Skip to content Skip to navigation


You are here: Home » Content » Non-Consuming Relevance: “The Grub Street Project”


Recently Viewed

This feature requires Javascript to be enabled.

Non-Consuming Relevance: “The Grub Street Project”

Module by: Laura Mandell. E-mail the authorEdited By: Frederick Moody, Ben Allen

Note: You are viewing an old version of this document. The latest version is available here.

Non-Consuming Relevance: “The Grub Street Project”

Laura Mandell

To summarize my response to Professor Muri at the outset: yes. And now onto some details. I am director of a project that has been modeled upon and is partly sponsored by NINES (our host for this event), a project mentioned in Professor Muri’s excellent paper. It is called 18thConnect, and, like NINES, it will aggregate and peer-review digital projects. 18thConnect differs from NINES in one major way, however: ECCO. The Eighteenth-Century Collections Online database, mentioned by Professor Muri, contains page-images of 140,000 texts ranging in size from a two-page pamphlet to the 1500-page Clarissa. The Text Creation Partnership has double-keyed 1,809 of these texts and then run out of money. 18thConnect has a development team at the University of Illinois, headed by Robert Markley. They are augmenting an OCR program by designing it to read eighteenth-century texts, specifically. After meeting with Martin Mueller a few weeks ago, I have come to believe that 18thConnect can play a crucial part in creating—not just an acceptable level of error in text files corresponding to ECCO’s page images, but actually a vital, immense dataset of eighteenth-century texts, minable by machines, word-searchable by all. That thrilling prospect will involve not just OCR but automated TEI tagging, automated linguistic markup, automated dictionary look-up, etc., etc. I attach at the end of this essay a table of all the organizations that could be involved in producing this vast data set and that could benefit from the text-cleaning process. If Martin Mueller’s Project 2015 really works, as I believe it will, new kinds of knowledge will come from textual data that will convince our institutions of the value of text as data, which will provide, as Professor Muri so aptly stated, the ground for making digital humanities a going concern. Academic institutions must become consumers clamoring for textual data if the humanities are to be relevant in the twenty-first century.

The phrase “consuming relevance”—almost quoted in the title to my response— comes from Frank Kermode’s book An Appetite for Poetry (1989). He says of the canon that it gives us books of timeless, “consuming relevance,” texts that have meanings and messages that were applicable to the time in which they were written but continue to be relevant and perhaps even prophetic, though the prophecy could of course be of the self-fulfilling sort arising as a consequence of veneration. Less abstractly, did Shakespeare understand modern psychology or partly produce it, a question of concern to Harold Bloom.1 But “Shakespeare’s” production of our inner lives wasn’t his, as attendees at this conference know. Insofar as we have been shaped by Shakespearean drama, it has been through the mechanisms of bardolatry instantiated in the eighteenth century and mediated by new print technologies that made it possible for Shakespeare’s work to be disseminated beyond his wildest imaginings.2 The first modern editor and first modern apparatus materializes in the 1720s around Hamlet.3

However, a much newer technology is evoked by the phrase “non-consumption,” which my title almost quotes as well. The Google Research environment proposes to offer the “non-consumptive” use of texts, noted by John Unsworth, Lisa Spiro, and Don Waters in various essays and talks about the Google Settlement, anticipated and actual.4 18thConnect, like NINES, will provide an alternative to that research environment, one that has been constructed by scholars and librarians working together, a necessary competitor to Google. But as Professor Muri noted, we cannot give away proprietary texts such as those in Gale-Cengage’s ECCO catalogue. We have been given permission by Gale to ingest those texts into a search engine that allows for full-text searching and, we hope, data-mining. Users of 18thConnect will not be allowed to “consume”—i.e., read—the ECCO texts, if their libraries do not subscribe to ECCO, but only to use these texts, use-value somehow being imagined as completely distinct from the exchange-value that prompts consumption and production, library budgets and Gale-Cengage profits. The Grub Street Project provides another way of “using” texts: mapping them. And this cultural analysis of textual relevance via mapping will open up worlds of thought as yet undefined.

As we move from the consuming relevance that prompted literary scholars to virtually memorize a circumscribed set of texts to the non-consumptive data-mining and mapping that will generate we-know-not-what-kinds of knowledge, there is another form of consumption dogging us, the kind of “consuming” mentioned by Professor Muri in this paper: “time-consuming” tasks. And when she uses the phrase, she describes what is happening for her at a university that sounds identical to my own. She talks about how her “research time is eaten up” by the multiple problems confronting the Grub Street Project, institutional, financial: ultimately her time is eaten up by public relations problems endemic to scholarly editing and the digital humanities. I want to reiterate all Professor Muri’s articulations of these problems, but I’ll do so in the most textually economical way, through anecdote.

I was offered the opportunity to be a grant reviewer for SSHRC’s Image, Text, Sounds and Technology (ITST) projects in literary and textual studies, described in Professor Muri’s Appendix A. The quality not just of the projects but of the applications for these funds was astounding—it actually made me realize how far ahead the digital humanities are in Canada, but this isn’t the time for indulging in nationalistic sentiment. The participants on this adjudicating committee all knew that there would not be enough money to fund projects that should have been funded, no matter what. And in fact SSHRC is in the process of reorganizing the granting structure precisely in order to let the excellent digital humanities projects compete against traditional projects of lesser quality and value.5 SSHRC itself wishes to improve those statistics found in Appendix A: they would say yes to Professor Muri’s comments here.

Second, MLA held a workshop designed and run by Susan Schreibman called “Evaluating Digital Work for Promotion and Tenure” (December 2009). Susan asked a number of us to create a tenure case for a digital humanist, real or imagined, in no more than four pages. She distributed these cases to a crowd of twenty-eight workshop participants, many of them chairs or deans, and then gave them a checklist to help facilitate discussions among groups as to whether each of these cases were tenurable. Dr. Schreibman noticed at the end of the workshop that these mock-tenure-case reviewers were complaining that the candidates for tenure had not adequately explained the research value of their digital work. We thought they had. When these non-digital humanists read the portions of these written case documents in which that value was explained, they complained about technical jargon, and skipped those sections. This is a really severe problem. Such misunderstanding can be corrected by peer-reviewing organizations such as NINES and 18thConnect, which would be able to instruct P&T Committees on the value of a digital resource. My own Poetess Archive was the basis for promoting me to Full Professor because it has the NINES seal of approval. But there is a worse effect of this problem than the simple misunderstanding for which NINES or 18thConnect can compensate.

Professor Muri describes comprehension problems for reviewers of grant applications as well as promotion and tenure. She never knows how much “basic education” in the digital humanities she must provide. What Dr. Schreibman and I discovered is that the basic education process in the mock-tenure cases took too much time and space, supplanting some of the discussion about research value. The same thing is happening right now in this response: I am writing about institutional issues and problems rather than discussing what kind of research questions might be askable and answerable in a literary mapping of London or data-mining textual artifacts of the eiughteenth century on a scale never yet thought possible.

But it gets worse. Because so much textual space in articles, essays, books, P&T documents, and grant applications is consumed by basic education, writers then justify their consumption of time and space by making untenable utopian claims about the value of the medium alone, regardless of what’s in it. Imagine Samuel Johnson describing all the possibilities of the periodical medium rather than writing his essays for the Rambler, Adventurer, and Idler, which of course do such a media analysis but not upon the periodical medium as if it were contentless, as if AutoTrader and Atlantic Monthly were in any sense “the same thing.” On the contrary, Johnson’s personae, his ramblers and adventurers, idly produce snippets of text and achieve specific ideas both for the writer and his readers by consuming short periods of time. The whole infamous database debate in PMLA was predicated on the idea that one could talk about the generic value of “a database” as if databases sustaining searches in the Whitman Archive were fundamentally “the same” as any use of a database.6 Such generalizations can spawn utopian, absurd discourse about the power of digital media that articles such as John Unsworth’s excellent “The Importance of Failure” are designed to counteract.7

Dean Unsworth’s essay lays out how to formulate digital projects that perform research as well as how to write about them. He says that digital projects (archives, editions, databases, etc.) ought to be able “to declare the terms of their potential success or failure”: chances are, such an explanation is only possible if the scholars really know what they are trying to accomplish. Moreover, real research is only being done if the possibility of failure exists. “Failure” in the research proposal, “We want to know X,” means being able to conclude after research has been performed, “we can’t know X for the following reasons, but we can know Y.” Even interpretive projects follow this rule. One is frequently thrilled to find an essay or book that implements the most thorough analysis of a particular text via one particular methodology because it allows one to see how far that particular theoretical vantage point will take us, and where it breaks down or becomes uninteresting. And Dean Unsworth insists that the methodology be formulated and stated explicitly for such projects. Once again, no matter what the result, the digital project is a success insofar as it implements that methodology to the utmost and reveals what results can and cannot be had from it. However, Professor Unsworth admits, this kind of successful failure is not really suitable for institutional consumption: “Frankly, the only metric that is likely to matter to the universities that sponsor such projects is their success in attracting outside funding, but scholars, designers, and funding agencies ought to care about more than those simple intrinsic criteria.”

When I teach literature classes, I say to my students that their essays are “essaies” in the French sense of the word: they are trials. I tell students, you’ll have your thesis, and you’ll try to prove it by quoting the text, but when you get to the end, if you no longer believe your thesis, that too is an excellent result. You tried it, and it didn’t work, and now you know that that idea, followed as far as you can take it, doesn’t work. What an amazing thing, really, to know that an idea you had while reading was really YOUR idea and not the author’s, that the idea is not in fact supported by the text. So, I say, if you don’t believe your thesis by the end of the paper, just say so, and I’ll give you an A. But are these the kind of concluding white-papers expected by the NEH? And there is no need to idealize the sciences: we know how much data has been falsified for purposes of funding and prestige. The social consumption of research sometimes contaminates it the way TB infects the lungs.

So what kinds of success and failure (success from failure) do we want from mappings of the sort undertaken by Professor Muri? Her work parallels that undertaken by Fiona Black and Bertram Macdonald, the SSHRC-funded History of the Book in Canada project ( It begins as a series of published volumes about this history, and will end as a GIS project, mapping information given in the following chart:8

Figure 1
Figure 1 (Picture 1.png)

The books describing all this information are completed, published in six volumes and two languages by the University of Toronto Press. But the mapping is advancing with GIS technologies, in particular, a recent email from Fiona Black tells me, capitalizing upon new geo-referencing technologies that enable using historical maps.9 MacDonald and Black argue in their article describing this undertaking that mapping information obtained from databases such as the British Book Trade Index ( enables answering the following kinds of questions, beautifully tabulated for us:

Figure 2
Figure 2 (graphics1.png)

Figure 3
Figure 3 (graphics2.png)

(Black and Macdonald 514).

It is never the case that such questions canNOT be answered using books or databases—of course they can. However, one would have to extract the information from the media in which they appear. The mapping pre-extracts it, and lets you go on to think about what this information shows us, once mapped.

Every medium abstracts information. Here, pictorially, are some abstractions of book history information:

Table 1
graphics3.png graphics4.jpg graphics5.png

These particular views of the book history information—a closed book, a pile of microfilm, and a database search form—are lossy to say the least, obfuscating to say the most. Nothing can be known until you open the book, look at the microfilm, search the database.

Figure 4
Figure 4 (graphics1-2.jpg)

These activities are trivial, you might say: the information is there. Is it? I want to trace one particular dataset that is “there” in many forms.

Ian Maxted published the book (inadequately pictured above) called The London Book Trades, 1775-1800: A Topographical Guide (Exeter: by the author, 1980). The book, clearly almost homemade, contains an introduction in typescript and a folder full of microfiche. The microfiche pictures a typed list of the names by street of those businesses involved in print culture, including the names of all who worked there. The typescript, we are told, was designed for Maxted’s own personal use to check the accuracy of information as he was proofing The London Book Trades, 1775-1800, and subsequently given to the British Library so that they could derive cataloging information—identify printers and/or places from inadequate title pages. The microfiche images are almost undreadable:

Figure 5
Figure 5 (graphics2-2.png)

This pdf of that microfiche image had to transform it from negative to positive. Maxted recognized the problem of communicating this information in book form, and to me he almost seems like a soldier in the effort to overcome information loss due to affordances of the book medium, as can be seen clearly in the copyright information to be found in his Topographical Guide:

© Ian Maxted, 1980

Subject to the provisions of the Copyright Acts any part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without the permission of the author.

When I first read this sentence, I had to go back over it several times: surely I was missing the “not”—“Subject to the provisions of the Copyright Acts any part of this publication may NOT be reproduced . . . ,” but NOT is not there. A typo? The sentence following this one cleared up my confusion: “The author would however be interested to be informed of any major use made of the publication.” Maxted is saying, in effect, PLEASE reproduce this document in a more useful (“electronic”) form than I was able to do. He just wants his work to be effectively used. And aware of the terrible quality of the microfiche, Maxted went further, reproducing the work in its entirety with the rest of his works on a blog ( ):

Figure 6
Figure 6 (graphics3-2.png)

Much of Maxted’s work forms the basis of the British Book Trade Index, as one can see from their sources, though Maxted’s Topographical Guide is not (yet) listed among them. And yet, to what extent is the information even THERE, in that database? It is extractable, yes, but here is what one must go through to extract it:

Figure 7
Figure 7 (graphics4-2.png)

Figure 8
Figure 8 (graphics5-2.png)

By carefully following these guidelines, I could extract Maxted’s street-by-street list of booksellers, engravers, printers, etc. The addresses are given in the returns, beautifully, indicating the dates an entity occupied a particular address, but entry by entry. To get to that list of addresses by date, you have to click on each element in the returned list—each printer, bookseller, etc. So of course to understand the relationships among these printers and booksellers, one thing I would need to do is to generate lists of where they were during specific dates—lists of street addresses by year.

I submit, without further arguing, that Maxted’s Topographical Guide breaks both the book and the database medium: they cannot be used to effectively convey this information. The information of topographical relatedness, and all the possible research questions that go with it (“Location,” “Condition,” “Trend,” “Pattern,” given by Black and MacDonald, above), are lost in those particular material abstractions of print culture. We can only know the topography of printing by hooking that database up to maps—historical maps, which, as in Todd Presner’s Hyper-Berlin, offer different maps and mappings for different times. What we need, in other words, is Allison Muri’s Grub Street project.

Right now, Professor Muri has only begun to enter the information needed into the databases hooked up to her maps—currently, it is primarily the ESTC, I believe. What if we could know via map views specific to particular times precisely in which printing houses the feminist texts she maps here were printed?

Figure 9
Figure 9 (graphics6-2.png)

What other kinds of texts were printed by the same printer and sold by the same bookseller? What collections of poetry came from those houses (and now I’m imagining linking my own Poetess Archive Database up to the Grub Street project), and whom did they include?

Professor Muri’s own idea is to go beyond even these socio-economic mappings to trace what she calls “patterns of literary motions”:

What might we discover if we map the “low-culture” of Grub Street communications alongside the “higher-order” communications of canonical authors . . . ? What are the relationships of shared, pirated, public, and freely-exchanged communications to furthering and encouraging practices of public creativity and knowledge? My preliminary task is to understand the passage of texts and books through the physical network (the streets) and the cultural network (the literature) of eighteenth-century London to see if I can gain insight into patterns of literary motions—the motions of texts, authors, books, and booksellers.10

I want to see what that project can show, and—to get back to John Unsworth’s notion of successful failure—what it cannot show. That is, what among those literary motions must be mapped in a different way, in a way other than through physical geography, and what kinds of mappings do we already have that abstract that information in more or less lossy form?11

To summarize and conclude: I want to know what happens when eighteenth-century London is mapped with texts and words, and spaces contextualized by prosopographies, biographies, and fictions. How does reading differ when spatialized and so topographically rendered rather than anthologically connected? What kinds of information lurk in the format, skewing the form by (mis)placing the observer? And then, to switch to possible research questions opened up by 18thConnect’s burgeoning data set . . . what kinds of “genres” appear on the scene when algorithms search among huge bodies of texts for semantic regularities or morphological similarities based on documentary forms or syntactic structure? What difference does it make to ask particular research questions when examining hundreds of thousands of eighteenth-century texts?—and what happens to such questions when they are reconfigured to be asked in terms of topographical orderings of cultural interactions? What difference do space and scale make? These should be our consuming questions as we speed along our Grub Street, the digital highway.12

Appendix I: 18thConnect: An Update

temporary address for 18thConnect, prior to launch in July 2010:

18thConnect, sponsored by NINES, began as an attempt to forestall severe degradation of information as the archive became digitized. But that has now changed. 18thConnect received an NEH-sponsored grant for supercomputer time and has conferred via NINES-sponsored meetings with other collaborative teams working on the same problem (all named in the table below). Now it looks as if it will be possible to know more than we could ever have known before about Anglo-American history, literature, crime, biography, anything conveyed by texts that were printed before 1850. Here follows a brief summary of the state of affairs, what 18thConnect is doing, and what we hope is to come. At the end of this report follows a table of those people with whom we would like to work in creating the cleanest possible data set of perhaps as much as half a million eighteenth-century texts.

Current State of Affairs

The Eighteenth-Century Collections Online database owned by Gale Cengage contains page-images of 140,000 texts ranging in size from a two-page pamphlet to the 1500-page Clarissa. The Text Creation Partnership of the University of Michigan libraries has double-keyed 1,809 of these texts and then run out of money. The ESTC (English Short Title Catalogue) at the British Library and UC Riverside has given us their catalogue of 400,000 eighteenth-century texts. We have loaded their bibliography into a SOLR indexer. These records are being reviewed and corrected by the ESTC before being made public with the launch of 18thConnect, projected for July 2010. Those records overlap with the bibliographic records of ECCO texts. Benjamin Pauley is working with ESTC and Google Books and engaging eighteenth-century scholars to connect ESTC numbers and other relevant data to items available through Google Books. Shef Rogers and Miami University have been given the right to digitally produce Cambridge’s Bibliography of English Literature, another resource for bibliographic data. These data sets will be the first to be aggregated by 18thConnect. In plain terms, when one searches the site, one will have returned bibliographic data from ECCO, ESTC, Google Books, and the DBEL (Digital Bibliography of English Literature).

The NINES system adopted by 18thConnect also allows for plain text searches, but it does not take in OCR that is under 99 percent correct, which means that both the Google and the ECCO plain text files cannot be ingested by our system.13 Gale Cengage has given us permission to try to improve the OCR software and generate cleaner, searchable texts. They have given us this permission because the system we use allows one to search plain text without recreating any specific text. Like the NINES interface, the 18thConnect interface will return bibliographic data linked to the ECCO catalogue. Clicking on a link, users of 18thConnect who subscribe to ECCO will be able to go directly to the page images, but non-subscribers will be taken to a page encouraging their libraries to buy ECCO. Holding libraries are listed in the bibliographic data alone, so interlibrary loan is another possibility.

Work Currently Underway

At McGill University, Ichiro Fujinaga created the Optical Character Recognition Program called Gamera for reading musical notes, optimizing it by creating libraries of images to search for specific to very short time periods in the history of musical notation. Johns Hopkins University released Gamera14 in the hope that domain experts would customize it for their specific data sets in order to increase its power. Only one group had created the libraries necessary to convert from image recognition to text, however, and we were unable to contact that group.15 Jennifer Lieberman, graduate student in English at Illinois, created the library that allows Gamera to output plain text files, loading in fonts specific to the eighteenth century in order to train its image reader. Now Mike Behrens at Illinois is using JuXta, a collation tool developed by NINES, in order to compare the double-keyed texts donated as a test set by the Text Creation Partnership to the plain-text output of Gamera. They will find and correct Gamera’s weaknesses, and will then run the full 140,000 .pdf files donated (with specific usage constraints) by Gale Cengage, producing plain-text versions of them by December 2010.

As an offshoot of the MONK project, Katrina Fenlon, a recent graduate of GSLIS, working under the supervision of Tim Cole and Martin Mueller, designed a proof-of-concept tool for turning the "white-space XML" output of OCR (in this case ABBYY Fine reader) into TEI P5 with very limited human intervention.Jennifer Lieberman is contacting Katrina Fenlon and Tim Cole so that we can get Gamera to produce the same output. This output makes it possible for the texts to be fed into MorphAdorner, released in 2009, a tool built by Phil Burns at Northwestern. MorphAdorner is a highly customizable Natural Language Processing tool kit with special capabilities for the virtual orthographic standardization, lemmatization, and morphosyntactic tagging of written English between 1500 and 1800. I believe that MorphAdorner can be trained to automatically correct 80 percent of the errors found by looking words up in a period-specific dictionary. Martin Mueller and Craig Berry have developed Annolex, a collaborative data curation tool that can be used for the remaining 20 percent of errors. Annolex allows people to hand correct text by giving unresolved words and their context, and then allowing each editorial intervention to be submitted to an editor overseeing the work. 18thConnect can provide a) that editorial supervision and b) letters from an illustrious editorial board about any particular scholar’s work in correcting bibliographic and textual data.

18thConnect would like to ask ASECS and the NACBS to issue a “My18thConnect” page, based on the MyNINES page, that will bring eighteenth-century specialists into the business of text correction. 18thConnect will allow all members of ASECS and NACBS to search the texts we have generated, texts that will be in the 99 percent clean range (per Peter Bajcsy), but only those members whose institutions subscribe will get access to the whole texts corresponding to the bibliographic data that is returned by their full-text search. Mellon is beginning to negotiate with Gale, Readex, and Proquest to let all eightenth-century scholars have access to all those texts, regardless of whether their institutions subscribe. But in the meantime, 18thConnect can enlist those scholars in correcting the OCR via the Annolex interface. As a reward to those scholars—in addition to letters by a board of respected scholars describing their work—we hope to negotiate with Gale so that 18thConnect can provide those who perform such corrections access to the full-text images of the text they have corrected and the opportunity to use the plain text files of such a text to create a digital scholarly edition, under the guidance of 18thConnect. All the enriched data can go back to Gale Cengage in return, but it will also go forward into a data-mining system structured by the Meandre workbench (SEASR) and minimally TEI-encoded using TEI-A (analytics) developed by Brian Pytlik Zillig of Nebraska Center for Digital Humanities for the MONK project. On the My18thConnect page, users will be able to perform data-mining as well and keep their results whether they can get access to the full texts or not.

Table of those whom we would like to involve in development and use, their interactions coordinated by the 18thConnect Office at Miami

Table 2
Person Organization Institution Interest / Skill
Martin MuellerPhil BurnsCraig Berry The Book of English, Project 2015 NorthwesternUniv. Morphadorner, NUPOS, Annolex
Jerome McGannAndy StaufferLaura Mandell NINES / 18th-Connect Univ. of Virginia, Miami Univ. of Ohio SOLR Index, Peer Review, JuXta
Brian GeigerMoira Goff ESTC (English Short Title Catalogue) UC Riverside, British Library Moira Goff of the British Library office has given 18thConnect 400,000 bibliographic records spanning the 18th century. Brian heads the UC Riverside Office which is working on 18-century newspapers that could make use of this workflow process.
Aaron McColloughPaul Schaffner TCP Univ. of Michigan Keying, post-processing, and encoding texts in ECCO and EEBO
Brian Pytlik Zillig Center for Digital Humanities Univ. of Nebraska, Lincoln Automated TEI-A tagging
Peter ReillDino Felluga ASECS andNAVSA UCLA Interests of scholar-users
Tim ColeKatrina Fenlon UIUC Libraries Univ. of Illinois at Urbana-Champaign XSLT transforming plain text to white-space XML
Peter BajscyAlan Craig NCSA UIUC Image and supercomputer expertise
Marshall Scott Poole,Kevin Franklin I-CHASS UIUC Awarded Supercomputer-time grant via NEH
John Unsworth Dean, Graduate School of Library and Information Science, Director Illinois Informatics Institute UIUC Experience in all areas, and MONK
Mike Behrens Michael Simeone Jenni Lieberman 18thConnect Development Team UIUC Dept. of English Gamera development and 18thConnect interface development (for Annolex and Meandre)
Loretta Auvil Bernie Ács SEASR UIUC Meandre Workbench data mining tools
Scott Dawson Ray Bankoski Gale-Cengage Gale-Cengage ECCO and the Burney Newspaper Collection
Table 3
Laurent Romary INRIA Research Director,Chairman of Scientific Board TEI,Researcher CNRS institut national de recherche en informatique et en automatique(France), TEI, Centre national de la recherche scientifique, and the ISO Developer for the ISO standard Morphosyntactic Annotation Framework (ISO 24611)
Ichiro Fujinaga Associate Professor Music Technology and recipient Digging into Data Challenge Grant McGill University Creator of Gamera
Stefan SinclairGeof RockwellSusan Brown Voyeur andJitter [tool name imprecise] McMaster and Univ. of Alberta Data-mining and tool using windows; interoperability with all web resources
Heike Neuroth TextGrid head of digital research at the State University Library in Göttingen Creators of a similar workflow process in Germany (includes texts from Georgian England)
James MussellGerhard Brey NCSE (Nineteenth Century Serials Edition) and CCH Birmingham Univ. / King’s College London Early nineteenth-century periodicals; algorithms for name extraction
Robert ShoemakerTim Hitchcock Connected Histories / London Lives / Old Bailey Online Univ. of Sheffield, Univ. of Hertfordshire Historians working in 18th Century; winners of Digging Into Data Challenge Grant
Dan EdelsteinChris Weaver The Republic of Letters Stanford Univ., Univ. of Oklahoma Winners of the Digging Into Data Challenge grant to digitize 53,000 18th-century letters
Mary Sauer-GamesJo-Anne Hogan Proquest Proquest / Chadwyck Healey Proprietary catalogs and full text
William Pidduck Adam Matthew Adam Matthew Digital Imprint Proprietary catalogs and full text
??? Readex Evans Proprietary catalogs and full text


  1. Shakespeare: The Invention of the Human (New York: Penguin Putnam, 1998).
  2. Margareta de Grazia, Shakespeare Verbatim: The Reproduction of Authenticity and the 1790 Apparatus (New York: Oxford University Press, 1991); Michael Dobson, The Making of the National Poet: Shakespeare, Adaptation and Authorship, 1660-1769 (New York: Oxford University Press, 1992).
  3. See my “Special Issue: ‘Scholarly Editing in the Twenty-First Century’—A Conclusion,” Literature Compass 7/2 (2010): 120–133, p. 125.
  4. John Unsworth, Computational Work with Very Large Text Collections: Google Books, HathiTrust, the Open Content Alliance, and the Future of TEI," Text Encoding Initiative Consortium Annual Members' Meeting, Ann Arbor, Michigan, November 13, 2009, accessed 1 March 2010; Lisa Spiro, Digital Humanities in 2008, II: Scholarly Communication & Open Access, accessed 1 March 2010; Don Waters, “The Changing Role of Special Collections in Scholarly Communications,” Research Library Issues 267 (December 2009),, accessed 1 March 2010.
  5. See SSHRC's discussions about restructuring:
  6. PMLA 122 (October 2007).
  7. John Unsworth, “Documenting the Reinvention of Text: The Importance of Failure,” Journal of Electronic Publishing 3, no. 2 (December 1997).
  8. This chart and the subsequent information about this project come from Bertrum H. MacDonald and Fiona A. Black, “Using GIS for Spatial and Temporal Analyses in Print Culture Studies,” Social Science History 24.3 (Fall 2000): 505-535, p. 510.
  9. Sun, Mar 7, 2010 at 4:43 PM.
  10. Allison Muri, “The Technology and Future of the Book: What a Digital ‘Grub Street’ can tell us about Communications, Commerce, and Creativity,” in Producing the Eighteenth-Century Book: Writers and Publishers in England, 1650-1800, ed. Laura Runge, Pat Rogers (Newark: Univ. of Delaware Press, 2009), pp. 235-251,
  11. Is the table of contents such a “map”? See Laura Mandell, “Putting Contents on the Table,” Poetess Archive Journal 1.1 (2007) One of Alan Liu’s classes re-invented the tables of contents of specific anthologies as geographical maps of authors’ locations: what could we see if we did such a thing on a large scale?
  12. For an excellent, thought-provoking account of this equation, see Muri 238.
  13. NINES and 18thConnect share a SOLR index running a Lucene search engine.
  15. Reddy, Sravana and Crane, Gregory, “A Document Recognition System for Early Modern Latin,”

Content actions

Download module as:

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens


A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks