Featured Abstract: Feb. 27

Featured Abstract: “Virtual Scriptorium St. Matthias”

Andrea Rapp, TU Darmstadt

In order to virtually reconstruct the mediaeval library of the Benedictine Abbey of St. Matthias of Trier the project is digitizing all of the approximately 500 manuscripts that originate from the Abbey but are now dispersed throughout the world. The integration and modeling of catalog data allows presentation, navigation, research and networking of the library holdings. For the modeling of the project there are various challenges: Each manuscript is a witness of a text or a work, it should be associated with information about this specific manuscript / text / work and with critical text editions (e.g. Perseus) or other manuscript witnesses. At the same time, the collection of the library can be seen as an ensemble, which has been collected, maintained and curated with care. The DFG-Viewer is used to easily present and navigate each manuscript, although this tool has been developed primarily for printed works. This decision brings about some problems for data modeling (TEI, METS-MODS). At the same time all data will be incorporated into the virtual research environment TextGrid, where they are released for further processing. On the one hand, this allows to support the scholarly work of individual researchers (or research groups), on the other hand the virtual scriptorium St. Matthias can be “edited” as a social edition by the research community. One of the key questions will therefore be, whether and how the collaborative data modeling can be designed. www.stmatthias.uni-trier.de

Featured Abstract: Feb. 23

Featured Abstract: “Where Semantics Lies”

Stephen Ramsay, University of Nebraska

Should the syntax of XML have been scrapped in favor of s-expressions?  This debate, which raged for years and which occasionally reappears, has all the ring of a religious war (Windows vs. Mac, Emacs vs. Vi, big-endian vs. little endian).  In this talk, I will suggest that while in general this discussion generated more heat than light, it pointed toward an important set of issues that bears on the problem of data modeling in the humanities.  The question isn’t which syntax is superior, but rather, what does it mean for a syntax to have a semantics and (more critically) where does that semantics lie within the overall system?

I begin by claiming that our common definitions of “semantics” (within computer science) are too vague, and offer a definition loosely based on Wittgenstein’s notion of meaning as a function of use.  I then use that definition to distinguish between XML as a syntax that binds its semantics late in the overall computational process, and an s-expression-based language (like Lisp) that defines its semantics early.  I then pose the question: What would it look like if we were to imagine systems that take our present data models and bind them early?

The purpose of this exercise is neither to rekindle this debate, nor even to suggest that the conception of semantics within XML or s-expressions is flawed.  It is, rather, to reimagine our current data models as having options beyond what has been commonly offered — not just data to which we apply algorithms, but data that is itself algorithmic.

Featured Abstract: Feb. 20

Featured Abstract: “What is the Thing that Changes?: Space and Time through the Atlas of Historical County Boundaries”

Douglas Knox, Newberry Library

One would think that modeling historical administrative boundaries would be a straightforward matter, relatively free of the complications of more fuzzy phenomena in the humanities. In fact, however, the precision of modeling tools casts the inherent difficulties of modeling administrative change over time and space in sharp relief. This presentation will draw on examples from the Atlas of Historical County Boundaries, an NEH-funded project of the Newberry Library completed in 2010, which documents every change in county boundaries in what is now the United States from colonial times through the year 2000. In addition to reviewing fundamental data modeling decisions of the project, the presentation will explore interesting edge cases, connections and similarities to other kinds of data, implicit models, and alternative ways of approaching the question of what are the objects of interest that we imagine persisting through changes over time.

Featured Abstract: Feb. 16

Featured Abstract: “Digital Literary History and its Discontent”

Fotis Jannidis, University of Wuerzburg

Literary history and digital humanities present themselves to the interested like an unfinished bridge marked by a huge gap the two sides. The side of literary history has been busy discussing the principles histories are constructed by and the demand of ever wider concepts of their subject. On the side of digital literary studies there are various attempts to read ‘a million books’ stretching the notion of ‘reading’ to new limits. Most work has been done on classification, specifically on genre classification (e.g. Mueller or Jockers). Work to close the gap can start from both sides. My talk will discuss some of the concepts underlying contemporary literary histories pushing towards a more formalized description. But this will only be possible for a very small part of the entire enterprise ‘literary history’ thus putting methods of digital literary studies in a subsidiary role. And even when it is possible to describe some aspect (genre, concepts like author, reader etc.) more formalized, most of the time this formal description cannot be applied automatically to larger collections of text. In a self-reflexive turn I will try to analyze how this ‘more formalized description’ is achieved describing thereby the gap between the conceptual modelling done in any humanities research and the demands of a more formal description.

Featured Abstract: Feb. 13

Featured Abstract: “Objects, Process, Context in Time and Space – and how we model all this in the Europeana Data Model”

Stefan Gradmann, Humboldt University of Berlin

Once we start modeling complex objects as RDF-graphs, as aggregations of web resources in a linked data environment we quickly get into questions regarding the boundaries of these aggregations of web resources, the ways we could describe their provenance, the way we could version them including their context (and what are the boundaries of that ‘context’?). How do we model time and process context in such environments? Herbert van de Sompel has done some initial groundbreaking work in that area with his Memento project – but that is just one first step. We seem to have firmer ground for contextualisation on the spatial side: GeoNames, GeoCoordinates and the like seem to be much more stabilized conceptual areas. Maybe because the denotative aspect is stronger in space than in time?!

Featured Abstract: Feb. 9

Featured Abstract – “The Person Data Repository”

Alexander Czmiel, Berlin-Brandenburg Academy of Sciences and Humanities

I will present the data model of the Person Data Repository, a project based at the Berlin-Brandenburg Academy of Sciences and Humanities, which pursues a novel approach to structure heterogeneous biographical data. The approach does not define a person as single data record, but rather as compilation of all statements concerning that person. Thus, it is possible to display complementing as well as contradicting statements in parallel, which meets one of the basic challenges of biographic research. In order to satisfy different research approaches and perspectives, the smallest entity of the Person Data Repository is not a person, but a single statement on a person, which is named “aspect” in the data model. An aspect bundles references to persons, places, dates and sources. By proper queries it will be possible to create further narrations, whose first dimension is not necessarily a person, but possibly also a time span or a certain location. Additionally, all aspects are connected to the corresponding source and to current identification systems respectively, like the LCCN or the German PND. Thus, scientific transparency and compatibility with existing and future systems is guaranteed. To collect and create aspects of a person we built the “Archive-Editor”, a java based tool with a user friendly but powerful interface for the Person Data Repository.

Featured Abstract: Feb. 6

Featured Abstract: “Discovering our models: aiming at metaleptic markup applications through TEI customization”

Trevor Munoz, University of Maryland

The activity of data modeling involves generating a description of the structure that data will have in an information system. In practice, for many current humanities text projects, the outlines of these  descriptions and the information systems they will work within are already known—the vocabulary of the Text Encoding Initiative (TEI) and some kind of toolchain suited to storing, processing, and retrieving XML. This should not obscure the modeling challenges involved in humanities text projects. Indeed it is in the investigation and manipulation of these complex systems of information representation—through the customization mechanisms provided by the TEI—that much of the intellectual contribution of “small data” projects to the digital humanities can be found. Also at this point in a project, the roles of (digital) humanist and librarian are most closely aligned. An examination of the process of developing TEI customizations for several projects will show some of the decisions whereby the digital representation of texts become strategic models and also where the strategic emphases of librarians and humanists for those representations begin to fall out in slightly different ways. As the primary case study among those presented, the development of the Shelley-Godwin archive project will exhibit how TEI customization as an act of data modeling looks backward to traditions of editing and forward to new kinds of computer-enabled processing in an attempt to develop a rich, critically-engaged record of engagement with an important body of texts.