Case Studies–Historical Archives (March 16):
[Thomas Stäcker] Hello everybody. First of all I would like to thank the organizers for the opportunity to speak to you. What I am going to do is to present a joint project of the University of Illinois Urbana-Champain and the Herzog August Bibliothek. I am afraid that I may disappoint you in this presentation because what I am going to say is neither exciting nor new, presumably. But may demonstrate, show, how data modeling is said to work in an actual project of the humanities.
The project I am talking about is a project in a series of projects. It started in 2002, so this might explain what I am going to say later on the standards we adopted. Nonetheless, it got funding in 2009 to 2012 and will be finished by the end of this year. Before discussing modeling aspects, I would like to briefly familiarize you with the project and its objectives, early modern emblem books. Just some people involved: Maura Wade is a principal investigator, and the goals. By the end of this year, 1800 emblem books will be digitized. Emblem books is a genre that was becoming popular in the 16th and 17th century. We will transcribe more than 15,000 mottos and 10,000 picturae will be indexed by Iconclass notations. We plan to create, of course, a joint portal for emblem research.
This is the starting point of emblem book production. The first emblem book is said to be written by Alciato, Alciato’s Emblematum Liber, which appeared in 1531 and became extremely popular in the following years. This is why some call this century an emblematic one.
So what is an emblem? The first time I traveled to the United States, I was invited to a workshop addressing issues of emblematics at the University of Illinois. At the customs station in Chicago, while I was interrogated about what I was planning to do in the States, a customs officer actually asked me this very question. Apparently he had no clear idea what an emblem book is like and probably classified emblems as rather suspicious. Otherwise, he would not have spent a total of fifteen minutes for the interrogation, even though there was a very long queue waiting, three planes coming in at that time from Europe, they waiting for my explanation. So I did my very best, but without much success. Eventually I ended up calling it a sort of “ancient advert.” This seemed to have satisfied him, and I was allowed to pass. Afterwards, I wondered if this really was a good description, and to some extent it is. Because there is a picture and a text somehow riddling and playing on the content of the message that makes it more attractive and convincing to sell the stories, so to speak. Ideally, an emblem consists of three parts: a motto, a picturae, and an epigram of subscription. Sometimes a book contains sophisticated compositions of emblems. As a book genre, it was international and multilingual, including several translations as Latin, Dutch, French or Italian. Quite often the components of an emblem are scattered all over the book and must be collected and recombined by the reader in order to make sense at all. But not only are the textual parts challenging for data modeling, so are the pictures.
Here is a series of descriptions relating to one engraving once again consisting of various emblems. Obviously there are at least two different structures involved. First, the book structure or page sequence. Second, the emblem structure, consisting of motto, picturae, and epigram. Most of them are important for the encoding and refer to different entities. The book and the emblem. I confine myself to emblems and books, though of course there are other appearances of emblems, such as church ceilings, maiolica, porcelain ceilings, paintings, architecture and so forth. All or some of these may be intricately interconnected and interdependent on one another, leading to specific modeling problems, but I will put this aside for the moment.
At the beginning in 2002 we had to face two major problems: there was no emblem standard ready at hand, and the questions of identification demanded a solution. I come to the latter later. The problem of which standard is to be applied presumably arises in every modeling process. Is it better to adopt a given modeling scheme, ontology, or standard, or to create a new one? Although the first is certainly preferable, it might be in some cases worthwhile considering if a new one is not more appropriate for the material and easier to implement. In the case of emblematics, we decided to create a new one, basing on a descriptive model from Stephen Rawles from the University of Glasgow put together, “the Spine of Information.” This decision may be debatable but it worked fine. Implementing and sharing data turned out to be comparably easy, according to the emblem namespace, as we called it. I do not want to get into much detail here but to just show you or give you an impression of how existent standards such as a TEI header or MODS are included and how the emblem schema looks like.
Here is the view of the Oxygen. You can either use the TEI header for bibliographic descriptions, but later on return to MODS, as this is more popular in libraries. Certainly you can include other standards if this is preferable. Here is the construction of the components of an emblem, and here you have a view on an XML representation of the emblem, consisting of motto, transcription, the various translations of the transcription, the picturae, and the Iconclass notations included in the schema. And now, the other structure, the book structure, as it is embedded in a structural description of the book.
By the way, this structure description is a sort of proto-standoff markup. So we thought it up in 2000. It is a very early time, and we considered it a good idea to encode the structure in a way that we think of a complete text but leaving out the text. This is nothing else than a prototype of standoff markup, I think, except for the identification of the relation to the text, which caused us some problems.
As already mentioned, one crucial issue in all scholarly work is identification, in that it provides the basis for quoting, referencing, and in digital environments, linking resources. However it can be secured, unique identification across various layers and the entities involved. Here we can take, again, advantage of the FRBR model even if it is discredited somehow in this conference. I still bear using it. We can use these levels to characterize and describe the particular modeling problems we had and still have. All of the four levels may be identifiable, identify us maybe attached to them. What we deal with at the moment in this project is just addition at the copy level, and we are desperately looking for a work identifier and a work-level description possibility. Typical identifiers of the edition level in traditional humanities are bibliographic numbers, be they in an emblematic bibliography such as Landwehr or in a national union catalogue. What we are missing in this catalogue is a URI based reference scheme, so some of the union catalogues in Germany, and I think in the [United] States as well, offer persistent identifiers for bibliographic records.
However, except for some bibliographies such as the VD 17, which is a database for imprints of the 17th century, there is still no national unique identifier for an edition, let alone a global one. What we were able to provide in the project are globally unique URIs of the copies. Or, to be more precise, of the emblems they are in. Accordingly, two identifiers were assigned to two copies of the same editions. Identical emblems from different copies received different identifiers. To ensure uniqueness, we made use of a handle service at the University of Illinois. Here is a screenshot of the handle service (12:36). Here you can ask for a URI for an emblem you are describing. We are aware that this can only be considered a first step. Further research, and this is really research at this stage, has to be done in order to link the various identical emblems together. So we might have identical emblems scattered over all of the copies we have and you have to bring them together via maybe OWL, same as labels or indicators. We hope that this will eventually lead to getting an emblem work identifier. This is the overall goal, to bring together, put together, all the emblems we have identified by identifiers and identify the emblem at the work level.
Let me close with some slides relatively relating to the presentation of the emblem books we have. First, the book structure involved in […] This is a slide that refers to the prototype of our database, so you are the first that can see this. Check back maybe in a week or so. I do not know whether it is functional at the moment. Where we try to provide access to a different level of an emblem. So here you can search books and emblems, books only, search emblems only, or browse by Iconclass notations. Here is the emblem-level access page, where you can select emblems, look for details, mottos and so forth, and the picturae level access where you can search for Iconclass headings. What I think is interesting is the Iconclass browsing view where you can make use of the Iconclass notations independent of languages used. The Iconclass browser allows you to search in English, in French, in German, in Swedish, in Finnish, and in Italian of course. So this is a very comfortable way of doing research on pictures. Here is the same browser provided by archives, by iframe, which you can link to your own page by iframe. We use an OAI interface for delivering emblem schema or the book level, metadata. We implemented also an ORI interface not for the public but for internal reasons, so that […], who was a partner of the project, and Archives, who was the other partner, were able to harvest all our data. They enriched our data. All of this happened on the emblems schema we provided for that, and we harvested the enriched data back into our database, so it is a nice way to collaborate in a digital environment. I leave with the issues of modeling. Thank you very much.
[Wendell Piez] I thought this was really interesting, that you referred to FRBR as an authority and yet the project just explodes the FRBR model, by virtue of the fact that working with a genre that is by definition, the traditional definition of work in FRBR dissolves in this commingling of different emblem books, and the way the different emblem books will replicate the same emblem with variation, how he defined what a single emblem actually is is extremely problematic and interesting, and obviously you have had to grapple with that from early on. So Iconclass is the location for an identifier for a particular emblem that you’re going to […]. So that, for example, if you have an emblem for faith, depicted in various forms in various editions in different emblem books, even if it is not exactly the same picture or maybe it is somewhat changed or the moderators are different or subscription is completely different, you can say that this is a faith thing that has an identity across that, correct?
[Thomas Stäcker] Exactly. So Iconclass is a thesaurus of iconography. They index concepts. They try to index concepts. Maybe Max [Schich] can explain it better — he’s an art historian. It is very powerful, to a certain extent. You cannot do everything with it, but it works fairly well with the emblems and maybe with baroque paintings. You can look for Jesus’ cross, what is the meaning of a particular depiction. This turns out very efficient in a multilingual context. As to your first question, this is a very difficult question, and in a way, we can just put that aside by saying that we identify every emblem by a unique identifier in every copy. We are aware that there are multiple appearances of the same emblem in various copies. The next step is the step of doing research on this, because we have to compare one copy with another copy and say “this here identification is identical with that identification.” That is research. We have got to do it, given the project we have.
[Wendell Piez] But you see, in effect, what Allen [Renear] suggested with FRBR is that we can relax some of the problems that we experienced with the FRBR mapping into domains other than the classical bibliographic domain, if instead of talking about manifestations and expressions, we simply talk about the simple structure level, which then can be represented by another simple structural level, where work remains at the top and then we have copy down at the bottom. Or, what is the term for the bottom line? Item. So in effect I think what this is suggesting is that we need to have that recursion also at the top, because work is not where it stops. We continue to go up, right? You have this production printer who printed emblem books for 30 years, but his emblem books have some of the same emblems as that guy over there in France. And yet they have variations?
[Thomas Stäcker] So what is the work then?
[Julia Flanders] This approach also recapitulates this erroneous suggestion that we start from the bottom and just derive similarities as we can with research, as you say.
[Thomas Stäcker] A good example for the problem we have here for the notion of work may be the tradition of Pietros Lombardos. This manuscript tradition is very sophisticated, and most researchers say there is not a starting text, so to speak, but every century added his own or created his own Pietros Lombardos. So what actually is the work? This is very hard to say. On other hand, these all are called Pietros Lombardos. So it may turn out being something like that: that they belong together in having the same name without being the same.
[Maximilian Schich] I have two remarks. One of them is very practical and the other is very, very conceptual. First thing: I think it is awesome to do exactly that: to identify the instances. Both the practical and conceptual are basically saying that images and art history in general don’t know […]. So there is no thing where we can say “oh, these are the instances, so they are copies of the same kind of concepts, which is work, right? So there is no one node where you could say[…] So even if you take completely identical and link them up as the same […], that is very dangerous, this is my practical remark. Because you have “n” times “n – 1” links, so twenty instances would be 380 links […] do by machine. Especially if you want to change something, you have to change all these instances, which is horrible. The other thing is, it is very likely that there will be a couple which are always used as examples which are actually identical and you can say “ok, this really the repetition of this one model.” But if you look at it like this lobster and globe thing, there will be […] other things and allusions to that and if that’s actually the case, we actually have to do scholarship and the scholarship not only involves writing texts and having a series of figures but also involves naming. To say, ok, here maybe we have one node where that is the type which is actually the version […] and there are other things where we say, really, a cloud with a […] of things that overlap and the structure of the map which is very very interesting. And the other thing is that Iconclass is really very subjective. So if you say Lobster or Globe is that thing an equator going over it or whatever. And so we learn more about the curator’s classification rather than Iconclass’s classification.
[Thomas Stäcker] I fully agree [that] Iconclass has its deficiencies. I see that. But it is, again, in view of that it is still a good tool to use here. I am not talking about the principal reliability of Iconclass for describing that sort of object, but it is a very practical and good tool to find these objects at all, because otherwise we have no chance to get to them. This is the sort of discussion I used to have with art historians, because every art historian tells me that Iconclass is awful. And it is true, sure. I ask him “what shall I do?” And he says “I have a new system. My system.”
[Wendell Piez] I think the anxiety comes from the implicit derivation of Iconclass with this . . . superwork thing which does not hold.
[Thomas Stäcker] I fully agree. Indeed, we have to distinguish between the magic of the idea of being implied in here and the complete practical approach of getting to the things. What we offer here are in the first place identifiers. Identifiers of pieces you can put together. What the reasoning is behind this “putting together” is another question. People like Henkel-Schöne who put together these various emblems in one book were doing exactly that. Because they compared the emblems, identified them from the books, they found them and compiled them. This is an advantage. This is debatable from another angle, but if you have identifiers to refer to this as that and this as that, then you can maybe split and say that these are not the same errors, it is something else. These are similar errors. You can build a new ontology on these sort of judgments, I think. What we’re providing is just a basis for doing research for on emblem books.
[Kari Kraus] I actually really wanted to talk about Iconclass because I used to work at the William Blake Archive a million years ago, and we had exactly that problem: we had to figure out how to mark up images, and how to markup the content of images. We ended up developing our own proprietary system, but in retrospect we could have very easily used Iconclass, and I do not entirely know why we rejected it, because we were aware of it. I think it is interesting that you say art historians hate the program, because of course it was designed by an art historian, developed by them. And I think one of the problems is that it’s proprietary. Theoretically, do not you have to pay to use?
[Thomas Stäcker] No, its free since about three or four years. The person in charge is Hans Brandhorst so if you are interested you can reach him. You can even enhance it and improve it, every suggestion is very welcome there. Get in touch with the Iconclass people themselves. This was one of my concerns, that is was proprietary that you have to pay for that. If you get online, this business you would have high costs.
[Kari Kraus] So one thing that’s always intrigued me about Iconclass is a feature that’s widely underused. In fact, I know of no instance of it being used for a project. And that is that it makes provisions for expressing relational information between the different components or parts of an image. You can use the colon sign. It can be used very denotatively. If you have a shepherd holding a flute, you can code for shepherd, use the colon, then code for flute. It expresses that there is some relationship between those things. It does not really express the nature of that relationship, but it does connect the two. One of the ways in which the manual suggests it might be used is to express more metaphoric or interpretative information. I think this is relevant because in a number of discussions we’ve thought about how can you have the affective triples or contradictory markup and so forth? So I always thought this would have enormous implications in the Blake archive, because Blake uses so many visual puns. So you might use a snake as a wheel of a chariot, or snakes to represent horns on an animal. The Iconclass system for allows you to search, for example, all ways in which “snake” was being used metaphorically. So your source domain is something. . .you would basically set it up as source domain and target domain separated by the colon. That way you could always search across one particular target domain. What was being used as a source domain in your metaphoric expression. Do you know any instance of anyone making use of that feature by Iconclass?
[Thomas Stäcker] Well, Hans Barndhost once. He taught me once but I cannot remember the project he was involved in.
[Kari Kraus] It is a really interesting project. And I have actually known about this project for a number of years.
[Julia Flanders] I have a question that I am going to go ahead and jump in and ask. A question that I wish Kari [Kraus] would ask of the nature of identicalness between images. It seems like the question you are asking researchers to make up their minds about, but I am wondering if you have arranged to make provisions for different kinds of identity or different degrees of identity between images? Like, yes printed from the same plate. Or printed from a revision of the plate. Or printed from a re-framing of a plate. Or “inspired by” or “allusion to.”
[Thomas Stäcker] Well, I am afraid that the modeling has not reached that state yet. But actually, next week I will be getting in touch with people to think about exactly that, because it makes much more sense now that we have got the basis for the research to do the research. Part of this next project we are going to apply, we asked money for, should be to develop such a model for describing certainty of signing identity between entities. I think there are some precursors, so to speak, that one can make use of. There are other data models available that you have to look at, what possibilities they may offer.
[Maximilian Schich] It is polemical, certainly. But there is an obvious spectrum, right? There is very useful uses, right? A very good sense to use it, because any classification is better than no classification, and having a system is better than having no system. But at the same time there is this idiosyncrasy thing. So, you can say “OK, we agree, Iconclass can take everything and it is not subjective,” but in the end, everyone has a subjective idea about Iconclass. If you measure their use, you find out that very little of it is actually really use. It is perfectly legitimate to say “okay let us count the prompts” and say “ID, ID, ID,” or whatever. Without using Iconclass, use through the hashtags. The result is a little bit different but also useful. At the other end of the spectrum, there is artists and art historians. I know of one particular case, Chris Derkland, who was the curator of the […]in Munich, he once said to me that “everything I do in my career is to disprove Iconclass.”
[Fotis Jannidis] When you described how you think of the requirements, the usage requirements for the data model, did you—going back to something Kari [Kraus] said before—did you talk to the researchers or do any […] whatever? Or did you just say, I have a rather clear understanding what emblems are and therefore I am part of a community who knows what emblems are and we are trying to reconstruct?
[Thomas Stäcker] Well, there is a community and there was a very vivid discussion about the components of emblems and what we are going to describe and make it feasible, so to speak. I have already mentioned Stephen Rawles, who is the father of this information. We only translated this description of what this emblem is about into an XML, in the schema. So we tried to adopt an opinion of a particular group of researchers in a data model.
[Fotis Jannidis] Can you also say whether this opinion of what the thing is, of this community, is somewhat related to what they’re doing in their research?
[Thomas Stäcker] There is a wonderful project in Glasgow, a digitization project on emblems. This is the context out of which this model was developed as a descriptive model. They did not use a schema. The need for schema turned up when we tried to share data. Before that, it was an isolated project. It was a description of what is going on, but further discussion as groups showed that we needed something to rely upon in a technical way. So we created a schema and this was a basis for all further work with the service providers. . .so we went to Iconclass, since Hans Brandhorst is the owner of this company Archives and he is at the same time running Iconclass, so to speak. These people are very good at indexing Iconclass, so we say, you should do that, but we need some sort of communication protocol. Here comes the emblem schema into play. But of course, it relies on the descriptive model the community gave us. There was some discussion afterwards of course, because [we had to ask things like] “do we have to adopt something?” or “do we need this really?” That sort of thing.