Panel discussion–Data modeling and humanities pedagogy

Panel discussion–Data modeling and humanities pedagogy (March 15):

Elisabeth Burr, Elizabeth Swanstrom, Susan Schreibman, Elena Pierazzo; Julia Flanders [moderator] (video)


[Julia Flanders] First of all, what’s distinctive about humanities data modeling as distinct from the modeling of data in other fields from a specifically pedagogical perspective? So what’s distinctive about teaching data modeling in a humanities context? How have you found it coming up in your own teaching? And the second question is: what value does data modeling hold for the digital humanities classroom? Or, in other quasi-pedagogical context, for example, ongoing professional development, as traditional humanists, what value does data modeling hold in that sort of professional transformation that the field may be undergoing? The third question was: what kind of conceptual challenges do humanists face in understanding data modeling as part of their work, or in understanding it at all? (Maybe, as another question.)

Finally, consider strategies for teaching data modeling, broadly speaking, to humanists. What kind of explanatory models have you seen working particularly well or particularly poorly? So I’m putting those questions out paradigmatically to start things off, but maybe I can just repeat the first question as a starting point: What’s distinctive about humanities data modeling as distinct from the modeling that may happen in other fields, from a pedagogical perspective? What’s distinctive about teaching data modeling in a humanities context? And if that’s a terrible starting point, feel free to pick one of the other questions to start with. You can have an insurrection and say “We hate that question.”

[Elena Pierazzo] Well it’s a difficult question because we have no idea. That’s the answer. I just teach modeling in the humanities and thus I have no idea what people in Mathematics teach. Or in other fields. I know only what we teach in my specific community, which is even different. So I do think it’s much more helpful for myself not to answer this question, because I don’t have an answer, but to go onward with another point.

The second question— “What does data modeling hold for the digital humanities classroom” —is very relevant for me. I teach XML, XSLT, TEI, but this is the most important bit of my teaching. I teach undergraduates, postgraduates, PhD students, so what I teach to these people who aren’t too happy to be learning new technical things, I have to say, they normally resist a lot… when I do start on the modeling side of things, otherwise, they fall back, as they are humanists again, so they don’t have to learn angle brackets, they cannot take a breath, and they understand, is actually the moment in which all the teaching I’ve been doing—conveying all the technical bits—makes sense, because they start to understand how to use that. To use what they know as humanities people and their knowledge, to put it into angle brackets. It’s the moment in which the click happens, and then they are much happier after. What is the challenge that they face? I do think the most important challenge is the fact that they aren’t used to looking at the data in the way we need them to look at them the moment they start modeling. So they have to look at exactly the same data they’ve been looking at—the text, basically—from a different perspective, looking at different features. So that’s the part that’s very hard for them, and we need to reiterate, reiterate, the same actions and goals for the documents: saying “This is a feature,” and they’ll say “Oh, right! I’ve never thought about that—yes, I know it’s a title, I know it’s a name, but why does it mean that?” And you say, “Well, it is a name, but it’s also something you may choose, or not, to encode.” And that is the part that is challenging for them. They are looking at material they’re familiar with, but with a different eye.


There are two different strategies I normally use. The first is what we call the “top down” approach, meaning “What do you want to achieve?” Visualize your website, the final website, what you want to do, and think backward about what you need to do to achieve these results. Because it’s easier to understand thinking of a final product than when thinking of the document they have, then modeling it from there. That’s what we call the “top down” approach.

The other, the second one is what I call the “theoretical reach” approach, in which we make them reflect on the modeling they already know. For instance, the theory of communication—[Roman] Jakobson, [Ferdinand de] Saussure—those are models of language, of communication, that most of them are familiar with because they studied them at the undergraduate level. So we say “Okay, this is a model; think on these terms.” Using the models they already are familiar with to apply to the documents, and start to make them understand what they’re actually doing—they’re applying those models somehow. And that helps because, again, they are using familiar knowledge, and they can make the step forward to something that’s more unfamiliar. That’s the second approach.

The third “trick”—and that I call a “trick”— is to use materials that they are familiar with. So if I’m talking to Medievalists, use medieval manuscripts. If I’m talking to an epigraphist, use stones. If I used the […Italians to colors, Italian material…], and so on and so forth. Something that I know they are expert at, so I can show them what they can do with their own material. That is my trick.

Syd Bauman Can you repeat the names of your first two methods?

[Elena Pierazzo] Sorry. The first one is called the “top, down,” meaning, the final result. So we usually use “wide framing,” so we use the wide framing to do the website, to mark up the website, and then we work backwards. The second method I said is the “theoretically rich,” so I get down all the theories that they’ve been hearing, structuralist, poststructuralist, and I ask them to actually “stick” them into XML. And that’s the second model, just to answer your question.

[Elisabeth Burr] So perhaps I take over. My background is a bit further from Digital Humanities in the end. I teach Romance Linguistics—linguistics in a very broad sense. That means also perhaps the history of the science and social linguistics and all of these sort of things. But in my courses I always do not just want to teach my students something about linguistics, but also something about research methods and something about how you do research, actually, and above all, sound scientific approaches. That means, for example, also academic paper writing. Now my students in Romance Studies, as Elena already said, I mean, all these things which we are doing nowadays: markup, ontologies, data, and all these sorts of concepts, in a way, they have been always implicit. No one ever talked about data even if a text was always there. Everybody used text or filing cabinets; you bought boxes, you ordered them, you created some sort of ontology in the end, and you set up relations between them. But nobody ever looked at them like this. And so, doing research was always, and also writing, a scholarly paper, was always looked upon much more as a creative, mental process. This is still the case in Romance Studies in the end because, even with the computer, there’s no way of working or no approach which would teach students to exploit sensibly the technology to do all the work. This is not in place. My students normally arrive at the class, they write an academic paper, but they aren’t even conscious that the layout is a structure—what is a text? That a text is a structured entity, that there’s a systematic layout, that things mean things. They’ve got no idea about data. It’s very difficult for them even to set links between the sources they use and the bibliography. All these sorts of concepts are not really explicit, implicitly, and it’s not very consistent.

The approach I’ve been taking lately was to set up a project where the research question was about “What is the influence of the neogrammarians on Romance linguistics?” During this project, students had to mark up texts because the texts needed to be included in this portal we created. By marking up texts, by using oXygen, for example, they learn a lot about text. First of all, they notice how even their own texts are always marked up. They understand what “markup” is actually about. There’s a structure; if you define these elements, you can also change them. They also learn that rendering is a different level from the data itself. Another sort of thing which worked very well during this project was Endnote, because the students had to insert the data they collected for their projects. They had to learn that you have to respect the fields, that different publications have different kinds of information bits they need. They also learned that stylesheets can be applied consistently into data, and that you can even change the rendering and have the data remain the same. Integrating all of this together is what actually I would hope for—conceptional change. Also that students don’t look at computers any longer like a typewriter. That’s what they actually do. But also as a document which is structured, which I can change. The whole thing is a process of writing such a paper nowadays, because by applying a different output, I can change the whole structure of the paper. All these sort of things. So I think, altogether, it should change their understanding of what they’re actually doing, and that’s the most important things.

[Elizabeth Swanstrom] I could add to that. I don’t teach data modeling per se, I teach literature. Yet I think that there are certain things in terms of data structures that are extremely useful for teaching literature. One of them is teaching them formalism, the elements of works and how things hang together. One thing I like to do, just to show students how it is the texts they read online are marked up and tagged and coming to them highly mediated, is to show them this very wonderful, simple example from the TEI standards of [William] Blake’s “Rose Thou Art Sick.” I say, “Oh, you see this and you’re familiar with it, you had it in your high school literature classes, and you see this online and perhaps what you don’t realize is that there’s all sorts of data attached to these poems so that you can enjoy it and see it and read it.” But at the same time, if you look at these lines, you have a way to parse this poem. We realize simply by looking at the structure of this poem that it’s comprised of lines and stanzas. This is actually a very useful tool for students who, I can sympathize, might be resistant to learning technical skills. I assure you, they’re also resistant to learning literary analysis. So this actually in some ways provides them with the sense, “Oh, I can do this.” Perhaps this whole idea of a formal analysis is not perhaps as abstract as it seems. Here’s actually a way that it’s done in a very simple framework. So that’s something I find extremely useful as a way of teaching literature. I’m also always on the lookout for the wonderful types of visualizations that we saw yesterday in Wendell [Piez]’s presentation and today with Vannevar Bush, etc. These are things that I use in my pedagogy and that I think I’d like to see more of in my own teaching.

[Susan Schreibman] So I’ve gone from a position of not teaching for a long time to doing, it seems, nothing but teaching, but teaching a lot. I started a Master’s. There’s a PhD in Trinity, and I teach undergraduates. I’m realizing for certainly the Masters and PhD, the methods that I’ve used, I think, don’t serve me or my students—I hope they’re not listening. Because I used to teach one course, and how much you can get through in one course is very different from teaching a whole Master’s, even in Ireland where Master’s are just one year. So I’m very grateful to Fotis [Jannidis] and Julia [Flanders] for having this symposium now, because I’d started to rethink how I might teach, and this gave me an opportunity to think quite differently. What happens if I were to frame some of the introductory courses in terms of modeling? I’m not sure I would call it that, because I think that might scare some people away and anyway it’s a big deal to change your course name. It seems to me that there are maybe four not very mutually exclusive approaches. One is teaching modeling through theory, something very theoretical, or something very practical—that you’re teaching them modeling but not for a theoretical point of view by doing it.  Or by having them model something, and as I said, not telling them what they’re doing. So it seems to me that, we were talking about this yesterday or this morning about teaching research methods as well. Not only to our students in digital humanities, but to all the other students who do research methods —how important it is, and I hadn’t thought of this before, to teach them the skills that allow them to manage and model the information.

[Susan Schreibman] In fact, I think we’re going to be doing a disservice to our professions if we don’t teach the next generation of students how to in fact model their own data. Because this is how we all collect data now. As someone said, we go into libraries, take photographs, we have a lot of word documents, you try to find them on your hard drive, you put them on Dropbox, and you do searches to see if you can find them. So it seems to me even in training, which I’ve done a lot of in summer schools, data modeling always seems to be implicit. I don’t think I’ve ever seen, and someone can correct me if I’m wrong, in a digital humanities summer school, certainly not one I’ve done, maybe unfortunately, there is no data modeling course. We teach it implicitly, so we teach XML, there’s data modeling, or we teach XSLT or we teach something else, and it’s implicit but nobody signs into the [motions a global sweep or encompassing] . . . So it seems to me that we’re missing a whole level. And it’s a question that maybe we started: what’s different? As well, I haven’t taught in a Computer Science department, but I suspect one of the differences is that we don’t have data modeling as a kind of primary method that we teach apart from the technologies themselves. It seems to me like that thing where you teach someone how to fish, they can eat for a long time. If we’re teaching them XML or we’re teaching them a particular technology, then they know how to do that; but it seems to me what we probably should be teaching them is a level up. So that when the new technology comes along, as it will in their lifetimes if they stay in this field, they can then apply that knowledge that we’ve taught them to the new technology as opposed to having to learn each one separately.

So I had a recent experience, and it came up around the time I’d been thinking about this for this workshop, and I asked my students as a part of a Digital Scholarship Editing course to evaluate other thematic research collections. They did, and they wrote about it in the old style, and they did very well. Then I asked them for their own projects to do wire frames of theirs. They didn’t do so well. It seemed to me that they didn’t make that leap between what they understood when they evaluated in text the other thematic research collections and what I was asking them to do, which was to make an abstract model of the site they were going to create. So then I thought, maybe there’s a different approach: What happens if I ask them to model a thematic research collection as an exercise? And I asked one group to model it in XML, and I asked another group to use Powerpoint because it’s the model that’s important, not the language that they’re using to model. That would also teach the kind of thing that you (points to Swanstrom) were talking about: it’s not the language we use to express it but the concepts either over- or underlying it. The last point I made, also in preparation for this, was that I remembered when UVA was thinking of doing a Master’s in Digital Humanities. They put a lot of information online. I remembered, in fact I went back to look, that two of the core modules are knowledge representation. I remember this came about—this course that ultimately didn’t come about, actually, but a lot of work is online about it—after a series of discussions and workshops they had at UVA to figure out what could a Digital Humanities Master’s look like. I was struck again by the centrality of the knowledge representation courses in this.

[Julia Flanders] This is fascinating. What I’m hearing from these different examples is a number of different approaches that come from different disciplinary perspectives. I wonder whether you guys and also, obviously this is open, could speak to the question of if we are to teach data modeling, if we’re responding to the urge to make it an explicit part of our pedagogy, is that something that can be done outside of a disciplinary context, or in a trans- or interdisciplinary context? Could you teach a course on data modeling for, say, humanities graduate students, or would it need to be for classicists, for literary students, for linguists? Is there a way we could imagine data modeling as a topic in itself in a humanities context?

[Elena Pierazzo] Well what I do teach is to Digital Humanities students who have all sorts of backgrounds, so they don’t have one type of background. I do use some different examples. So I think you can do that. There are things that are very specific. Actually, I do teach text, so they need to be text-grounded. But text is something that’s common within most humanities background, so I just make sure that they take some documents and material we look at, which is of wide interest. The last document we used was the Gettysburg Address, which was a various lot of digital images and different versions, so it was a good example. It could be interesting for many different backgrounds. So that’s the reason why I wanted to choose something “iconic,” that’s what I mean.

[Laurent Romary] I thought Elena would say exactly the contrary. I agree with what she said before about the importance of working on sources which are very close to the domain of expertise of the students, their background. I think it’s a matter of giving technical confidence to the students. It means to make them be in the position to say “Nope, I know my stuff. As a linguist, I’ve got some concepts I acquired over my previous year of training; as a historian I’ve got some concepts. I take a document and this guy is asking me to put this together as an XML document. I’ve just learned what XML was 5 minutes ago. And I discover I can’t do that, because I don’t understand the material.” So I think it’s very essential that if you have heterogeneous classes, you need to know how to organize that in subgroups where they can have this epiphany moment where they say “Ah-ha, yeah, I can build up my concept of something more technical than I thought.” It’s a very important issue.

[Fotis Jannidis] Is there anything besides the models we already have, general concepts we are thinking of — XML, or this kind of model. We say “This is so basic to data modeling that they have to know about this, without any relevance to the kind of studying they do.” Like indirection, abstraction, things like that would be something they’d do.

[Elena Pierazzo] That’s why I use the “theory rich” approach, in a sense, because I build on theory that they already know. The point is to make them shift from the empirical — looking at the specific to the abstraction, and to make the model become abstract. They do this already in many circumstances. If you look at the theory of communication, the theory of literature or language, things with which they’re already familiar, they are able to understand what the signifier/signified are. That’s the moment where abstraction already happens. That’s how I use it — for these reasons.

[Syd Bauman] So I’m just about the furthest thing from someone who’s an expert in pedagogy. The thought occurs to me when we talk about teaching modeling, I wonder if we don’t divorce ourselves a little bit from the idea that this is “data modeling” or the idea that this is “knowledge representation.” And rather think of ourselves as. . . what we want to teach first is just information management. And then, couched that way . . . maybe I’m wrong; to me it doesn’t sound this scary. I’m asking Julia’s question, “Can we do this for a broad spectrum of people rather than directly aimed at certain disciplines?” All of us have experience in certain things that we can take advantage of. One that jumps to mind, using your first example of information modeling, is what information do we want to collect to keep track of your own personal collection of photographs? Or something like that, that everyone can get their teeth into. Something like “Who took the picture? What camera did you use? On what day? Who’s in the picture?” Stuff like that. Model that first, and then step into texts in disciplines that students have some access to? Is that a reasonable point or am I being crazy here?

[Elena Pierazzo] It’s not for me, at least not in my experience. The higher you go, the post-graduate and PhD students regard these as patronizing. They say “I know these things, I’m a PhD student. Treat me as an adult.” They really want to focus directly on something that is relevant to what they do. It doesn’t work if you don’t give them a challenge or some material that’s likely to be a part of their experience. That’s my experience, though.

[Syd Bauman] And undergraduates too? Or . . . ?

[Elena Pierazzo] It’s weird; they don’t react that much, do they? (Laughs.) I’m sorry, I’m very naughty. But it’s difficult with the undergraduate students because they normally listen to you and not [a lot?], whatever you do.

[Susan Schreibman] I was going to respond to Fotis [Jannidis]. I was again re-reading things about modeling and I was looking at Willard [McCarty]’s article [“Modeling: A Study in Words and Meanings”]  in the Companion to Digital Humanities again. He broke down modeling into these five areas: analogy, representation, diagram, map, simulation, and experiment. Again, it’s a kind of step up. (Motions her hands to signal hierarchy.) I wonder if a model, as it were, like that could serve as a way to teach modeling, but not specifically about a particular language or tool or . . .  And abstract out and demonstrate how these “become” models, and how we use them both theoretically and very practically.

[Elli Mylonas] I don’t know whether to believe that one ought to teach modeling for knowledge representation without any tools or without a specific tool. Because it seems to me, especially if it’s without any tool, if you’re talking about teaching humanities students who want to work on something specific, or who hope to work on something specific, the tool is how you test it. One of the real “ah-has” for this kind of thing is when someone realizes, whether by abstraction or something else, that things start to fall into place. That your model makes sense as you answer a question with it. And I think in order to do that, you have to have a tool you’re playing with. Maybe you aren’t just focusing on one tool. . . but . . .

[Susan Schriebman] Yeah, I think that’s the point. I think often we teach tools and then presume they get it. Or they get something, but it might not be something they can abstract to the next tool that comes along. So the tool becomes the vehicle, not the purpose.

[Stephen Ramsay] Yeah I’m really [brought on?] to this question because I want what Susan and Elena have suggested exactly. I want both things to happen. I love this one-level outlook but I also […] I teach programming, and the thing you’re trying to say when you teach programming is that “We’re going to learn this particular language, this particular tool.” And then at the same time you want to say, “By the way, forget completely about the tool because the tool doesn’t matter, we’re talking about a concept.” They know that as they go deeper and deeper into the tool, we say, “No, no, no, no stop that.” So you’re going back and forth. One of the ways I’ve tried to get around this is by constantly peppering the discussion about specific technology with as many anecdotes about other technologies as I can fit in. This is how Java thinks about it, this is how LISP  thinks about it, this is how […] does it. I risk confusing them completely, but I also feel like that’s a worthwhile risk. So if I were teaching data modeling, I think what I’d want to do is say “Let’s make XML our home base for humanities department for today. Let’s take them off […] as often as we can in trying to say “by the way, if we’re talking about relational databases—in fact, then let’s talk about that, how would it look? […] I’m glad both of those came up because that’s exactly what I want, both of those things.

[Susan Schreibman] Absolutely, because in a way, what you’re teaching them is that you can express this in XML or you can express it within a relational database and here’s what doesn’t work, maybe, but have them figure it out. So, could a way of teaching this be to set a problem, and if you have groups working, to have them work in the different technologies. Or have them model the same thing in different technologies but some way of allowing them to get to that space you’re trying to give them with other examples. I don’t know if this will work either, but it’s exactly what you were talking about, which I imagine was the original question: “Do they do this differently in engineering or the computer sciences or mathematics?” So that once you pick up one language, it’s easier to pick up another. Because it’s the concepts that are important.

[Julia Flanders] It’s also a question of institutional strategy. I know at many American universities the question is “If we’re going to teach topics like this, where do they get taught?” Do they need to be taught in the departments? Do they need to be taught across departments? Do you benefit from having archaeologists and linguists and romance languages people all in the same classroom learning data modeling, or do you benefit from having each group being taught separately. And I think an interesting way this maps onto the phrase “getting your hands dirty” that came into the discussion yesterday and really makes us ask “Do we think of modeling as the new version of a liberal education?” Is it like the classics in the 19th century British education system, where once you know that, you know everything, you can go everywhere? Or is it more like Chemistry or Engineering, where once you know it you can do a bunch of stuff? That bears on the question of what happens when a scholar understands modeling: is it that they’re getting their hands dirty or, on the contrary, are they keeping them very clean?

[Elena Pierazzo] I like this idea, the fact that Stephen mentioned of presenting all of their programming language. What I intend to do for that precise purpose is to say “Okay, I’m not teaching everything; I’m teaching XML, TEI, XSLT and CSS in twenty hours.” I mean, it’s kind of a big challenge, and Trevor [Muñoz] can say that, because he was in one of my classes. (34:13) I always say that I can teach you 40% of that, 30%, of that, 40%, 20%, 5%, but the rest is yours, because I want you to learn how to learn — not just learn the technology itself. So the fact that within the course I teach so many technologies at once, in a sense, and all of them are to be used together, make some sense. And the fact that the students have to go and learn by themselves a lot. I say that in five years, this technology will be obsolete. We’re not preparing you for five years’ work, we want to prepare you to become a Digital Humanities person, someone who can learn a new technology in five years’ time. First, you have to learn how to learn. That is what’s assessed as well.

[Susan Schreibman] I was just going to respond to what you just asked, Julia, because I, like Elena, teach whoever signs up for the Digital Humanities course, but I think there’s an advantage to having different disciplines there. The advantage to just having your own discipline? — it does make it easier. But the other advantage is to get people who understand or will represent the same objects very differently because they come from different theoretical and disciplinary backgrounds. Allowing students to see how others do that representation or the knowledge they take out of it is stronger, possibly, than the benefit of having all the same types of people in the room.

[Desmond Schmidt] I just wanted to progress that a little bit by asking the question, is it just the way that you abstract when you teach digital humanities students about data modeling, or is it really just about XML or about relational database? I don’t think there is anything much abstract. I think it’s almost all tied to […]… The analogy with programming languages isn’t a perfect one. You’ve got many different similar programming languages, they’re all […] functional or whatever, one can to some extent swap between them. But the relational database model and XML are not exactly similar. I think what you’ve been describing is just teaching XML data modeling. I think that’s all we can do.

[Fotis Jannidis] My question sort of relates to that. I’m still trying to get to the level where there aren’t any abstractions worthwhile in weighing out […] of teaching XML. I’ve found that data modeling is a model of a certain perspective. So if you make this clear from the very beginning, they don’t think they’ve modeled some ontology, something that’s out there but it’s a model of a certain perspective. And usually, as a researcher talking to young researchers, we try to say “Okay, you will have a research question and then you model what your discoveries.” But they don’t have any research questions, and basically I’m having to ask around. Do you have an answer on how to do this? How do you make them think up research questions? I’m teaching a class to undergraduates and at the very beginning of the research . . . they don’t have any idea of what a research question could be, so they don’t have any info on this and they’re basically not interested in data modeling at all. I would be glad to find a way around that.

[Elizabeth Swanstrom] I can speak a little to that in terms of literary studies because I would never, in my department I don’t think we would ever have a class per se on data modeling. Yet I think it’s absolutely crucial for literary studies courses to incorporate digital humanities tools, and that’s what I think I’d like to see: perhaps a gap narrowed there. For my classes, I think what Syd brought up. . . many undergraduates who are majors in English at my college and at different colleges where I’ve taught would not feel patronized if they were asked “What would you use to gather your photographs? What would you use to gather your documents? What online tool would you be comfortable with using?” And that would be a wonderful way in to get them thinking about how to translate the analysis of literary texts. One tool that I’ve been using more frequently is the Electronic Literature Organization Directory, which I’ve been a member of for a few years now. It’s a wonderful resource in terms of gathering scholarship and information about works of electronic literature, net art, digital art, etc. — but it’s also a great pedagogical tool. For undergraduate literature students, it gives them the opportunity to engage with these texts, write an entry, go through a peer review process. But at the same time, they’re responsible for tagging their own entries. They’re responsible for thinking about how these texts they’re looking at are broken down and how they might translate that to other people. So it’s something that I think is something of an overlap.

[Elisabeth Burr] I think we’ve got two different types of setup here. [Elena’s] trying to teach digital humanities and [Elizabeth] is trying to integrate more or less everything into our literature or linguistics classes. So I agree with you [Elizabeth] that in my context, at the moment at least, I cannot see separate classes for across the humanities, for example. If I want to do anything about digital humanities, then I have to do it in an integrated way. Now I’m all for the integrated way anyway. Also with academic paper writing there are all of these sources and service centers and things like that and I think that they just cannot work. The best thing is to bind it all up with the topic of the seminar, and then they have got a research question, because in the traditional German seminar you always have to write a paper on a certain topic, then you have to collect information like sources and bibliographical data and you have to work on a structure and all these kinds of things. Now I break this all down into individual elements and then, in order to make people more conscious about this process and what they’re actually doing — consistency, structure. I mean, an academic paper anyway is a very structured document, so there’s not much arty-farty going on in this context. You have to do it in a certain way. But to make it conscious why it has to be done in a certain way — because it has to be scientific, data has to be explicit, and these sorts of things — I think that sort of markup and databases can actually help a lot to make students conscious of what they’re actually doing. And by the same time, they can learn something about the digital humanities. So it’s all in one pot.

[Allen Renear] I teach a course called Foundations of Information Modeling. At the beginning of the course I say, “You will not learn information modeling in this course. This course is about the foundations of information modeling. We then spend five weeks learning on formal logic, we review discrete mathematics, we open formal communication theory, we look at Chomskian hierarchy, we look at how DTDs grow out of BNF … We look at relational models, we learn […] and one of the two calculi, neither of which is implemented in software, it’s all on paper, I tell my students, you want to learn SQL, drive to Borders, buy a book, that’s what grownups do—that’s what I did, you can do it too. I do believe that learning data modeling is best done in books, not in the classroom. So our students are Masters Students, they’re here for 18 months: it’s like their last 18 months of school for the rest of their life, and I don’t want them doing in the classroom what they could do far better in the workplace—with […] and projects and deadlines and up to date software, and a paycheck, too This is their last chance to learn theoretical foundations, and we focus on things like that. And they have to have some backing, they have been acquainted, just barely, with relational databases and conceptual modeling, so they’re not encountering these things for the first time, but when we do conceptual modeling,  we concentrate on how to translate the rectangles and arrows of first-order logic so that they understand what the underlying model is regardless of how […] so I think there is a lot we can do to prepare students for becoming good data modelers, and I actually think that…that operation is not about learning data modeling, it’s about learning fundamental principles that are going to serve them for the rest of time.

[Wendell Piez] I think that’s great and I think it’s really important, but one of the other themes we’re hearing here is that there isn’t necessarily such a hard and fast line between data modeling as we conceive of it in the digital humanities as something that’s part our practice and very traditional pedagogical roles of learning the methods and practices of research and literary formalism or linguistic categories — those are, in essence, themselves data modeling and always have been, long before the computer. We refashion that in some ways when we bring digital technologies to it, but at the same time there’s also a continuum, right? So it’s like there are two goals here which in some ways are at odds, but in other ways, maybe they’re potentially complementary, where we see data modeling almost as an end in itself — something that fits within the digital humanities, something that we do because we’re building projects, building resources. Yet on the other hand it’s not really an end in itself; it’s also a means to another end. I just want to put that on the table, because I think that if we don’t keep in mind that there’s that aspect to it, then it defeats the whole train of theory.

[Elena Pierazzo] What I was commenting on for Fotis [Jannidis] about “How to give research for students that are too young to have research questions” . . . I don’t think you can. What you can do is ask them to model for a public, for a specific target. You say, “Okay, build a website for this public.” Once they start to do that, they can do it because of their experience in the world. You can make them reflect on what they’ve done, their actual answers to a research question that this particular public will have. It’s a way around.

[Maximilian Schich] I think this theory thing is really important . . . Learning how something in XSLT is done is also very important. But one aspect that is missing is there should also be something like the practical experience, because especially if I look at myself, I learned … a lot of systems … which are essentially didn’t talk to each other. And if you were a relational guy, the  [XML guy?] wouldn’t talk to you and vice versa, and there was a graph guy […] so one had a really hard time  to actually trace the way down, to say to the librarians at the university, at the Library of Congress […looking for a book?…] there was this guy, he had an idea which we now […] So I think the history is not written yet, I mean the practical history, not how the theory evolved, but actually how it was practically implemented, what worked out and what didn’t. And I think that’s the very interesting part, which is also something we can teach the students, as you said all the interesting work can be done but the most interesting question is actually what can’t be done, because that will raise research questions and actually new developments in the field.

{Julia Flanders] This occurs to me to ask the librarians and archivists in the room whether there’s a distinctive role for that community in pedagogical mode, even though they may not actually be teaching courses in that sense.

[Jean Bauer] Librarian isn’t my title, but I was actually going to throw out an idea I tried in a class before I got this job. When I was teaching in the History department at UVA as a grad student, I tried to create a class that would get students thinking about the ways in which we record information and how that changes over time. And so we went to the Rare Books room, we looked at manuscripts, we set type in a Book School and on a printing press. I also took them up to NINES and they looked at XSLT — we made reference map. For the database section, which is what I know best and what I do —relational databases—I gave them some basic readings on what are databases and how are they constructed? The assignment was to go and find one of the secondary source websites that they were using to research their papers, start putting in some search terms, and to write two paragraphs about how they thought the database was answering their queries. Could they tell if there was a clear relevancy to what responses they were getting back? Were they finding a certain functionality built-in or did it seem like there were certain things missing from the underlying model? We had a really, really great class when they came back, because they demo’ed all these sites to the other students, so the students learned about all of these other sources they could be looking at. Then we had this great conversation about “Well I could find this subject term but I couldn’t find that subject term.” Or, you know, “I was looking around and I finally found one that would really support some Boolean argument I was trying to create.” I think maybe it got them thinking about how the stuff they interact with all the time has been modeled. What they’re getting back, when they go and ask their research questions given their materials, is in fact partially predetermined for them, and so how can they start to work that into their research. They didn’t do any programming. They didn’t build any databases. I talked a little bit about databases and then I let them loose on databases. . . to see what came back.

[Elli Mylonas] I’ll be the anti-Allen here and say that Code Studies is an area that might well be brought into teach data modeling, because it’s trying to look at theorizing and make use of the ideologies that are going into something as simple as writing code. Some of the papers that we’ve already heard at DH have a lot to do with looking at how databases are constructed or how knowledge systems are constructed and how that is created by people who do it, and  focus on that aspect of knowledge representation.

[Elizabeth Swanstrom] I guess I would just like to add to that very quickly to say that’s precisely the kind of thing I’d love to see gathered, consolidated — best practices, successful case studies. That type of thing. I’ve already gone on record once now, this is the second time, although I think it’s so vital—so many of us teaching in the humanities who want to incorporate these techniques, have a kind of wonderfully rich but distributed resource. So I’m very tempted to put in an application for an NEH startup grant to have such a place for teachers in the humanities, best-case practices for using these things in our classrooms. What works? What doesn’t? What’s been disastrous? What’s been a wonderful success? etc. And I think that would be something we would all benefit from.

[Elena Pierazzo] In this respect, there’s a wonderful resource made by Lisa Spiro about the syllabi on the digital humanities around the world, about teaching the digital humanities. It is in Zotero. And this research, this is a Zotero group about teaching digital humanities. I just wanted to point it out because it is really amazing.

[Maximilian Schich] There is one interesting thing which we didn’t talk about during the past one-and-a-half days which is also somehow not quite data modeling but actually is, which is ranking and other things in databases. They have a similar effect. If you create something, you do something with a data model and it gives you something back in a database engine that ranks something, like say page rank. That has an effect, especially if you are involved in entering data  it has an effect in how you treat your database in the very same way as the configuration of the set of data models. There is for example, one effect that if you have a power law distribution of how frequent at term is, or how much information there is about a certain thing, once we use this kind of page rank process, your power law will get steeper and steeper, which is a big problem because it’s concentrated […]  I think it would be a good example in this discussion of education — there are certain fields which became public […]  where we have specialists on data models, on particular languages […].  But there are other things out there, with we have thousands of engineers from Google working on it every day, which have as strong an effect. And the question is how people prepare for that gap, because if they go somewhere, say they join a company, they do web sites with XML, they will be confronted with this: that the influence of that page rank mechanism weighs stronger on how people […] their web site than their data model.

[Laurent Romary] I wanted to react quickly on two aspects […] One thing is the thread between Fotis and Elena concerning the research question:  sometimes you can’t, because like they said we’ve got students, but you need to keep this in mind in a way in that in answer to your query, how do you teach to archivists and librarians, I did this seminar in Stefan’s institute last year where we we encoded a diary of a woman in the 1940, 1941, and it was quite a successful scenario at the end of the day. But we started by saying “Look, you’re putting yourself in the skin of someone responsible for putting together a digital edition in a library; you need to identify your title.” Let’s imagine a typical historian: do the edition and model the data in such a way that you know in a minute what you can do and what the historian will do. A  typical case is modeling persons. So what’s an entry, […] blah blah blah, some objective info like you would have in an international library database, and then “Nazi”. Like, “hey, why did you write that?”   Well, because it said in the document that this woman is saying about this sort of person is a Nazi. The historian has to decide. It’s not your job to put those kind of features, you […] The other point is concerning getting our hands dirty. I spoke about confidence we need to give to the students. One of the messages I gave […] to people in literaturwissenschaft, in literature, with no technical background at all. And I said that the objective of the course is at the end that we can go to a computer scientist and say I want this, this, this. I don’t want to bother except to say “that ‘s the kind of data I want” as input, as output, and that’s it. So you need to know enough of the technology to give orders, because you’ve got the science. The computer scientist is done, he will never be more comfortable in your field than you are. This is an important aspect.

[Julia Flanders] That’s very interesting. You see, one of the things your comment reminds me is that the digital humanities community does data modeling on at least two levels. There are some of us whose job is to model data on behalf of others — we are, in effect, consultants. And there are others of us who are modeling on behalf of ourselves, for our research. We are the scholars, we are in charge of our projects, or both — right, absolutely. But the point is that there are these two different ways in which data modeling functions in the digital humanities, and I’m wondering whether the teaching exercise is inflected by that. I teach a course in Electronic Publishing, which is secretly data modeling, at the Library School at Illinois. What I have my students do is pretend to have a project, so they have to follow the modeling exercise through schema development, […] document analysis, designing a publication and so forth. And I ask them to use a project of their own, to come up with some project they care about because, as has already been observed, it gives them a greater sense of investment in the outcome. But I’ve often thought that I would really like to be able to teach them how to take anybody’s data and think interestingly in a kind of vicarious way — in other words, teach people how to project their imagination into somebody else’s data and try to understand what those other elements might be. I don’t know how I might go about teaching that, or whether it’s just the same set of skills…

[Laurent Romary] Julia, you need to make sure that these kids can take pleasure in looking at someone else’s data, step by step, before you do that.

[Julia Flanders] Or maybe it’s just a matter of talent.

[Susan Schreibman] I was going to say that I teach a course and have for a long time, somewhat like Laurent?, and also in answer to your questions, Fotis and also in answer to you Julia, what happens is — and again, I’m teaching students from a wide variety; I taught this in the library school and now in Digital Humanities — that in a way they get sucked into the content. I’m sorry for all the Medievalists here — pick something Medieval. I do pick fairly contemporary, A), it’s what I’m interested in, and B) it something they can relate to. I have a class going on right now and they’re totally sucked into the content areas, doing research I haven’t even asked them to do. But they’re coming up now with the research questions; because now they’re so engaged with the content, they’re moving forward without me. So I find that way works very well. I just wanted to go back to maybe a couple things that were said in terms of how to abstract in a way that’s different. I was again thinking of Willard’s models of, say, something like representation. Trevor [Muñoz]  was speaking about “representation” throughout his talk. That’s one way of representing. Elizabeth (points to Elizabeth Swanstrom), you were talking about representing literature through visualization tools. Jonathan Stray has a fabulous article about representing the Iraq war logs and the inherent issues with the algorithms that were used to create the representation and how that inflects or influences our way of reading it [“A full-text visualization of the Iraq War logs:]. So I’m wondering if there are ways, again, to go up one level and say “Okay, we’re going to do representation” — or maybe not even say that — and teach about all the ways in which we use data in ways of representing information in the world we live in, but then finding some ways for them to begin to move to the next step of modeling that themselves, but also realizing how their model affects the ultimate representations at the final point of whatever visuals they’re looking at. Or even within the encoding itself, if you’re thinking of something like XML.

[Fotis Jannidis] I just want to add we have opinions at the moment, or schools. One says that the best ways to learn data modeling is to do something concrete…XML, for example. The next one says teach concepts of a specific subject areas, and disciplines, formalize them, concentrate on the more formal concepts and go from there. And the third would be Allen’s way to say…don’t do this at all, just start a foundation and go from there and maybe you can point out to us why the things you mentioned, first order logic and so on, they are the foundation for the kinds of data modeling we’re talking about. In a way we can say we have different concepts of data modeling in this room in the last two days and what you are seeing as foundations seemingly are  applicable only to a small subset of this. But this may be a wrong understanding so maybe you can explain the selection. But I actually wanted to ask this question and the other question of the three approaches at the moment and I just wanted to say, are they three approaches of equal validity or do we have preferences?

[Elke Teich] I don’t see them as exclusive. They all hang together. So, okay, let’s go from a discipline — you have a particular object you want to model with the text. You have a particular interest in the text. […] That’s a very abstract thing. In order to analyze the text, let’s say from a literary perspective, you have a set of […] methods. And you have some [model to apply?] I’m not talking about digital […] Now, to make things digital you need a computational model […] There I come back to what you said. Once you want a computational model, you have to know the foundations of computational modeling. That’s why you need this type of information as well. It all hangs together. You have to do everything.

[Stephen Ramsay] I feel like I’ve been puzzling over the question for ten years how much to go toward Allen’s lofty interpretation how much to focus on working inside the tool I think we all have to negotiate that in various ways. But one thing we can say for certain is that for most of us our colleagues and our dean and our chair, is likely to be a lot happier without […] that more analytical method, that more conceptual way of teaching not relational databases but relational calculi —it’s a safer place for us to be professionally. We’re not totally free here, in a way because it’s very hard to get a course on XML into the curriculum; it can be hard to get that course on XML approved. And you start talking about foundations of informational theory and suddenly we […] “I teach this course but it’s secretly this…” […] Sometimes we’re trying to slip in the theoretical with the practical, sometimes we’re trying to slip in the practical with the theoretical. […]

[Allen Renear] When I had introduced a course, we had lots of courses on database design, system design, XML and so on. This was a deliberate effort to introduce a course that provided a theoretical foundation. I believed and my colleagues in New York believed that our students would produce much better systems so throughout their career they would learn texts faster, they can understand them better, they can understand them inside. I only had one chance in 18 months, the last 18 months in the classroom, this was a good opportunity to learn these things. Otherwise, I feel like I’m watching the students look at the art diagram and realize that there are rectangles and arrows and they don’t know what they mean—they don’t know what they mean. This often comes up in the debate over what bracket markup language to use. And you realize it is being carried out about features that are truly irrelevant. And these features that are relevant are completely unseen by the students. That’s part of the frustration for me. I think that our students really need to understand what’s going on when they’re data modeling if they are going to learn these data modeling languages, and learn them well. One characterization of the approaches up here was pragmatic versus theoretical And I’m not sure if I recognize any prejudice in respect to this foundational stuff. I’m a very theoretical—I’ve been a pragmatic modeler most of my life—but I’m a very theoretical modeler now. When I think modeling I believe I’m doing digital humanities just like I was taught to do. I’m doing it too. We’re doing it. […] wasn’t trying to build systems, and neither am I. […] So I think it’s relevant to that but I also see it as relevant converting XML and you’re modeling … I see it as just as relevant that your computation but I don’t see it as particularly relevant , I don’t see it as more relevant to computation […] than to theoretical modeling. There may be other perspectives that would be […] but not that one.I don’t know what you had in mind exactly.

[Fotis Jannidis]
I think that there’s maybe an implicit argument for taking up the distinction by Elke, that you have data modeling, and then there’s a subgroup of computation of data modeling. What you were talking about was laying foundations of computational data modeling, and that wouldn’t be the foundation for all data modeling.

[Allen Renear] I don’t see myself as particularly focused on foundations for computational data modeling. We want to simply understand a domain regardless of digital plans for managing data. I think logic and formal languages are all relevant, specifically in understanding domain from a modeling point of view. It doesn’t have to be computational modeling; it has to be […]

[Kari Kraus] I want to jump on this, too, because I think that Steve is absolutely right about the departments. I think that’s true for, for example, the English department where I teach, but I also teach at an I-school and I would say that, in my experience, it’s actually the inverse. It’s much easier to get the course to fly, especially at the core level, if it’s something like a basic technology course, a basic programming course, and so forth, or cataloguing. I once taught classification theory, which was a survey of classificatory principles across different disciplines — biology, philosophy, linguistics, and the challenges from fields like cognitive science — and it was a blast, but I’ve only been able to teach it once. I think this maybe applies to our professional schools: if you’re teaching in a professional school, then very hands-on education is what counts.

[Julia Flanders] Well, and that’s especially interesting given the set of decisions that are being made now about whether graduate programs in the digital humanities are construed as professional schools. So for example when people ask me “Where are graduate programs in the Digital Humanities” and I do my annual look-through, I’m always astonished by how many of them are specifically aimed at marketing themselves as providing practical training for people who’re working in areas like media as opposed to people who are going to be teaching humanities subjects. At least, this is in the United States. I would be interested in how the landscape might be different in a place like Germany or elsewhere.

[Elena Pierazzo] We think of graduate programs: what about undergraduate programs? We don’t teach an undergraduate program; we teach some modules that people can plug in within their own program, major, whatever. There’s a big debate we’re having within Kings whether or not to institute an undergraduate program in digital humanities, and my hypothesis has always been to do a minor. More or less as you do, those of you in Würzburg: to have someone take a major in English or French and they do a minor in digital humanities, which is a methodological approach. But my new director, Andrew Prescott has a completely different idea. His idea is that a discipline doesn’t exist if it doesn’t have a major, single honor, in the undergraduate level. So if we can’t all provide a single honor in digital humanities, then digital humanities is not a discipline. So there’s a large debate that we’d like to start now—we just started the debate at King’s, but we hope to use the Digital Humanities conference in Hamburg and the panel that we are both in on teaching in the digital humanities to start this debate — to say: “What would you fit within a Digital Humanities single honor for undergraduate level? Which kind of person, which kind of professional  or non-professional profile are you trying to create? Someone who goes out there and is able to do a website? Someone who goes out there and is able to be a librarian? Or does research after, what will a graduate of digital humanities will do? I come from an experience from  Italy, from Pisa, where there is a very healthy undergraduate degree in digital humanities which is more of the former approach: so create some professional, someone who’s able to do a localization of software, for instance. A technical writer of software, a web designer or something like that. This is what digital humanities has been. It works very well. People get a job. They’re very happy, there’s lot of people, etc. My question is: is that the only approach? Is that the professionalizing: creating the professional humanities person, someone that does a lot of humanities background but also the technique to do something with it, or do we want to create more of a researcher? Everybody’s a researcher because we do things more or less in that respect and what we do is more or less research. So I was wondering is this the only thing? That’s my question.

[Wendell Piez] I think that’s really interesting and I’m glad that you’re posing a question in that way. Generally, at the conference at Hamburg I think that will be really good. One of the things I’d like to see asked in that context is “What do we mean by asking this question?” When we ask this question, do we imply that, for example, an English major or a History major is bound for a particular professional track? When we ask this question, do we assume that a new discipline needs to create a space in the economic marketplace for itself before it can be recognized? Because the economy’s already giving up among the majors that already exist, and therefore, there’s no call for a new discipline unless you have a profession for it. Similarly, I think the question has to do with the way in which we conceive the humanities as a whole. Because as we know, there’s been a crisis going on for years at the graduate level about whether all English graduate PhD are going to be professors. Is that a real assumption that we’re actually thinking? Is that something we’re presuming in order to legitimize our activities without actually being able to deliver on them? Those questions are bound to have a lot of really interesting and important questions about the what the academic programs are all about.

[Maximilian Schich] I think that’s a very interesting point. I’m not that old but if you tracked since the early 1980’s, what you could get for a certain skill set. […reference to some technical training …] proficient as a classical researcher in a particular field in the humanities in the 1980s you would have stood out like a rocket scientist  […]. In the 1990s you could probably get a tenure track position as a researcher but not a professorship with that kind of [degree? training?], and then you would program, but still that was already a service position where you would work for a classical researcher […] And right now, if you look at the job descriptions, you get not even full […] 50% positions, which nevertheless presuppose you’re able to code, you’re able to do […], you’re able to work with large data sets and so on. Basically if you teach people only the skill sets and not the theoretical background to do something they will end up in service and will have lousy jobs. There are so many people who do similar stuff; there’s media school where you learn […] and all that stuff. And how many people have a great grasp of […]… This is really depressing for people who are finishing off their PhD programs.

[Elli Mylonas] I think some of the more interest in data modeling at the undergrad level happens in the social sciences or can be applied in the social sciences. There was an effort here from the computer science department, which already colors what I’m about to say, who were trying to find out what it was that social scientists  and humanists wanted, if there was going to be a minor or some component of the undergraduate education that was CS-like. The answer for the social sciences was easy: they taught them how to analyze the speeches of presidential candidates using XML as a subject, and things like that. The humanists were intractable. It was impossible to get the humanists to say what they wanted because the faculty couldn’t conceive of what they did in a way that was tractable to the computer scientists. Now I’m not saying it’s impossible—this is a story. What was being taught to the social scientists is what we talk about here a lot, but the humanities faculty don’t think it’s interesting. The CS faculty ended up teaching a lot of work on Moby-Dick. Because they did find some humanities students and they did find some humanities classes and they know how to count things and they know to do statistics on them. That was applying computers and computing and technology to humanities subjects, but of course I don’t think the humanities faculty care about that one bit.

[Susan Schreibman] How long ago was that?

[Elli Mylonas] About two years ago.

[Susan Schreibman] Two years! Wow.

[Elli Mylonas] I would be very interested to see what other people think is digital humanities. What the humanists who are not digital think is digital might be digital humanities. Our few digital humanists kept saying “We want to do text encoding; we think it’s interesting.” And the CS faculty thought that was really boring. So that’s a story about trying to get it done in the undergraduate curriculum because in the social sciences it works.

[Elisabeth Burr] I just wanted to tell you something about an example. I got an email a few days ago. One of the students who actually went through this project, she wrote back to me, saying she had a job in a big German firm which has nothing to do with digital humanities or humanities or anything like that. She knows how to write a text by using the computer in a professional way, and she has learned XML. She’s got this job and she is completely happy about it. So even just humanities can lead to a good professional start in life.

[Elena Pierazzo] We’re all very helpful. We hope that.

[Elisabeth Burr] Because people have got something else. They bring a lot of other things with them. They get trained in the firm in the end, more and more specifically, and they’re trainable.

[Wendell Piez] Not only that but someone needs to know Greek…

[Julia Flanders] For this generation, you are that person!

[Wendell Piez] I was just sitting with three people who read Greek better than I do!

[Julia Flanders] I think it is time for a break.

1 thought on “Panel discussion–Data modeling and humanities pedagogy

  1. Pingback: Knowledge Organization and Data Modeling in the Humanities: An ongoing conversation | datasymposium

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s