Theoretical Perspectives III (March 16):
Stephen Ramsay, “Where Semantics Lies” (paper, video)
[00:00]
[Stephen Ramsay] It is my sad lot to always be giving talks during the last session of the last day, right before lunch. You know, this sort of thing — when attentions are flagging. And I add to that that I am also in an enviable position of just proceeding Michael [Sperberg-McQueen] and following all of you. It is terrible because this has been really bracing, wonderful, enlightening conference for me. It is a terrible thing to try to have the second-to-last word.
[00:40]
Should the syntax of XML— this is a really good start for the last session, isn’t it? — should the syntax of XML have been scrapped in favor of S-expressions? This debate, which raged on for years, and which occasionally reappears, has all the ring of a religious war. Sort of like the war of Windows versus Mac, Emacs versus VI, the big indian versus the little indian and so forth. The risks we take in even broaching the subject are manifold.
A talk based on a question like this is destined to be both technical and philosophical. Which is to say, bad. And try as I might, I will undoubtedly seem guilty of favoring one side over another. Whatever protestations I make to the contrary, it is in the nature of religious warfare to be on one side or another, and to be wrong whichever side you’re on.
[01:43]
But, my purpose here isn’t really to settle this question. It’s not even really meant to reintroduce the debate. What I want to do is to use this mostly wrongheaded back-and-forth to shake out something that I think is actually highly relevant to the topic of data modeling – in the humanities or anywhere else. That highly relevant point can be stated pithily by asking “where does the semantics lie in our computational systems?” In fact, what I would like to say is that this issue subtly affects the way we think about data modeling even when we try to think about data modeling in complete isolation from any concerns about the use of data models — or even for that matter, computational tractability. But before I launch in on this hopefully meaningful quest for theological insight, perhaps I should explain the terms of the debate that give rise to these meditations. To start with: “What the hell is an S-expression?”
An S-expression is a notation for representing tree structures, and it looks like this. [shows slide]
We could define S-expressions much more formally using an elegant, recursive definition. But this is perhaps beside the main point, because anyone looking at this will say “you mean like LISP?” Yes, like LISP, but let us lay that aside for a moment and consider the fact that anything we could possibly want to express in this notation can be expressed using the tree-structure notation we call XML. Now I’m leaving off attributes here, but it is easy to imagine how we might add them in. If we have something like that, where I have added an acronym attribute, then we could do something like this. . . I do not know if that is the best way. Several ideas have been proposed whereby a tree note can be annotated by a value pair. But the point is this: these two representations are 100% isomorphic. Anything I can do with one, I can do with the other. So you might suppose that one element of the debate involves syntax, and that is certainly true. Some people have argued quite vociferously — notice that I did not say “quite correctly” — that XML is simply a needlessly verbose form of S-expression syntax. The standard reply is that syntax matters. Paul Prescod is probably the most eloquent defender of XML over reply S-expression as a syntactic matter. The S-expression syntax is perhaps less busy. On the other hand, do you really want your TEI document to end with 75 closing parentheses? But that is not the center of this debate at all. The center of the debate is the charge that XML has no semantics.
Before I delve into what such a thing could possibly mean in this context, let us dive down one additional rabbit hole and ask “What does it mean for something to have a semantics?” The most frequently offered answer to that question is that semantics refers to what a particular representation means. Terrence Parr, who’s the author of the ANTLR Parcer Generator, and therefore someone who should presumably know what semantics is, says this: “loosely speaking, semantic analysis figures out what the input means. Anything beyond syntax is the semantics.” Now this is written in a book called Language Implementation Patterns. Hardly light reading, but not a textbook on formal languages. He can surely be forgiven for speaking loosely. But when we turn to actual textbooks on formal languages, we get statements like this: “this book is an analytical study of programming languages. Our goal is to provide a deep, working understanding of the essential concepts of programming languages. Most of the essentials relate to the semantics, or meaning, of the program elements.” Now that is from a formal textbook on formal language theory. It doesn’t stop there, but anytime you say “What is semantics?” the books tend to say “well, it’s meaning.” Now both of these statements, and I could cite dozens more, seem to me to beg the question “What is meaning?” Or to put it more awkwardly, “What does it mean for something to mean something?” Obviously this paper can only get worse.
[06:25]
My favorite denizen of this particular rabbit hole is someone who’s already come up a couple of times, and that’s Ludwig Wittgenstein, who offered what I think is the most provocative answer to that question ever given. Some of you may be familiar with the canonical quotation in which his basic idea appears: “for a large class of cases — though not for all — in which we employ the word ‘meaning’ it can be defined thus: the meaning of a word is its use in the language.” It can be difficult to see at first what is so radical about this conception. Wittgenstein, in fairness, spends a few hundred pages drawing it out. Taken superficially, it may seem to be a statement about context — that how a word is used is important. But Wittgenstein goes considerably further than that by rejecting the entire notion that propositions are true or false or otherwise meaningful based on some condition exterior to those propositions. In fact, what he really says is that there is not anything other than use in context. There is not anything to speak of beyond this complex web of relations. “What is justice?” Justice is the set of moments in which the term is deployed. That does not make the question itself nonsensical or unanswerable. “What is justice?” is, after all, an instance in which the term is employed. But it does make it unlikely that we could get very far in forming a useful, all-purpose definition. And since forming useful, all-purpose definitions is presumably one of the goals of philosophy, we may find that posing questions like this gets us exactly nowhere.
[08:00]
What is useful about this for my purposes, though, is the fact that this idea of meaning and use gives us not only a way to talk about computational representations, but is a way to describe computation itself. Computation, stated in the most minimalistic way possible, is about taking information from one state to another. In the normative case, it is about taking some linguistic construct and producing another linguistic construct, though that is not at all essential. If you have a process that can take information and produce more information, we call that process a computation. It is what happens when you press the equals sign on a calculator, and it is what happens when you friend someone on Facebook. The fact that we have some process by which to affect that transformation indicates something in particular about the information with which we began. We say that it has a semantics. This restates Wittgenstein’s point quite succinctly. The information has meaning, has a semantics, because we can produce other states from it. States being anything from reorganizations to physical actions. In the absence of such productions, whether actual or potential, the information is literally meaningless. And while that condition might be rare, it sets a boundary condition on semantics. Most computational representations have a semantics because it is at least possible to imagine computations being performed on them. This is perhaps why [Daniel P.] Friedman and [Mitchell] Wand, from whom I drew that previous quotation about the essentials of programming languages having to do with semantics, go on a few sentences later to say, “the most interesting question about a program as object is ‘what does it do?’” If meaning is use, then who can argue? So when the LISPers say that XML has no semantics, they are presumably referring to the fact that by itself, XML has no inherent ability to produce anything at all. You need to describe that semantic meaning somewhere else. Which is exactly the same as saying that you need some process by which that representation is either transformed into some other kind of representation or otherwise results in another representation being produced.
[10:20]
But is that any less true of S-expressions? Isn’t a S-expression also a representation in search of a means by which it can be translated into some other representation? What could possibly cause someone to say that S-expressions have a semantics while XML does not? And I have read this entire flame war so you do not have to. But the answer to that question — S-expression has a semantics while XML does not — the answer to that question does have to do with LISP, because in LISP there is no inherent difference between the representation used for data and the representation you use for the process, i.e. code. This by the way is called homoiconicity, and it is an inherent process of all languages in the LISP family. The most striking example of homoiconicity outside of the LISP family is, wait for it, XSLT. In either case, it means that any code you write is also a data structure in the language and, conversely, any data structure you create is at least potentially an executable process. I say potentially because the LISPers are completely and totally wrong when they say that S-expressions have a semantics. They have a semantics if and only if you also have a way of taking that representation and using it to produce something else. That is to say, S-expressions have a semantics if you also have a LISP to process them. The consequent notion for XML is that XML has a semantics if and only if you also have a way of taking that representation and using it to produce something else. That is to say, XML has a semantics if you also have a schema combined with some way to process it. But notice the difference there: if you have S-expressions, you need a LISP run time; if you have XML, you need a schema — which is to say, a grammar description combined with a type and structure ontology; what we would call, in this context, a data model — combined with a presumably Turing complete language. The difference, in other words, has less to do with angle brackets and parentheses and much more to do with where the semantics lies in the overall system.
[12:43]
It is possible, of course, to process S-expressions without LISP. It would also be possible to separate the grammatical description of type and structure constraints from the entity responsible for affecting the transformation and still be doing LISP. We are not talking about some kind of new affordance offered by LISP, some deficiency in the XML ecosystem or the other way around. When it comes to taking things from one information state to another, either system could be designed either way. So my question is this: does it matter at all where you put the semantics? And the answer to that, I think, is yes. And for more or less the same reasons that syntax matters. The XML ecosystem implicitly imagines a radical decoupling between the act of data modeling and the act of processing data. In fact, it breaks the act of data modeling itself into several discrete stages, which, in practical terms, translates into a decoupling of the social act of marking up texts from the social act of modeling data, and both from the social act of processing data. I use the term “social act” as a way of designating different potential functions — job descriptions, if you like — in the overall job of computation. You can be the person who decides how a grammar is applied in a particular instance, or you be the person who defines the grammar. Or you can be the person who uses the grammar and the document to translate the information into another state — or, obviously, you can be all three.
[14:10]
What the LISPers argue for is really a world in which these three things are combined. Some partition of roles is, of course, still possible, but in practice the LISP ecosystem more or less demands that data modeling and data processing are never far from one another. While it’s possible to imagine an S-expression tagger—maybe that would be a ”paren-er”— it is less easy to imagine that person not also being at some level a programmer. But forget about LISP, again, because the real issue is not whether LISP is good or bad. The issue is whether the distributed, decoupled model embodied in the XML ecosystem limits or expands our ideas about data modeling as compared to a more centralized workflow in which data modeling is never far from data processing. And here I will risk starting my own flame war by saying that, practically speaking, it does.
[15:00]
It does because it is not possible to fully describe the semantics of anything apart from the processing that is enabled by the semantic relationships so described. An XML schema, and here I’m talking about any kind of schema at all, describes a grammar. It is in fact explicitly based on BNF grammars, which of course are also used to describe programming languages. This, and not any particular instantiation, is the data model — a statement which the designers of XML, by the way, are in full agreement on. Typically a schema defines a set of data types and a set of ordering constraints, which are semantically meaningful only at the point that the document is processed. But why stop there? Why not use that schema to define a set of control structures for processing data? Why not state whether variables are bound late or early, lazily or not? Why not define a set of data structures into which the data might be trivially but predictably transformed? Well you see, they did — and it’s called XSLT. And it is separate. And it is optional, and that is good, and you might be right — in fact, I think you are. But the fact still remains: every data model is asymptotically approaching a processing model. I would even suggest that the question “are the data models we have proposed for the humanities sufficient to the task?” is equivalent to the question: “does the semantics reside in the right place in our model?” Not because shifting the semantics around gives you new processing powers, because it does not. But because to the degree that any data model attempts to stay neutral with respect to future processing regimes, it must limit the practical affordances offered by that model to the data model. To do so might be to commit an act of magnanimity. To construct a data model in the absence of any particular judgment about future processes is, presumably, to let one thousand processes flourish. But it is also to limit what can be modeled, because that is exactly where a good number of the decisions about semantics are being made. We may comfort ourselves with the thought that every step up the chain of abstraction allows more flexibility at the processing level. But a dark voice remains, and should remain. Every step up the chain of abstraction also means separating further and further from what is presumably the point of all this, namely the attempt to exploit the computational tractability of the data. To give the processor more power is necessarily to give the data modeler less control. Not just less control over the processing, but less control over the data model itself.
[17:44]
So we really must ask ourselves: does having less control over the data model, which is not the same thing as saying “more flexibility,” makes sense for our data? Should we have gone this way? Should we have attempted to create a more tightly coupled ecosystem, in which the line between data modeling and data processing vanishes as a practical matter, as it does, it would argue, in sequel? Should we now think about doing that? I don’t know, and I’m sorry to end with something so obviously decoupled from a practical recommendation of any kind. But I take the point of this symposium to have been: “can we do what we want to do?” And I think it’s at least apposite to point out that as long as we talk about what we want to do, we’re talking at least partially about what our data models cannot do by themselves.
**applause**
[18:48]
[Wendell Piez] I thought that was brilliant, thank you. One of the things I thought was very interesting and a really critical point was when you identified the relationship between the design of the XML ecosystem and the roles played by different people who live in those work forms, right? And in fact, I would go further. I would say that actually XML takes a step back towards the LISP model, away from what SGML had. Also, SGML reflects a design that was very deliberately matched towards those roles because it was designed for a publishing system in which you have authors, or “creators,” who create the stuff and mark it up because they need to push through the system. Then you have editors who define the rules. Then you have production specialists who then optimize the processing of the information for the digital realm. Right? So, the design of that system is actually — you know, “this comes out of IBM” among other such places — but it’s very reflective of a certain sort of industrial culture . . . information production and management strategy. Which is appropriate to certain kinds of information, as well.
[20:09]
[Stephen Ramsay] So is it our kind? Is that your question?
[Wendell Piez] Right. I guess that is a. . .
[Stephen Ramsay] I really don’t mean to take a strong stance on this. I really just want move along.
[Wendell Piez] Yeah, I understand. I think your question “is it our kind?” is actually vital, because one of the reasons why XML has worked so well is because it does take a step back towards the LISP model. It says, you know, you don’t have to have your schema. You know, frankly, we can just go ahead and manage things with just the markup and the stylesheet. It gives you another way to get into that cycle, so that you don’t have to do everything up front, waterfall-style. That’s been really tremendously important and useful for us, right? But you see, the thing is that the dark side of the LISP model is that we cannot really do anything unless you are an auteur who understands everything at all points. It does not allow you to split things out.
[21:10]
[Stephen Ramsay] Right, and worse than that — XML also imagines that. . . when it says “no particular processing regime is imagined” it is absolutely. . . it doesn’t care what language you use, or what platform you use or anything like that. The answer to the question “How do you do anything on the LISP model?” is LISP. “How do I transform documents into other. . . S-expression?” LISP. “How do I extract information?” LISP. “How do I search the documents?” LISP. Right? No wonder the LISPers like it.
[21:46]
[Wendell Piez] You actually do hear that kind of rhetoric on the XML side also. Which is not to say that everybody loves XML syntax for everything but at a certain point, the syntax fades away and turns into the data model itself — and into the affordances of the tree-structure. So I think the point here is that splitting things out is actually very very good, because it allows us to distribute work and to work together and to communicate across boundaries and to optimize our roles.
[22:24]
[Stephen Ramsay] But would you grant to me that it is subtly reducing the affordances of the data modeler as the person who occupies it?
[Wendell Piez] Absolutely. There is a compromise. There is a trade-off there.
[Stephen Ramsay] There is a trade-off. But that trade-off frightens me a little bit, because, especially in the context of a symposium like this, where we want to take data modeling and (makes a gesture for ascendance).
[Wendell Piez] Well what it does is that it puts you in a position where you can do those things, but only if you can affect communications within a larger community or system, which is a good thing, too, right?
[Stephen Ramsay] So we should invite other programs to this [symposium]?
[23:00]
[Laurent Romary] First, I must thank you for providing such a nostalgic brief just before leaving tonight and I have to spend an awful night because I will think about LISP and PROLOG. I think we were having this discussion yesterday. I was an old fan, and I just see LISP fade away. (Laughter.) You should have avoided that. Anyhow, two major things which are essential for our discussion here concerning data modeling. I would put them on the horizontal and vertical lines. First thing is, when you speak about isomorphic — this is essential that when we are reflecting on ourselves when we model something, that we can take this tense of saying “okay, whatever tool we have we can just draw things on paper. Like, we can draw a tree if we really like trees.” The isomorphism is essential and it is reflected in . . . our renditions like thinking about the OMG with the notion of metamodels, or what we are doing also in this iso committee on language associations. Where are we describing a mechanism by which, through metamodels and and decorations with data categories, we actually create classes of models which, in turn, can be instantiated in any kind of XML document, or whatever . . . syntax from the past that we don’t want to mention.
[24:32]
The second aspect . . . you point out the word “affordance.” That is a good word because it’s the purpose of the talk at the end of the three days like this. It went very fast. Saying, basically, that you cannot think of any kind of modeling without thinking of processing. When you take some distance a little bit, this notion of “affordance” says “okay, when you start modeling data, the purpose is to contemplate a certain series of processing that you would like to do. As a researcher, you would like to be able to search the data. You would like to, through those affordances, the capacity to link and create new concepts which are more on the research side than just the observation side. So those affordances are very central when modeling things. So, those affordances are very central when modeling things. What are the concepts by which I want to create a certain stability in my data? So, it’s not just one single-sided orientation for one kind of processing. It’s really like a statistical method.
[25:30]
[Stephen Ramsay] Although I will point out to you that the TEI guidelines disagree with you on the fourth line on that. They disagree with you very quickly on the idea that what you do when beginning the data modeling process is think about the processing regimes you want to enable. That’s the way you’re not supposed to do it in the XML ecosystem. You may think that is crazy, and so do I, but. . .
[25:55]
[Laurent Romary] I think that having a very concrete project, having to encode some sources in the humanities, starts with trying not to overkill the use, for instance, of the TEI guidelines. Saying “what are the concepts I want to have in my text that I will annotate?” Are persons essential in my texts such that I will either take them or will disregard that part of the TEI guidelines. Because the TEI guidelines is a marketplace to create possible affordances.
[26:24]
[Stephen Ramsay] Although it’s interesting, because what you said is slightly different from the argument Julia [Flanders] was making yesterday. Or the day before, sorry — it’s all running together. If we can arrange ODD files and analyze them, what we will see is the scholarly community having a changing conception about text. I think the way I would say that, and I think the way Laurent [Romary] would say that as well, is that actually what we are seeing is changing process requirements over time. And that’s where the changing conceptions of text are occurring. It is not like we change around the ODD files because our notion of text is changing. That is too direct, right? Really what is happening is we want to do new things, and we keep adjusting the ODD. And I think this is really what I am trying to talk about. Where do we put the line between. . . where is it our job and somebody else’s job? Because I agree with you fundamentally, but I also think that we are, when we talk about data modeling, locating a point on the slope at which modeling approaches processing. And that is really all I am trying to draw out in this paper is: do we have that line right? And as a practical matter, we may.
[27:54]
Michael Sperberg-McQueen] [Unclear audio]
[Stephen Ramsay] Well maybe. I do not know. I mean, I would also point out that XSLT bears a tremendous family resemblance to SCHEME, and I think on purpose, because I think they were thinking about this exact issue when they wrote it. I think that this was one of the issues they were thinking about. Which of course the pro-programmers never thought about..
[28:25]
[Syd Bauman] First of all, thank you very much, Steve [Ramsay]. This was absolutely fantastic. I ate it up already, and I am going to eat it up again. I want to hold the position of, I was going to say “the devil’s advocate,” but perhaps after what you said to Laurent [Romary] I want to say TEI advocate in saying that I am a fan of believing that we can decouple our semantics from our processing. That comes from my personal history. As many of you know, I work at the Women Writers Project generating these texts, building not necessarily good models back then, but models of these texts without any possibility of processing them, without any hope of processing them, for years to come, in a complete vacuum of a processing environment. VaporWare was my best friend. I had to develop the semantics without any processing. I think we can still usefully build models that represent our thoughts about texts and defers the processing either to later in time or to someone else down the road, at some other place in our institution, in a very useful way. Maybe I am wrong.
[29:39]
[Stephen Ramsay] Right. And the question is not “can we?” Well we have. Sure, we did. Let us imagine LMNL. One of the things that is exciting, and here we are going back to the very first talk, what is exciting about LMNL to me is that, though there are practical reasons why you might want to model things and allow overarching hierarchies and have them in the same thing. .. but one of the excitements about LMNL to me is that I see it as a kind of anarchic experimental playground for reimagining data modeling. Because we could, in a moment of perversion, say “you know what, this language is going to have no nouns in it. There is going to be no such thing as italics. There is only going to be italicize. Everything is going to be a processing instruction.” Right? That would be a very radical view.
[30:41]
[Elisabeth Burr] Isn’t that the same thing as from the 1970s with text formatting?
[Stephen Ramsay] Oh, you’re thinking of things like Cocoa and other event based… (audience responds by naming a few other programs.) TROFF, right. But with LMNL syntax, it is possible that the overlapping hierarchies, if you think of those as functions and not ontic boundaries or objects or something like that, then it is possible that the problem of overlapping hierarchies disappears in part because you can always run functions in parallel. You can run as many as you want. You can say “this event started” and “this event ended” as opposed to saying “this is a discrete object” . . . I do not know, I have not sat and figured it out.
[31:30]
But my point is just this: I would not dispute that, first of all, we have done it, and that second of all we may have done it for all the right reasons and have the best possible system. On the other hand, as long as we are here to talk about data modeling I really want to talk about other worlds, and whether those worlds might have given us a different view. Because I presume that this symposium arose in part with a kind of anxiety. Do we have it right? Do we have the right tools? Do we have the right thoughts? Do we have the right affordances? It seems to me that is the right way to ask the question, is by saying “well, maybe it is all wrong. Maybe it is the LISP guys, maybe they were right . .” Well the LISP guys were not right, obviously but, maybe they had something to say. (laughter)
[32:23]
[Maximilian Schich] I think you raised exactly the right point. It is not just in the 1990s where you could only model what you thought about the text. It is like, throughout human history you could only model what you thought about the world. And now you can actually start processing. It is an interesting thing, that if we have applied the models and if we have collected the data, now we can actually process and take a look at whether it makes sense or not. We can look at the text and say “is Syd [Bauman]’s model better than the model of this other guy?” That is a huge shift. That is the race of data sciences versus data modeling.
[33:02]
[Wendell Piez] And what better reason in that context is also going to be contested. What I would say is that in the good old days, back in the 1990s, we did not actually have semantics. We had claims about semantics. We had it in the sense that Steve [Ramsay] was talking about. We were declaring a semantics, because our data was like this, which meant that it was not like that. Therefore you can certainly say there was a semantic potential in that. And in fact I think that one of the fascinating things about descriptive markup regimes in general is that, and Trevor [Muñoz] was talking about this yesterday, is the way in which we actually do try to maintain a kind of discipline of non commitment to particular operations in favor of . . . it is like Fotis [Jannidis] was just talking about with respect to the scholar talking about the enlightenment. You know, there is a climate, which means that you decide what we mean by climate. I want to see what you are doing with LMNL, but I also think Syd [Bauman] is going to be able to use it to describe data structures which he has no idea how to process.
[34:29]
[Stephen Ramsay] Poor Michael [Sperberg-McQueen] is over there thinking that I am going to reinvent ROFF. But actually I have a grander plan. What I’m going to invent is the TTI. It is the text transformation initiative, and it imagines that . . it is possible you could say “we are not going to define objects with three thousand tags. We are going to describe functions and operations.” Now, I am pretty convinced that is insane. But it is fun, as Allen [Renear] pointed out.
[35:01]
[Julia Flanders] I have to gloss this for a minute. I think that when you say “had no idea what we are going to do with it,” that that is actually not quite true. It is a little disingenuous, but very importantly so because in fact, we have what you might call an intermediary processing model, that is to say. . . we knew what kind of people were going to want to do what kind of thinking with a specific text. We did not know what kind of specific tools they were going to need to do those things, but we knew that they want to study people and we knew they want to study genre and we knew that they want to study these things. So I wonder if it is just a question of identifying processing at the right level of specificity.
[35:35]
[Laurent Romary] You have the dream of processing, at least.
[35:38]
[Wendell Piez] Well, you also had the ability and you did not fail to take advantage of the opportunity to validate against a schema.
[35:46]
[Julia Flanders] Sure, sure, but that is . . . after some point, yes.
[35:51]
[Stephen Ramsay] For the sake of time, we are talking about isomorphism and it is like . . . isomorphism all the way down. We are explicitly not talking about our ability to go from one input to another output. We are not talking about “can you do it?” We are talking about, “have we separated the concerns right?” We are talking about, “have we put people’s mental energy in the right place?” These questions seem, to me, paramount. They are connected to our architectures and our models. It comes back to, you know as I said at the beginning, I want an answer to the question “why does our workflow look so unlike some of the other workflows that are used.”
It was very interesting. Elke [Teich] put up her data model: Weka. R. RapidMiner. That is a very different way of thinking about data modeling, and not an illegitimate way of thinking about data modeling. I mean, we are sort of different. Maybe for good reasons, but what are those reasons?
[36:55]
[Syd Bauman] I just want to say real quickly before you begin the next session. I think Wendell [Piez] was coining at the term “potential semantics.” Using semantics as a much more powerful concept.
Pingback: Knowledge Organization and Data Modeling in the Humanities: An ongoing conversation | datasymposium