Patrick Sahle, “Modeling Transcription”

Theoretical Perspectives II (March 15):

Patrick Sahle, “Modeling Transcription” (slidesvideo)


[Patrick Sahle] Ladies and gentleman, I am […] following the presentation by Paul [Caton] yesterday to talk about [wheels?], and […] of course, as well, and why are we talking about transcription models. This is my program for the next twenty minutes. So you can see, I have to speed up a little bit, of course.

Traditionally, the idea behind transcription is that it yields with the textual description — (that’s the the quotation from an article by […] on the document) —that there is a total linear order in the text and that there are signs and characters that can be identified objectively.

The idea is that textual identity is a linguistic code of text that is in the wording or lettering. But who believes in such an approach today? There are many full theoretical problems with the traditional approach. I’d like to sum them up in my own conclusion, which is that it’s impossible to state a closed set of features, effects, to be transcribed in order to produce identical texts that is independent from time or […] It’s impossible to clearly define what is a letter, or a character, or a sign, with any […] and which sign or character it is.

There are practical problems as well. Transcription depends on the expected usages of a transcribed document. With the digital turn, at the latest, users have expectations towards transcriptions that are way beyond the linguistic code. I will skip some things…


My theory of transcription is largely based on a theory of text that I had elaborated earlier, maybe ten years ago, and presented elsewhere, but haven’t published in an accessible way so far, and I am sorry about that. That theory of text aims at text–not as the creation of new text–but text as the reproduction of textual objects. Insofar, it’s a useful basis for modeling transcription as well.

To keep it very, very short: what is text as the reproduction of text? Text is an attempt to convey meaning by expressing it verbally and fixing it with a document [or by meaning?]. There are other positions in between these main three positions.

Text may be seen as, for example, the work, as a specific structure beyond some linguistic code. Text may be seen as a set of graphs, signs, graphemes that can be interpreted as meaning characters or words.

And text can be seen as a visual phenomenon that sometimes carries semantic information directly, and not via written language, and that is why this forms a circle, and not a line.

Of course, I do not intend to state that there are only six notions of text. But, rather, that there is a whole universe of textual notions. Sometimes, […]  can be located as ranges on the wheel of text, which I will call the pluralistic model of text. So let’s talk about transcription.


Transcription is reading written down again. That’s why, traditionally, transcription leads to a linear set of characters and words. Text is what you look at and how you look at it. Text and transcriptions are defined by the glasses you wear, by the tools you apply. Transcription is mapping perception towards the target system. And then by reading or applying textual criticism, you add information to the transcription. So, there are reproductive and productive forces at work here. What we create as digital transcriptions are potentially information-rich resources, representations, that are not media presentations in the first place, but that are transmedial digital representations which become media presentations in further steps of processing.


Transcription as a result of reading–that seems to be pretty straightforward. You read the text and write it down. But that’s not true when you take a closer look at it. What you get is a physical document. What you see is a visual thing. What you read are signs, words, graphemes, and the result is a construction.


Let’s have another example. What you see might be a picture. On a grapheme level, we would have to transcribe it like that. Recoding it in the contemporary alphabet that doesn’t have the letter “v” would lead to this one. Taking our model alphabet as a target system would lead to this one. But “universum” as a word isn’t the result of simple rule-based character translation or identification. It’s the result of the identification of the string as representing a certain aspect of the word, which then is written down again. That was the traditional approach of [cabbala?]. That was the approach as triggered by print culture and its promotion of an abstract and word-letter oriented notion of text.


With the digital turn and its support for visual presentation, as well as multiple presentation, a pluralistic notion of text is facilitated. In this media and culture environment, various information channels may be transcribed and encoded. Like an image, mise-en-page, description of textual areas, graphemes, graphs, letters, or the structure of a work. Still, transcription is a distinction between noise and information, between noise and signal. And that distinction is made by the chosen text model or target system, which is like a pair of glasses.


What is noise for one target system is informational signal for another. There will never be a possibility for not having noise filtered out–no possibility to be complete towards all possible, recordable facts.


There is another old problem that can be addressed by this approach: authenticity and identity. Identity is determined by the limits of the target system. A document and its transcription can be identical if the transcription is complete and without errors according to the target system and its rules. Identity may lie in the use of documents and transcriptions. If they serve the same purposes equally well, they might be called identical.

Some of the transcriptional processes seem to be productive rather than reproductive, but some people might argue as well that this is not transcription but rather editing. Possible examples of productive processes in transcription are shown here. We could say we have reproductive forces mainly on the side of the physically and visually given and productive forces on the side of the more abstract notions of text.

Maybe this isn’t very convincing, since all transcriptive processes are interpretative processes, and interpretation is productive. And it may depend on your own notion of text. If you believe that a text is completely constituted by the linguistic code, then you may understand the notation of that linguistic code as reproductive, as a reproductive act rather than a productive act, as I would do.


Maybe we would have to concede that within the mapping from the document towards the target system, there are interpretative and thus productive processes. Think of a physical description of a document–a manuscript description, for example. So, we would have to draw another line between reproduction and production as is shown here.

Maybe the distinction between reproductive and productive forces isn’t that productive at all. I liked it for a while, but I feel increasingly uncomfortable with it. It would be the first occasion where I would disagree with Mats Dahlström.

Elena [Pierazzo] has called it, in another wording, “the shift from the mimetic to the analytic level.” Maybe that’s a better wording. Maybe I will switch to that in the future.


Anyway, nowadays transcription isn’t only the act of writing down again what’s been the result of reading anymore. But, with the technologies at hand, it’s rather the protocol of reading processes. There’s a multitude of possible mappings at the same time. I’d like to make a strong argument here for recording these multiple mappings, which can be justified by the multiple audiences and user scenarios we have.

I’d like to make another strong argument for diplomatic, detailed transcriptions as close as possible to the document. Reading steps are based on each other. Readings are incremental, even over time, with various transcribers or users. There is no way back if you start on a higher level. So, we have to start on a level as basic as possible, and I’d like to stress usefulness of the record of multiple mappings, not just final results of interpretation. We have seen examples from that in the previous slides, like having a facsimile, which is a sort of transcription to me, and the text, which is obviously. Another trivial example would be to record the abbreviation and the expansion. That’s what we do nowadays. It’s a record, it’s a notation of multiple readings, of course.


Another example: let’s assume we have a text represented or described as a digital image. We have a portion of text describable as a segment, as a zone, for example, of the page having coordinates. You may note that there are signs or even characters which are, as they are, like “sic” as they are translated to a certain target alphabet that’s already a strong step in processing, interpreting, identifying letters in uppercase, lowercase mode, or which might be corrected because found to be erroneous as compared to an abstract notion of a certain word and its autographic, canonized presentation.

You may as well visually note that the print mode is italics. You may interpret that as being highlighted compared to the surrounding text, or you may find that this has to be called an emphasized mode. And then you may interpret either the word as being or the mode as indicating a person name, which you may identify as a name of a certain identifiable person. And you may as well express this by an RDF statement like: “this text is talking about this person.” So what we have is multiple notations as a protocol of a reading process, where steps are built on each other. The act of transcribing aims at information-rich transcriptions as transmedia representations, which is another theory of mine, which I will not go in detail anymore. From these certain presentations, which are new documents, but which we may as well call performances or editions of a document, of a work, etc., can be generated or medialized. And this may emphasize or address particularly certain notions of text, like a critical work-oriented edition versus a diplomatic document-oriented edition. This would be, for example, a more traditional critical edition emphasizing text as a linguistic code and word structure. And then the more diplomatic edition emphasizing text as being the document, and having a certain set of graphemes, let’s say, for the moment.


Of course, this is all known to you. Sometimes it’s simply called the “single source principle.” But I wanted to show this again to make a point for thinking not just in terms of intended presentations or purposes of an intended edition, but also in terms of a description of text as abstract and information-rich and neutral towards presentation media as possible, and yes, I know there are limits to that approach.

One of the oldest core questions on transcription is surely objectivity and subjectivity. The traditional approach distinguishes between Befund and Deutung, as we have heard yesterday: between record or objectively identifiable features — “An A is an A is an A” — and interpretation on the other hand.

Surely I’m not the only one here who doesn’t really believe in this distinction anymore. I suppose as a starting point for further discussion to distinguish between two levels. First, every act of transcription is interpretative in the same way, as it is a process of mapping towards a target system. These target systems are disputable. Maybe they are arbitrary. No target system can be complete, since it’s related to future research questions, which we cannot know, yet.

For example, I will map these letters onto a given alphabet. I will describe these textual features according to a certain model of a textual genre. There is, for example, a salutation in a letter. Or another example, I will take a digital image of this document with 300 dpi, this color range, this camera, this lens, and this light. All of these are target systems, which are are disputable, and which have to be chosen about.


Second, decisions within these mappings towards target systems might be disputable in different ways and on different levels. Two transcribers taking pictures with the same camera in the same setup will probably get the same results. Two transcribers working with the same target alphabet and transcribing strictly on the character level would agree on 99% of the characters that might be less or more when they’re transcribed on the word level. (Why don’t I see all of my notes?) But as we all know, if we give transcribers a non-trivial document and let them transcribe it according to the guidelines of the TEI, we will probably never get the same result twice.

Okay, now, the last one was a trap, because here I mixed both levels. If you give somebody just the TEI guidelines, you haven’t clearly defined the target system, so you cannot take that as an example.


Okay, what is it good for? I think that such a model is good for many things.

First of all, it’s interesting or it’s good for analysis of media history. Which media or text technology supports which notion of text?

It might further be useful in teaching textuality, transcription, or editing. It’s an extremely simplifying model. That’s why it’s useful to teach people about textuality.

It’s a means for cartography and visualization to support better understanding. For example, what do people mean when they talk about transcription or text? So if we have, for example, literature, articles like that one by Claus Huitfeldt and Michael Sperberg-McQueen, “What is Transcription?” then we can locate that article on that wheel of text. We can say, they are talking about this range, this portion of the model.

The same is true for recent works of Peter Stokes on paleography and transcription. On graphs, ideographs, allographs, characters, ontographs — we could as well locate the work of Peter Stokes at a certain place in that model. The same is true for when (18:10) Melissa Terras talks on markup on a stroke level. Then she’s moving somewhere on TextG text as graphs or graphemes. Or when Melissa Terras talks about the critical implications of digital imaging, this is talking about transcription in a certain sense.

Or, when Wendell Piez talked yesterday about text as reaching from plain text to tables. Well, tables are always a very interesting case when we discuss textuality, because with the tables we can see that sometimes the text and its meaning, its semantics, leaps over the linguistic code, because a table is in the first place given visually as a visual structure. We derive a meaning and a semantics from it, from that visual structure, and not from the words that are inserted in the cells.


 Anyway, how does the model relate to other models? That would be another question. As we said yesterday, maybe we should look for a model of the models we use. Which ranges of the model are covered by which standard in which way? This can also be cartographied. The model may help in the assessment and further development of these standards, and it may serve as a reference for software development. I still dream of a software that supports the notation of a protocol of reading, or as we read.

Okay, two examples: How does the model relate to other models? One very trivial example is of course this one. FRBR. FRBR can be located here, like that but very, very roughly you’d have to have a more detailed discussion, of course, but that’s roughly where FRBR could be localized. Or another example: which ranges of the model are covered by standards in which way? Let’s have a look at the TEI. Obviously the TEI is traditionally focused on certain notions of text more than on others. The TEI assumes that you have a linguistic code and you talk about the linguistic code, for example, in terms of structures, of genres. That’s what the TEI is best at, I think. In this point of view, the TEI is not complete. It has some biases. But maybe things are on a good way in this example. Because we have the SIG on the manuscript, we have the SIG on ontologies, and we have another SIG on graphics. We could think that the TEI is on a good way, but of course, this is an extremely optimistic and friendly picture of the TEI, as it pretends that there are no gaps left. Maybe sometimes, someone will locate all single elements and attributes on this map. That would be nice. It would maybe show another picture where the TEI has biases and where it’s strong and where it’s weak.

We have another example for another SIG — where can that be located? The SIG from linguistics. You can see it’s not so easy to stick to that very closed model. Everything has its place and fills a gap and so on. And as I’ve said, that model may serve as a reference for the development of software that would ideally support multiple notions and multiple notations of text.


Okay, other further challenges. Yes. What I would like to do, or what I would like to see being done in the future, is testing the usefulness of the model with three cases, for example, of editing. So you can use that as a guideline for reviewing editions to assess what notion of text is behind this transcription or this edition. What range of notions of texts are covered by this edition. Of course this model needs further detailed modeling. For example, the situation in text as a grapheme is a wide range–covers a wide range– of phenomena, of features that people want to transcribe. So, if I were to talk to Peter Stokes, I expect that he would tell me, “well, you have to be more differentiated and yielded, and you have to have this and that and you cannot pack it all together,” so, we have to be more detailed in certain areas of that model.

We have to talk about various levels of abstraction.


Maybe what I showed was processes of reading and in quotation, going counterclockwise around the model. But we can also argue that sometimes abstraction and interpretation takes another direction by going from the core of it to outer orbits, I would say, instead of moving around that wheel.

We might also further think about transcription and text as an ontological compound object, because what is inherent in that model is a problem–what is actually the ontological status of the things we are describing? That model suggests that we’re describing different things at the same time. We can further think about the depiction of the FRBR levels, the FRBR entities, on that model, and we would find out then that it’s hard to drive a clear borderline for the single FRBR elements. So, even if you tried to transcribe a FRBR item, you will insert information that belongs to the manifestation expression or work level–that’s what I mean with ontological compound object. We can think about which steps in mapping and interpretation should be represented explicitly. Since many of these processes are rule-based, of course, so the question is, should we know the rules or the results of the application?


For example, we could note a rule: all cases of the letter “U” before other vowels may be converted to the letter “V” for certain usage scenarios. But then, we wouldn’t have to note explicitly “V” in addition to a “U,” like in universe.

What about concurrent interpretations and their notations? Then interpretation may depend on context. We would have to discuss where a borderline between textual representation and external talking-about-a-text may be found. We would have to model interconnections between different documents in text, as well as same texts. So what if you take the work as a starting point for transcriptions?

Then you have to integrate various documents. But on the other hand, if you start with the physical document, you have to share information on the work level with other documents. How can we depict these relationships?

And finally, of course, we have the problem of texts and their contexts and their contextuality. How can we depict that on such a model? But that’s it for the moment, and thank you very much.

[Thomas Stäcker] Thank you very much for your elucidation and the resolution for defining, maybe, in a way, transcription.

Just a suggestion: What do you think of introducing the concept of transliteration in addition to transcription?

[Patrick Sahle] Well, transliteration, from my understanding, is transcription on the character level. Right? [laughter.] So it’s like with the example here. What’s your concept of transliteration?

[Thomas Stacker] I’m talking about transcription in terms of interpretative approaches. So, a transcription is a concise interpretation, is a commentary, in a way. If you transcribe the manuscript, it’s very evident that the result of this transcription is the result of your interpretation of the text and what you mean, what you see there, or what you can interpret what the representation may be like. And, this is different from the sort of mechanical approach with transliteration. So I think there’s a step before transcription that could take […]

[Patrick Sahle] But then you end with transliteration–identification of characters. Or, lets say signs?

[Syd Bauman] So, first, thank you very much for a very interesting model to think about. I first want to just confess that I was initially lulled into a false sense of security when I had presumed that your use of the word “transcription”—just as I was lulled also when Allen [Renear] used the word […]—that your use of the term “transcription” was similar to Sperberg-McQueen and Huitfeldt’s use of the term “transcription.” They are almost completely unrelated, actually. But in that false sense of security, one thing that’s fun to do with a talk is to poke at its premises right away.

I think in your second slide or so, you said that it’s not the case that there are writing systems for which we can’t. [Sahle goes back to second slide] “It is not possible to draw a border between essential and arbitrary features. All features are relative…” I think it’s the next slide back or the next slide forward. Now my mind is just boggled by watching this go by.

So if it’s not possible to draw a clear border between essential and arbitrary features and actions in text, I’m thinking that that’s probably true for some character sets. I’ll bet that in the room, we would all, or most of us, would agree on some character sets, that’s true. But, I’m wondering if we wouldn’t also be able to come up with character sets for which it’s not true.

[Patrick Sahle] Yes, that’s right, I’m talking about [non-trivial encoding?] here [laughter]. There might be a document that’s simple and that simply contains printed characters, and we all agree on these characters.

But my problem is, what are we interested in when we deal with texts? We’re interested sometimes in the meaning it conveys. And that may lie not in the characters. That is where we lose every border.

[Syd Bauman] This is where I fall afoul of that.

[Patrick Sahle] So if we say: okay, transcription is about characters, then maybe we can find safe ground. But if I’m working with historians that are interested in “facts” or something like that, which are not conveyed by the linguistic code, but by something else, what can I do for them? How can I argue to them “Well, you could have transcription but it would have no sense, no value for you”? So, what sense does the transcription have?

[Elena Pierazzo] Lets go back to the transliteration point and I don’t think there is any such thing as a mechanical thing, ever. For instance, with “universe”, the “universum.” The difference between the “U” and the “V”: is that the transliteration? Is that the change in the alphabet? We change the alphabet since the moment in which the “universe” with the “V” or the “U” was more or less the same thing. And so we could call that a transliteration: the long “S” or the short “S” is that just a change of character? We change the alphabet and so we do that. But, the fact that we are able to get to the “universum” that is in bold over there is because we are mapping to our mind of the word “universum” that we know, so it makes sense for us to do that. And so, I mean, I do agree with the fact that there is a very fuzzy line in between not what it is that is on the page but what we choose to see on the page–because that is a very important point. My opinion is that it is not the effort you put to see what is on the page, but only what you choose to see on the page. You put two people together […] Is that new line, is that the same thing? We decide not to do the dimension, the spacing between the letters. Are we always thinking about that? Sometimes the justification of the margin makes a bigger space. Is that relevant? Perhaps it isn’t. But the point is that the game is always made with the choices and the reason might be, as you said, something…

[Syd Bauman] The evidence that Elena [Pierazzo] speaks the truth is that if you weren’t correct there wouldn’t be CAPTCHA. The character transcription system.

[Elena Pierazzo] At a three-letter level perhaps they are happy with that simplification, perhaps when you do a manuscript perhaps you are not happy with that simplification. Or sometimes you are. The point is that you cannot say a priori that this is the way it works. You never can say that. And every operation that you do from when you look at the manuscript when you type something or when you write with your pencil, is an editorial identification, that is what it is.

[Paul Caton] Yeah, I would have to completely disagree with you, Thomas [Stäcker], I think he did define transcription. My problem is that the transcription seemed to smear itself across everything from duplication to interpretation, and if you are going to model transcription, what I did not get from your model–and your model is fascinating, valuable with a lot of insight–but, what I didn’t get was a model of what you meant by transcription. What is this process that is going on?

As flawed as I think Michael [Sperberg-McQueen] would agree his initial model was, at least that was trying to say “this is the process, this is what happens.” It was an attempt to define transcription in the process of modeling. And what I didn’t get from your model was that you were actually defining transcription; you were modeling and defining a bunch of issues around transcription, but in the center, I couldn’t see what was there.

[Patrick Sahle] Yeah, but what you ask is what I’ve noted here. Where is the border? What is the border between transcribing a text, reproducing a text, and talking about the text productively. And my problem is I haven’t found that border.

[Paul Caton] But if the concept of transcription is to have any meaning, then you’ve got to put a border around it.

[Patrick Sahle] Yeah, right, right: I should have a border, but I don’t have one. [laughter] But if you could give me a border, I would destroy that border. [laughter from all]

[Michael Sperberg-McQueen] This may address the point that I think Paul [Caton] is driving at from a different angle. If you go back to the second slide…[laughter] the one that talks about the failure of the traditional model, I think that you are talking about the failure of the traditional approach to transcription as a method of representing text and not the failure of a particular account of transcription as an account of transcription.

So to come back to your historians. Yes. A historian says “I don’t care what characters are in the manuscript, I only care about the facts that are attested. My response to the historian is, my question for you is “since when did the concept of the transcription take on the responsibility of making every historian happy?” Yes, they’re not happy with the transcription, so? Yes, transcription is not an adequate way to represent every phenomenon that is relevant to textuality. No! There is a difference in the lexicon of English as normally spoken, between transcription and facsimile reproduction. And it is precisely the elimination of certain classes of information that make distinctions between transcriptions and facsimiles.

There is no need to define transcription in such a way as to ensure the result is an adequate tool for every conceivable, logical scientific operation. Indeed, if you want to know what transcription is good for, you must be willing to assign limits and say “yes, this explains why transcriptions don’t do certain things, and they do do others.”

[Patrick Sahle] I see your argument, I see your point. But, still I would ask: “how can you define, when you say transcription is about translating signs from a document, how can you define the limit on that?” For example, if you have a line break, it is not a sign.

[Michael Sperberg-McQueen] Isn’t it? [Michael and Patrick talking over each other] It’s a serious question, it’s a serious question. In any paleographic transcription prepared by a graduate of the École des Charts, line breaks will be marked. Why? Because for them it is an important graphemic element. And the writing system that they are reading in the document makes them want to distinguish that. Now, they transcribe it into another writing system in which it is typically represented in a different way, as a vertical bar.

[Patrick Sahle] Now, you tell me, when you talk about transcription look it up in the dictionary. And then you say “the line break is a sign.” So I would say, “but the dictionary…” You did say it was a graphemic element but…

[Michael Sperberg-McQueen] I asked a question. I claim that the difference between transcriptions which record line breaks and transcriptions which do not, quite often are traceable to the different writing systems applied by the two transcribers.

[Fotis Jannidis] When people disagree so fundamentally it may be good to have dinner together.

[Maximilian Schich] I have a question though, at about 10:30, on your model.  Your model is this nice little circle, right? Which makes me think of […] But at 10:30, there is this thing which I, as an art historian, there’s ontologies on top, if I’ve understood it right, and the visual out there, at say, 10 o’clock?  How do you connect this? You talked about the rest of the clock at the time but..

[Patrick Sahle]…I can’t figure what he means ‘at 10:30!’

[Maximilian Schich] Yeah, sorry.

[Patrick Sahle] So, what is the problem with 10:30? Just a moment, please? Uh…10:30…

[Maximilian Schich] Ok, so…yeah.

[Patrick Sahle] Oh, of course, that one? Yeah. [Finding a slide labeled “Text as Reproduction of Textual Objects” with a clock-like wheel in the center. Schich refers to position of 10:30 on the wheel.]

[Maximilian Schich] […] something with red circles around it, right? It would be interesting to superimpose all of them, and there would probably be a gap at like…11 o’clock, right? Is that true? Because, how do you go from “text as an idea, intention, meaning, semantics,” etc., to “text as a visual object?” [quoting the slide]  All the ways you described how to transcribe were around basically from 12 to…

[Patrick Sahle] Yeah, but that’s why it’s a circle: because that’s not always the case. I’ll give you an example. Medieval History. I have administrative records, taxpayers from the city and the fact that the names have an order can be interpreted that these are neighbors or living in the same house here, so the fact that they’re positioned on the page in a certain way appears semantic. So, in this case, I go from here to there [using pointer on screen, but screen is off camera].

[Maximilian Schich] I strongly doubt that it could actually […] because it wraps around, right?  I don’t know if that…if it’s like…you know, you put all the colors into a painting and also…

[Patrick Sahle] That seems intentionally like an oversimplification.

[Fotis Jannidis] (Calls for questions/participants) Stephen [Ramsey]? And then, Laurent [Romary]?

[Stephen Ramsay] I think that whether this model is right or wrong or useful or not…it exposes a faultline in this discussion. The faultline is between people who insist that the world is full of nouns and people who insist that the world is full of verbs. You know?

Do you think things are defined by what you can do with them?…in the actions in which they are implicated?…the processes in which they participate? Or, do you think they are primarily defined by some kind of ontological status? And we can hear it in the way the comments switch back and forth as we go from one person to another. So, one person says, you know, what is essential and what is arbitrary in the text depends on what you’re trying to do, and two people will agree. And then…and that’s fine. And then, the next person says, “if you don’t draw a box around the word ‘transcription’ and say what it means, then it doesn’t mean it.”


So, for me, the way the conversation is shaking out is it seems to be critical in a discussion of data mining is thinking about that difference because that…deciding where you are on the spectrum between the world is verbs and the world is nouns seems to be you know…has an effect on how you think about computational systems and their use and their being.

[Fotis Jannidis] Laurent [Romary]?

[Laurent Romary] Yes, carrying on with what’s been said…this discussion about the terminology there…we’re somehow the prisoners of terminology from the past which we’re trying to apply again and again even to someone is trying to reach into the future.


Who cares about transliteration, transcription, annotation? We wanted to kill the primary source yesterday. I mean, we’ve got categories in a context where…we’ve seen that in a previous talk…we’ve got a continuum of objects which we enrich. And [for] most of these objects, we don’t have a category–we don’t have a noun or a verb for the operation–we know we’re at an evolutionary stage.


So when we annotate, when we speak, like you said, this was very essential when you said speaking about a text is like transcribing or annotating or what have you. It’s just a series of stages, so we need an underlying theory for considering how we organize this graph of changes from a physical object when it exists because sometimes, part of the job is just to reconstruct an object that we know may have existed, and this is the case in many works and manuscripts.  And at the end of the day, you have an object which is never the end of the day because it’s obviously the primary source for further studies.  And this is our dream, in a way: it’s to bring a pool of objects which are constantly enriched by further scholarly activities. So we should try to forget try to put those things—where is annotation, where is transcription, transliteration?–basically, we need to be able to get rid of that, and it’s not easy, because we don’t have the words to speak about this.

[Fotis Jannidis] You’re totally overrating the digital. I mean, it’s a new medium but, it’s…it’s not an [irrevocable? heretical? theoretical?] change of the world. We’re still working with primary sources…I’m not sure that the bigger revolution won’t be coming back to the same concepts because they’re very useful in many different contexts.

[Laurent Romary] […] the big revolution.

[Stefan Gradmann] Patrick [Sahle], thank you. I think there is one problem with your new model and that is that it suffers from a very diffuse use of the term sign. We sometimes use it in terms of something semiologically complex data…the linguistic sign…and then again we can use it for theoretical […] So I think your model cannot be applied without clarifying what level–semiological or character […] we’re operating under.

[Patrick Sahle] Usually, I try to avoid using the word sign. Yeah, today, I didn’t but really I try to avoid it because […]

Inaudible question from room… (Which word?)

[Patrick Sahle]  A sign…a sign [clarifying].

[Syd Bauman]  I’m just curious: Did you coin the term transmedialization? Or have I just never seen this word before?

[Patrick Sahle] That’s what I say in German. [laughter] I say transmedializerung

Conversation takes over on the word transmedialization, but no one speaker or voice is clear.

[Fotis Jannidis] I get the feeling that the discussion is kind of…at its end today. [laughter again].

One thought on “Patrick Sahle, “Modeling Transcription”

  1. Pingback: Knowledge Organization and Data Modeling in the Humanities: An ongoing conversation | datasymposium

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s