Friday, October 28, 2011

Draw me a network

It would only seem natural, since we aim at reconstructing networks on the basis of our digital information, to set ourselves the ultimate goal to visualize them - you know, mapping-wise. But somehow I am reluctant to go this way. I feel like someone who would be given a car, the keys to it and even the directions to go from A to B, and would still not be able to get there.

My reluctance has first to do with the fact that I really do not get how a bunch of lines and arrows can give you more information than a sorted list. The only persons in awe for this kind of visualizations are those who are able to interpret them. And as far as I am concerned, being a text kind of person, drawings are not particularly hermeneutically inspirational to me.

There is a more general problem, though. Why would you want to reconstruct  networks when you work in German Literature? It is a different thing when you are, say, historian. Making network reconstruction fruitful in German Literature means being able to draw lines between writers and texts, and gaining conclusions that concern content and form of the texts. That is one thing. But there is also the peculiarity of the period we work on, where cooperative thinking and writing is structurally dominant. So the first step of the network extraction consists in reconstructing who contributed how to certain works (giving the first idea, correcting the manuscript, writing a review,...). There is no way to visualize all of those mechanisms as a whole. You have to pick a particular work or literary debate and see how several personalities work out a position around it in a very limited period of time. The idea of reconstruction "the intellectual network" is in itself absurd, as there are many networks interlaced with one another in relationships and in time.

As you can see on our logo, the very end of the perspective is still blurry.

Monday, October 24, 2011

Ja wohl, mein Herr!

I have actualized my webpage! You can now find there an acceptable presentation of the activites of my research group as well as a presentation of the digital edition - with some more informations than I posted here.

There is a "but", though: it is all in German. Not sure you can crack it all with google translation...

Tuesday, October 18, 2011

Unknotting the < note >

I am sure the suspense was unbearable, so here goes my post on the note-element.

For the TEI-geeks reading this blog: you have to bear in mind that we are a small project and couldn't afford going into the subtleties of genetics or critical apparatus. 

There were actually two layers of problems:
1) The most philologically troubling to me was the fact that you are supposed to encode very different textual elements with this < note > element: the footnotes of the author (in my example letter the part situated at the bottom of the page), the elements added by the people who at some point re-wrote (abridged, censored) the letter in order to publish it (here the part stroke in red ink), the elements added by the archivists along time (as for instance the foliation in pencil on my letter) and the comments we want to add to our edition, be it to explain in what book an extract was already published or what is the "Curiosität" mentioned on the first line for instance.

2) The second problem concerns the author's notes. Of course, when the author has changed or added a single word, we don't mark it as a < note >, but with an < add > element. So the question was: if < add > and < note > are so close elements, when does an addition cease to be an < add > and qualify as a < note >?

The first idea was to have different types of notes. Each < note > could thus be precised by a @type and a  @resp. To get back to my example, the footnote on the letter would have been a type="author" (the @resp being obvious), the red ink a type="second_hand". And there was the problem: there are 2 persons who gave an edition of this letter and may be responsible for the red ink: either the addressee - or the sender, but then he would have added this red ink stroke much later than he actually wrote this letter. There was no way this solution would not draw us into a black whole of endlessly more complex @types, @subtypes and @resp.

And then came this:

Laurent, who is obviously in a @hand-phase (didn't convince me at first, but now I am starting to see the point), suggested that we use the < note > element connected to a @hand to describe the different text layers from the author's hand, being thus able to give an exact account of the temporality of the writing process. The @resp attribute was to be used only for the notes we add to the text (in the form: type="comment" resp="anne.baillot"). Laurent was unsure what to do with all the other people mixing up between the author and us. But at that point, it was obvious to me that the separation between @hand and @resp was good way to sort the things you can actually see on the manuscript from the additional information, and so we will use the @hand to describe all we can see on the manuscript and the @resp for all the immaterial input.

If you have been following me so far, this is the point where you should say: But wait, this is not a solution the add/note problem at all. Things are not so clear there. The idea to sort < add > and < note > according to their position on the manuscript ("Lage" in Laurent's schema) is not sufficient - we are too often encountering end of sentences added in the margin AND complete new paragraphs written in the margin. Defining it by the context is a slippery slope as well: this involves so much interpretation that we would not be able to keep it coherent within the whole project. The aim now is to define for each subproject what are the habits of each author in this regard and determine according to that when to use < add > and when to use < note >.

The manuscript I used here is a letter written be the poet and drama author Ludwig Tieck to his friend Friedrich von Raumer in the year 1823 (or 1824, look into the part in red ink on top of the letter). It is preserved in the Staatsbibliothek zu Berlin-PK and part of the first series of texts we will edit (in the subproject "Tieckiana").

PS: I allowed myself to add the 6 happy girls on Laurent's original.

Monday, October 17, 2011

Brackets, here we come!

Würzburg definitely helped me get over my "fear of the brackets" (an expression that applied only too well to me).

Hear, hear: After another 2 hours of discussion, we finally found a satisfactory way to use the <note> element. I am so thrilled I just had to post it. Details following tomorrow! (I will try to find a satisfactory way to explain problem & solution...)

Saturday, October 15, 2011

Adios, Würzburg

So here goes, heading beack home (and already dreading the trip with the Deutsche Bahn)!

What I will remember of my first TEI conference:

1) Big projects and small projects do not really have the same workflow problems but in the end, we all end up trying to figure out a way to use the TEI-P5 guidelines without increasing their volume too much by adding this and that proposal for a complement because what we actually need is not described there yet. Which leads me back to the questions of what standards are actually there for. I can't think of a definite answer to that. Do we need a communication basis or a completely unified system? The irritating part is of course that the guidelines will give you 3 ways to encode one thing and none to encode another and you have to make the best ot of that situation for your project.

2) Long time archiving is a problem for everybody. I suspected that before and I think that considering it is a general problem, it should not be the job of the project managers to find their own little solution, but a problem for larger infrastructures like DARIAH in Europe to take care of.

3) There are so many cool problems connected to encoding. As a project manager, I am very busy keeping everything working at all and do not really have time to think profoundly about the hermeneutical implications of each and every decision we make but somehow it still has to do with what the early 20th century scholars did when setting standards for critical editions. We are in the process of redefining philology - the problem is, we are so in it that it is difficult to reflect it. But if anything, that would be the one theoretical question that would interest me right now.

4) The <note> element is really a problem, not just for us. I will post on that later and I promise to do so on the Special Interest Group Manuscript list as well.

5) Posters are a cool way to get to present your project in a way that the person you are talking to gets the information he/she really wants. We should have more of those non-authoritative kind of scholarly communication in the traditional humanities. But it still is a challenge to stand next to James Cummings!
6) Würzburg reminds me too much of the city I grew up and got endlessly bored in as a teenager. I want to get back to Berlin!

Friday, October 14, 2011

Slim Slam

(This is the 1-minute epos I presented in the poster/poetry slam preceding the poster session at the TEI conference in Würzburg)

Once upon a time there were six brave girls
Working in such caves they call archives.
A good fairy came along and said, through his beard,
Come, miladies and hear the tale of the TEI,
For with the Guidelines you may
Alas! We followed the siren’s voice
And not only digitized letters
But could also make the choice
What could cure our distress?
Not even the poor students
That we taught at great expense
Of time, 'cause peekaboo:
See, they are infected too,
The disease has spread all over!
Come to our poster!!

Thursday, October 13, 2011

Digital Editions Session Part 2

It seems to me that the different projects presented so far each try to deal in their different way with the dilemma of simplicity vs. complexity. The paper of Bertrand Francois Gaiffe and Béatrice Stumpf, mostly a technical one, showed the difficulty of trying to on the one hand simplify the TEI part (using TEI-tite) and on the other hand enriching it with the tools they need.

The last paper of the section was given by Matteo Romanello and Aurélien Berra. They presented an ongoing edition project of Athenaeus as the meeting of a Digital Humanist and a Classics scholar, starting by quoting Gregory Crane. The text itself is a compilation and the quoting system was a great incentive to make it a digital edition. The editors are - suitingly - using a wide range of quotation forms (<q>, <quote>, <cit>, <mentioned>, and several others). The complexity connected with the genre of the fragment, though abismal, was presented very clearly (somehow I tend to consider that people who are able to explain the problems clearly are not that far away from the solution).

Even they are not moving withing the limits of the TEI, asking for their project "what TEI should be combined with to create not only an edition but a complex virtual editing environment". They mentioned the Homer Multitext Project and went in more detail into the CTS procotol, which they use and intend to "go beyond". A beautiful project still waiting for a funding to make it real.

Digital Bundle

Here are links to digital editions that were mentioned in the course of the day:

Klagenfurt edition of the works of Robert Musil
Chinese Buddhist Electronic Text Association (CBETA) - in the Chinese Buddhist version!
Letters of Van Gogh (ah, the Van Gogh letters!)
Mark Twain Project (Autobiography of Mark Twain)
Henrik Ibsen's Writings (Maximum Edition online, Minimum available in print)
Edition project of the Works of Thomas Le Roy (still virtual, presented here)
Digital Edition of Slowenian Literature
The Homer Multitext Project 

And I came accross the Catalog of the Scholarly Digital Editions while browsing.

Digital Editions Session Part 1

The presentation of Marie Bisson's project of a Digital Edition of works by Thomas Le Roy was interesting to me especially because it is still pretty virtual - just like our project, still ringing with several layers of technicalities before you can really say what it is going to look like. But the added value of comparing 3 versions of not so identical texts seems to appeal to the more experienced colleagues in the room.

I could see that Marie Bisson chose to have the values of her tags in French. We decided to switch from German to English in our own tagging at a pretty early stage for reasons of readability of the xml-file. I will have to ask Marie Bisson why she kept the French...(the only presentation of the project you can find only is also in French)

And now to Faust: a titanic project at last!!
The first paper by Gerrit Brüning and Katrin Henzel presented the multiple encoding of the same information as 2 sides of the same medal. In this huge genetic edition, textual and documentary transcript are kept in different files containing different markups (markup rules are documented on a wikipage). In order to synchronize contents, an algorithm was developed to ensure automatic collation. Gregor Middell and Moritz Wissenbach then went into the technicalities. Looks like the Faust editors do not only know some mephistophelian tricks, but also Chapter 20 of the Guidelines almost by heart.

It is impressive how the several work steps are being identified to reach the publishing objectives. Obviously, as the speakers mentioned, it is NOT something you can try at home by yourself but need some serious institutional backup to realize.

During the discussion, Werner Wegstein asked an interesting question (at least interesting for us) about the different scripts (ha!) but I was pretty disappointed by the complicated answer (still no clear solution for us in sight...)

Live from Würzburg: Blue, Red, Green, Yellow, Black Dariah

In the first session (birds singing and sun shining outside), Armin Volkmann and Christof Schöch presented DARIAH and some of its subprojects. The beauty of European projects is in general the complexity of the structure they develop in order to involve the different partners - and the acronyms they use. DARIAH does makes not exception - the bubbles and arrows showing the whole structure are colorful and complex - an administrative wonder if there ever was one.

One first big project they mentioned is the one working on standardizing metadata. In the course of the discussion, it turned out to be still pretty virtual, but it really is the kind of things that one can only hope will not remain virtual.

Quoting from Martin Müller's letter, they presented several DARIAH projects working on developing TEI-aware tools. Christof mentioned to me later interesting projects like MONK and TXM textométrie that work in that sense.

On the question of community building (one of the aims of DARIAH), they suggested creating a Wikipedia portal. Although I am not sure which kind of crowd you can actually gather this way, it will surely reach a wider range of people.

Discussion also revolved around the question of the format of membership to the TEI-consortium. Although Martin Müller is not here, obviously his letter and the issues it raises are quite present.