Friday, September 30, 2011

Paso Doble

I might have to revise my judgement on wikipedia. I have always considered it only helpful when you already have a substantial amount of background knowledge on the topic you are looking into. In the follow-up of my ISO related worries, I had to admit that I gained some interesting insight through wikipedia.

Laurent told me about the kind of Paso Doble ISO and Unicode have been dancing for some years before starting working hand in hand. After he had explained to me the historical meaning of the ISO Latin script, I was somewhat more inclined to consider German Old script/Kurrent as related to it indeed, Latin being a very wide family of scripts. So much for the soothing of my hermeneutic irritation.

This didn't help much, though, concerning our encoding: we still had to differentiate between two types of handwritings. As Laurent explained to me, ISO standards were primarily conceived for printed characters, not for handwritten ones. So it only seemed logical to use the ISO standard for the Fraktur script, in which the texts written in German Old Script were actually printed in the 18th-19th century, to mark the difference from the non-German latin script.

But the story doesn't end there. The first problem is that Fraktur was not only used as a printed version of German Old script, but also as a printed version of Sütterlin, a very much simplified version of the German Old script used mostly in the 20th century. That would be in terms of ISO 15924: Latf, 217. But it is pretty insatisfactory to stay at such a general level that basically German Old Script/Kurrent and Sütterlin would be considered the same. The deeper problem is probably that working with handwritten material is not really compatible with a system based on the differentiation between scripts and fonts (what is a script and what is "only" a font).
And why does (according to wikipedia) Fraktur have a unicode number attributed to it and neither Kurrent nor Sütterlin do?

For those who are deeply bored by these considerations and would like to actually get more than far-fetched metaphors, I recommend watching Strictly Ballroom.

Tuesday, September 27, 2011

Brave new canon

In a paper published in Digital Humanities Quarterly dating back to 2009 ("The Productive Unease of 21st-century Digital Scholarship"), Julia Flanders points out a series of problems our digital edition has to face and, more importantly, reflect upon.

One interesting point is the fact that digital editions and digitization projects give corpora a new importance. Texts that were sofar considered secondary are just as easily available as any other literature. Also, documents that were not easily accessible (mostly for preservation reasons) are now online like other more robust media.

Not so long ago, handwritten material - which  is most of the time unique -, had to be tracked down to the small archive where it is preserved and an adventurous journey had to be organized for you to be able to see and read it. You would reach the archive exhausted and utterly excited, see that it is only opened 3 days a week for 2 hours, stand by the door the very minute it opens, grab your pencil (inkpens being strictly forbidden) and feverishly start transcribing, hoping to be able to decipher each and every letter, spot, stain during the few days of your stay. Once home, you would then start thinking about the content of your transcriptions.

Nowadays, you can search online catalogs after keywords (such as the very complete German kalliope catalogue), see where the documents you are interested in are preserved, fill in an application for digitizations and have those sent home, when they are not online already.

This modifies, as Julia Flanders points out, the relationship between canonical and non-canonical literature. It is true, too, that the digital canon that is arising under our very eyes seems structured by rules that are widely independent from the traditional history of literature.

For scholars used to work on unpublished, handwritten material, it also modifies the relationship to the archives and the objects of scholar desire you can find there. In one click, you can leave behind the closed world of the egoistic discovery (you know what I mean: the manuscript you cannot really believe you actually found, and you are so happy about you almost cry or laugh by yourself in your archive room) and make it known to the whole world.

This affects deeply our relationship to archivists as well. Because they have had, more than us scholars still, to go through a radical evolution of their role as a go-between, from the rooms of a small archive visited only by crazy scholars to a world-exposed position - with probably much more crazy non-scholars.

Monday, September 26, 2011

Latin Salsa

As you can see in this letter written by Lessing around 1750, it requires a particular training to just be able to read the kind of script used in Germany in the 18th-19th century (this letter is indeed a pretty nice and clean example - it usually gets far worse)

The texts we publish in our digital edition are mostly - but not exclusively - written in this kind of script, called german old script ("alte deutsche Schrift" in German, also known as "Kurrentschrift"). The problem we were confronted with was that of a change in the script occurring very often, especially between latin and german old script. The importance of those changes of script is obvious in terms of the materiality of the document, but also as a way to describe the literary practice of the different authors. In traditional German editions of texts of that period, it is standard to render the script differences optically, most of the time by changing font or font size.

But we couldn't find the german old script in the ISO standards (which we use in our encoding). So we (that is in this case Laurent) bravely sent a request to have an ISO number attributed to it.

Here is the answer we received from the ISO committee:
"After submitting your email to the experts, it seems to me that Kurrent is just old German handwriting which uses the Latin script."

Of course, the idea that this would be "just another" Latin script is irritating for us who spend hours trying to figure out what these characters are. But the real problem is that it is precisely the difference from latin script we want to make noticeable.

So the little salsa with the ISO committee might last a little bit, since we certainly will try to have Kurrent considered, if not a non-latin, at least a defined sub-class of latin script.

Friday, September 23, 2011

Digital Edition "Das intellektuelle Berlin um 1800"

The Digital Edition of the junior research group is a work in progress. It is probably bound to stay a work in progress for a long time, but we hope to go online with a first set of texts by the beginning of 2012.

This edition has two main aims. The first one is to make texts available that have not been published so far or have only been partially published. The second aim is to use the newly published corpus to develop methods that can help describe the intellectual networks in Berlin at the beginning of the 19th century in a more accurate way.

The picture shows a screenshot of what the front-end of the edition should roughly look like: the darker orange upper border lists the different corpora; the lighter orange one allows different types of access to the queries. The main window is composed of a digitization of the manuscript on the left side and its transcription on the right side. The transcription can be visualized either in a form that is totally adequate to the manuscript ("diplomatisch") or in a commented form ("kritisch") or in the form of the originary XML-file.

I will be posting our specific encoding guidelines soon. They reflect our restless efforts to respect the TEI-Guidelines without getting lost in details...

Getting started

I had been considering an English version of my webpage for a long time and decided to go for a more dynamical form instead.
I already sensed that blogging was rapidly gaining Academic recognition. This Guardian post convinced me definitely that it would be a good idea to try and go that way.
I will be posting news on my research, on the actualities of Digital Humanities in Berlin and around - and on all the things that might be worth posting!