Tuesday, October 04, 2005

Internship update #($whatever)

I'm in the thick of things now; it's at the point where I can come in on Monday and am fairly self-motivated for what I do with my time.

I spent the entire morning and most of the afternoon working on TEI markup of the Indiana Authors biographical information. As is, I got through maybe six pages. It is tremendously slow going since not only does each individual reference to a person or area or topic or date require markup but nesting is imperative; thus it's not acceptable to do
<settlement>Chicago</settlement>
; rather, you have to use
<placeName><settlement>Chicago</settlement></placeName>

And that's not even adding the type attribute values, which are required in the majority of instances of many of the elements. I guess what I am trying to say is that there has to be a beter way. I understand that XML is hierarchical and therefore any scheme built on this model must use this basic concept. However, must it be so exacting? How can we expect every library to use this to mark up its texts. I don't even want to think about what it would take to mark up the entire Indiana library collection with this scheme.

I propose an RDF-based scheme to serve this purpose. That way, the hierarchy is assumed and does not need to be explicitly stated in each instance in the form of tags. Also, using RDF instead of SGML/XML would make it much easier for the addition of localized classes/properties/elements/attributes into the mix rather than relying on TEI to make an enormous and complex set of elements that the majority of collections being marked up might not even call for.

After the morning of coding, I took a break to do some reading on digital library workflows in anticipation of a meeting at 2pm with my supervisor and a programmer who's been working on the EVIADA project, a digital video repository. The meeting went well, and it looks like I'll be working on XSLT as my next project as soon as I formally renounce/disavow TEI. Kidding, of course. I look forward to that since, as my supervisor keeps remindng me, XSLT is a complicated and difficult but crucial part of the job. Nothing that's important is every easy, though, so I anticipate facing off and triumphing over this next challenge.

0 Comments:

Post a Comment

<< Home