ELAG 2001 - Integrating Heterogeneous Resources - Prague, 6-8 June 2001

WORKSHOP #8

XML AND RELATED TECHNOLOGIES

Use of well defined formats has been an important issue in libraries for ages. When new technologies approach it is important to investigate their usefulness also for library applications.

XML represents a family of related technologies that have become very popular and widespread in use. XML is a very flexible language that is useful for representing almost everything that needs a formal definition. Since the structure of XML is the same over heterogeneous applications, XML can be used as a standard language for defining formats for messages, records and files. One good example is use of XML as syntax (format) for MARC records.

The XML family has already many members, and its size is growing. Some of the more important members are:

In this workshop we will take a closer look at those technologies and discuss if they can be beneficial for library applications. The participants are invited to share their experiences, plans and thoughts regarding XML. We expect to have some useful discussions regarding the pros and cons in using XML technologies. And maybe we can end up with some guidelines on applying XML in libraries.

XML is the basic syntax that can be used to express almost everything in a simple yet formal way. The structure and rules for a XML document might be expressed in a DTD (Document Type Definition) or in a XML Schema. DTD is inherited from the older SGML (Standard Generalized Markup Language ISO 8879:1986(E)) standard. XML Schema is a new proposal that has more power and richness for expressing structure and rules for the content of a document.

Use of namespaces can make the markup in a XML document globally unambiguous. It makes it easy to share and exchange documents and data without fear of ambiguity.

XML is not meant to be displayed directly. For displaying and presentation there is defined another language, XSL. The same XML document might be the origin of several different presentations with the use of different style sheets. One can use different formatting for low and high resolution screens, print on paper and export to other systems. A part of XSL is XSLT that can be used for transforming a document into another formats. A MARC record in XML can be transformed into an ISO 2709 record, and a bibliography can be formatted as a PDF document. All this can be achieved from a basic record structure in XML and transformation using XSLT.

XPath is a specification on how to address parts of a XML document. It is a utility that is included in several of the other XML recommendations such as XSL, XML Schema. XPointer and XML Query.

We are all familiar with the simple links available in HTML. With XML it is possible to use more advanced connections both between and inside documents. XLink and XPointer make it possible to have two-way links and links that points to more than one target. The link may also address only a part of a document, so a user does not need to fetch a large document when only a part of it is interesting. This might be a useful feature when accessing full text documents.

There have been several approaches on defining a query language for XML. W3C has a working group for making a specification for a new query language. The work is in an early stage, but the results so far seems promising.

XML can build the basis for formats on higher levels. RDF (Resource Description Framework) is useful for expressing relations between objects. RDF may express those relations in a syntax based on XML.

FRBR (Functional Requirements for Bibliographic Records) has two objectives: First to define relations between bibliographic data and their users, and second to recommend a level of functionality for records created by national bibliographic agencies. The data models that emerge out of those objectives may very well be expressed in RDF, and then use XML as its basic syntax.

There are also some common misapprehensions around XML and its employment in libraries.
One of them is that XML will replace MARC in bibliographic records. MARC is a format on another (higher) level than XML. MARC records might be expressed in XML syntax as well as tagged line formats or ISO-2709.

The areas for applying XML in library applications are many, and the usefulness of XML is great since library data is heavily based on formal structure. In a library system almost all modules might benefit from XML structured data. This is especially true in document ordering, cataloguing, circulation and interlibrary loan.

During the days of ELAG 2001 we will share our experiences with bibliographic data and see if XML can give them a better structure. Your knowledge, ideas, thoughts, experiences and visions are the most important resources for this workshop. We will discuss both the strong and weak sides of XML in relation to our common knowledge. Hopefully we can together sort out where XML is most beneficial today and where it needs to be developed further.

Some useful links:

http://sunsite.berkeley.edu/XML4Lib/

  XML for libraries discussion list.

http://www.w3c.org/   W3C Consortium, from whom most of the recommendations origin.
http://www.softwareag.com/xml/   XML information from Software AG.
http://www.xml.org/   XML resources from the OASIS consortium.
http://www.xml.com/   XML resources from O'Reilly and ass.
http://www.isgmlug.org/   International SGML/XML Users' Group.

Prepared by Jan Erik Kofoed Jan.Kofoed@bibsys.no,
BIBSYS Library Automation, Trondheim,
27 April 2001.


Home | Invitation | Program | Information | Progress report form | Submitted Reports | Workshops | Registration form | Participants by name | Participants by country