[LINK] RFI: Dublin Core 10 Years On

Glen Turner glen.turner at aarnet.edu.au
Wed Jan 17 15:33:39 AEDT 2007

Roger Clarke wrote:

> My impression is that a lot of developments have occurred alongside 
> rather than as part of the Dublin Code movement, e.g. DOI, DRM.

On the other hand a lot of file formats which need simple metadata
use Dublin Core to do the job, especially in XML file formats.
Two examples are Open Document Format and Scalable Vector Graphics.

 > Are there convenient mechanisms to support authors to quickly generate
 > metadata for their documents just before they release them?

Users who fill in Open Office.org's (ODF) Document Properties or
Inkscape's (SVG) Image Properties or Acrobat's (PDF) Properties
are creating Dublin Core records (in an eXensible Metadata
Platform wrapper) they just don't know it.  OOo creates a Dublin
Core record automatically containing the file name, date, user's
name and organisation and can be configured to prompt for
the other meta-data when saving for the first time.

In that sense Dublin Core has been a success. It prevented
a multiplicity of formats for representing basic metadata.
Certainly new XML formats that use something other than XMP
with DC for metadata about the content have the cluestick
applied pretty quickly.

The definition of Dublin Core has been tightened up with
more fields having recommendations to use a controlled
vocabulary. That should help user interfaces considerably
as users are much more likely to use a sticky dropdown
rather than enter data into a empty field.

You might want to compare Dublin Core with the metadata
in MP3 files. ID3 tags were invented by amateurs and
it shows. For example the controlled vocabulary for
Genre is next to useless -- it differentiates Ballad
and Power Ballad but there are no entries for Spoken Word
let alone differentiating Spoken Word from Book Reading [1].
And the ID3 Year applies to the year of release of the
CD, although in practice everyone uses it as if it
were the year of recording of the track.

Then there was the question of if the ID3 tag
goes at the start of the end of the file and
if it is padded.  In practice you write the data
in both formats as some players use one format
and other players use the other format.

In short, you can see what happens without a well
considered document like Dublin Core, even though
Dublin Core appears almost trivial.

> Has auto-generation of metadata arrived?

That's the worst of all worlds. Automatically generated
metadata that attempts to analyse content will never
be perfect, so you are capturing the progress of the
technology at the time the record is created. Better
to use full text and use today's technology to analyse
the content.

Cheers, Glen

  [1] ID3 Genre tags:

