[LINK] Diacritics and Search Engines
Roger Clarke
Roger.Clarke at xamax.com.au
Wed Jan 16 12:36:16 AEDT 2008
It's embarrassing to have to admit it (because I've done some work in
this area), but I've just twigged to the obvious - diacritics such as
umlauts, acutes and cedillas are not handled well by search-engines.
In the few languages that I'm familiar with, a letter with a
diacritic is appropriately treated as a variant of the letter, e.g.
u-umlaut is still a u (although in some languages the unadorned
letter may not exist, or the two may be treated as different letters).
I tripped over the problem because people have reported that they're
unable to find my paper from last September:
What 'Überveillance' Is, and What To Do About It
[Heaven knows what your email-client did with the u-umlaut ...]
http://www.anu.edu.au/people/Roger.Clarke/DV/RNSA07.html
If linkers can point to sources that explain this to dubbos like me,
and what to do about it, I'd greatly appreciate the assistance.
--
Roger Clarke http://www.anu.edu.au/people/Roger.Clarke/
Xamax Consultancy Pty Ltd 78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 1472, and 6288 6916
mailto:Roger.Clarke at xamax.com.au http://www.xamax.com.au/
Visiting Professor in Info Science & Eng Australian National University
Visiting Professor in the eCommerce Program University of Hong Kong
Visiting Professor in the Cyberspace Law & Policy Centre Uni of NSW
More information about the Link
mailing list