[LINK] Curious tiny URL variant using Unicode

Martin Barry marty at supine.com
Sun Nov 13 04:06:49 AEDT 2011

> Paul Brooks wrote:
> > Yes - links displayed correctly in Thunderbird 8.0 email client, and
> > when clicked on were displayed correctly in Firefox 7.0.1 browser.

Same here on a text console in Mutt and opening in Firefox. :-)

> > Curiously, the URL shown at the bottom of the window when the mouse
> > hovers over it is a ASCII text translation:
> >
> > ж.tv is displayed as xn--f1a.tv
> > ᄒ.ws is displayed as xn-hqd.ws

This is punycode encoding. http://en.wikipedia.org/wiki/Punycode

$quoted_author = "Rick Welykochy" ;
> I am sure Linkers are aware that there be nasties lurking in Unicode
> domain names. For example, there are several character (perhaps many)
> characters in Unicode that look pretty well identical to the lower case
> letter "o".
> An attacker could issue the domain "micr*s*ft.com" where * is one of these
> other "o" characters, and it would appear to you as "microsoft.com". If
> your email client or web browser then shows the domain in its ASCII
> format, it would not look at all like what you expect and a possible
> exploit is averted.

The issue of homoglyphs in IDN is a well discussed problem but I've lost
track of where they got to. One of the options was to ban particular unicode
characters, but then you reduce the usability of IDN. Another option was to
ban the mixing of character sets within unicode (i.e. something that was
mostly ASCII should theoretically be all ASCII) although this has so many
corner cases it's probably impractical.


More information about the Link mailing list