[LINK] Curious tiny URL variant using Unicode

Ben McGinnes ben at adversary.org
Sun Nov 13 11:43:15 AEDT 2011


On 13/11/11 4:06 AM, Martin Barry wrote:
>>
>>> ж.tv is displayed as xn--f1a.tv
>>> ᄒ.ws is displayed as xn-hqd.ws
> 
> This is punycode encoding. http://en.wikipedia.org/wiki/Punycode

Yes, it's been around for a long time.

>> An attacker could issue the domain "micr*s*ft.com" where * is one
>> of these other "o" characters, and it would appear to you as
>> "microsoft.com". If your email client or web browser then shows the
>> domain in its ASCII format, it would not look at all like what you
>> expect and a possible exploit is averted.
> 
> The issue of homoglyphs in IDN is a well discussed problem but I've
> lost track of where they got to. One of the options was to ban
> particular unicode characters, but then you reduce the usability of
> IDN. Another option was to ban the mixing of character sets within
> unicode (i.e. something that was mostly ASCII should theoretically
> be all ASCII) although this has so many corner cases it's probably
> impractical.

Rather than potentially break punycode by adding complexity it was
decided to address these issues in TLD policy.  In general various
ccTLDs restrict domain licenses to ASCII plus the particular part of
Unicode that deals with their native language and symbols.  This is
why you can get Chinese characters in .cn domains and kanji in .jp
domains, but you can't get Chinese characters in .jp or kanji in .cn.

The microsoft.com example wouldn't work because the .com policy
prevents using Unicode to do exactly what was described.

The view of the IETF was that these sorts of things were the result of
legal and economic influences.  The best way to address that being to
deal with it using legal and economic measures in the form of domain
name policy instead of trying to address it through technical design.
Those ccTLDs which do not have such restrictive policies will become
known as less trustworthy, if they are not already.  All the gTLDs
have fairly robust policies already.

Obviously the URL shortener is using ccTLDs that may not have the most
stringent policies.  The two ccTLDs were .tv and .ws, which belong to
Tuvalu and Samoa, respectively.  Both countries view their ccTLD as a
new resource which can be used to fund more programs in their
countries (e.g. roads, running water, etc.).


Regards,
Ben

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: OpenPGP digital signature
URL: <https://mailman.anu.edu.au/pipermail/link/attachments/20111113/ac0c5c47/attachment.sig>


More information about the Link mailing list