December 29, 2023
IDN-tifying trends: insights from the set of non-Latin domain names
IDN-tifying trends: insights from the set of non-Latin domain names

Internationalized domain names (IDNs) are domain names featuring characters in non-Latin scripts, including examples featuring accented characters (such as münchen.de) and those which are entirely written in alternative character sets (such as яндекс.рф – Yandex Russia). This infrastructure allows brand owners to create domain names in local languages and target content to specific markets, but also provides potential for bad actors to create names which are deceptively similar to the official domain names of trusted brands (e.g. by substituting a character with a non-Latin equivalent appearing visually similar – a so-called ‘homoglyph’). 

In this study, we consider the full set of registered IDNs across all gTLDs (generic top-level domains, or domain extensions) for which zone files are available, covering around 1,000 different extensions, to identify trends and patterns, and indicators of potential abuse.

Overall, there are around 1.3 million gTLD IDNs currently in existence, across 470 distinct domain extensions, with the most popular being .com (853k IDNs), .net (136k), and .线 (Chinese for ‘online’) (28k). 

267 distinct domain names were found to comprise homoglyph variations of any of the top ten most valuable global brands in 2023, and not to be under the control of the brand owner. A significant proportion of these feature indicators that they have been registered for infringing use, with 79 (30%) found to have active MX (mail exchange) records, indicating that they have been configured to be able to send and receive e-mails and could therefore be associated with phishing activity, and 128 (48%) having privacy-protected whois records. Various examples were identified as explicitly hosting fraudulent or infringing content, including instances of lookalike sites (e.g. ǥoogłe.com, googļe.com, googłe.online (re-directs to googłe.co) and ɠoogle.com (re-directs to gooqle.cm)), misdirection and brand confusion (e.g. gooqłe.com, gooġlɵ.com, visã.com and ɢoogle.net).

Across the set of these non-official homoglyph domains, the average number of replaced characters in the SLD name (the part of the domain name to the left of the dot) is 1.62, highlighting the necessity for the use of detection technologies able to analyse strings in full in order to detect visual similarity, rather than just identifying instances which differ from the official string by (say) a single character. Ten examples were identified of domains in which more than half of the characters have been replaced with non-Latin homoglyphs, including (all with 100% non-Latin characters): ᴀᴘᴘʟᴇ.com, арріе.com, арріе.net, арріө.com, аррӏе.com, ᴍɪᴄʀᴏꜱᴏꜰᴛ.com, ᴍᴄᴅᴏɴᴀʟᴅꜱ.com and ᴀᴍᴀᴢᴏɴ.com. Three of these domains have active MX records.

The following additional points warrant specific consideration by brand owners:

  • The number of homoglyph domains targeting trusted brands – and the significant proportion of these found to be actively infringing or to feature indicators of suspicious intentions – highlights the need for brand owners to monitor activity in this space, combined with tracking examples of concern for content changes and launching enforcement actions when appropriate.
  • Many top brands incorporate instances of potentially-deceptive IDNs in their defensive domain portfolios; however, this approach in isolation is likely to be of limited effectiveness because of the infinite potential variations available to would-be infringers. Where domains are held for defensive reasons, it may be advisable for them to be configured to re-direct to the official brand website, to maximise traffic and minimise the risk of customer confusion. 

Similar trends in potentially fraudulent domain registration activity have also been observed in the landscape of Web3 blockchain domains, which also allow for a wide range of non-Latin characters[1]. This arena is also worthy of careful consideration by brand owners, who may wish to explore brand protection strategies across these emerging technologies. This approach may be particularly valid as the availability of desirable domain names begins to run low across traditionally popular areas of the domain landscape, such as .com[2].

 The full version of the study can be downloaded here.

 

 

 

[1] https://www.iamstobbs.com/trends-in-web3-ebook

[2] https://www.iamstobbs.com/availability-of-domains-ebook

Tags
Online Brand Enforcement /  Domains

Found this article interesting today?
Send us your thoughts: