You are on page 1of 8

The future of International SEO

The future of Search Engine Optimization (SEO) for International Business

Whitepaper

The World Wide Web is now allowing special characters in URLs which means crawlers now have a stronger signal to localize queries right down to regional dialects. It is a sizable change to the WWW language landscape given that historically coding has been based on English Latin characters.
A recent update to the World Wide Web has occurred in relation to how we can display and read certain languages that were previously not accessible to us. The Internet was founded in English (modern Latin) a language lacking diacritical marks (special characters). The update has facilitated the ability to display special characters in the whole URL. There are many languages, including French, German and Spanish, Arabic, Russian, Chinese, Greek and Hebrew etc., which have special characters within their written language. Being able to display these languages in URLs means certain countries and regions can now view their language as intended. The major search engines rank results first by using the same language that the query used. The domain name or Unique Resource Locator (URL) is one of the elements search engines take into consideration when listing and locating local resources on the Internet. With the new update, search engines will find it easier to hone in on country and regional dialects, and serve more locally relevant content. This is great news for international and local marketers alike.

What is this Whitepaper Addressing?


This whitepaper will discuss the merits of special character language acceptance on the Internet; and define the new technical acceptance of language characters in-lieu of the Internet Engineering Task Force 2010 (IETF) and 2011 ICANN inline update. This paper will also highlight the benefits of a local SEO strategy for international and local businesses, and digital marketers. And it will offer recommend tactics to help with localizing online assets to target local search queries.

What is Hyper-local SEO?


Optimizing online location assets to bring local internet searchers to a business Markets are continuously expanding their reach while holding on to their core colloquialisms. Moreover, people like to engage in their local language, and for day-to-day spending consumers will continue to go to their local stores. Hyper-local SEO develops and targets local Web pages (landing pages) to correlate with local queries/searches.

The Future of International SEO

What is ICANN?
Internet Corporation for Assigned Names and Numbers (ICANN) is an internationally organized, public benefit non-profit organization responsible for coordinating Internet Protocol (IP) address space allocation, protocol identifier assignment, generic (gTLD) and country code (ccTLD) Top-Level Domain name system management, and root server system management functions.

URLs and International Scripts


The Internet was founded in English (modern Latin). Therefore data formats and keyboard arrangements were developed with a bias for English, a language lacking diacritical marks. Internationalized Domain Names (IDNs) is a technical solution to translate names written in language-native scripts into an American Standard Code for Information Interchange (ASCII) text representation that is compatible with the Domain Name System (DNS) and have been available to register since 2000 in their basic format which only partially included special character coding in the URL. The Chinese Domain Name Consortium (CDNC) led the way in the creation of IDN character coding acceptance for double-byte characters in 1999 approved by ICANN. The Unicode support in the Domain Name System was updated in 2010 by IETF to accept special characters at the network level On June 20, 2011, ICANN said that new Generic Top Level Domains (gTLDs) in any words and in any language and with special characters based on IETF standards will be allowed for corporations to promote their brands. This does more than just truncate characters and redirect them. It actually allows for country specific character coding in URLs from inception at the network level. The recent announcement and update from ICCAN on new generic TLDs to accept [dot] [extensions] and accommodate any language has had strong media coverage with some companies, such as Google, applying for as many as 101 [dot] extensions and enterprise businesses waiting in line for ICCANs next auction.

Source: http:/ /www. webpronews.com/ what-does-googlesgtld-applicationssay-about-thecompany-2012-06

Whitepaper
What has not had as much publicity is the new character encoding acceptance. URLs have now been updated to further accommodate special characters that were not readily available in the original Latin-based URL coding. DNS is now compliant to the Universal Character Set Transformation Format 8bit (UTF-8) at the network level due to an IDN update by IETF of which ICANN has complied when they released the new Generic Top Level Domains (gTLD). This is great news for consumers in many countries. People can now search and type into Universal Resource Locators (URLs) in their local language with their custom diacritics and special characters, such as circumflex[^], diresis[ ], tilde[~], and cedilla[]. In laymans terms, this means that special characters are now automatically accepted and displayable in URLs all over the world. This helps open the doors for hyper-local digital marketing in targeted emerging regions.

Good for International Business


Up until now, all international domain names have had the Latin letters behind the dot (.jp. .cn, com etc). The announcement by ICANN permits the extensions after the dot to be used in international scripts. For example Chinese: . (.cn), Japanese: . (.jp). Egyptian: . Russian: . (.ru). Any word or meaning can be represented in its original character set in any language within the whole URL. The benefit for business is that brands that contain special characters can now maintain their full brand identity. For example: , a well known brand in Russia (Moscow) has for years been forced to water down its identity online in URLs instead using MegaFon due to the use of the Russian Cyrillic letter [F]. Russian consumers can now search in their own language with their own language keyboards and receive corresponding language responses.

The Russian registry reported that it broke through the 200,000 domains mark for registrations of . within the first six hours, after it opened its doors. Russias Cyrillic internationalized domain name, ., received its 800,000th registration on April 5th 2011 putting it 15th place in terms of European ccTLDs.

73% of Online Activity is related to Local Content


Source: Google

82% of Local Searchers Follow Up Offline via an In-Store Visit, Phone Call or Purchase and 50% of local searches result in purchases
Source: TMP/comScore

57% of Internet Users who Shop Online, Purchase Offline


Source: NPP Group

Why is This Important to Online Marketers?


Local people are now able to use their local language instead of adapting to the existing Latin characters. Raking in search engines may become easier for the websites in countries and languages that have historically had to base URL nomenclature on new Latin characters.

The Future of International SEO


Optimizing for countries with linguistic colloquialisms is important to serve local search queries in up-and-coming markets. Local consumers can better identify with the online retailer offering local deals, grabbing their attention, and driving them to engage. The market potential is sizable, particularly in such emerging regions as Africa, the Middle East, Far East, Asia, Asia Pacific and Eastern Europe that are open to change. English Terms will only perform well where English is the official or dominant language. 64.8% of total Internet users in the world do not use English as primary language. 35.7% of total population uses European languages and 32.3% use Asian languages.
Source: http:/ /glreach.com/globstats

Out of 2 billion Internet users, statistics show that 1 billion do not use Latin-based language.
Source: www.internetworldstats.com/stats.htm

90% of Internet users prefer online content in their native language.


Source: http:/ /ec.europa.eu/public_opinion/flash/fl_313_en.pdf

International Location Factors


There are hundreds of technical and semantic signals used by search engines to decipher, extract and feed the location of queries. However, these all sit below the fold of what we coin the SEO international backbone factors. International coding standards and updates, including ICANN gTLDS and WC3 UTF-8/16, IETF, IPV6 and IRI, sit above the search engines and to a large degree govern the internationalization of the World Wide Web. Historically search engines could detect language differences, but were unable to pinpoint dialects and regional location just by crawling the content. Lang attributes have been needed inside span and div tags to create greater relevancy within international search indexes

Whitepaper
With the updates to IDNs and move towards localization, these rules are under review. Having acceptable character deciphering and encoding will allow search engines to more readily understand dialects and pin point regions and dialects.

Deciphering and Targeting a Local Query

1.) The major search engines rank results first by using the same language that the query used, then further down in the results, translating the query into other languages. Correct identification of the language of the queries is of critical importance to search engines.
Source: www.aclweb.org/anthology/P09-1120 Language Identification of Search Engine Queries

2.) The domain name or unique resource locator (URL) is one of the elements search engines take into consideration when listing and locating local resources on the Internet, aka Country Code Tope Level Domain (ccTLD). The new ICCANN and WC3 standards advance the ability to control and target traffic as the Web expands beyond the realms of .com, .org, .net, and .ccTLD. 3.) Google and other major search engines use the geographical information of the users server as a signal. This signal becomes even more important when the domains have a generic TLD, such as .com, .net, and .gov, especially if the geographic targeting features in search engine webmaster tools are not enabled correctly. 4.) Hundreds of Factors Many rules in the search engine ranking algorithms involve the location and frequency of keywords on a Web page, hyperlink, URL, meta in-code internally (density ratio) and externally (citations). If youre expecting that the typical visitor to your site coming from a search engine will be using accented letters when they search, then using the accents on your page should improve your rank in the results.
Source: webhostingtalk.com

The Future of International SEO

Actionable Insights and Recommendations


Some 73% of online activity is related to local content and 50% of local searches result in purchases with 90% of Internet users preferring online content in their native language. These statistics are compelling for companies that have a local presence and diacritics and special characters are core to many cultures businesses communications. International marketers now have the ability to talk to the newest emerging markets in their own language. International local digital marketing (local SEO) is not just a function of chasing search engine algorithms. Moreover, search engines are themselves chasing and compliant to www backbone standards. A few key takeaways to work with to help International/local targeting: 1. Language and its diacritics/characters are a collection of rules specific to a language and/or a geographic area. When employing a local campaign, limit the code and semantics to one dialect, or prepare separate micro-sites, sub-domains, or applications to target each location. 2. Place text that has been translated into a database, a separate file, or a particular location within a file. This will enable and inform targeted indexing and thus correct serving of your content. 3. Place keywords in the target language (with special characters) in your URLs. 4. Implement Lang attributes within your coding and select your targeted country within webmaster tools. 5. Consider in-country hosting or local rerouting options. 6. Comply with rules and algorithms of in country search engines.

About Rio SEO


Rio SEO provides best-of-breed technology solutions for earned and owned digital media programs, specifically for SEO (search engine optimization) and social media marketing. Based in San Diego, Rio SEO is among the largest independent providers of SaaS-based SEO automation solutions with patented technology. Rio SEO offers application modules for organic search and social media with tools for content marketing, auditing, reporting, change tracking, keyword discovery, competitive analysis, mobile site optimization, SEO execution, and local SEO automation. Clients include brand marketers, retailers, and digital agencies. More information about Rio SEO is available by calling 858.876.3010 or at rioseo.com.

Appendix
Definitions
Double-byte characters: Some languages, such as Japanese and Chinese, are made up of double-byte characters. If you just translate the English text into Japanese, the page size may become too large. Maximum for one byte is 256 characters and two bytes can represent up to 65,536 characters. Unicode: An effort to include all characters from previous code pages into a single character enumeration that can be used with a number of encoding schemes. ICCAN: The right to use a domain name is delegated by domain name registrars who are accredited by the Internet Corporation for Assigned Names and Numbers (ICANN), the organization charged with overseeing the name and number systems of the Internet. In addition to ICANN, each top-level domain (TLD) is maintained and serviced technically by an administrative organization operating a registry. W3C: The World Wide Web Consortium is an international industry consortium founded in 1994 whose purpose is to develop specifications, guidelines, software, and tools to promote the Internets evolution. UTF-8: (Universal Character Set UCS Transformation Format8-bit [1]) is a variable-width encoding that can represent every character in the Unicode character set. UTF-8 has become the dominant character encoding for the World Wide Web, accounting for more than half of all Web pages.

References
http:/ /www.icann.org/en/news/announcements/announcement-30jun06-en.htm http:/ /en.wikipedia.org/wiki/UTF-8 webhostingtalk.com http:/ /glreach.com/globstats www.internetworldstats.com/stats.htm http:/ /ec.europa.eu/public_opinion/flash/fl_313_en.pdf www.aclweb.org/anthology/P09-1120 Language Identification of Search Engine Queries www.newgtldsite.com/gtld-application-process/ http:/ /en.wikipedia.org/wiki/Country_code_top-level_domains http:/ /www.independent.co.uk/news/media/the-internet-gets-international-with-the-arrival-of-nonlatin-domainnames-1967712.html http:/ /news.cnet.com/8301-1023_3-57459036-93/whats-.google-want-with-101-new-.domains-anyway/ www.webpronews.com/what-does-googles-gtld-applications-say-about-the-company-2012-06 www.icann.org/announcements/idn-tld-cdnc.pdf The Forrester Wave: SEO Platforms, Q4 2012: http:/ /www.rioseo.com/forrester-rio-seo-is-the-only-leader http:/ /searchenginewatch.com/article/2064539/How-Search-Engines-Rank-Web-Pages http:/ /marketingland.com/5-reasons-why-national-brands-should-be-utilizing-local-targeting-in-mobile-24416 http:/ /searchenginewatch.com/article/2065661/How-Language-Affects-the-Long-Tail http:/ /tools.ietf.org/html/rfc5891 http:/ /www.discountdomainsuk.com/domain_services.php

9255 towne center drive, suite 750, san diego, ca 92121 858.876.3010 rioseo.com 2013 Rio SEO, Inc. All rights reserved. Rio SEO, the Rio SEO logo, Tag & Trace, Rio SEO Activate, Rio SEO Advertise and Rio SEO Analyze are trademarks of Rio SEO, Inc. in the U.S. and various countries. 02.25.13