Search, Buscar, Поиск, جستجو, Cari: Creating a Good, Multilingual Search Experience

Aug 31, 2015

In 2015, DigitalGov Search dramatically expanded support for languages on our search results page, expanding from just English and Spanish to support 68 different languages. Government agencies across the United States publish content in a growing number of languages to do the business of the country. Language-specific websites and mobile apps include not just translated content, but also site navigation and other lexical elements.

A word cloud of the word human, in multiple language and colors.

kgtoh/iStock/Thinkstock

This month marks the 15th anniversary of EO 13166, which directed federal agencies and federally funded programs to provide meaningful access to information for people with limited English proficiency. In the years since its signing, over 275 websites or subsites have been published in over 65 languages, including a few multilingual portals. If your agency supports sites or apps in other languages, let us know about them!

See, for instance, USAGov en Español (formerly known as GobiernoUSA.gov) or the Spanish-language U.S. Citizenship and Immigration Services site. And beyond thinking about the experience of non-English speaking site visitors in the U.S., many agencies have websites intended for audiences around the globe. The Department of State, for instance, has a local-language website for nearly all embassies and consulates. Imagine how jarring it would be, then, for a site visitor to run a search and find themselves on a results page written in English.

When a search is configured for another language, two major changes occur. First, search results will be primarily in that language. The language of the page is determined in two ways: by the locale setting in the page’s metadata, and by search indexes that detect non-English words on the page and place a tag in their indexes with a best guess as to the language of the page. Second, the interface of the search results page is localized—any words that would appear on the page are presented to the user in the selected language, as you see on this search for ‘información sobre mi caso’ on USCIS.gov/es. This full and thoughtful translation of a system into another language is called localization.

By now we are all familiar with on-the-fly translation services like Google Translate and other online services, and are aware of the problems with accuracy in machine-generated translations. If you have not yet run a paragraph through an online translator and then had it re-translated back into English, you should give it a try sometime when you need a chuckle. Humans are still the most reliable translators from one language into another, because we understand how word choices change based on context and location.

All system text used in DigitalGov Search results pages has been crafted by people, who know, for instance, which of the six French words for “related” is the right one for our “Related Searches” feature. (Hint: it’s not the one the computer recommends.) We also support right-to-left languages (like Arabic, Hebrew, and Urdu), and re-organize the page to support site users in their right-to-left searching.

The following 58 languages have fully localized results page support with both language-specific results and localized system text. (A big shout out to the USAGov en Español and Department of State teams for the localized text. Thanks!)

A bordered, three-column table with horizontal stripes.
Albanian [SQ] Hausa [HA] Portuguese [PT]
Amharic [AM] Hebrew [HE] Punjabi [PA]
Arabic [AR] Hindi [HI] Russian [RU]
Armenian [HY] Hungarian [HU] Serbian [SR]
Azerbaijani [AZ] Indonesian [ID] Sindhi‎ [SD]
Belarusian [BE] Italian [IT] Slovak [SK]
Bosnian [BS] Kalaallisut [KL] Somalian [SO]
Bulgarian [BG] Khmer [KM] Spanish [ES]
Catalan [CA] Korean [KO] Swahili [SW]
Chinese [ZH] Kyrgyz [KY] Tagalog [TL]
Croatian [HR] Latvian [LV] Tajik [TG]
Czech [CS] Lithuanian [LT] Thai [TH]
Danish [DA] Macedonian [MK] Turkish [TR]
Dutch [NL] Malay [MS] Turkmen [TK]
English [EN] Mongolian [MN] Ukrainian [UK]
Estonian [ET] Montenegrin [ME] Urdu [UR]
French [FR] Norwegian [NO] Uzbek [UZ]
Georgian [KA] Pashto [PS] Vietnamese [VI]
German [DE] Persian [FA]  
Greek [EL] Polish [PL]  

Ten more — Baluchi, Bangla, Creole, Finnish, Icelandic, Japanese, Kazakh, Romanian, Slovene, and Swedish — provide language-specific results, but don’t yet have localized system text. We have open-sourced our localizations, and encourage you to contribute to them on GitHub.

The next time your agency is setting up a non-English website, don’t forget about providing your visitors with a localized search experience, too. It’s easy. We’ve done the heavy lifting for you. Just sign up for DigitalGov Search, set up a search site for any of the above languages, and your visitors will have a good, seamless search experience—in any language.