Every Language, Every Script, All at Once
By Gabriel —
From the start, we wanted Wokabulary to be a tool for learning any language, not just European languages.
That might sound obvious today. But the reality is: most computer systems, software, and websites were originally built with English and Western European languages in mind. Supporting the full richness of the world’s writing systems required thoughtful engineering at every step. In this article, we want to show you some of the hidden technologies and design decisions that make Wokabulary a truly global language learning tool.
Unicode: The Foundation for All Scripts
If you tried to write Korean, Hindi, or Amharic on a computer in the 1980s or 1990s, you would have quickly run into problems. Different systems used different “code pages” and encodings, and mixing languages often led to unreadable text.
Today, that chaos has been largely solved thanks to Unicode. Unicode is a universal standard that assigns a unique number (called a “code point”) to every character, symbol, and even emoji — no matter what language it belongs to. Whether it’s the letter “A” (U+0041), the Korean syllable “한” (U+D55C), or the Devanagari character “अ” (U+0905), Unicode ensures every piece of text has a consistent identity across platforms.
Wokabulary fully embraces Unicode. This means you can safely store, learn, and search vocabulary from any language — from commonly studied ones like Spanish and Japanese to lesser-known ones like Georgian or Khmer — without worrying about encoding errors or compatibility issues.
Typing: Making Multilingual Input Natural
While Unicode solves the storage problem, input is still challenging — especially when you’re learning a language with a completely different script.
Imagine learning Korean. Korean uses an alphabet called Hangeul, which is made up of simple, highly regular letters. To type Korean on a computer, you need to activate a Korean keyboard layout in the system settings. In the Hangeul layout, what has been your R key now produces the consonant ㄱ, and K becomes the vowel ㅏ. The system then automatically groups these letters into syllable blocks like 가 or 고, depending on the sequence you type.
Switching between typing Latin script and Korean script requires manually switching the keyboard layout every time — a small but constant disruption when you are adding vocabulary.
In Wokabulary, we’ve made this seamless: Each vocabulary language can have a preferred keyboard layout assigned. When you start entering or editing a word, Wokabulary automatically activates the correct keyboard for that language.
This small feature saves countless clicks and makes entering new words feel fluid and natural, whether you’re learning Korean, Arabic, Hebrew, or any other language.
Display: The Challenge of Fonts Across Scripts
Storing and inputting text is one thing. Displaying it beautifully and legibly across all languages is another.
Although Apple’s system font San Francisco supports a wide range of scripts, it can’t perfectly cover every typographic tradition. Different scripts sometimes need different styles to look right.
An interesting example is Urdu, spoken by around 70 million people in Pakistan and parts of India. Urdu uses the Arabic alphabet, but written in a unique calligraphic style called Nastaliq. Unlike standard Arabic script, which flows horizontally, Nastaliq forms a flowing, slanted cascade, with elegant strokes and varying baselines — almost like a handwritten river of text.
macOS and iOS display Urdu text with an Arabic standard font by default because they cannot distinguish when Arabic letters should follow Urdu-specific typographic conventions. But in Wokabulary, we detect when the language is set to Urdu and automatically apply a specialized font — Noto Nastaliq Urdu — that renders the text correctly in its traditional flowing style.
Diacritics: Tiny Marks, Big Challenges
Accents and diacritics — like the acute accent in é, the umlaut in ö, or the tilde in ñ — are common across many languages. But technically, they can be tricky.
In Unicode, characters with diacritics can be represented in two ways:
- As a precomposed character (e.g., “é” as a single unit, U+00E9)
- As a decomposed sequence (e.g., “e” followed by a combining acute accent, U+0065 + U+0301)
Visually, they look the same. But if you import vocabulary lists from different sources, you might encounter both forms, which can cause mismatches during searches or quizzes.
Wokabulary automatically normalizes all text internally to a consistent form. This means that whether you type “café” using a true “é”, a decomposed “e” plus accent, or even just “cafe” without any accent at all, Wokabulary understands what you mean and matches it correctly.
This normalization ensures consistent behavior and simplifies usage for learners working with accent-heavy languages like French, Spanish, or Vietnamese.
Typography: Adjusting for Detail and Clarity
Different writing systems vary enormously in how dense and detailed they are.
For example:
- Chinese characters like 學 (“learn”) and Japanese kanji like 語 (“language”) are packed with intricate strokes.
- Thai script words like เรียนรู้ (“learn”) use looping, stacked forms with subtle diacritics above and below the main letters.
- Hebrew words like שָׁלוֹם (“peace”) combine consonants and vowel marks compactly into a small visual space.
If displayed too small, these intricate features can easily blur together, making reading uncomfortable and learning harder.
Wokabulary adjusts font sizes dynamically based on the script and content to improve readability. Scripts with fine detail are displayed slightly larger, while still ensuring that longer multi-word phrases fit neatly.
System Integration: For a Native Experience
Wokabulary is proudly built for macOS, iOS, and iPadOS — platforms that offer some of the best internationalization and accessibility support available.
Thanks to this, we can integrate powerful system features like:
- High-quality speech synthesis in dozens of languages.
- Built-in translation suggestions and dictionary lookup.
- Full support for Dynamic Type and VoiceOver, ensuring that Wokabulary is accessible for everybody.
By building on these strong foundations, we can focus on refining the learning experience, ensuring that Wokabulary works naturally and intuitively for every learner.
What’s New in Wokabulary 7.4
The latest update, Wokabulary 7.4, continues our mission to make the app even more accessible and learner-friendly:
- Larger default font sizes improve readability across scripts.
- Specialized font support for Urdu ensures authentic, beautiful typography for Nastaliq text.
- New settings let you select between regular, large, and extra-large font size in the settings, tailoring the display to your personal needs
We aim to make Wokabulary a tool that works for everyone — no matter which language you are learning, and no matter how that language is written.
Happy learning — in every script, everywhere!