Say the words “character encoding standard” to most people and their brains will congeal into a pile of glazed donuts, like 🍩. See how I embedded a cute little donut directly into that last sentence? You can thank Unicode for that. What is Unicode and how did it become the universal standard for digitally representing the world’s writing systems (yes, including emoji)? Plenty has been written about its history already, but here’s an attempt at a very brief overview.
A long time ago, when telephones still had cords and the internet didn’t fit in your pocket, there wasn’t a very reliable way to ensure that whatever character included in your digital document would be there when that document showed up on someone else’s Commodore 64. In order to support more multilingual texts and data, a universal way to encode character standards was vital. And although standards existed, they usually only existed for one language or platform, not many. From “Understanding Unicode – I: A general introduction to the Unicode Standard” by Peter Constable:
This situation led to considerable difficulty for developers and for users working with multilingual data. Products were often tied to a single encoding, which did not allow users to work with multilingual data or with data coming from incompatible systems. Developers were also required to support multiple versions of their products to serve different markets, making development and deployment for multiple markets a difficult process. In order to support data created using others’ products, developers had to support a variety of different standards for a single language. In order to work with multilingual data, they needed to support several standards simultaneously since no one standard supported more than a handful of languages. In turn, it was impossible to support multilingual data in plain text. Developing software that had anything to do with multilingual text had become incredibly difficult.
Enter the Unicode Consortium, which, after several years of discussions and workshops between lots of super smart people at places like Xerox, Apple, IBM and Microsoft, published the first Unicode standard volume in 1991. Much more detail is available in this riveting chronological recap of the making of Unicode version 1.0. That might sound like a bit of an exaggeration, but imagine the process for a team of international engineers and developers to agree upon a set of multilingual standards that had the potential to grow indefinitely in the future. And they didn’t even have Slack to help with their collaborative workflows and productivity levels!
The Unicode Standard, now on version 8.0 (9.0 is due out in a few short weeks), is a colossal document that defines the display of the world’s written languages, those both in use, as well as classical languages from historical texts. And yes, this most recent version included a detailed report on guidelines for the design and implementation of emoji. Version 9.0 promises 72 new emojis (in addition to nearly 7,500 other characters), so you’ll finally be able to include that terrifyingly happy Clown Face in all your digital messages (thanks, Unicode!).
Emoji are now at the forefront of popular culture, in no small part thanks to Unicode. Oxford Dictionary’s Word of the Year for 2015? None other than 😂. Facebook recently introduced emoji reactions, to much fanfare and internet rage. And save the date for an air-conditioned nap: The Emoji Movie comes to theaters in Summer 2017. For type designers, emoji is a serious design consideration (although I’m still waiting for the Garamond taco). Check out Colin Ford’s excellent ongoing series, Making Faces (and Other Emoji).
But Unicode is about a lot more than emoji. Like, 120,000+ more characters. To wrap your head around that, check out this graph by co-founder and Unicode Consortium President Mark Davis demonstrating the number of characters included over the past 20 years; emoji make up only a tiny number of defined characters.
So what are all of these 120,000+ characters? This handy chart lists all of the scripts and symbols in the most recent volume. To view it another way, this hypnotizing 2.5 hour video, part of the wonderful decodeunicode project from the Hochschule Mainz, shows 109,242 characters in Unicode 6.0 (it’s a few years old, ok?).
I’ll admit to occasionally playing this video in the background while working.
If language is culture and writing systems are ways of preserving that culture, then Unicode plays a monumental part in helping to provide access to minority and endangered languages, as well as the study of dead and classical languages. Given this task, it is understandable that there is some criticism of Unicode and its lack of support of certain minority languages. Language is also a form of political power and the decisions for what to include or not has implications that are far more impactful than the bacon emoji.
This past year, Unicode began its Adopt-a-Character fundraising program, allowing anyone to sponsor a single unicode character of their choice
for one year in perpetuity.
Alphabettes chose to adopt the 💌, how could we not? In honor of the February Love Letter series, the 💌 is all about spreading and sharing the adoration of the written word.
Part of the official sponsorship package included a printed certificate for our
1-year adoption of the U+1F48C Love Letter. When it arrived in the mail, I was more excited than I probably needed to be.
One of my favorite Twitter accounts these days is the Unicode Consortium’s, who tweets out recently adopted characters and their beloved adopters, although I might be one of the few people that hasn’t muted them.
Alphabettes is now a Bronze sponsors of LOVE LETTER!#UnicodeSponsor https://t.co/X3d6nKqo0E pic.twitter.com/t0U5DiINnk
— Unicode Consortium (@unicode) April 14, 2016
Don’t worry, Unicode Consortium, we promise to take good care of 💌
for the year. It’s in good 🙌.