Subject: Yeah something interesting
Author:
Posted on: 2021-01-12 01:46:30 UTC

So emoji were in Unicode before they were on phones. That's why phones could type and send them to begin with. They got into Unicode because Japanese cellphone manufacturers had a gap in their encoding scheme for characters (The Japanese language is big and complex, lotta characters, so there were some unassigned numbers there), so an engineer suggested putting little pictograms into the character set. And then when Unicode came along, what Unicode wanted to do was take every character in the world and assign it a number in a single, uniform system, and provide encoding schemes that would be standardized, so that you could always, always send text from one computer, to another computer, in any language, with the guarantee that the person at the other end can read the text the same as you can, and at worst will need to find a new font that supports all the weird characters you used. And since Japan was using emoji, they came along for the ride.

By the way, the fact that Unicode exists, and that Unicode works, is entirely, 100% insane. We imagine huge undertakings as big and flashy, but over the past few decades, without the world noticing, a team of linguists and computer scientists and programmers have quietly assigned numbers and invented encoding schemes to allow anyone to write in any language and character set and be universally understood by the entire world. That's mad. The fact that I can even complain, as a programmer, about unicode problems, is only because it's worked so well and been so successful that nine times out of ten the only reason the system breaks down is because a dumb programmer like me made an assumption about how the world works that doesn't hold true outside their language. That is, let me be clear, magic.

Also it's worth noting that as characters... Emoji aren't any different from any other characters. They're just numbers that convey to display an image. The character for 't' is the same, but it just says to display... well, a t. No, if you want to talk about truly weird characters, you need to talk about control characters. The best of which is obviously U+202E, ‮which forces text to be interpreted as right-to-left...

Reply Return to messages