<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tofugu&#187; html</title>
	<atom:link href="http://www.tofugu.com/tag/html/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.tofugu.com</link>
	<description>A Japanese Language &#38; Culture Blog</description>
	<lastBuildDate>Fri, 11 Apr 2014 22:42:45 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.8.2</generator>
	<item>
		<title>The Sorry State Of Japanese On The Internet</title>
		<link>http://www.tofugu.com/2012/04/04/the-sorry-state-of-japanese-on-the-internet/</link>
		<comments>http://www.tofugu.com/2012/04/04/the-sorry-state-of-japanese-on-the-internet/#comments</comments>
		<pubDate>Wed, 04 Apr 2012 19:00:13 +0000</pubDate>
		<dc:creator><![CDATA[Hashi]]></dc:creator>
				<category><![CDATA[Editorial]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[standards]]></category>
		<category><![CDATA[unicode]]></category>

		<guid isPermaLink="false">http://www.tofugu.com/?p=17897</guid>
		<description><![CDATA[Japanese text on the web is a lot like politics and sausage &#8211; it&#8217;s a messy process that nobody should ever have to see. But in the time I&#8217;ve been working at Tofugu, I&#8217;ve had to bear witness to some horrible, horrible things. Let me pull back the curtain for a bit and show the [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>Japanese text on the web is a lot like politics and sausage &#8211; it&#8217;s a messy process that nobody should ever have to see. But in the time I&#8217;ve been working at Tofugu, I&#8217;ve had to bear witness to some horrible, horrible things.</p>
<p>Let me pull back the curtain for a bit and show the absolute nightmare that exists behind Japanese text on the internet. </p>
<h2>Kanji, Kanji Everywhere</h2>
<p>If you&#8217;re unfamiliar with Asian history and culture, ancient Chinese culture had a tremendous impact on virtually every culture in East Asia. Other countries in Asia adopted Chinese food (like ramen!), customs, and parts of the language.</p>
<p>Most of you probably already know that the complicated characters in Japanese called kanji come from Chinese characters, but it doesn&#8217;t stop there. Korean has its own adaptation of Chinese characters called <em>hanja</em>, and until colonialism, Vietnamese used Chinese characters in its language.</p>
<p>These are all known as <strong>han</strong> characters, or sometimes <abbr="Chinese, Japanese, Korean">CJK (Chinese, Japanese, Korean)</abbr> characters.</p>
<p><img src="http://www.tofugu.com/wp-content/uploads/2012/04/we-are-the-kanji.jpg" alt="We Are The World" title="we-are-the-kanji" width="710" height="384" class="aligncenter size-full wp-image-17906" />You&#8217;d think that this would be a good thing, right? All these different countries and cultures using han characters, it&#8217;s like everybody&#8217;s joined hands and is singing <cite>We Are The World</cite>, right?</p>
<p>Oh, if only.</p>
<h2>Why Kanji Doesn&#8217;t Look Quite Right On The Internet</h2>
<p>Why are these han characters a problem? It has something to do with Unicode, a commonly used standard that&#8217;s used to display text from different languages on computers.</p>
<p>Somewhere down the line, somebody thought that it would be a great idea to save time and space by saying that, in Unicode, all of these han characters are, for all intents and purposes, exactly the same. This process was called Han Unification, and would soon become the bane of my existence.</p>
<p>Han Unification is a problem because han characters can look different and mean different things in each language.</p>
<p>Just take a look at this picture: it&#8217;s the same Unicode character, rendered in five different languages:<br />
<img src="http://www.tofugu.com/wp-content/uploads/2012/04/unicode-confusion.png" alt="" title="unicode-confusion" width="710" height="220" class="aligncenter size-full wp-image-17995" />The Chinese versions look <em>completely different</em> from the other languages.</p>
<p>Unless the website explicitly says that a piece of kanji text is Japanese (with the HTML <code>lang</code> attribute), it won&#8217;t look quite right. It might use the Chinese style and make everything else (i.e. kana) look out of place.</p>
<p>This becomes a problem more often than you might think, whether it&#8217;s <a href="http://forum.koohii.com/viewtopic.php?id=8331" title="Android OS Unicode font (Han unification) issues - Reviewing the Kanji - Learning Japanese" target="_blank">on your phone</a>, when <a href="http://code.google.com/p/ankidroid/issues/detail?id=939" title="Issue 939 -   ankidroid -    Force font -   Flashcards on Android - Google Project Hosting" target="_blank">using electronic flashcards</a>, or just <a href="http://www.guidetojapanese.org/blog/2009/10/28/fonts-matter-people/" title="Fonts matter people! | Tae Kim&#8217;s Blog" target="_blank">reading the news</a>.</p>
<p>And when you throw in different fonts, operating systems, and browsers into the mix, all bets are off.</p>
<p>Worse still, people argue that <em>this is exactly what Unicode should be like</em>. The argument is that, despite stylistic and cultural differences, underneath it all these characters are essentially the same.</p>
<p>I can understand the rationale behind Han Unification but, since I have the emotional capacity of a child and just want things to work, I&#8217;m going to say that it&#8217;s <em>dumb</em> and <strong>stupid</strong> and <strong><em>I hate it</em></strong>.</p>
<h2>Why Japanese Isn&#8217;t Readable On The Internet</h2>
<p>But hey, if your kanji looks wrong, all&#8217;s not lost. You can always use <em>furigana</em>, the simple, little characters you see above kanji to help you read them. Right?</p>
<p><strong>Wrong.</strong></p>
<p>While there is the technology to do this on the web (the HTML <code>ruby</code> element), you won&#8217;t see it much. It just doesn&#8217;t work on all web browsers (like Firefox), and few people choose to use it on their websites. </p>
<p><img src="http://www.tofugu.com/wp-content/uploads/2012/04/ruby-comparison.png" alt="" title="ruby-comparison" width="710" height="203" class="aligncenter size-full wp-image-17948" />I would <em>love</em> to include furigana in the kanji I write to make it easier for beginners, but right now it&#8217;s not really an option.</p>
<p>But, unfortunately, web developers seem much more interested in tech demos and proof-of-concept sites than making sure the web looks as good in other languages as it does in English.</p>
<p>Maybe someday Japanese will get the first-class treatment on the web that it deserves, but right now I think we have a long way to go.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tofugu.com/2012/04/04/the-sorry-state-of-japanese-on-the-internet/feed/</wfw:commentRss>
		<slash:comments>73</slash:comments>
		</item>
	</channel>
</rss>
