Nowadays, free Japanese dictionaries are everywhere; you no longer have to pay an arm and a leg to buy a giant Japanese dictionary, but instead you can just look up things for free with a website, app, or browser extension.
But what a lot of people don’t know is that more often than not, these free dictionaries are labors of love. Very dedicated people build and maintain these dictionaries in their spare time, sacrificing a lot of time (and usually money) to keep them up and running.
I wanted to hear more about what it’s like to build and run one of these dictionaries, so I took some time to talk to Gregory Bober, the creator of one of my favorite Japanese dictionaries, Tangorin. We talked about Tangorin’s latest update, how Tangorin came about, where it’s going, and what’s wrong with Japanese dictionaries today.
Tell us a little about yourself — what’s your name, where are you from, etc..
My name is Grzegorz (Gregory) Bober. I’m 26 and I’m from Poland. I’m a web developer working mostly on personal projects.
How did you get interested in learning Japanese in the first place?
I’ve always been a huge film buff. In my early teens I got interested in anime. I discovered a whole different film universe and wanted to watch and understand everything in it.
I had a chance to learn English very early on so I got to enjoy movies on a whole new level. I never liked Polish cinema, there’s really not much to like, and once you get a taste of a really good American film, understand it without subtitles, and with cultural references, there’s no going back.
With anime, I was stuck with subtitles, and that bothered me a lot since I knew very well how much I was missing out. That’s why I started learning Japanese. Simply to watch anime without subtitles.
Did you teach yourself Japanese, or did you learn in a class?
Luckily there was a small foreign language school in my home town that offered a course in Japanese language, culture, and calligraphy. I went there during high school. Then I got accepted to the Japanology department at the University of Warsaw. It was a very intense course in everything related to Japan, with a strong focus on language. Knowing kana, basic kanji, basics of grammar and having watched a lot of anime helped me tremendously.
I was a pretty good student at first, but since I’m very lazy and get bored easily I got progressively bad. The fact that we started to spend too much time on classical Japanese and Buddhism didn’t help either. Learning Buddhist mantras by heart and deciphering Heian-jidai love letters from princes to every women in the imperial court while still not being able to have a proper conversation, seemed pointless, I lost interest and decided to drop out.
I never saw myself as a translator or a language teacher so there was no point in staying. I wrote my thesis about the influences of Western culture in the works of Shinichiro Watanabe (Cowboy Bebop and Samurai Champloo), my professor liked it, but I never got my degree.
After so many years of studying, my Japanese still isn’t that great, but don’t worry, I don’t actually translate anything at Tangorin.
Have you ever traveled to or lived in Japan?
After I dropped out of university I spent a little over a year traveling, mostly in East and Southeast Asia. I stayed in Tokyo for six months. I lived in a long-term guest house in Nishi-Funabashi.
I’m not a huge fan of sightseeing, I prefer to stay in one place, live as local as possible and wait until I get to know my surroundings really well so that’s exactly what I did. I got out of Tokyo only once, to climb Mt. Fuji. Most of the time I spent with friends in Shibuya, Akihabara, Shinjuku, Harajuku, and Roppongi (in that order).
My Japanese was good enough to live there comfortably without any English. I had to leave because my two visas had run out and it’s not that easy to stay in Japan as a freelance developer without a degree. I definitely want to go back.
What is Tangorin? What does the name “Tangorin” mean?
Tangorin is essentially an online interface to various open projects built for Japanese language students. My job is to normalize data from several dictionary files, mostly from Jim Breen’s WWWJDIC, combine them into a single fast and easy-to-search database, and provide some basic additional features like custom vocabulary lists.
As for the name, I wanted something short and simple that sounded Japanese but was easy to pronounce and spell in Western languages. 単語 (tango) means “words,” 林 (rin) is a common suffix for dictionaries. I liked how it kind of sounded like tangerine. And that it was a very unpopular word on Google.
What new features are you adding to Tangorin?
The biggest change in the newest update will be a whole new interface based on Twitter Bootstrap. The main layout won’t change much but it will be more consistent and mobile friendly. Much faster too, performance- and bandwidth-wise. The most important thing is that it will help me make more updates on a more regular basis.
The wildcards functionality will be back (it was a terrible mistake to sacrifice them for database performance, sorry for that), few new features in Vocabulary, and improved search results sorting. Shortly after the update I will add a look-up method based on the Wikipedia API. It will translate article names into selected languages.
When did you decide to make Tangorin?
While studying at the university I realized there was no good online Japanese dictionary. There was of course WWWJDIC, Hideki (no longer exists), jisho.org, and few others, but apart from WWWJDIC they were all based on EDICT (and still are).
EDICT is basically a legacy database format for the newer much better structured JMdict. The main difference between these two formats is that a single entry in JMdict contains all the synonyms, alternative kanji writings and readings associated with the Japanese term it describes, whereas in EDICT they are divided into multiple reading-writing pairs with copied English definition.
I was also disappointed with the overall functionality of available dictionaries, how they weren’t properly linked together and lacked useful features like creating your own vocabulary lists. I made a simple interface for JMdict for personal use and then made it public under tangorin.com. Soon, most of my friends from Japanology started using it and they’ve been very helpful with its development.
What are your long-term goals for Tangorin?
First of all, to make it profitable. Right now running Tangorin costs me a lot of time and money. I really enjoy working on it and want to spend more time developing it. There’s a lot of room for improvement. Donations have been scarce but I don’t want to clutter the layout with more ads or limit Tangorin’s free functionality to offer features for a fee. I still need to figure this out. There’s a strong demand for a mobile app, especially on Android, so that’s definitely on my to do list.
Apart from that: autosuggest, incorporating Japanese WordNet to build a synonyms dictionary, hand-writing recognition, a simple morphological analyzer built with MeCab, kanji decomposition, a built-in spaced repetition system to effectively study words from Tangorin vocabulary lists, better forums, REST API, pronunciation, audio files.
How do you try to make Tangorin stand out?
By focusing on developing a clear, fast and easy to use search experience. By combining different look up methods so that you can search with Japanese, English, kana, romaji, kanji, and/or tags from a single input form.
So many Japanese dictionaries nowadays rely on Jim Breen’s WWWJDIC — do you think this is a good or a bad thing?
Definitely a bad thing. The fact that many dictionaries still use EDICT instead of JMdict makes it even worse. WWWJDIC is a fantastic project and the quality of its translations is pretty good. Any kind of alternative, especially for more experienced students, would be great.
We also need better example sentences.
I would love to license Kenkyusha’s database, both English-Japanese and Japanese-Japanese dictionaries, but I don’t have the resources to do that. Perhaps when Tangorin Android and iOS apps are finished.
Do you have any other projects you’re working on right now?
I have other small projects and ideas to work on but right now I’m focused only on Tangorin. It’s been five years since I started developing it and I feel like a lot more could be done in that time.
Thanks to Gregory Bober for the interview! You can check out Tangorin here.