What's new

Chasen and Mecab


Aug 24, 2012
Reaction score
Hi, does anyone have any practical experience with japanese Part-of-Speech and Morphological parsers such as Mecab, ChasSen, Juman etc?
I am writing a web application to provide a nice graphical interface to such a tool and I am currently using the ChaSen parser as the output seems easier to work with.
I have however noticed that it seems to make mistakes with certain verbs and adjectives. As an example if I feed ChaSen the sentence "店へ行って家に帰ります" I get the following:
店 ミセ 店 名詞-一般
へ ヘ へ 助詞-格助詞-一般
行っ オコナッ 行う 動詞-自立 五段・ワ行促音便 連用タ接続
て テ て 助詞-接続助詞
家 イエ 家 名詞-一般
に ニ に 助詞-格助詞-一般
帰り カエリ 帰る 動詞-自立 五段・ラ行 連用形
ます マス ます 助動詞 特殊・マス 基本形
。 。 。 記号-句点

As you can see it sees 行く as 行う (okonau). If I give the same sentence to Mecab it interprets it correctly:
店 名詞,一般,*,*,*,*,店,ミセ,ミセ
へ 助詞,格助詞,一般,*,*,*,へ,ヘ,エ
行っ 動詞,自立,*,*,五段・カ行促音便,連用タ接続,行く,イッ,イッ
て 助詞,接続助詞,*,*,*,*,て,テ,テ
家 名詞,一般,*,*,*,*,家,イエ,イエ
に 助詞,格助詞,一般,*,*,*,に,ニ,ニ
帰り 動詞,自立,*,*,五段・ラ行,連用形,帰る,カエリ,カエリ
ます 助動詞,*,*,*,特殊・マス,基本形,ます,マス,マス
。 記号,句点,*,*,*,*,。,。,。

I haven't done any post install configuration to either Mecab or Chasen so it may be I need to learn more about the fine tuning of them but any advice would be appreciated.