What's new

Welcome to Japan Reference (JREF) - the community for all Things Japanese.

Join Today! It is fast, simple, and FREE!

Learn Japanese with JapanesePod101.com

Chasen and Mecab

edosan

エド
Joined
24 Aug 2012
Messages
13
Reaction score
0
Hi, does anyone have any practical experience with japanese Part-of-Speech and Morphological parsers such as Mecab, ChasSen, Juman etc?
I am writing a web application to provide a nice graphical interface to such a tool and I am currently using the ChaSen parser as the output seems easier to work with.
I have however noticed that it seems to make mistakes with certain verbs and adjectives. As an example if I feed ChaSen the sentence "店へ行って家に帰ります" I get the following:
店 ミセ 店 名詞-一般
へ ヘ へ 助詞-格助詞-一般
行っ オコナッ 行う 動詞-自立 五段・ワ行促音便 連用タ接続
て テ て 助詞-接続助詞
家 イエ 家 名詞-一般
に ニ に 助詞-格助詞-一般
帰り カエリ 帰る 動詞-自立 五段・ラ行 連用形
ます マス ます 助動詞 特殊・マス 基本形
。 。 。 記号-句点

As you can see it sees 行く as 行う (okonau). If I give the same sentence to Mecab it interprets it correctly:
店 名詞,一般,*,*,*,*,店,ミセ,ミセ
へ 助詞,格助詞,一般,*,*,*,へ,ヘ,エ
行っ 動詞,自立,*,*,五段・カ行促音便,連用タ接続,行く,イッ,イッ
て 助詞,接続助詞,*,*,*,*,て,テ,テ
家 名詞,一般,*,*,*,*,家,イエ,イエ
に 助詞,格助詞,一般,*,*,*,に,ニ,ニ
帰り 動詞,自立,*,*,五段・ラ行,連用形,帰る,カエリ,カエリ
ます 助動詞,*,*,*,特殊・マス,基本形,ます,マス,マス
。 記号,句点,*,*,*,*,。,。,。

I haven't done any post install configuration to either Mecab or Chasen so it may be I need to learn more about the fine tuning of them but any advice would be appreciated.
 

Create an account or login to comment

You must be a member in order to leave a comment

Create account

Create an account on our community. It's easy!

Log in

Already have an account? Log in here.

Top Bottom