Opened 12 years ago

Closed 12 years ago

#31540 closed enhancement (fixed)

Enhance mecab family

Reported by: humem (humem) Owned by: rsky0711@…
Priority: Normal Milestone:
Component: ports Version: 2.0.3
Keywords: Cc: rsky0711@…
Port: mecab

Description

I propose enhancements of MeCab family in textproc. The current mecab port is excellent and very useful, but I think it does not have a few features.

First, we cannot install other dictionaries than IPA, such as jumandic and naist-jdic, with the mecab port. Dictionary ports for mecab exist, but it is a little inconvenient for users to modify a configuration file manually after installing the ports.

Second, the encoding of mecab could not be specified by other ports, such as cabocha and mecab-java, which require not default EUC-JP but UTF-8 version of mecab. Although the port has variants to specify its encoding, the macports system has no means to specify a port variant in dependency unfortunately.

I propose to split the mecab port into three components: mecab meta (mecab, mecab-sjis, mecab-utf8), mecab-base and mecab dictionaries (mecab-ipadic*, mecab-jumandic*, mecab-naist-jdic*). Please refer to my user's svn: http://trac.macports.org/browser/users/hum/textproc

There are three mecab meta ports according to a variety of encodings, and the default mecab port without encoding name is for EUC-JP. Variants for encodings are deprecated and If you intend to install UTF-8 version of MeCab for example, you should install 'mecab-utf8' instead of 'mecab +utf8'. The mecab meta ports set a specific dictionary to the system default dictionary path. Although the default dictionary is the IPA dictionary, you can specify one of the available dictionaries as a variant, for example 'mecab +jumandic' or 'mecab-utf8 +naistjdic'.

The mecab dictionary ports for EUC-JP are deprecated, and primary dictionary ports without encoding name are added for EUC-JP. In addition to ipadic and jumandic, naist-jdic port is added.

In Portiles, indents are modified, descriptions and maintainers are slightly changed, licenses are added and checksums are update. livechecks are added to mecab-base, mecab-ipadic, mecab-jumandic and mecab-naist-jdic. +dartsclone variant is added to mecab-base to use darts-clone instead of the standard darts.

Please check the proposed ports. Thanks in advance!

Change History (2)

comment:1 Changed 12 years ago by humem (humem)

Cc: rsky0711@… added; rsky071@… removed
Owner: changed from macports-tickets@… to rsky0711@…

comment:2 Changed 12 years ago by humem (humem)

Resolution: fixed
Status: newclosed

Committed in r86218 (maintainer timeout). Let me know if you have any problem. Thanks.

Note: See TracTickets for help on using tickets.