New Ticket     Wiki     Browse Source     Timeline     Roadmap     Ticket Reports     Search

Ticket #21517 (closed enhancement: fixed)

Opened 2 years ago

Last modified 14 months ago

python25, python26: japanese locale errors

Reported by: null.atou@… Owned by: jwa@…
Priority: Normal Milestone:
Component: ports Version: 1.8.0
Keywords: Cc: mcalhoun@…
Port: python26 python25

Description

When the python script eyeD3 execute under Japanese environment, 'LookupError: unknown encoding: x-mac-japanese' has occurred.

% __CF_USER_TEXT_ENCODING=$UID:0:0 eyeD3 hoge.mp3   <--- encoding is 'mac-roman'
(snip, no error but many Mojibake occur!)
% __CF_USER_TEXT_ENCODING=$UID:1:14 eyeD3 hoge.mp3  <--- In Japanese environment, __CF_USER_TEXT_ENCODING already set
(snip)
Uncaught exception: unknown encoding: x-mac-japanese
Traceback (most recent call last):
  File "/opt/local/bin/eyeD3", line 1215, in <module>
    retval = main();
  File "/opt/local/bin/eyeD3", line 1192, in main
    retval = app.handleFile(f);
  File "/opt/local/bin/eyeD3", line 566, in handleFile
    self.printTag(self.tag);
  File "/opt/local/bin/eyeD3", line 937, in printTag
    "replace"),
LookupError: unknown encoding: x-mac-japanese

This is because python26 (and also python25) don't look LANG env-var(ja_JP.UTF-8 in many case in Japan), but get an encoding name 'x-mac-japanese' from CoreFoundation CFStringGetSystemEncoding() and CFStringConvertEncodingToIANACharSetName() (see 'Lib/locale.py' and a source 'Modules/_localemodule.c'). Then, unfortunately, a python only knows codecs in the codec table http://docs.python.org/library/codecs#standard-encodings. In the table, there are no 'x-mac-japanese' or 'x-mac-trad-chinese' or 'x-mac-korean' etc... So a simple test is here:

% __CF_USER_TEXT_ENCODING=$UID:1:14 python -c 'import locale; print "getdefaultlocale is", locale.getdefaultlocale(), ", getpreferredencoding :", locale.getpreferredencoding();'
getdefaultlocale is (None, 'x-mac-japanese') , getpreferredencoding : x-mac-japanese
% __CF_USER_TEXT_ENCODING=$UID:1:14 /usr/bin/python -c 'import locale; print "getdefaultlocale is", locale.getdefaultlocale(), ", getpreferredencoding :", locale.getpreferredencoding();'
getdefaultlocale is ('ja_JP', 'UTF8') , getpreferredencoding : UTF-8

Yes, apple's python2.6.1 in Snow Leopard (and also apple's python2.5.1 in Leopard) looks not CF_... but LANG.

I referred to web pages (in Japanese)  here and  here, and made a simple patch arround locale problem. Under this patch, python26 looks LANG env-var and get well-known encoding 'UTF-8', so, no error, no mojibake is occurred in eyeD3:-)

By the way, there is no unknown encoding error in python31 because a similar change applies in Modules/_localemodule.c in version 3.1.

Attachments

patch-locale-from-apple-darwinsource.diff Download (1.1 KB) - added by null.atou@… 2 years ago.
patch to Lib/locale.py and Modules/_localemodule.c
hoge.mp3 Download (18.0 KB) - added by null.atou@… 2 years ago.
IDv2.3, Artist and album tags use Japanese language

Change History

Changed 2 years ago by null.atou@…

patch to Lib/locale.py and Modules/_localemodule.c

  Changed 2 years ago by jmr@…

  • keywords Python, locale removed
  • owner changed from macports-tickets@… to blb@…
  • port changed from python26, python25 to python26 python25
  • cc mcalhoun@…, mww@… added

You should also open a bug upstream (if you haven't already).

  Changed 2 years ago by blb@…

This looks like it could be  python issue 1276 which has a different fix (which was only commited to py3k branch and not 2.x). Does that fix work as well? Do you have a file which can be used to reproduce this issue (if that mp3 isn't shareable)?

  Changed 2 years ago by null.atou@…

In Japanese environment, when make a new user, default LANG is set to ja_JP.UTF-8. While a dot file ~/.CFUserTextEncoding(this file affects __CF_USER_TEXT_ENCODING) is '1:14'. '1' means that encoding is CP10001; x-mac-japanese. But, we Japanese don't use x-mac-japanese but use UTF-8 in Terminal. Yes, we prefer UTF-8, not x-mac-japanese.

At a glance, the way of python-issue-1276 seems adding encodings(x-mac-japanese, etc.) to the table of codecs. But these patches don't correct the locale problem in the environment of LANG=ja_JP.UTF-8, because LANG was still ignored. And I found, in Python 3.1 (more correctly 3.1rc2), they disappear the special routine for darwin or __APPLE__ that use CoreFoundation function as mentioned above, so as to follow standard UNIX manner. This change looks same as Apple's  patches.

% LANG=ja_JP.UTF-8 /usr/bin/python2.6 -c 'import locale; print(locale.getpreferredencoding());'
UTF-8
% LANG=ja_JP.UTF-8 /opt/local/bin/python3.1 -c 'import locale; print(locale.getpreferredencoding());'
UTF-8
% LANG=ja_JP.UTF-8 /opt/local/bin/python2.6 -c 'import locale; print(locale.getpreferredencoding());' 
x-mac-japanese  # bad!

Changed 2 years ago by null.atou@…

IDv2.3, Artist and album tags use Japanese language

  Changed 2 years ago by null.atou@…

You can test eyeD3 hoge.mp3

% __CF_USER_TEXT_ENCODING=$UID:1:0 /opt/local/bin/eyeD3 hoge.mp3
(LookupError)
% __CF_USER_TEXT_ENCODING=$UID:0:0 /opt/local/bin/eyeD3 hoge.mp3
(snip)
title: 		artist: ????????
album: ????????		year: 2009

hmm, not mojibake but ??????? :-) and private build version from source to /usr/local/bin and using /usr/bin/python, result is expected.

% LANG=ja_JP.UTF-8 /usr/local/bin/eyeD3 hoge.mp3

hoge.mp3	[ 18.00 KB ]
-------------------------------------------------------------------------------
Time: 00:01	MPEG1, Layer III	[ 128 kb/s @ 44100 Hz - Joint stereo ]
-------------------------------------------------------------------------------
ID3 v2.3:
title: 		artist: アップルの中の人
album: システムサウンド		year: 2009

follow-up: ↓ 6   Changed 2 years ago by blb@…

Great, thanks for the reproducible tests; python26 fixed in r58097.

The other question is whether we want to bother with python25, since it doesn't work quite right on 10.6 and not being as well maintained upstream anymore?

in reply to: ↑ 5   Changed 2 years ago by null.atou@…

Replying to blb@…:

python26 fixed in r58097.

Thank you! eyeD3 works well, python looks LANG. No more error, no more mojibake. Thank you!

The other question is whether we want to bother with python25

Yes off course, because as the same behavior to LANG env-var.

  Changed 2 years ago by jmr@…

  • cc jwa@… added; mww@… removed

  Changed 17 months ago by blb@…

  • cc mcalhoun@… removed
  • owner changed from blb@… to mcalhoun@…

  Changed 14 months ago by jmr@…

  • cc mcalhoun@… added; jwa@… removed
  • owner changed from mcalhoun@… to jwa@…
  • summary changed from py26-eyed3-0.6.17 LookupError: unknown encoding: x-mac-japanese to python25, python26: japanese locale errors

  Changed 14 months ago by jmr@…

  • status changed from new to closed
  • resolution set to fixed

Applied to python25 in r74671.

Note: See TracTickets for help on using tickets.