New Ticket     Tickets     Wiki     Browse Source     Timeline     Roadmap     Ticket Reports     Search

Ticket #20686 (new defect)

Opened 4 years ago

Last modified 13 months ago

gsed fails to handle non-ASCII characters (bytes with top-bit set) in C locale

Reported by: vinc17@… Owned by: macports-tickets@…
Priority: Normal Milestone:
Component: ports Version: 1.7.1
Keywords: Cc: jabronson@…
Port: gsed

Description

For instance:

$ echo "abécd" | LC_ALL=C gsed -e 's/.*//'
écd

With the sed from Mac OS X and GNU sed under GNU/Linux, one gets a blank line, thus I suppose that this is what the user expects even though é isn't part of the US-ASCII character set specified by the C locale (and even though the result could depend on the encoding with some expressions).

The consequence is that building ocaml fails if gsed is installed with the with_default_names variant (see bug #20275).

Change History

comment:1 Changed 4 years ago by jabronson@…

  • Cc jabronson@… added

Cc Me!

comment:2 Changed 4 years ago by nox@…

I don't get an empty line with Mac OS X sed:

Bellcross:~ nox$ which sed gsed
/usr/bin/sed
/opt/local/bin/gsed
Bellcross:~ nox$ echo "abécd" | LC_ALL=C sed -e 's/.*//'
écd
Bellcross:~ nox$ echo "abécd" | LC_ALL=C gsed -e 's/.*//'
écd

comment:3 Changed 4 years ago by vinc17@…

That's strange. I have Mac OS X 10.4.11. If you have Leopard, perhaps Apple introduced a bug.

comment:4 Changed 4 years ago by vinc17@…

BTW, does bug #20275 occur on your machine?

comment:5 Changed 4 years ago by nox@…

I don't use +with_default_names.

comment:6 Changed 4 years ago by vinc17@…

Yes, but even without +with_default_names (or without gsed installed), you should probably reproduce the bug because your Mac OS X sed is buggy too.

comment:7 Changed 4 years ago by russ.bubley@…

I encounterd this problem too. For me (on Tiger)

machine:~/bin user$ which sed gsed
/usr/bin/sed
/opt/local/bin/gsed
machine:~/bin user$ echo "ab\303\251cd" | LC_ALL=C sed -e 's/.*//'

machine:~/bin user$  echo "ab\303\251cd" | LC_ALL=C gsed -e 's/.*//'
écd

comment:8 follow-up: ↓ 9 Changed 3 years ago by jmr@…

Has this been reported upstream?

comment:9 in reply to: ↑ 8 Changed 2 years ago by vinc17@…

Replying to jmr@…:

Has this been reported upstream?

Yes: sed fails to handle bytes with top-bit set in C locale under Mac OS X

comment:10 Changed 13 months ago by jmr@…

  • Owner changed from nox@… to macports-tickets@…

-> nomaintainer

Note: See TracTickets for help on using tickets.