Namazu-users-en(old)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Malformed UTF-8 character ...



On May 6, 2004 at 10:40, Tadamasa Teranishi wrote:

> > Figuring it was a LANG envariable setting, I explicitly sent LANG
> > to en_US (it was defaulted to en_US.UTF-8), but it did not fix it.
> > Maybe I should try en_US.ISO-8859-1?
> 
> xxxx.UTF-8 is not supported.

I'm aware of this.

> You Instead of "en_US.UTF-8" You have to set "C".
> 
> probably "LC_ALL" or "LC_CTYPE" etc. It is en_US.UTF-8.
> Please set up LC_ALL=C and use mknmz.

Is there any drawback of including the "use bytes" pragma to
avoid this problem?  Is there a need to support older versions
of perl that do not support the pragma?

As a sanity check, namazu could do a locale check (checking various
envariables), and if set to a UTF-8 locale, could either generate
a warning, and fallback to the C locale, or could error out
stating unsupported locale.

With later linux distributions now defaulting to UTF-8-based locales,
such checks may eliminate user mail to the list about this.

--ewh