[Namazu-users-en] Re: Problems with mknmz and Perl 5.8.6

Tadamasa Teranishi yw3t-trns at asahi-net.or.jp
Tue Jun 14 02:04:53 JST 2005

Earl Hood wrote:
> You do realize this greatly reduces the usability of namazu.

Of course, it might be so. 
However, it is a reality in Namazu to our regret that only the 
processing of 7bit ASCII text corresponds. 

It is not possible only to correspond to it though the demand that 
it wants to use 8bit character is understood. 

> In email, it is hard to control what charset you will get.  With my
> program, MHonArc, it does a fairly good job of "normalizing" the data,

It is fairly things of the past for MHonArc of Namazu. 
It is before MHonArc outputs the code of the Unicode character entity 

> I do not care, because I do not want to deal with Japanese text
> with the data set in question.  Namazu should NOT be doing
> JP processing if the locale is not set to JP.  Therefore, things
> like multi-byte and wide characters are irrelevent.

There is a possibility that the text of multi-byte and wide characters 
is input if the input is not definable. 
And, there is a possibility for it to cause various adverse effects. 
For instance, the index is destroyed. 

(To begin with, 8bit character even doesn't test. )

> If it is the decision of Namazu developers to exclude all character
> entity references in data, 

The corrected code is a sample to the last. 
This correction is never reflected in stable-2-0 though it wrote 
in previous mail. 
However, it is not because it is possible to correspond to 8bit 
character, and use it in 7bit ASCII text, please. 
It is possible to say. 

It is scheduled to correct it with Namazu 2.0.15 in the part that 
passes Japanese processing excluding a Japanese environment of 
