[Namazu-users-en] Re: Problems with mknmz and Perl 5.8.6

Tadamasa Teranishi yw3t-trns at asahi-net.or.jp
Sun Jun 12 12:23:22 JST 2005

Earl Hood wrote:
> I have, in a myriad of ways.  I just recreated things on one of my
> local systems to make analysis easier.
> I've made available of the command used and the output of a
> stock namazu 2.0.14 installation available for your examination at
> <http://www.mhonarc.org/tmp/mknmz-out.txt.gz>.  I.e. No modifications
> to namazu code is done, so the many "malformed utf-8 ..." messages
> are provided.  Perl also complains about wide characters in print.
> I've also made available the input files and NMZ.* files at
> the following locations:
> <http://www.mhonarc.org/tmp/namazu-users-en_NMZ_files.tar.gz>
> <http://www.mhonarc.org/tmp/namazu-users-en_input_files.tar.gz>.

To our regret, Namazu supports ASCII text-only input. 
(However, Japanese text can be used for a Japanese environment. )
For instance,
namazu-users-en/2000-07/msg00000.html is a Japanese text. 
namazu-users-en/2003-06/msg00000.html is non-ASCII text. 
In addition a lot. 

Please use it by ASCII text-only. 

> Also, the "Malformed UTF-8 ..." warnings are popping up, regardles
> of what LANG or LC_ALL are set to.  I had to add a 'use bytes' pragma
> to mailnews.pl at line 212 to get rid of the warnings.

'use bytes' is not the one that only warning is erased, and the root 
of the problem is solved. 

By the way,
The EUC-JP text is included in line 216 of mailnews.pl. 
Japanese processing is done excluding a Japanese environment. 
Therefore, warning has gone out.  
I want to correct Japanese processing as doing only in a 
Japanese environment. 
(However, it is not because non-ASCII text comes to be treatable. )
Key fingerprint =  474E 4D93 8E97 11F6 662D  8A42 17F5 52F4 10E7 D14E

More information about the Namazu-users-en mailing list