[Namazu-users-en] Re: 2.0.13 and wide characters problem +
index exists but no results problem
julio at cnedra.org
Thu Aug 12 07:18:29 JST 2004
Hi Takatsugu !
knok at daionet.gr.jp wrote:
> Unfortunately, current stable Namazu has no support for all UTF-8
Does that mean that we're doomed ? ;)
I've got a few questions regarding the issues Arkadiusz pointed out in
his mail on 27th of June.
I'm making archives of a french newsgroup with mhonarc, I then generate
an index file with mknmz and use namazu as a search tool.
Recently I've reached more than 250 000 text files in my database. And
I've seen losts of these errors when using mknmz:
Malformed UTF-8 character (unexpected continuation byte XYZ, with no
preceding start byte) messages
Wide character in print at /usr/bin/mknmz line 2475.
As a result, the answers to my queries are now always empty !!
With the full index:
With a temporary index I'm currently rebuilding:
I've spent weeks and weeks generating my index file (the index is now
370Mo), so I'd need to know if I have to re-build the whole thing...
Would using the fedora patch solve my problem ? Is there a way to
"clean" the index ? A mknmz option perhaps ? (I've not found it btw)
Namazu is really a great tool (and thanks a lot for bringing it to us
!), but it would really be a pain if I had to build my index again...
For your info:
12/08 0:04 julio at gourdon /mnt/frcd% mknmz -C
Loaded rcfile: /etc/namazu/mknmzrc
KAKASI: module_kakasi -ieuc -oeuc -w
ChaSen: module_chasen -j -F '%m '
Wakati: module_kakasi -ieuc -oeuc -w
Coding System: euc
Supported media types: (22)
Unsupported media types: (11) marked with minus (-) probably missing
application in your $path.
- application/excel: excel.pl
- application/ichitaro7: taro7_10.pl
- application/msword: msword.pl
- application/pdf: pdf.pl
- application/powerpoint: powerpoint.pl
- application/rtf: rtf.pl
- application/x-dvi: dvi.pl
- application/x-js-taro: taro7_10.pl
- application/x-rpm: rpm.pl
- application/x-tex: tex.pl
- audio/mpeg: mp3.pl
text/html; x-type=mhonarc: mhonarc.pl
text/plain; x-type=rfc: rfc.pl
zsh: 1803 exit 1 mknmz -C
Thanks a lot !
Julien Gourdon <julio at cnedra.org>
More information about the Namazu-users-en