[Namazu-users-en] Re: namazu stopped working

IEM - network operating center noc at iem.at
Fri Nov 25 01:28:46 JST 2005

IEM - network operating center wrote:
> debugging output:
> running "namazu" from the command-line with debugging options gives me 
> following ("OSC" is a keyword which pops up every now and then)
> %> namazu -c -d --config=pd-list "OSC"
> namazu(debug): NAMAZUNORC: ''
> namazu(debug): load_rcfile: /etc/namazu/namazurc loaded
> namazu(debug):    5: Directive:  [Index]
> namazu(debug):       Argument 1: [/var/lib/namazu/index/pd-list]
> namazu(debug):   12: Directive:  [Template]
> namazu(debug):       Argument 1: [/var/lib/namazu/index/pd-list]
> namazu(debug):   14: Directive:  [Replace]
> namazu(debug):       Argument 1: [/var/lib/mailman/archives/private]
> namazu(debug):       Argument 2: [/pipermail]
> namazu(debug):   20: Directive:  [Logging]
> namazu(debug):       Argument 1: [on]
> namazu(debug): load_rcfile: pd-list loaded
> namazu(debug):  -n: 20
> namazu(debug):  -w: 0
> namazu(debug): query: [OSC]
> namazu(debug): Index name [0]: /var/lib/namazu/index/pd-list
> namazu(debug): set_phrase_trick: OSC
> namazu(debug): set_regex_trick: OSC
> namazu(debug): query.tokennum: 1
> namazu(debug): query.tab[0]: OSC
> namazu(debug): size of /var/lib/namazu/index/pd-list/NMZ.t: 132748
> namazu(debug): before nmz_strlower: [OSC]
> namazu(debug): after nmz_strlower:  [osc]
> namazu(debug): do WORD search
> namazu(debug): size of /var/lib/namazu/index/pd-list/NMZ.ii: 1492960
> namazu(debug): l:0: !
> namazu(debug): r:373239: µÎ¬
> namazu(debug): searching: ..)
> namazu(debug): searching:
> namazu(debug): searching: khz.

so after a bit more research i found, that NMZ.ii does not return the 
correct offset.
as far as i understand it the search::nmz_binsearch() performs a binary 
search of the keyword using NMZ.wi to look up which byte-offset a given 
line has in NMZ.w (with each keyord in a separate line)
it first starts with line 186620 [=(373239+1)/2=(r+1)/2] which in fact 
contains "clean;" but namazu thinks that it contains "..)"

more research revealed, that the byte-offset returned from NMZ.wi points 
into the middle of a line "clean....)"; however, since the so found term 
in "..)" the binary search miserably fails.

i guess it is a problem with some multi-byte characters.
(which reminds me that when i build the index i get some warnings:
"Wide character in print at /usr/bin/mknmz line 2447, <GEN7162> line 

any hints how i should proceed?


More information about the Namazu-users-en mailing list