From alexander at oelzant.priv.at Sun Jul 9 03:13:42 2006
From: alexander at oelzant.priv.at (Alexander Oelzant)
Date: Sun Jul 9 03:13:52 2006
Subject: [Namazu-users-en] meta-information on large binary files
Message-ID: <20060708181342.GZ1579@comitan.oelzant.priv.at>
Is there any possibility (or planned feature) to have namazu read just a
few Kb of a file in order to extract metadata? In analogy to the mp3
filter I've written an ogg plugin, but for the large radio recordings
it's prohibitively slow.
As the metadata is at the beginning of the stream, the overhead is
exorbitant; maybe there should be a method to just retrieve a chunk at a
time?
I realise this is probably rather a request to namazu-developers, but
I'm still hoping I might have overlooked the appropriate feature ...
hand
Alexander
--
Alexander Oelzant (Durchlaufstr. 7/4/5, A-1200 Wien)
alexander@oelzant.priv.at aoe@fsinf.htu.tuwien.ac.at
ex-internic, ripe, bofh, priv.at: !ao418
+43 1 3500929 +43 676 84441065 McQ
From knok at daionet.gr.jp Tue Jul 11 13:58:37 2006
From: knok at daionet.gr.jp (NOKUBI Takatsugu)
Date: Tue Jul 11 13:58:41 2006
Subject: [Namazu-users-en] Re: meta-information on large binary files
In-Reply-To: <20060708181342.GZ1579@comitan.oelzant.priv.at>
References: <20060708181342.GZ1579@comitan.oelzant.priv.at>
Message-ID: <877j2kloua.wl%knok@daionet.gr.jp>
At Sat, 8 Jul 2006 20:13:42 +0200,
Alexander Oelzant wrote:
> Is there any possibility (or planned feature) to have namazu read just a
> few Kb of a file in order to extract metadata? In analogy to the mp3
> filter I've written an ogg plugin, but for the large radio recordings
> it's prohibitively slow.
If the target files are only one media-type, you can do it like following:
$ mknmz -O indexdir -t audio/mpeg target-dir
-t (--media-type) option ommits to read target file for finding binary
signature.
--
NOKUBI Takatsugu
E-mail: knok@daionet.gr.jp
knok@namazu.org / knok@debian.org
From alexander at oelzant.priv.at Tue Jul 11 21:49:24 2006
From: alexander at oelzant.priv.at (Alexander Oelzant)
Date: Tue Jul 11 21:49:42 2006
Subject: [Namazu-users-en] Re: meta-information on large binary files
In-Reply-To: <877j2kloua.wl%knok@daionet.gr.jp>
References: <20060708181342.GZ1579@comitan.oelzant.priv.at>
<877j2kloua.wl%knok@daionet.gr.jp>
Message-ID: <20060711124924.GC1579@comitan.oelzant.priv.at>
On Tue, Jul 11, 2006 at 01:58:37PM +0900, NOKUBI Takatsugu wrote:
> At Sat, 8 Jul 2006 20:13:42 +0200,
> Alexander Oelzant wrote:
> > Is there any possibility (or planned feature) to have namazu read just a
> > few Kb of a file in order to extract metadata? In analogy to the mp3
> > filter I've written an ogg plugin, but for the large radio recordings
> > it's prohibitively slow.
>
> If the target files are only one media-type, you can do it like following:
>
> $ mknmz -O indexdir -t audio/mpeg target-dir
>
> -t (--media-type) option ommits to read target file for finding binary
> signature.
Thanks, but unfortunately for indexing the filter still has to read in
the entire file, for a 200M-file that produces processes like the
following:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22312 user 15 10 1598m 592m 1104 R 0.3 39.2 0:08.19 mknmz
With enough swap, it takes about twenty minutes to extract the 5 lines
of data and insert those in the db ;-)
I was hoping the $ON_MEMORY_MAX = 5000000; would take care of that, e.
g. only reading in the first part of the file, but according to
tips.html that only influences the size of the db files kept in memory,
which is only logical, since redesigning namazu to read in files chunk
by chunk would probably involve rewriting all the filters to use a
read_chunk() function instead of accessing $$contref directly, though.
hand
Alexander
--
Alexander Oelzant (Durchlaufstr. 7/4/5, A-1200 Wien)
alexander@oelzant.priv.at aoe@fsinf.htu.tuwien.ac.at
ex-internic, ripe, bofh, priv.at: !ao418
+43 1 3500929 +43 676 84441065 McQ
From yw3t-trns at asahi-net.or.jp Tue Jul 11 22:03:09 2006
From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi)
Date: Tue Jul 11 22:04:40 2006
Subject: [Namazu-users-en] Re: meta-information on large binary files
References: <20060708181342.GZ1579@comitan.oelzant.priv.at>
Message-ID: <44B3A18D.B78163E1@asahi-net.or.jp>
Namazu is a full-text search system.
Most document files should analyze all data.
Namazu processes the file reading everything once.
The voice and the movie file are special.
(Meta information is included in the first part of the file. )
Therefore, it is difficult to change the design to read the file
partially.
--
=====================================================================
TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp
http://www.asahi-net.or.jp/~yw3t-trns/index.htm
Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E
From yw3t-trns at asahi-net.or.jp Tue Jul 11 22:31:34 2006
From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi)
Date: Tue Jul 11 22:33:05 2006
Subject: [Namazu-users-en] Re: meta-information on large binary files
References: <20060708181342.GZ1579@comitan.oelzant.priv.at>
<877j2kloua.wl%knok@daionet.gr.jp>
<20060711124924.GC1579@comitan.oelzant.priv.at>
Message-ID: <44B3A836.BEB25EAB@asahi-net.or.jp>
Alexander Oelzant wrote:
>
> > If the target files are only one media-type, you can do it like following:
> >
> > $ mknmz -O indexdir -t audio/mpeg target-dir
> >
> > -t (--media-type) option ommits to read target file for finding binary
> > signature.
>
> Thanks, but unfortunately for indexing the filter still has to read in
> the entire file, for a 200M-file that produces processes like the
> following:
Namazu reads the entire file once.
In the worst case, it copies with mknmz once, it reads once, it writes
with the filter once, and it reads once.
The time that hangs to the presumption of the media type is only
shortened when -t option is specified.
> I was hoping the $ON_MEMORY_MAX = 5000000; would take care of that, e.
> g. only reading in the first part of the file, but according to
> tips.html that only influences the size of the db files kept in memory,
$ON_MEMORY_MAX is not specification s as for the read size of the
file.
When two or more files are read, and the size of the read file
exceeds $ON_MEMORY_MAX, the index is written once.
"$ON_MEMORY_MAX = 5000000;" it Every time, it writes it to the
index at the file of 200MB.
Because it takes time to the writing processing of the index,
the frequency is decreased when $ON_MEMORY_MAX is adjusted to a very
big value, and time can be shortened.
Processing will quicken more than now because it assumes,
"$ON_MEMORY_MAX = 1073741824;" if the memory is piled up enough.
--
=====================================================================
TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp
http://www.asahi-net.or.jp/~yw3t-trns/index.htm
Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E
From jhart at atr.jp Wed Jul 12 15:38:16 2006
From: jhart at atr.jp (J. Hart)
Date: Wed Jul 12 15:39:06 2006
Subject: [Namazu-users-en] Re: mknmz not working for Japanese language
documents ?
In-Reply-To: <44A48364.7030703@dcook.org>
References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp>
<44A472EC.1000105@atr.jp> <44A48364.7030703@dcook.org>
Message-ID: <44B498D8.8070201@atr.jp>
I still seem to be unable to reliably search Japanese documents with
Namazu, but searching English language documents seems to work
perfectly. We are able to find some documents using Japanese search
terms, but many are not found.
I am building the index as follows :
export LANG=C
mknmz -k --indexing-lang=ja_JP.eucJP -f mknmzrc -O index publications
We are invoking Namazu vi a web page with a cgi script. What we can
find seems to be affected by how the page encoding is set for the browser.
From jhart at atr.jp Wed Jul 12 16:00:36 2006
From: jhart at atr.jp (J. Hart)
Date: Wed Jul 12 16:01:21 2006
Subject: [Namazu-users-en] (no subject)
In-Reply-To: <44A48364.7030703@dcook.org>
References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp>
<44A48364.7030703@dcook.org>
Message-ID: <44B49E14.2010606@atr.jp>
Every night, we rebuild our search index to include things we've added
during the day. I have a log of the index build emailed to me.
In that log, I find the following :
1/716 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/026.pdf
system error occurred! (x-system/x-error) skipped.
1/715 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/033.pdf
system error occurred! (x-system/x-error) skipped.
1/714 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/050.pdf
system error occurred! (x-system/x-error) skipped.
1/713 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/054.pdf
system error occurred! (x-system/x-error) skipped.
Is there a way to find out what these error messages are caused by ?
Regards,
J. Hart
From jhart at atr.jp Wed Jul 12 16:10:26 2006
From: jhart at atr.jp (J. Hart)
Date: Wed Jul 12 16:11:12 2006
Subject: [Namazu-users-en] error messages from mknmz (added subject line)
In-Reply-To: <44A48364.7030703@dcook.org>
References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp>
<44A48364.7030703@dcook.org>
Message-ID: <44B4A062.4050200@atr.jp>
I apologize for this repost. It seems I had forgotten to add a subject
line in my previous post ....m()m
Every night, we rebuild our search index to include things we've added
during the day. I have a log of the index build emailed to me.
In that log, I find the following :
1/716 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/026.pdf
system error occurred! (x-system/x-error) skipped.
1/715 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/033.pdf
system error occurred! (x-system/x-error) skipped.
1/714 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/050.pdf
system error occurred! (x-system/x-error) skipped.
1/713 -
http://133.186.90.12/hrcn/publications/2002ICRA/pdffiles/papers/054.pdf
system error occurred! (x-system/x-error) skipped.
Is there a way to find out what these error messages are caused by ?
Regards,
J. Hart
From yw3t-trns at asahi-net.or.jp Wed Jul 12 16:21:01 2006
From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi)
Date: Wed Jul 12 16:22:31 2006
Subject: [Namazu-users-en] Re: error messages from mknmz (added subject line)
References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp>
<44A48364.7030703@dcook.org> <44B4A062.4050200@atr.jp>
Message-ID: <44B4A2DD.22E1D5E0@asahi-net.or.jp>
"J. Hart" wrote:
>
> Every night, we rebuild our search index to include things we've added
> during the day. I have a log of the index build emailed to me.
> In that log, I find the following :
...
> Is there a way to find out what these error messages are caused by ?
Perhaps, these sizes of the file might have been larger than that of
$FILE_SIZE_MAX .
--
=====================================================================
TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp
http://www.asahi-net.or.jp/~yw3t-trns/index.htm
Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E
From yw3t-trns at asahi-net.or.jp Wed Jul 12 19:28:02 2006
From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi)
Date: Wed Jul 12 19:29:32 2006
Subject: [Namazu-users-en] Re: mknmz not working for Japanese
languagedocuments ?
References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp>
<44A472EC.1000105@atr.jp> <44A48364.7030703@dcook.org>
<44B498D8.8070201@atr.jp>
Message-ID: <44B4CEB2.59F61CEF@asahi-net.or.jp>
"J. Hart" wrote:
>
> We are able to find some documents using Japanese search
> terms, but many are not found.
Abstractly, the cause cannot be narrowed.
Please give a concrete example.
> export LANG=C
> mknmz -k --indexing-lang=ja_JP.eucJP -f mknmzrc -O index publications
The following might be better though it repeats.
$ env LANG=ja_JP.eucjp mknmz -k -f mknmzrc -O index publications
And, please inform me of contents of mknmzrc.
Please confirm the operation to make sure with pltests.
$ cd pltests
$ rm test-log
$ env LANG=ja_JP.eucJP perl alltests.pl
> We are invoking Namazu vi a web page with a cgi script. What we can
> find seems to be affected by how the page encoding is set for the browser.
The encoding of Web Page that calls namazu.cgi should be EUC-JP.
First of all, let's confirm the operation by using not namazu.cgi
but namazu.
A. Are not you found by namazu.cgi though are found by namazu?
B. Are neither namazu nor namazu.cgi found?
--
=====================================================================
TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp
http://www.asahi-net.or.jp/~yw3t-trns/index.htm
Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E
From anthonys at faredge.com.au Fri Jul 28 15:38:49 2006
From: anthonys at faredge.com.au (Anthony Sadler)
Date: Fri Jul 28 16:52:09 2006
Subject: [Namazu-users-en] MaxHit directive apparently being ignored
Message-ID: <001f01c6b210$7c0c2770$bf20a8c0@snow>
Hi peoples:
Summary of problem
------------------
I have a Fedora Core 3 box running namazu 2.0.14. For some reason, namazu apparently ignores the MaxHit directive in the .namazurc file. I do not know what the max setting is, but I know searches that return a small amount of entries Namaz
Example
-------
Here is an example case. Please note that hostnames and directory pathes have been changed, though not to the detriment of this report.
On the web interface, I run a search for the word "test". I get a result of:
Web interface results
---------------------
References: [ (Too many documents hit. Ignored) ]
No document matching your query.
Up the top, I have this:
This index contains 5,072 documents and 122,292 keywords.
That would imply to me that there are a possible 5072 results. My .namazurc file is attached below in complete. However, here is the relevant sections:
MaxHit 5000
MaxMatch 200000
So by that, I should have got some results.
Command line results
--------------------
This is what happens when I run namazu from the command line:
#=====================================================================#
[root@server cgi-bin]# namazu -n 2147483647 -f .namazurc test
Results:
References: [ (Too many documents hit. Ignored) ]
No document matching your query.
[root@server cgi-bin]#
#=====================================================================#
Notice how I have specified an insane value for -n. The .namazurc file I specified is the one below. Surely there cannot be more than 2147483647 matches in a database that contains 5,072 documents and 122,292 keywords?
Here is the debug output:
#=======================================================================================================#
[root@server cgi-bin]# namazu -n 2147483647 -f .namazurc test --debug
namazu(debug): NAMAZUNORC: ''
namazu(debug): 15: Directive: [Index]
namazu(debug): Argument 1: [/home/wwwroot/sites/lists.company.com/lists]
namazu(debug): 23: Directive: [Template]
namazu(debug): Argument 1: [/home/wwwroot/sites/lists.company.com/lists]
namazu(debug): 51: Directive: [Replace]
namazu(debug): Argument 1: [/home/wwwroot/sites/lists.company.com/]
namazu(debug): Argument 2: [http://lists.company.com/]
namazu(debug): 58: Directive: [Logging]
namazu(debug): Argument 1: [off]
namazu(debug): Scoring: tfidf: 0, dl: 0, freshness: 0, uri: 0
namazu(debug): 78: Directive: [Scoring]
namazu(debug): Argument 1: [simple]
namazu(debug): 85: Directive: [EmphasisTags]
namazu(debug): Argument 1: []
namazu(debug): Argument 2: []
namazu(debug): 92: Directive: [MaxHit]
namazu(debug): Argument 1: [5000]
namazu(debug): 99: Directive: [MaxMatch]
namazu(debug): Argument 1: [200000]
namazu(debug): load_rcfile: .namazurc loaded
namazu(debug): -n: 2147483647
namazu(debug): -w: 0
namazu(debug): query: [test]
namazu(debug): Index name [0]: /home/wwwroot/sites/lists.company.com/lists
namazu(debug): set_phrase_trick: test
namazu(debug): set_regex_trick: test
namazu(debug): query.tokennum: 1
namazu(debug): query.tab[0]: test
namazu(debug): size of /home/wwwroot/sites/lists.company.com/lists/NMZ.t: 8234976
namazu(debug): before nmz_strlower: [test]
namazu(debug): after nmz_strlower: [test]
namazu(debug): do WORD search
namazu(debug): size of /home/wwwroot/sites/lists.company.com/lists/NMZ.ii: 489168
namazu(debug): l:0: ??
namazu(debug): r:122291: ?????chmail2000@vip.sina.com
namazu(debug): searching: bodipy?
namazu(debug): searching: onwards,
namazu(debug): searching: technician's
namazu(debug): searching: unconfigured
namazu(debug): searching: ticket274.html
namazu(debug): searching: ticket1377
namazu(debug): searching: threadm
namazu(debug): searching: testing2
namazu(debug): searching: tender--
namazu(debug): searching: test3
namazu(debug): searching: terrible)
namazu(debug): searching: test()
namazu(debug): searching: tesoriero"
namazu(debug): searching: tesoriero';
namazu(debug): searching: test"
namazu(debug): searching: test
Results:
References: [ (Too many documents hit. Ignored) ]
No document matching your query.
[root@server cgi-bin]#
#=======================================================================================================#
If you would like, I could attach straces of the searchs above. I have looked through them and have not seen anything in there that gives me any hints.
#=======================================================================================================#
#=======================================================================================================#
# This is a Namazu configuration file for namazu or namazu.cgi.
#
# Originally, this file is named 'namazurc-sample'. so you should
# copy this to 'namazurc' to make the file effective.
#
# Each item is must be separated by one or more SPACE or TAB characters.
# You can use a double-quoted string for represanting a string which
# contains SPACE or TAB characters like "foo bar baz".
##
## Index: Specify the default directory.
##
# Index /usr/local/var/namazu/index
Index /home/wwwroot/sites/lists.company.com/lists
##
## Template: Set the template directory containing
## NMZ.{head,foot,body,tips,result} files.
##
# Template /home/www/corbett/www/lists
Template /home/wwwroot/sites/lists.company.com/lists
##
## Replace: Replace TARGET with REPLACEMENT in URIs in search
## results.
##
## TARGET is specified by Ruby's perl-like regular expressions.
## You can caputure sub-strings in TARGET by surrounding them
## with `(' and `)'and use them later as backreferences by
## \1, \2, \3,... \9.
##
## To use meta characters literally such as `*', `+', `?', `|',
## `[', `]', `{', `}', `(', `)', escape them with `\'.
##
## e.g.,
##
## Replace /home/foo/public_html/ http://www.foobar.jp/~foo/
## Replace /home/(.*)/public_html/ http://www.foobar.jp/\1/
## Replace /C\|/foo/ http://www.foobar.jp/
##
## If you do not want to do the processing on command line use,
## run namazu with -U option.
##
## You can specify more than one Replace rules but the only
## first-matched rule are applied.
##
#Replace /home/foo/public_html/ http://www.foo.bar.jp/~foo/
Replace /home/wwwroot/sites/lists.company.com/ http://lists.company.com/
##
## Logging: Set OFF to turn off keyword logging to NMZ.slog.
## Default is ON.
##
Logging off
##
## Lang: Set the locale code such as `ja_JP.eucJP', `ja_JP.SJIS',
## `de', etc. This directive works only if the environment
## variable LANG is not set because the directive is mainly
## intended for CGI use. On the shell, You can set
## environemtnt variable LANG instead of using the directive.
##
## If you set `de' to it, namazu.cgi use
## NMZ.(head|foot|body|tips|results).de for displaying results
## and use a proper message catalog for `de'.
##
#Lang ja
##
## Scoring: Set the scoring method "tfidf" or "simple".
##
Scoring simple
##
## EmphasisTags: Set the pair of html elements which is used in
## keyword emphasizing for search results.
##
EmphasisTags "" ""
##
## MaxHit: Set the maximum number of documents which can be
## handled in query operation. If documents matching a
## query exceed the value, they will be ignored.
##
MaxHit 5000
##
## MaxMatch: Set the maximum number of words which can be
## handled in regex/prefix/inside/suffix query. If documents
## matching a query exceed the value, they will be ignored.
##
MaxMatch 200000
##
## ContentType: Set "Content-Type" header output. If you want to
## use non-HTML template files, set it suitably.
#ContentType "text/x-hdml"
#=======================================================================================================#
#=======================================================================================================#
Anthony Sadler
Far Edge Technology
w: (02) 8425 1410
From yw3t-trns at asahi-net.or.jp Fri Jul 28 17:35:27 2006
From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi)
Date: Fri Jul 28 17:34:04 2006
Subject: [Namazu-users-en] Re: MaxHit directive apparently being ignored
References: <001f01c6b210$7c0c2770$bf20a8c0@snow>
Message-ID: <44C9CC4F.50CFF6A9@asahi-net.or.jp>
Anthony Sadler wrote:
>
> I have a Fedora Core 3 box running namazu 2.0.14. For some reason,
> namazu apparently ignores the MaxHit directive in the .namazurc file.
> I do not know what the max setting is, but I know searches that
> return a small amount of entries Namaz
Do not use Namazu 2.0.14.
The version must improve to Namazu 2.0.16 at once.
MaxHit and MaxMach are debugged by Namazu 2.0.15.
> Up the top, I have this:
> This index contains 5,072 documents and 122,292 keywords.
>
> That would imply to me that there are a possible 5072 results.
> My .namazurc file is attached below in complete. However, here is
> the relevant sections:
>
> MaxHit 5000
> MaxMatch 200000
>
> So by that, I should have got some results.
If "test" is included in all documents, it becomes an error because
MaxHit is smaller than 5072.
If MaxHit is adjusted to a bigger value, it is likely not to become
an error.
However, it is necessary to note it because the deleted number
of documents is included.
Please try again by the version's improving to Namazu 2.0.16,
and compressing the index with gcnmz.
> Notice how I have specified an insane value for -n.
> The .namazurc file I specified is the one below. Surely there
> cannot be more than 2147483647 matches in a database that
> contains 5,072 documents and 122,292 keywords?
The -n option specifies the number of documents displayed at a time.
It is not the one to specify the number of maximum hits.
--
=====================================================================
TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp
http://www.asahi-net.or.jp/~yw3t-trns/index.htm
Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E