[Namazu-users-en] MaxHit directive apparently being ignored

Anthony Sadler anthonys at faredge.com.au
Fri Jul 28 15:38:49 JST 2006


Hi peoples:

Summary of problem
------------------
I have a Fedora Core 3 box running namazu 2.0.14. For some reason, namazu apparently ignores the MaxHit directive in the .namazurc file. I do not know what the max setting is, but I know searches that return a small amount of entries Namaz

Example
-------
Here is an example case. Please note that hostnames and directory pathes have been changed, though not to the detriment of this report.

On the web interface, I run a search for the word "test". I get a result of:

Web interface results
---------------------
References: [ (Too many documents hit. Ignored) ]

No document matching your query.

Up the top, I have this:
This index contains  5,072  documents and  122,292  keywords.

That would imply to me that there are a possible 5072 results. My .namazurc file is attached below in complete. However, here is the relevant sections:

MaxHit 5000
MaxMatch 200000

So by that, I should have got some results. 

Command line results
--------------------
This is what happens when I run namazu from the command line:

#=====================================================================#
[root at server cgi-bin]# namazu -n 2147483647 -f .namazurc test
Results:

References:  [  (Too many documents hit. Ignored)  ]

No document matching your query.
[root at server cgi-bin]#
#=====================================================================#
Notice how I have specified an insane value for -n. The .namazurc file I specified is the one below. Surely there cannot be more than 2147483647 matches in a database that contains 5,072 documents and 122,292 keywords?

Here is the debug output:
#=======================================================================================================#
[root at server cgi-bin]# namazu -n 2147483647 -f .namazurc test --debug
namazu(debug): NAMAZUNORC: ''
namazu(debug):   15: Directive:  [Index]
namazu(debug):       Argument 1: [/home/wwwroot/sites/lists.company.com/lists]
namazu(debug):   23: Directive:  [Template]
namazu(debug):       Argument 1: [/home/wwwroot/sites/lists.company.com/lists]
namazu(debug):   51: Directive:  [Replace]
namazu(debug):       Argument 1: [/home/wwwroot/sites/lists.company.com/]
namazu(debug):       Argument 2: [http://lists.company.com/]
namazu(debug):   58: Directive:  [Logging]
namazu(debug):       Argument 1: [off]
namazu(debug): Scoring: tfidf: 0, dl: 0, freshness: 0, uri: 0
namazu(debug):   78: Directive:  [Scoring]
namazu(debug):       Argument 1: [simple]
namazu(debug):   85: Directive:  [EmphasisTags]
namazu(debug):       Argument 1: [<strong class="keyword">]
namazu(debug):       Argument 2: [</strong>]
namazu(debug):   92: Directive:  [MaxHit]
namazu(debug):       Argument 1: [5000]
namazu(debug):   99: Directive:  [MaxMatch]
namazu(debug):       Argument 1: [200000]
namazu(debug): load_rcfile: .namazurc loaded
namazu(debug):  -n: 2147483647
namazu(debug):  -w: 0
namazu(debug): query: [test]
namazu(debug): Index name [0]: /home/wwwroot/sites/lists.company.com/lists
namazu(debug): set_phrase_trick: test
namazu(debug): set_regex_trick: test
namazu(debug): query.tokennum: 1
namazu(debug): query.tab[0]: test
namazu(debug): size of /home/wwwroot/sites/lists.company.com/lists/NMZ.t: 8234976
namazu(debug): before nmz_strlower: [test]
namazu(debug): after nmz_strlower:  [test]
namazu(debug): do WORD search
namazu(debug): size of /home/wwwroot/sites/lists.company.com/lists/NMZ.ii: 489168
namazu(debug): l:0: »¾
namazu(debug): r:122291: ãããããchmail2000 at vip.sina.com
namazu(debug): searching: bodipy?
namazu(debug): searching: onwards,
namazu(debug): searching: technician's
namazu(debug): searching: unconfigured
namazu(debug): searching: ticket274.html
namazu(debug): searching: ticket1377
namazu(debug): searching: threadm
namazu(debug): searching: testing2
namazu(debug): searching: tender--
namazu(debug): searching: test3
namazu(debug): searching: terrible)
namazu(debug): searching: test()
namazu(debug): searching: tesoriero"
namazu(debug): searching: tesoriero</st1:personname>';
namazu(debug): searching: test"
namazu(debug): searching: test
Results:

References:  [  (Too many documents hit. Ignored)  ]

No document matching your query.
[root at server cgi-bin]#
#=======================================================================================================#

If you would like, I could attach straces of the searchs above. I have looked through them and have not seen anything in there that gives me any hints. 

#=======================================================================================================#
#=======================================================================================================#
# This is a Namazu configuration file for namazu or namazu.cgi.
#
#  Originally, this file is named 'namazurc-sample'.  so you should
#  copy this to 'namazurc' to make the file effective.
#
#  Each item is must be separated by one or more SPACE or TAB characters.
#  You can use a double-quoted string for represanting a string which
#  contains SPACE or TAB characters like "foo bar baz".


##
## Index: Specify the default directory.
##
# Index         /usr/local/var/namazu/index
Index           /home/wwwroot/sites/lists.company.com/lists


##
## Template: Set the template directory containing
## NMZ.{head,foot,body,tips,result} files.
##
# Template /home/www/corbett/www/lists
Template        /home/wwwroot/sites/lists.company.com/lists


##
## Replace: Replace TARGET with REPLACEMENT in URIs in search
## results.
##
## TARGET is specified by Ruby's perl-like regular expressions.
## You can caputure sub-strings in TARGET by surrounding them
## with `(' and `)'and use them later as backreferences by
## \1, \2, \3,... \9.
##
## To use meta characters literally such as `*', `+', `?', `|',
## `[', `]', `{', `}', `(', `)', escape them with `\'.
##
## e.g.,
##
##    Replace  /home/foo/public_html/   http://www.foobar.jp/~foo/
##    Replace  /home/(.*)/public_html/  http://www.foobar.jp/\1/
##    Replace   /C\|/foo/               http://www.foobar.jp/
##
## If you do not want to do the processing on command line use,
## run namazu with -U option.
##
## You can specify more than one Replace rules but the only
## first-matched rule are applied.
##
#Replace       /home/foo/public_html/  http://www.foo.bar.jp/~foo/
Replace         /home/wwwroot/sites/lists.company.com/  http://lists.company.com/


##
## Logging: Set OFF to turn off keyword logging to NMZ.slog.
## Default is ON.
##
Logging       off


##
## Lang: Set the locale code such as `ja_JP.eucJP', `ja_JP.SJIS',
## `de', etc.  This directive works only if the environment
## variable LANG is not set because the directive is mainly
## intended for CGI use.  On the shell, You can set
## environemtnt variable LANG instead of using the directive.
##
## If you set `de' to it, namazu.cgi use
## NMZ.(head|foot|body|tips|results).de for displaying results
## and use a proper message catalog for `de'.
##
#Lang          ja


##
## Scoring: Set the scoring method "tfidf" or "simple".
##
Scoring       simple


##
## EmphasisTags: Set the pair of html elements which is used in
## keyword emphasizing for search results.
##
EmphasisTags  "<strong class=\"keyword\">"   "</strong>"

##
## MaxHit: Set the maximum number of documents which can be
## handled in query operation.  If documents matching a
## query exceed the value, they will be ignored.
##
MaxHit 5000

##
## MaxMatch: Set the maximum number of words which can be
## handled in regex/prefix/inside/suffix query. If documents
## matching a query exceed the value, they will be ignored.
##
MaxMatch        200000

##
## ContentType: Set "Content-Type" header output. If you want to
## use non-HTML template files, set it suitably.
#ContentType    "text/x-hdml"
#=======================================================================================================#
#=======================================================================================================#

Anthony Sadler
Far Edge Technology
w: (02) 8425 1410
 



More information about the Namazu-users-en mailing list