From blewis at nps.org.au Thu Jun 1 09:14:34 2006 From: blewis at nps.org.au (Bryn Lewis) Date: Thu Jun 1 09:15:35 2006 Subject: [Namazu-users-en] cgi.exe ignoring config file Message-ID: <447E316A.10400@nps.org.au> hi, A simple setup question. I have successfully setup namazu.cgi.exe to run in cgi-bin on apache (on windows). If I put the index in C:\namazu\var\namazu\index it is happy. I can't get namazu.cgi.exe to look for the index anywhere else, however. I have edited the config file (.namzurc) to : #Index C:\namazu\var\namazu\x-index (I have changed C:\namazu\var\namazu\index to C:\namazu\var\namazu\x-index to test putting the index elsewhere). I have also tried changing '.namazurc' to 'namazurc' .namazurc is in C:\Program Files\Apache Group\Apache2\cgi-bin, along with namazu.cgi.bin any help appreciated, thanks, Bryn Lewis From warlord at MIT.EDU Thu Jun 1 09:32:37 2006 From: warlord at MIT.EDU (Derek Atkins) Date: Thu Jun 1 09:32:49 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file In-Reply-To: <447E316A.10400@nps.org.au> References: <447E316A.10400@nps.org.au> Message-ID: <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> Quoting Bryn Lewis : > hi, > > A simple setup question. > > I have successfully setup namazu.cgi.exe to run in cgi-bin on apache (on > windows). If I put the index in C:\namazu\var\namazu\index it is happy. > > I can't get namazu.cgi.exe to look for the index anywhere else, however. > I have edited the config file (.namzurc) to : > > #Index C:\namazu\var\namazu\x-index This line is commented out (see that '#' at the beginning on the line?). That leading hash mark means "this line is a comment". So it's ignoring your change. Try creating an Index line that's not commented out: Index C:\namazu\var\namazu\x-index (notice the lack of the leading hash mark).. > (I have changed C:\namazu\var\namazu\index to > C:\namazu\var\namazu\x-index to test putting the index elsewhere). > > I have also tried changing '.namazurc' to 'namazurc' > > .namazurc is in C:\Program Files\Apache Group\Apache2\cgi-bin, along > with namazu.cgi.bin You should keep it called .namazurc. The file namazurc needs to live in the "central" config location... BUt I have no idea where that is on Windows. > any help appreciated, thanks, Bryn Lewis -derek -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH warlord@MIT.EDU PGP key available From blewis at nps.org.au Thu Jun 1 14:17:03 2006 From: blewis at nps.org.au (Bryn Lewis) Date: Thu Jun 1 14:17:57 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file In-Reply-To: <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> Message-ID: <447E784F.3040703@nps.org.au> Derek Atkins wrote: > Quoting Bryn Lewis : > >> I have successfully setup namazu.cgi.exe to run in cgi-bin on apache (on >> windows). If I put the index in C:\namazu\var\namazu\index it is happy. >> >> I can't get namazu.cgi.exe to look for the index anywhere else, however. >> I have edited the config file (.namzurc) to : >> >> #Index C:\namazu\var\namazu\x-index > > > This line is commented out (see that '#' at the beginning on the line?). > That leading hash mark means "this line is a comment". So it's ignoring > your change. Try creating an Index line that's not commented out: > > Index C:\namazu\var\namazu\x-index > I tried that, but it still doesn't work. I get a message: References: [ (can't open the index) ] in the html. >> I have also tried changing '.namazurc' to 'namazurc' >> > You should keep it called .namazurc. The file namazurc needs to live > in the "central" config location... BUt I have no idea where that is > on Windows. > ok Bryn From yw3t-trns at asahi-net.or.jp Thu Jun 1 15:10:01 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 1 15:10:45 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> Message-ID: <447E84B9.DAEA0297@asahi-net.or.jp> Bryn Lewis wrote: > > > This line is commented out (see that '#' at the beginning on the line?). > > That leading hash mark means "this line is a comment". So it's ignoring > > your change. Try creating an Index line that's not commented out: > > > > Index C:\namazu\var\namazu\x-index > > > I tried that, but it still doesn't work. I get a message: > References: [ (can't open the index) ] in the html. Is the index file put on the directory? This error occurs if there is no index file. How does the result of the following command become it? C:\> dir C:\namazu\var\namazu\x-index -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From blewis at nps.org.au Thu Jun 1 15:20:25 2006 From: blewis at nps.org.au (Bryn Lewis) Date: Thu Jun 1 15:21:25 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file In-Reply-To: <447E84B9.DAEA0297@asahi-net.or.jp> References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> <447E84B9.DAEA0297@asahi-net.or.jp> Message-ID: <447E8729.7020600@nps.org.au> Tadamasa Teranishi wrote: > Bryn Lewis wrote: > >>>This line is commented out (see that '#' at the beginning on the line?). >>>That leading hash mark means "this line is a comment". So it's ignoring >>>your change. Try creating an Index line that's not commented out: >>> >>>Index C:\namazu\var\namazu\x-index >>> >> >>I tried that, but it still doesn't work. I get a message: >>References: [ (can't open the index) ] in the html. > > > Is the index file put on the directory? > This error occurs if there is no index file. > How does the result of the following command become it? > > C:\> dir C:\namazu\var\namazu\x-index Yes, the dir exists. I get a file listing (starting with NMZ.body) from: dir C:\namazu\var\namazu\x-index -The index is working ok. If I use the default settings I get a working search page. It is only when I try use .namazurc to use a different index that there is a problem. Bryn From yw3t-trns at asahi-net.or.jp Thu Jun 1 15:38:59 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 1 15:39:40 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> <447E84B9.DAEA0297@asahi-net.or.jp> <447E8729.7020600@nps.org.au> Message-ID: <447E8B83.CD44AB4@asahi-net.or.jp> Bryn Lewis wrote: > > > Is the index file put on the directory? > > This error occurs if there is no index file. > > How does the result of the following command become it? > > > > C:\> dir C:\namazu\var\namazu\x-index > > Yes, the dir exists. I get a file listing (starting with NMZ.body) from: > dir C:\namazu\var\namazu\x-index How does the result of the following command become it? C:\> namazu test C:\namazu\var\namazu\x-index -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From blewis at nps.org.au Thu Jun 1 15:57:10 2006 From: blewis at nps.org.au (Bryn Lewis) Date: Thu Jun 1 15:58:06 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file In-Reply-To: <447E8B83.CD44AB4@asahi-net.or.jp> References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> <447E84B9.DAEA0297@asahi-net.or.jp> <447E8729.7020600@nps.org.au> <447E8B83.CD44AB4@asahi-net.or.jp> Message-ID: <447E8FC6.50208@nps.org.au> Tadamasa Teranishi wrote: > Bryn Lewis wrote: > >>>Is the index file put on the directory? >>>This error occurs if there is no index file. >>>How does the result of the following command become it? >>> >>> C:\> dir C:\namazu\var\namazu\x-index >> >>Yes, the dir exists. I get a file listing (starting with NMZ.body) from: >>dir C:\namazu\var\namazu\x-index > > > How does the result of the following command become it? > > C:\> namazu test C:\namazu\var\namazu\x-index Results: References: [ test: 197 ] Total 197 documents matching your query. etc. From yw3t-trns at asahi-net.or.jp Thu Jun 1 16:08:23 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 1 16:09:04 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> <447E84B9.DAEA0297@asahi-net.or.jp> <447E8729.7020600@nps.org.au> <447E8B83.CD44AB4@asahi-net.or.jp> <447E8FC6.50208@nps.org.au> Message-ID: <447E9267.A69BEA5F@asahi-net.or.jp> Bryn Lewis wrote: > > > How does the result of the following command become it? > > > > C:\> namazu test C:\namazu\var\namazu\x-index > > Results: > > References: [ test: 197 ] > > Total 197 documents matching your query. OK. There is no problem in the index. next, How does the result of the following command become it? C:\> dir "C:\Program Files\Apache Group\Apache2\cgi-bin" It is another one question. Is the result of pltests All PASS? C:\> cd namazu\pltests C:\namazu\pltests\ >perl alltests.pl -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From blewis at nps.org.au Thu Jun 1 16:26:39 2006 From: blewis at nps.org.au (Bryn Lewis) Date: Thu Jun 1 16:27:34 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file In-Reply-To: <447E9267.A69BEA5F@asahi-net.or.jp> References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> <447E84B9.DAEA0297@asahi-net.or.jp> <447E8729.7020600@nps.org.au> <447E8B83.CD44AB4@asahi-net.or.jp> <447E8FC6.50208@nps.org.au> <447E9267.A69BEA5F@asahi-net.or.jp> Message-ID: <447E96AF.1020608@nps.org.au> Tadamasa Teranishi wrote: > OK. There is no problem in the index. > next, > How does the result of the following command become it? > > C:\> dir "C:\Program Files\Apache Group\Apache2\cgi-bin" > Directory of C:\Program Files\Apache Group\Apache2\cgi-bin 1/06/2006 03:09 PM . 1/06/2006 03:09 PM .. 1/06/2006 10:49 AM 3,228 .namzurc 2/03/2006 12:58 AM 1,176,793 namazu.cgi.exe 1/06/2006 10:51 AM 3,228 namzurc 1/06/2006 09:46 AM 0 NMZ.warnlog > It is another one question. > Is the result of pltests All PASS? > > C:\> cd namazu\pltests > C:\namazu\pltests\ >perl alltests.pl C:\namazu\pltests>perl alltests.pl Error: "pkgdatadir": Undefined environment variable. What does this tell me? I've installed Activeperl 5.8.8 Build 817. From yw3t-trns at asahi-net.or.jp Thu Jun 1 16:38:04 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 1 16:38:46 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> <447E84B9.DAEA0297@asahi-net.or.jp> <447E8729.7020600@nps.org.au> <447E8B83.CD44AB4@asahi-net.or.jp> <447E8FC6.50208@nps.org.au> <447E9267.A69BEA5F@asahi-net.or.jp> <447E96AF.1020608@nps.org.au> Message-ID: <447E995C.1CCC9BB@asahi-net.or.jp> Bryn Lewis wrote: > > Directory of C:\Program Files\Apache Group\Apache2\cgi-bin > > 1/06/2006 03:09 PM . > 1/06/2006 03:09 PM .. > 1/06/2006 10:49 AM 3,228 .namzurc > 2/03/2006 12:58 AM 1,176,793 namazu.cgi.exe > 1/06/2006 10:51 AM 3,228 namzurc > 1/06/2006 09:46 AM 0 NMZ.warnlog It has understood. It is a typing error. It is not ".namzurc" and ".namazurc". "namazurc" is unnecessary in this directory. It is likely to move if the name is changed. > > It is another one question. > > Is the result of pltests All PASS? > > > > C:\> cd namazu\pltests > > C:\namazu\pltests\ >perl alltests.pl > > C:\namazu\pltests>perl alltests.pl > Error: "pkgdatadir": Undefined environment variable. > > What does this tell me? See.?README.txt "6. About the environment variable setting" -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Thu Jun 1 16:42:12 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 1 16:42:55 2006 Subject: [Namazu-users-en] Re: cgi.exe ignoring config file References: <447E316A.10400@nps.org.au> <20060531203237.05z3695ply8kwkg4@webmail.mit.edu> <447E784F.3040703@nps.org.au> <447E84B9.DAEA0297@asahi-net.or.jp> <447E8729.7020600@nps.org.au> <447E8B83.CD44AB4@asahi-net.or.jp> <447E8FC6.50208@nps.org.au> <447E9267.A69BEA5F@asahi-net.or.jp> <447E96AF.1020608@nps.org.au> <447E995C.1CCC9BB@asahi-net.or.jp> Message-ID: <447E9A54.340E7B39@asahi-net.or.jp> Tadamasa Teranishi wrote: > > It has understood. It is a typing error. > > It is not ".namzurc" and ".namazurc". > "namazurc" is unnecessary in this directory. > It is likely to move if the name is changed. C:\xxxx\ > ren .namzurc .namazurc -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From chetan at srijan.in Thu Jun 22 21:46:29 2006 From: chetan at srijan.in (Chetan Thapliyal) Date: Thu Jun 22 21:44:35 2006 Subject: [Namazu-users-en] Beginner's question Message-ID: <449A9125.10203@srijan.in> Hi all, I am beginner for kakasi. I could download and intall it very well. Also, I am able to get the translation also. But the real problem is the exact meaning of the kakasi's command line tool's (kakasi) various options. I tried to find the details from google but all in vain. Could anybody send a brief description, or link to the source of reference from where I can get some help. I don't know Japanese. As a result I am unable to get the reference from available Japanese site. Any attempt in this regards would be a great help for me. Thanks in advance for any sort of assistance. Regards, Chetan From knok at daionet.gr.jp Fri Jun 23 08:10:00 2006 From: knok at daionet.gr.jp (NOKUBI Takatsugu) Date: Fri Jun 23 08:10:02 2006 Subject: [Namazu-users-en] Re: Beginner's question In-Reply-To: <449A9125.10203@srijan.in> References: <449A9125.10203@srijan.in> Message-ID: <878xnou6rb.wl%knok@daionet.gr.jp> At Thu, 22 Jun 2006 18:16:29 +0530, Chetan Thapliyal wrote: > I am beginner for kakasi. I could download and intall it very well. > Also, I am able to get the translation also. But the real problem is the > exact meaning of the kakasi's command line tool's (kakasi) various > options. I tried to find the details from google but all in vain. Could > anybody send a brief description, or link to the source of reference > from where I can get some help. I don't know Japanese. As a result I am > unable to get the reference from available Japanese site. Any attempt in > this regards would be a great help for me. I think you try to use Windows version of Namazu, is it right? And if you want to use Namazu on English environment, you don't need KAKASI. It is required for Japanese documents only. -- NOKUBI Takatsugu E-mail: knok@daionet.gr.jp knok@namazu.org / knok@debian.org From starr at houston.rr.com Sat Jun 24 22:35:02 2006 From: starr at houston.rr.com (starr@houston.rr.com) Date: Sat Jun 24 22:35:08 2006 Subject: [Namazu-users-en] Namazu encoding Unix vs win32 Message-ID: I'm trying to generate the Namazu index on Unix (FreeBSD) and search it on win32. When the win32 search runs, a considerable amount of garbage is produced, along with some echoing of templates. Unix search works fine. Looks to me like it may be a coding problem. mknmz scripts are identical on the two ports. I'm scanning English, normal HTML. Any Idea where the difference is ? From yw3t-trns at asahi-net.or.jp Sat Jun 24 23:15:53 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Sat Jun 24 23:16:02 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: Message-ID: <449D4919.AFEB17ED@asahi-net.or.jp> starr@houston.rr.com wrote: > > I'm trying to generate the Namazu index on Unix (FreeBSD) and search it > on win32. 'mknmz' is executed by UNIX and the index is made. It is possible to retrieve it by using the made index by 'namazu.exe' of Windows. The index is compatible. > When the win32 search runs, a considerable amount of garbage is > produced, along > with some echoing of templates. Unix search works fine. Please explain how to become it concretely. > Looks to me like it may be a coding problem. What are grounds of which it thinks like that? > mknmz scripts are identical on the two ports. What does it mean? It is necessary to execute mknmz only by UNIX. Moreover, the Windows version is mknmz.bat, and it is not the same as mknmz of UNIX in the batch file. Please present detailed information on the version of Namazu, the content of namazurc, and the directory composition, etc... Moreover, how does the result of pltests of Windows become it? -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From warlord at MIT.EDU Sun Jun 25 01:06:17 2006 From: warlord at MIT.EDU (Derek Atkins) Date: Sun Jun 25 01:06:25 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 In-Reply-To: (starr@houston.rr.com's message of "Sat, 24 Jun 2006 08:35:02 -0500") References: Message-ID: starr@houston.rr.com writes: > I'm trying to generate the Namazu index on Unix (FreeBSD) and search it > on win32. > When the win32 search runs, a considerable amount of garbage is > produced, along > with some echoing of templates. Unix search works fine. > Looks to me like it may be a coding problem. > mknmz scripts are identical on the two ports. > > I'm scanning English, normal HTML. > > Any Idea where the difference is ? My first guess would be an encoding issue, perhaps unicode v. UTF8? At least that would be my first guess. The index files are data files, not text files. My next guess would be line-ending convention. -derek -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH warlord@MIT.EDU PGP key available From starr at houston.rr.com Sun Jun 25 04:20:32 2006 From: starr at houston.rr.com (starr@houston.rr.com) Date: Sun Jun 25 04:20:37 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 Message-ID: >> >> I'm trying to generate the Namazu index on Unix (FreeBSD) and search it >> on win32. >'mknmz' is executed by UNIX and the index is made. >It is possible to retrieve it by using the made index by 'namazu.exe' >of Windows. >The index is compatible. This is Good News ! >> When the win32 search runs, a considerable amount of garbage is >> produced, along >> with some echoing of templates. Unix search works fine. >Please explain how to become it concretely. First of all, I'm running the Namazu.CGI.exe through shttpd on a win98 system >> Looks to me like it may be a coding problem. >What are grounds of which it thinks like that? Just the fact that some help text from the template came through, preceeded by unintelligible characters where the "hits" are normally seen. Also, Namazu.cgi.exe finished normally without windows or shttpd seeing any problem. (I've had other search software get DLL errors and crash the system) >> mknmz scripts are identical on the two ports. >What does it mean? >It is necessary to execute mknmz only by UNIX. >Moreover, the Windows version is mknmz.bat, and it is not the same >as mknmz of UNIX in the batch file. to demonstrate: After removing the CR at the end of each remaining line,in mknmz.bat as found in the nmz2.0.16.001-win32.zip distribution, this is the diff between the installed script from the FreeBSD port (/usr/local/bin/mknmz) and the stripped .bat file: $ awk '{print substr($0,1,length($0)-1 );}' < mknmz.bat > mknmz.bat.strip $ diff mknmz.bat.strip /usr/local/bin/mknmz 1,14c1 < @rem = '--*-Perl-*-- < @echo off < if "%OS%" == "Windows_NT" goto WinNT < perl -x -S "%0" %1 %2 %3 %4 %5 %6 %7 %8 %9 < goto endofperl < :WinNT < perl -x -S %0 %* < if NOT "%COMSPEC%" == "%SystemRoot%\system32\cmd.exe" goto endofperl < if %errorlevel% == 9009 echo You do not have Perl in your PATH. < if errorlevel 1 goto script_failed_so_exit_with_non_zero_val 2>nul < goto endofperl < @rem '; < #! /c/Perl/bin//perl -w < #line 15 --- > #! /usr/local/bin/perl -w 63,64c50,51 < my $PKGDATADIR = $ENV{'pkgdatadir'} || "C:/namazu/share/namazu"; < my $CONFDIR = "C:/namazu/etc/namazu"; # directory where mknmzrc are in. --- > my $PKGDATADIR = $ENV{'pkgdatadir'} || "/usr/local/share/namazu"; > my $CONFDIR = "/usr/local/etc/namazu"; # directory where mknmzrc are in. 2699,2701d2685 < < __END__ < :endofperl This is only a minor point, just to confirm my quick observation. All the perl code is identical after the .bat "wrapper". -------- Here are the uncommented lines of /usr/local/etc/namazu/mknmzrc on FreeBSD: package conf; # Don't remove this line! $HTML_SUFFIX = "htm"; $ALLOW_FILE = ".*\\.(?:$HTML_SUFFIX)"; -------- Here are the uncommented lines of the namazurc in the windows98 system: Index E:\install\cgi Template E:\install\cgi ##Lang ja_JP.SJIS #Lang ja.EUC ##Lang en ContentType "text/x-html" -- none of the Lang options seem to make any difference I am unclear on their meaning for the win32 CGI -------- Here are the uncommented lines of /usr/local/etc/namazu/namazurc on FreeBSD: Index /var/NAMW Template /var/NAMW -- searching here works fine. -------- Here is the Form.htm file: Search Gospel Codes

Search Gospel Codes



Display: Description: Sort:
Thanks for your help. From starr at houston.rr.com Sun Jun 25 05:22:07 2006 From: starr at houston.rr.com (starr@houston.rr.com) Date: Sun Jun 25 05:22:11 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 Message-ID: Derek Atkins Writes: >> Any Idea where the difference is ? > My first guess would be an encoding issue, perhaps unicode v. UTF8? > At least that would be my first guess. The index files are data > files, not text files. My next guess would be line-ending convention. That was my first guess, especially the CRLF line-ending, but before I went to hexdumps I thought perhaps somebody had already dealt with it or I had messed up a parameter. From yw3t-trns at asahi-net.or.jp Sun Jun 25 05:25:41 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Sun Jun 25 05:26:46 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: Message-ID: <449D9FC5.7F6DA6B6@asahi-net.or.jp> > Moreover, how does the result of pltests of Windows become it? First of all, please inform me of the result of pltests. see. README.txt - 9. Confirming the operation Please confirm whether the installation is correctly done if you do not pass the test. starr@houston.rr.com wrote: > > First of all, I'm running the Namazu.CGI.exe through shttpd > on a win98 system Then, let's test in the command line. (Let's test with namazu.exe. ) And, after normal operation is confirmed, let's test CGI in the next step. > >What does it mean? ... > to demonstrate: > After removing the CR at the end of each remaining line,in mknmz.bat > as found in the nmz2.0.16.001-win32.zip distribution, > this is the diff between the installed script from the FreeBSD port > (/usr/local/bin/mknmz) and the stripped .bat file: ... > This is only a minor point, just to confirm my quick observation. > All the perl code is identical after the .bat "wrapper". Yes. However, even if contents of mknmz.bat are broken, it is likely not to influence it in any way in Windows because mknmz.bat is not executed. > Here are the uncommented lines of /usr/local/etc/namazu/mknmzrc on FreeBSD: > > package conf; # Don't remove this line! > $HTML_SUFFIX = "htm"; > $ALLOW_FILE = ".*\\.(?:$HTML_SUFFIX)"; There is no problem. > Here are the uncommented lines of the namazurc in the windows98 system: > Index E:\install\cgi > Template E:\install\cgi > ##Lang ja_JP.SJIS > #Lang ja.EUC > ##Lang en > ContentType "text/x-html" Here might have to be made the following content. Index E:\install\cgi Template E:\install\cgi Lang C ContentType "text/html" or Index E:\install\cgi Template E:\install\cgi Lang C # ContentType "text/html" > Here are the uncommented lines of /usr/local/etc/namazu/namazurc on FreeBSD: > Index /var/NAMW > Template /var/NAMW > > -- searching here works fine. Were contents of "/var/NAMW" copied onto "E:\install\cgi" as a binary? Then, let's retrieve from the command line and test. C:\> namazu "word" E:\install\cgi -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Sun Jun 25 05:38:00 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Sun Jun 25 05:39:07 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: Message-ID: <449DA2A8.38CE4DC8@asahi-net.or.jp> starr@houston.rr.com wrote: > > > My first guess would be an encoding issue, perhaps unicode v. UTF8? > > At least that would be my first guess. The index files are data > > files, not text files. My next guess would be line-ending convention. > > That was my first guess, especially the CRLF line-ending, > but before I went to hexdumps I thought perhaps > somebody had already dealt with it or I had > messed up a parameter. There is a problem in the setting of the Web server if it operates normally in the command line. Encoding is compulsorily set by setting the Web server. The following one is being written in FAQ of Namazu (Only Japanese :). This is being written for Japanese. Please read it in a different way properly for English. ===================================================================== * The retrieval result is garbled when retrieving it with namazu.cgi by a Japanese character string. The Web server is Apache, and to the configuration file SetServerEncoding UTF-8 The garble is caused with the setting. This is because namazu.cgi doesn't correspond to the input of UTF-8. SetServerEncoding EUC-JP Please operate. Please operate it with EUC-JP for other Web servers if there is a similar set item. -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Sun Jun 25 06:34:51 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Sun Jun 25 06:36:06 2006 Subject: [Namazu-users-en] Namazu 2.0.16 RPM for Fedora Core 5 Message-ID: <449DAFFB.9C52F08@asahi-net.or.jp> Namazu 2.0.16 RPM for Fedora Core 5 was prepared as follows. [Binary RPM] http://www.akaneiro.jp/public/rpm/fc5/RPMS/i386/namazu-2.0.16-1.i386.rpm http://www.akaneiro.jp/public/rpm/fc5/RPMS/i386/namazu-cgi-2.0.16-1.i386.rpm http://www.akaneiro.jp/public/rpm/fc5/RPMS/i386/namazu-devel-2.0.16-1.i386.rpm [Source RPM] http://www.akaneiro.jp/public/rpm/fc5/SRPMS/namazu-2.0.16-1.src.rpm ?Each RPM file signs the electron by GPG with my key. ?Please make binary RPM for user's environment from source RPM for the environments other than Fedora Core 5. Attention..version..environment..do. $ rpmbuild --rebuild namazu-2.0.16-1.src.rpm -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From starr at houston.rr.com Sun Jun 25 06:53:44 2006 From: starr at houston.rr.com (starr@houston.rr.com) Date: Sun Jun 25 06:53:48 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 Message-ID: Tadamasa Teranishi Writes: > First of all, please inform me of the result of pltests. > see. README.txt - 9. Confirming the operation > Please confirm whether the installation is correctly done if you do > not pass the test. OK. I was trying to avoid installing all the indexing and language complexity, on windows, but to fully diagnose the situation ... I've unzipped the nmz2.0.16.001-win32.zip distribution, alltests.pl is asking for NKF.pm -- where do I find it Please ? >Index E:\install\cgi >Template E:\install\cgi >Lang C >ContentType "text/html" -- this gave less garbage characters but still did not format output correctly > Were contents of "/var/NAMW" copied onto "E:\install\cgi" as a binary? Yes. Index and Templates were zipped on Unix platform, unzipped on win98 From starr at houston.rr.com Sun Jun 25 07:25:58 2006 From: starr at houston.rr.com (starr@houston.rr.com) Date: Sun Jun 25 07:26:03 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 Message-ID: Tadamasa Teranishi Writes: > There is a problem in the setting of the Web server if it operates > normally in the command line. > Encoding is compulsorily set by setting the Web server. ... > The Web server is Apache, and to the configuration file ... > SetServerEncoding EUC-JP PLEASE NOTE: *** The Apache server I'm running on Unix does not have this directive configured. It works fine. What is the difference ? > Please operate. > Please operate it with EUC-JP for other Web servers if there is > a similar set item. I'm not using Apache on win98. The Shttpd I'm using doesn't convert codings. I guess I can forget about the pltests. I'll try to find a stripped down Apache for windows. However, just a suggestion, you may be able to trivially recode in the final output stage of namazu.cgi.exe. ** I'm still wondering why it works without recoding on Unix ... From yw3t-trns at asahi-net.or.jp Sun Jun 25 07:38:59 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Sun Jun 25 07:40:24 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: Message-ID: <449DBF03.3C19CC5F@asahi-net.or.jp> starr@houston.rr.com wrote: > > Tadamasa Teranishi Writes: > > > First of all, please inform me of the result of pltests. ... > > Please confirm whether the installation is correctly done if you do > > not pass the test. ... > I've unzipped the nmz2.0.16.001-win32.zip distribution, > alltests.pl is asking for NKF.pm -- where do I find it Please ? You are not installing it according to README.txt. Again, please install it correctly according to the procedure of README.txt. -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Sun Jun 25 07:45:11 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Sun Jun 25 07:46:37 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: Message-ID: <449DC077.2B0148A4@asahi-net.or.jp> starr@houston.rr.com wrote: > > I'm not using Apache on win98. > The Shttpd I'm using doesn't convert codings. It knows you are not using Apache. However, it is not understood whether your Web server changes Apache similar encode. Please change the setting if in your Web server, there is a setting concerning encode. A previous example is an example in Apache to the last. > I'll try to find a stripped down Apache for windows. Do you operate correctly in the command line ahead of that? -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From starr at houston.rr.com Sun Jun 25 09:11:10 2006 From: starr at houston.rr.com (starr@houston.rr.com) Date: Sun Jun 25 09:11:15 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 Message-ID: Tadamasa Teranishi Writes: >> The Shttpd I'm using doesn't convert codings. > It knows you are not using Apache. > However, it is not understood whether your Web server changes > Apache similar encode. > Please change the setting if in your Web server, there is a setting > concerning encode. shttpd (shttpd.sourceforge.net) noes not change encoding. It has no parameters to specify that. shttpd is elegant in it's simplicity. a 44K .exe, copy it into a directory, which becomes document root, click on it and it starts listening on port 80, serving CGI and all subdirectories. Very Solid. > A previous example is an example in Apache to the last. >> I'll try to find a stripped down Apache for windows. > Do you operate correctly in the command line ahead of that? Not sure what you mean ? command line namazu.exe, run from same cgi directory along with same namazurc and NMZ.* index files gets the following: References: [ (can't open the index) ] (I really appreciate your help. Thank you.) From yw3t-trns at asahi-net.or.jp Sun Jun 25 11:27:18 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Sun Jun 25 11:29:18 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: Message-ID: <449DF486.7AEA2AC@asahi-net.or.jp> starr@houston.rr.com wrote: > > command line namazu.exe, run from same cgi directory along with > same namazurc and NMZ.* index files gets the following: > > References: [ (can't open the index) ] Please confirm it in the next order though it becomes a repetition. 1. You are not installing it according to README.txt. Again, please install it correctly according to the procedure of README.txt. 2. All the tests are confirmed with pltests and it is confirmed to pass. 3. It confirms the operation in the command line. CGI is tested only after it is confirmed that there is no problem in these. -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From starr at houston.rr.com Mon Jun 26 00:23:53 2006 From: starr at houston.rr.com (starr@houston.rr.com) Date: Mon Jun 26 00:23:58 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 Message-ID: Tadamasa Teranishi Writes: > Again, please install it correctly according to the procedure of > README.txt. First Question from instructions in README.txt: Does the installation REALLY require C:\namazu as the ONLY possible install location ? I think I just answered my own question: (using korn shell) [E:/installn/cgi] ./namazu.exe -d fragrant namazu(debug): NAMAZUNORC: '' namazu(debug): namazunorc: '' namazu(debug): -n: 20 namazu(debug): -w: 0 namazu(debug): query: [fragrant] namazu(debug): Index name [0]: C:/namazu/var/namazu/index namazu(debug): set_phrase_trick: fragrant namazu(debug): set_regex_trick: fragrant namazu(debug): query.tokennum: 1 namazu(debug): query.tab[0]: fragrant namazu(debug): C:/namazu/var/namazu/index/NMZ.i: No such file or directory Results: References: [ (can't open the index) ] No document matching your query. WHAT ???????? Why is the "C:/namazu/var/namazu/index" location HARDCODED ??? Why are you not reading the "Index" parameter from "namazurc" ??? This is an obvious inconsistency !!! Immediate correction is required. Fixing this should be TRIVIAL ! 30 minute job. Unix parses the "namazurc" file. the code to read it is there. Does "namazurc" have to be in some particular location to be read ? Can you not look for it in the current working directory ? See GetCurrentDirectory() C function in the win32 API. I HAVE TO ASSUME this this is also the major problem in "namazu.cgi.exe". Does it read "namazurc" in the current working directory ? I have limited space on drive C. My system has worked reliably over the years because I do not install ALPHA software *on my system drive*. I do not mess with the ugliness of the windows registry. You already have the right approach to configuring your installation, the *rc files. Why not use them ?? How (on what compiler) are you building the win32 version of Namazu ? Can it (or has it) been built with GNU mingw from Linux/Unix with the cross compiler ? From yw3t-trns at asahi-net.or.jp Mon Jun 26 00:45:57 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Mon Jun 26 00:48:11 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: Message-ID: <449EAFB5.52A9C9E6@asahi-net.or.jp> starr@houston.rr.com wrote: > > Does the installation REALLY require C:\namazu as the ONLY possible > install location ? The document is made for "C:\namazu" on the assumption of the installation. The explanation is omitted though ..other directory.. installation is possible. However, the method is not shown. > I think I just answered my own question: (using korn shell) > > [E:/installn/cgi] ./namazu.exe -d fragrant > namazu(debug): NAMAZUNORC: '' > namazu(debug): namazunorc: '' ... > References: [ (can't open the index) ] > > No document matching your query. > > WHAT ???????? The environment variable is not set. see. - 6. About the environment variable setting Please set it for oneself if korn shell doesn't reflect the environment variable of the system. > Why is the "C:/namazu/var/namazu/index" location HARDCODED ??? > Why are you not reading the "Index" parameter from "namazurc" ??? > > This is an obvious inconsistency !!! No. It is because the environment variable doesn't understand whether the unsetting, and where it is namazurc. > I HAVE TO ASSUME this this is also the major problem in "namazu.cgi.exe". > Does it read "namazurc" in the current working directory ? No. However, namazu.cgi.exe reads the .namazurc file that exists in the same directory. (Please do not make a mistake as namazurc file. ) > My system has worked reliably over the years > because I do not install ALPHA software *on my system drive*. > I do not mess with the ugliness of the windows registry. Namazu 2.0.16 is not ALPHA VERSION SOFTWARE. As for the zip archive, the installer is not included. Moreover, the document is insufficient. Therefore, the zip archive assumes ALPHA VERSION. (The Namazu program included in the zip file is RELESE VERSION. ) -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Mon Jun 26 01:06:56 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Mon Jun 26 01:09:16 2006 Subject: [Namazu-users-en] Re: Namazu encoding Unix vs win32 References: <449DF486.7AEA2AC@asahi-net.or.jp> Message-ID: <449EB4A0.861BF32@asahi-net.or.jp> Tadamasa Teranishi wrote: > > Please confirm it in the next order though it becomes a repetition. > > 1. You are not installing it according to README.txt. > Again, please install it correctly according to the procedure of > README.txt. > > 2. All the tests are confirmed with pltests and it is confirmed to > pass. > > 3. It confirms the operation in the command line. > > CGI is tested only after it is confirmed that there is no problem > in these. It will take more time to straighten out that problem if it doesn't confirm it according to the procedure. More haste, less speed. -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From jhart at atr.jp Wed Jun 28 13:21:03 2006 From: jhart at atr.jp (J. Hart) Date: Wed Jun 28 13:21:20 2006 Subject: [Namazu-users-en] mknmz not working for Japanese language documents ? Message-ID: <44A203AF.6040804@atr.jp> Ihave a collection of English and Japanese language documents we would like to index. I are having a problem trying to get mknmz to index the Japanese documents. I tried the following: mknmz -k --indexing-lang=ja -O index publications I get the following messages: ---------------------------------------------------- Looking for indexing files... 28 files are found to be indexed. sh: no: command not found 1/28 - publications/2_01.pdf [application/pdf] sh: no: command not found sh: no: command not found 2/28 - publications/2_02.pdf [application/pdf] sh: no: command not found sh: no: command not found I am not sure what the error messages mean. When I do this without the --indexing-lang switch, it works perfectly, but the Japanese language documents are not indexed properly. I have installed Kakazi, NKF, and Namazu. I have not yet installed the Perl modules for speed until I have the indexing working properly. Any Ideas ? J. Hart From yw3t-trns at asahi-net.or.jp Wed Jun 28 14:18:07 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Wed Jun 28 14:20:25 2006 Subject: [Namazu-users-en] Re: mknmz not working for Japanese language documents ? References: <44A203AF.6040804@atr.jp> Message-ID: <44A2110F.8290A2A1@asahi-net.or.jp> "J. Hart" wrote: > > I am not sure what the error messages mean. > When I do this without the --indexing-lang switch, it works perfectly, > but the Japanese language documents are not indexed properly. > > I have installed Kakazi, NKF, and Namazu. I have not yet installed the > Perl modules for speed until I have the indexing working properly. Kakazi ??? (Is it a mistake of kakasi?) After Namazu was installed, kakasi or nkf might have been installed. Please show contents of mknmzrc. Please inform me of the result of the following command. $ mknmz -C -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From jhart at atr.jp Wed Jun 28 14:41:24 2006 From: jhart at atr.jp (J. Hart) Date: Wed Jun 28 14:41:36 2006 Subject: [Namazu-users-en] Re: mknmz not working for Japanese language documents ? In-Reply-To: <44A2110F.8290A2A1@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> Message-ID: <44A21684.2000506@atr.jp> Tadamasa Teranishi wrote: >"J. Hart" wrote: > > >Kakazi ??? (Is it a mistake of kakasi?) > > Oops....it is....My English is not as good as yours... m(_)m >After Namazu was installed, kakasi or nkf might have been installed. >Please show contents of mknmzrc. >Please inform me of the result of the following command. > >$ mknmz -C > > It would seem that it was not aware of Kakasi. Must Kakasi be installed before Namazu is installed ? Here is the output : -------------------------------------------- $ mknmz -C System: linux Namazu: 2.0.16 Perl: 5.008007 File-MMagic: 1.25 NKF: no KAKASI: no ChaSen: no MeCab: no Lang_Msg: en_US.UTF-8 Lang: en_US.UTF-8 Coding System: euc CONFDIR: /usr/local/etc/namazu LIBDIR: /usr/local/share/namazu/pl FILTERDIR: /usr/local/share/namazu/filter TEMPLATEDIR: /usr/local/share/namazu/template Supported media types: (35) Unsupported media types: (9) marked with minus (-) probably missing application in your $path. - application/excel: excel.pl application/gnumeric: gnumeric.pl application/ichitaro5: taro56.pl application/ichitaro6: taro56.pl - application/ichitaro7: taro7_10.pl application/macbinary: macbinary.pl application/msword: msword.pl application/pdf: pdf.pl application/postscript: postscript.pl - application/powerpoint: powerpoint.pl - application/rtf: rtf.pl application/vnd.kde.kivio: koffice.pl application/vnd.kde.kpresenter: koffice.pl application/vnd.kde.kspread: koffice.pl application/vnd.kde.kword: koffice.pl application/vnd.oasis.opendocument.graphics: ooo.pl application/vnd.oasis.opendocument.presentation: ooo.pl application/vnd.oasis.opendocument.spreadsheet: ooo.pl application/vnd.oasis.opendocument.text: ooo.pl application/vnd.sun.xml.calc: ooo.pl application/vnd.sun.xml.draw: ooo.pl application/vnd.sun.xml.impress: ooo.pl application/vnd.sun.xml.writer: ooo.pl application/x-apache-cache: apachecache.pl application/x-bzip2: bzip2.pl application/x-compress: compress.pl - application/x-deb: deb.pl - application/x-dvi: dvi.pl application/x-gzip: gzip.pl - application/x-js-taro: taro7_10.pl application/x-rpm: rpm.pl - application/x-tex: tex.pl application/x-zip: zip.pl - audio/mpeg: mp3.pl message/news: mailnews.pl message/rfc822: mailnews.pl text/hnf: hnf.pl text/html: html.pl text/html; x-type=mhonarc: mhonarc.pl text/html; x-type=pipermail: pipermail.pl text/plain text/plain; x-type=rfc: rfc.pl text/x-hdml: hdml.pl text/x-roff: man.pl From jhart at atr.jp Wed Jun 28 14:53:07 2006 From: jhart at atr.jp (J. Hart) Date: Wed Jun 28 14:53:19 2006 Subject: [Namazu-users-en] Re: mknmz not working for Japanese language documents ? In-Reply-To: <44A2110F.8290A2A1@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> Message-ID: <44A21943.8000909@atr.jp> I reinstalled Namazu after making sure that NKF and KAKASI were both present. It seems to be working well now.... It would seem that Namazu will not detect the presence of either if they are installed after Namazu is. Many Thanks for your very kind assistance... J. Hart From yw3t-trns at asahi-net.or.jp Wed Jun 28 14:56:38 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Wed Jun 28 14:58:57 2006 Subject: [Namazu-users-en] Re: mknmz not working for Japanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> Message-ID: <44A21A16.AFBFB65E@asahi-net.or.jp> "J. Hart" wrote: > > It would seem that it was not aware of Kakasi. Must Kakasi be installed > before Namazu is installed ? It is recognized that kakasi and nkf are installed before Namazu is installed by the automatic operation. It is possible to correspond by rewriting mknmzrc if kakasi and nkf were installed after Namazu is installed. $NKF = "/usr/local/bin/nkf"; $KAKASI = "/usr/local/bin/kakasi -ieuc -oeuc -w"; $WAKATI = $KAKASI; > Here is the output : > -------------------------------------------- > $ mknmz -C > System: linux > Namazu: 2.0.16 > Perl: 5.008007 > File-MMagic: 1.25 > NKF: no > KAKASI: no > ChaSen: no > MeCab: no > Lang_Msg: en_US.UTF-8 > Lang: en_US.UTF-8 By the way. Namazu doesn't support UTF-8. Therefore, Lang_Msg and Lang should be C. (For English) It is necessary to make it to ja_JP.eucjp to process Japanese. Please set ja_JP.eucjp and use environment variable LANG etc. -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From jhart at atr.jp Wed Jun 28 16:45:45 2006 From: jhart at atr.jp (J. Hart) Date: Wed Jun 28 16:46:01 2006 Subject: [Namazu-users-en] Re: mknmz not working for Japanese languagedocuments ? In-Reply-To: <44A21A16.AFBFB65E@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> Message-ID: <44A233A9.6050609@atr.jp> Tadamasa Teranishi wrote: >By the way. >Namazu doesn't support UTF-8. >Therefore, Lang_Msg and Lang should be C. (For English) > >It is necessary to make it to ja_JP.eucjp to process Japanese. >Please set ja_JP.eucjp and use environment variable LANG etc. > > I will change these and try it again. I understand that I will have to create a mknmzrc file in the directory "/usr/local/etc/namazu/". Is that correct ? Can I use the defaults from "mknmz -C" and customize those ? When I reinstalled Namazu after KAKASI and NKF, the Japanese search worked properly at last. When I installed the Perl modules to increase the speed, the Japanese text search no longer worked as it did before. I have probably done something wrong here, so I will have to check it tomorrow and let you know what I find. Thanks Again, J. Hart From yw3t-trns at asahi-net.or.jp Wed Jun 28 17:02:52 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Wed Jun 28 17:05:12 2006 Subject: [Namazu-users-en] Re: mknmz not working forJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> Message-ID: <44A237AC.76AABD77@asahi-net.or.jp> "J. Hart" wrote: > > I will change these and try it again. I understand that I will have to > create a mknmzrc file in the directory "/usr/local/etc/namazu/". Is > that correct ? Please copy/usr/local/etc/namazu/mknmz-sample, and make mknmzrc. Afterwards, edit contents of mknmzrc. > When I installed the Perl modules to increase the speed, the Japanese > text search no longer worked as it did before. It is because the Perl module was installed after Namazu is installed. Please rewrite it in the following content when you use Text::Kakasi and the nkf Perl module. $NKF = "module_nkf"; $KAKASI = "module_kakasi -ieuc -oeuc -w"; $WAKATI = $KAKASI; -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Wed Jun 28 17:07:23 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Wed Jun 28 17:09:43 2006 Subject: [Namazu-users-en] Re: mknmz not working forJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> Message-ID: <44A238BB.4BA7F1DA@asahi-net.or.jp> Tadamasa Teranishi wrote: > > Please copy/usr/local/etc/namazu/mknmz-sample, and make mknmzrc. > Afterwards, edit contents of mknmzrc. Correction: /usr/local/etc/namazu/mknmzrc-sample -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From jhart at atr.jp Thu Jun 29 13:25:11 2006 From: jhart at atr.jp (J. Hart) Date: Thu Jun 29 13:25:30 2006 Subject: [Namazu-users-en] Re: mknmz not working forJapanese languagedocuments ? In-Reply-To: <44A237AC.76AABD77@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> Message-ID: <44A35627.8020201@atr.jp> Tadamasa Teranishi wrote: >Please copy/usr/local/etc/namazu/mknmz-sample, and make mknmzrc. >Afterwards, edit contents of mknmzrc. > > > >>When I installed the Perl modules to increase the speed, the Japanese >>text search no longer worked as it did before. >> >> > >It is because the Perl module was installed after Namazu is installed. > >Please rewrite it in the following content when you use Text::Kakasi >and the nkf Perl module. > >$NKF = "module_nkf"; >$KAKASI = "module_kakasi -ieuc -oeuc -w"; >$WAKATI = $KAKASI; > > Unfortunately this did not help. The Japanese text search does not work since I put in the Perl modules for NKF and Kakasi. I tried rebuilding and reinstalling Kakasi, Text-Kakasi, NKF and Namazu in order, but that did not solve the problem either. It looks like the Perl module installation left something behind. I will try removing everything entirely and reinstalling without the Perl modules and see if I get the Japanese text search back. From yw3t-trns at asahi-net.or.jp Thu Jun 29 14:04:58 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 29 14:07:21 2006 Subject: [Namazu-users-en] Re: mknmz not workingforJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> Message-ID: <44A35F7A.47363D63@asahi-net.or.jp> "J. Hart" wrote: > > Unfortunately this did not help. The Japanese text search does not work > since I put in the Perl modules for NKF and Kakasi. I tried rebuilding > and reinstalling Kakasi, Text-Kakasi, NKF and Namazu in order, but that > did not solve the problem either. It looks like the Perl module > installation left something behind. Such a problem has hardly happened in Japan. How does the result of the following command become it? $ perl -e 'use Text::Kakasi; print $Text::Kakasi::VERSION' $ perl -e 'use NKF; print $NKF::VERSION;' $ mknmz -C -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From jhart at atr.jp Thu Jun 29 15:51:15 2006 From: jhart at atr.jp (J. Hart) Date: Thu Jun 29 15:51:38 2006 Subject: [Namazu-users-en] Re: mknmz not workingforJapanese languagedocuments ? In-Reply-To: <44A35F7A.47363D63@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> Message-ID: <44A37863.6020308@atr.jp> Tadamasa Teranishi wrote: >Such a problem has hardly happened in Japan. > >How does the result of the following command become it? > >$ perl -e 'use Text::Kakasi; print $Text::Kakasi::VERSION' > > 2.04 >$ perl -e 'use NKF; print $NKF::VERSION;' > > 2.06 >$ mknmz -C > > $ mknmz --indexing-lang=a_JP.eucjp -C Loaded rcfile: /usr/local/etc/namazu/mknmzrc System: linux Namazu: 2.0.16 Perl: 5.008007 File-MMagic: 1.25 NKF: module_nkf KAKASI: module_kakasi -ieuc -oeuc -w ChaSen: no MeCab: no Wakati: module_kakasi -ieuc -oeuc -w Lang_Msg: C Lang: a_JP.eucjp Coding System: euc CONFDIR: /usr/local/etc/namazu LIBDIR: /usr/local/share/namazu/pl FILTERDIR: /usr/local/share/namazu/filter TEMPLATEDIR: /usr/local/share/namazu/template Supported media types: (35) Unsupported media types: (9) marked with minus (-) probably missing application in your $path. - application/excel: excel.pl application/gnumeric: gnumeric.pl application/ichitaro5: taro56.pl application/ichitaro6: taro56.pl - application/ichitaro7: taro7_10.pl application/macbinary: macbinary.pl application/msword: msword.pl application/pdf: pdf.pl application/postscript: postscript.pl - application/powerpoint: powerpoint.pl - application/rtf: rtf.pl application/vnd.kde.kivio: koffice.pl application/vnd.kde.kpresenter: koffice.pl application/vnd.kde.kspread: koffice.pl application/vnd.kde.kword: koffice.pl application/vnd.oasis.opendocument.graphics: ooo.pl application/vnd.oasis.opendocument.presentation: ooo.pl application/vnd.oasis.opendocument.spreadsheet: ooo.pl application/vnd.oasis.opendocument.text: ooo.pl application/vnd.sun.xml.calc: ooo.pl application/vnd.sun.xml.draw: ooo.pl application/vnd.sun.xml.impress: ooo.pl application/vnd.sun.xml.writer: ooo.pl application/x-apache-cache: apachecache.pl application/x-bzip2: bzip2.pl application/x-compress: compress.pl - application/x-deb: deb.pl - application/x-dvi: dvi.pl application/x-gzip: gzip.pl - application/x-js-taro: taro7_10.pl application/x-rpm: rpm.pl - application/x-tex: tex.pl application/x-zip: zip.pl - audio/mpeg: mp3.pl message/news: mailnews.pl message/rfc822: mailnews.pl text/hnf: hnf.pl text/html: html.pl text/html; x-type=mhonarc: mhonarc.pl text/html; x-type=pipermail: pipermail.pl text/plain text/plain; x-type=rfc: rfc.pl text/x-hdml: hdml.pl text/x-roff: man.pl From yw3t-trns at asahi-net.or.jp Thu Jun 29 16:15:46 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 29 16:18:09 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> Message-ID: <44A37E22.B4F23A29@asahi-net.or.jp> "J. Hart" wrote: > > >$ perl -e 'use Text::Kakasi; print $Text::Kakasi::VERSION' > > > > > 2.04 OK. > >$ perl -e 'use NKF; print $NKF::VERSION;' > > > > > 2.06 OK. > >$ mknmz -C > > > > > $ mknmz --indexing-lang=a_JP.eucjp -C This makes a mistake. The correct answer is the following content. $ mknmz --indexing-lang=ja_JP.eucjp -C > Loaded rcfile: /usr/local/etc/namazu/mknmzrc > System: linux > Namazu: 2.0.16 > Perl: 5.008007 > File-MMagic: 1.25 > NKF: module_nkf > KAKASI: module_kakasi -ieuc -oeuc -w > ChaSen: no > MeCab: no > Wakati: module_kakasi -ieuc -oeuc -w > Lang_Msg: C > Lang: a_JP.eucjp > Coding System: euc It is likely to change. Lang: ja_JP.eucjp -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From jhart at atr.jp Thu Jun 29 16:59:39 2006 From: jhart at atr.jp (J. Hart) Date: Thu Jun 29 16:59:54 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? In-Reply-To: <44A37E22.B4F23A29@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> Message-ID: <44A3886B.3040304@atr.jp> Tadamasa Teranishi wrote: >This makes a mistake. >The correct answer is the following content. > >$ mknmz --indexing-lang=ja_JP.eucjp -C > > > >>Loaded rcfile: /usr/local/etc/namazu/mknmzrc >>System: linux >>Namazu: 2.0.16 >>Perl: 5.008007 >>File-MMagic: 1.25 >>NKF: module_nkf >>KAKASI: module_kakasi -ieuc -oeuc -w >>ChaSen: no >>MeCab: no >>Wakati: module_kakasi -ieuc -oeuc -w >>Lang_Msg: C >>Lang: a_JP.eucjp >>Coding System: euc >> >> > >It is likely to change. > >Lang: ja_JP.eucjp > > I'm afraid I don't understand. What should I change ? With Thanks, J. Hart From yw3t-trns at asahi-net.or.jp Thu Jun 29 17:07:10 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Thu Jun 29 17:09:32 2006 Subject: [Namazu-users-en] Re: mknmznotworkingforJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A3886B.3040304@atr.jp> Message-ID: <44A38A2E.EF9C0FE0@asahi-net.or.jp> "J. Hart" wrote: > > I'm afraid I don't understand. What should I change ? $ mknmz --indexing-lang=ja_JP.eucjp -C ^^^^^^^^^^^ -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From darren at dcook.org Thu Jun 29 17:08:43 2006 From: darren at dcook.org (Darren Cook) Date: Thu Jun 29 17:09:36 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? In-Reply-To: <44A3886B.3040304@atr.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A3886B.3040304@atr.jp> Message-ID: <44A38A8B.5030502@dcook.org> >>>Lang: a_JP.eucjp >> Lang: ja_JP.eucjp > > I'm afraid I don't understand. What should I change ? You've a typo, missing the j of ja. Darren From jhart at atr.jp Fri Jun 30 09:34:41 2006 From: jhart at atr.jp (J. Hart) Date: Fri Jun 30 09:35:03 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? In-Reply-To: <44A37E22.B4F23A29@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> Message-ID: <44A471A1.8090806@atr.jp> Tadamasa Teranishi wrote: >This makes a mistake. >The correct answer is the following content. > >$ mknmz --indexing-lang=ja_JP.eucjp -C I understand now...:-) Here is the output from that command: -------------------------------------- $ mknmz --indexing-lang=ja_JP.eucjp -C Loaded rcfile: /usr/local/etc/namazu/mknmzrc System: linux Namazu: 2.0.16 Perl: 5.008007 File-MMagic: 1.25 NKF: module_nkf KAKASI: module_kakasi -ieuc -oeuc -w ChaSen: no MeCab: no Wakati: module_kakasi -ieuc -oeuc -w Lang_Msg: en_US.UTF-8 Lang: ja_JP.eucjp Coding System: euc CONFDIR: /usr/local/etc/namazu LIBDIR: /usr/local/share/namazu/pl FILTERDIR: /usr/local/share/namazu/filter TEMPLATEDIR: /usr/local/share/namazu/template Supported media types: (34) Unsupported media types: (10) marked with minus (-) probably missing application in your $path. - application/excel: excel.pl application/gnumeric: gnumeric.pl application/ichitaro5: taro56.pl application/ichitaro6: taro56.pl - application/ichitaro7: taro7_10.pl application/macbinary: macbinary.pl application/msword: msword.pl application/pdf: pdf.pl - application/postscript: postscript.pl - application/powerpoint: powerpoint.pl - application/rtf: rtf.pl application/vnd.kde.kivio: koffice.pl application/vnd.kde.kpresenter: koffice.pl application/vnd.kde.kspread: koffice.pl application/vnd.kde.kword: koffice.pl application/vnd.oasis.opendocument.graphics: ooo.pl application/vnd.oasis.opendocument.presentation: ooo.pl application/vnd.oasis.opendocument.spreadsheet: ooo.pl application/vnd.oasis.opendocument.text: ooo.pl application/vnd.sun.xml.calc: ooo.pl application/vnd.sun.xml.draw: ooo.pl application/vnd.sun.xml.impress: ooo.pl application/vnd.sun.xml.writer: ooo.pl application/x-apache-cache: apachecache.pl application/x-bzip2: bzip2.pl application/x-compress: compress.pl - application/x-deb: deb.pl - application/x-dvi: dvi.pl application/x-gzip: gzip.pl - application/x-js-taro: taro7_10.pl application/x-rpm: rpm.pl - application/x-tex: tex.pl application/x-zip: zip.pl - audio/mpeg: mp3.pl message/news: mailnews.pl message/rfc822: mailnews.pl text/hnf: hnf.pl text/html: html.pl text/html; x-type=mhonarc: mhonarc.pl text/html; x-type=pipermail: pipermail.pl text/plain text/plain; x-type=rfc: rfc.pl text/x-hdml: hdml.pl text/x-roff: man.pl -------------------------------------- With Thanks, J. Hart From jhart at atr.jp Fri Jun 30 09:40:12 2006 From: jhart at atr.jp (J. Hart) Date: Fri Jun 30 09:40:28 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? In-Reply-To: <44A37E22.B4F23A29@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> Message-ID: <44A472EC.1000105@atr.jp> An additional note: We are using Namazu with a search web page we have set up here. I have learned that there are in fact some Japanese strings we are able to find, but many we that we are unable to find. What we can find is dependant on the character encoding setting used by the browser doing the search. The documents we built the index from are very likely to be using several different Japanese character encodings. (ex. Shift_JIS, EUC-JP). I wonder what effect this might have. J. Hart From jhart at atr.jp Fri Jun 30 09:49:39 2006 From: jhart at atr.jp (J. Hart) Date: Fri Jun 30 09:49:56 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese (Correction) In-Reply-To: <44A37E22.B4F23A29@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> Message-ID: <44A47523.2060608@atr.jp> My apologies....when I sent that command output, I had forgotten to set the LANG environment variable. Here it is again.. $ export LANG=C $ mknmz --indexing-lang=ja_JP.eucjp -C ------------------------------------------- Loaded rcfile: /usr/local/etc/namazu/mknmzrc System: linux Namazu: 2.0.16 Perl: 5.008007 File-MMagic: 1.25 NKF: module_nkf KAKASI: module_kakasi -ieuc -oeuc -w ChaSen: no MeCab: no Wakati: module_kakasi -ieuc -oeuc -w Lang_Msg: C Lang: ja_JP.eucjp Coding System: euc CONFDIR: /usr/local/etc/namazu LIBDIR: /usr/local/share/namazu/pl FILTERDIR: /usr/local/share/namazu/filter TEMPLATEDIR: /usr/local/share/namazu/template Supported media types: (34) Unsupported media types: (10) marked with minus (-) probably missing application in your $path. - application/excel: excel.pl application/gnumeric: gnumeric.pl application/ichitaro5: taro56.pl application/ichitaro6: taro56.pl - application/ichitaro7: taro7_10.pl application/macbinary: macbinary.pl application/msword: msword.pl application/pdf: pdf.pl - application/postscript: postscript.pl - application/powerpoint: powerpoint.pl - application/rtf: rtf.pl application/vnd.kde.kivio: koffice.pl application/vnd.kde.kpresenter: koffice.pl application/vnd.kde.kspread: koffice.pl application/vnd.kde.kword: koffice.pl application/vnd.oasis.opendocument.graphics: ooo.pl application/vnd.oasis.opendocument.presentation: ooo.pl application/vnd.oasis.opendocument.spreadsheet: ooo.pl application/vnd.oasis.opendocument.text: ooo.pl application/vnd.sun.xml.calc: ooo.pl application/vnd.sun.xml.draw: ooo.pl application/vnd.sun.xml.impress: ooo.pl application/vnd.sun.xml.writer: ooo.pl application/x-apache-cache: apachecache.pl application/x-bzip2: bzip2.pl application/x-compress: compress.pl - application/x-deb: deb.pl - application/x-dvi: dvi.pl application/x-gzip: gzip.pl - application/x-js-taro: taro7_10.pl application/x-rpm: rpm.pl - application/x-tex: tex.pl application/x-zip: zip.pl - audio/mpeg: mp3.pl message/news: mailnews.pl message/rfc822: mailnews.pl text/hnf: hnf.pl text/html: html.pl text/html; x-type=mhonarc: mhonarc.pl text/html; x-type=pipermail: pipermail.pl text/plain text/plain; x-type=rfc: rfc.pl text/x-hdml: hdml.pl text/x-roff: man.pl From darren at dcook.org Fri Jun 30 10:50:28 2006 From: darren at dcook.org (Darren Cook) Date: Fri Jun 30 10:51:22 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? In-Reply-To: <44A472EC.1000105@atr.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp> Message-ID: <44A48364.7030703@dcook.org> > What we can find is dependant on the character encoding setting used by > the browser doing the search. > The documents we built the index from are very likely to be using > several different Japanese character encodings. (ex. Shift_JIS, EUC-JP). I've not used the perl modules, but I can tell you what I do on a site that isn't native EUC. For indexing an English UTF8 site I use: mknmz --indexing-lang=en.UTF-8 -e ... For indexing a Japanese UTF8 site I use (the -k means use kakasi): mknmz --indexing-lang=ja.UTF-8 -k -e ... For searching (I'm using PHP module by the way) I convert the search keywords to EUC: $kw_euc=mb_convert_encoding($kw,"EUC-JP","UTF8"); Then do the search, then for each search hit I convert the result back from EUC to UTF8 ready for display, e.g.: $title=mb_convert_encoding( nmz_result_field($hlist,$n,'subject'), 'UTF8','EUC-JP'); Darren From jhart at atr.jp Fri Jun 30 11:09:35 2006 From: jhart at atr.jp (J. Hart) Date: Fri Jun 30 11:09:55 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? In-Reply-To: <44A48364.7030703@dcook.org> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp> <44A48364.7030703@dcook.org> Message-ID: <44A487DF.1040202@atr.jp> Darren Cook wrote: >I've not used the perl modules, but I can tell you what I do on a site >that isn't native EUC. > >For indexing an English UTF8 site I use: > mknmz --indexing-lang=en.UTF-8 -e ... > >For indexing a Japanese UTF8 site I use (the -k means use kakasi): > mknmz --indexing-lang=ja.UTF-8 -k -e ... > >For searching (I'm using PHP module by the way) I convert the search >keywords to EUC: > $kw_euc=mb_convert_encoding($kw,"EUC-JP","UTF8"); > >Then do the search, then for each search hit I convert the result back >from EUC to UTF8 ready for display, e.g.: > $title=mb_convert_encoding( > nmz_result_field($hlist,$n,'subject'), > 'UTF8','EUC-JP'); > > Our documents may be of different encodings. Do I need to have a seperate site for each different encoding then ? J. Hart From yw3t-trns at asahi-net.or.jp Fri Jun 30 11:45:58 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Fri Jun 30 11:48:51 2006 Subject: [Namazu-users-en] Re: mknmznotworkingforJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp> Message-ID: <44A49066.6E47DE8F@asahi-net.or.jp> "J. Hart" wrote: > > The documents we built the index from are very likely to be using > several different Japanese character encodings. (ex. Shift_JIS, EUC-JP). > > I wonder what effect this might have. No problem. Japanese contents automatically convert into EUC-JP by nkf and are processed. -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Fri Jun 30 11:46:39 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Fri Jun 30 11:49:04 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese (Correction) References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A47523.2060608@atr.jp> Message-ID: <44A4908F.49190062@asahi-net.or.jp> "J. Hart" wrote: > > My apologies....when I sent that command output, I had forgotten to set > the LANG environment variable. Here it is again.. > > $ export LANG=C > $ mknmz --indexing-lang=ja_JP.eucjp -C For English: $ export LANG=C $ mknmz -C When you contain Japanese: $ export LANG=C $ mknmz --indexing-lang=ja_JP.eucjp -C or $ export LC_MESSAGES=C $ export LANG=ja_JP.eucjp $ mknmz -C However, a latter method is recommended. Moreover, if the terminal is EUC-JP: $ export LANG=ja_JP.eucjp $ mknmz -C The message is displayed in Japanese. (When LC_ALL, LC_MESSAGES, and LC_CTYPE are the unsettings.) -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Fri Jun 30 11:58:42 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Fri Jun 30 12:01:06 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp> <44A48364.7030703@dcook.org> Message-ID: <44A49362.E94C52AB@asahi-net.or.jp> Darren Cook wrote: > > I've not used the perl modules, but I can tell you what I do on a site > that isn't native EUC. > > For indexing an English UTF8 site I use: > mknmz --indexing-lang=en.UTF-8 -e ... It is a mistake. Namazu doesn't support UTF-8. > For indexing a Japanese UTF8 site I use (the -k means use kakasi): > mknmz --indexing-lang=ja.UTF-8 -k -e ... It is a mistake. Namazu doesn't support UTF-8. (But, it corresponds to the document of ja_JP.UTF-8.) It is necessary to keep the following. $ mknmz --indexing-lang=ja_JP.eucjp -k -e ... The document of ISO-2022-JP, Shift_JIS, and EUC-JP can be handled though it is specified ja_JP.eucjp. --indexing-lang option doesn't specify the encoding of the handled document. > For searching (I'm using PHP module by the way) I convert the search > keywords to EUC: > $kw_euc=mb_convert_encoding($kw,"EUC-JP","UTF8"); The retrieval key word supports only ISO-2022-JP, Shift_JIS, and EUC-JP. (UTF-8 is a unsupport. Therefore, it is recommended to convert it into EUC-JP like this example. ) > Then do the search, then for each search hit I convert the result back > from EUC to UTF8 ready for display, e.g.: The retrieval result is sure to become EUC-JP. (for UNIX) -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From yw3t-trns at asahi-net.or.jp Fri Jun 30 12:04:07 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Fri Jun 30 12:06:32 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ? References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp> <44A48364.7030703@dcook.org> <44A487DF.1040202@atr.jp> Message-ID: <44A494A7.52BC29AA@asahi-net.or.jp> "J. Hart" wrote: > > Our documents may be of different encodings. > Do I need to have a seperate site for each different encoding then ? It is quite unnecessary. Japanese encoding of Namazu is as follows. Encoding of document: ISO-2022-JP, Shift_JIS, EUC-JP, ja_JP.UTF-8 Encoding of search string: ISO-2022-JP, Shift_JIS, EUC-JP Encoding of retrieval result(for UNIX): EUC-JP Encoding of message(for UNIX): EUC-JP -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E From darren at dcook.org Fri Jun 30 12:13:26 2006 From: darren at dcook.org (Darren Cook) Date: Fri Jun 30 12:14:19 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese... In-Reply-To: <44A49362.E94C52AB@asahi-net.or.jp> References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp> <44A48364.7030703@dcook.org> <44A49362.E94C52AB@asahi-net.or.jp> Message-ID: <44A496D6.1050400@dcook.org> >>For indexing an English UTF8 site I use: >> mknmz --indexing-lang=en.UTF-8 -e ... > > It is a mistake. > Namazu doesn't support UTF-8. > >>For indexing a Japanese UTF8 site I use (the -k means use kakasi): >> mknmz --indexing-lang=ja.UTF-8 -k -e ... > > It is a mistake. > Namazu doesn't support UTF-8. > (But, it corresponds to the document of ja_JP.UTF-8.) That is interesting, as the above both work fine. It is a year since I set up the above, so my memory may be wrong, but I'm fairly sure I had problems and using "ja.UTF-8" fixed it. I think I may have had to upgrade nkf to get it working? (See also: http://www.mhonarc.org/archive/html/namazu-users-en/2005-06/msg00010.html where it says: "The text of ja_JP.UTF-8 can be processed by combining with nkf 2.0.5 if it limits it to a Japanese environment. ") Darren From yw3t-trns at asahi-net.or.jp Fri Jun 30 12:28:15 2006 From: yw3t-trns at asahi-net.or.jp (Tadamasa Teranishi) Date: Fri Jun 30 12:30:40 2006 Subject: [Namazu-users-en] Re: mknmz notworkingforJapanese... References: <44A203AF.6040804@atr.jp> <44A2110F.8290A2A1@asahi-net.or.jp> <44A21684.2000506@atr.jp> <44A21A16.AFBFB65E@asahi-net.or.jp> <44A233A9.6050609@atr.jp> <44A237AC.76AABD77@asahi-net.or.jp> <44A35627.8020201@atr.jp> <44A35F7A.47363D63@asahi-net.or.jp> <44A37863.6020308@atr.jp> <44A37E22.B4F23A29@asahi-net.or.jp> <44A472EC.1000105@atr.jp> <44A48364.7030703@dcook.org> <44A49362.E94C52AB@asahi-net.or.jp> <44A496D6.1050400@dcook.org> Message-ID: <44A49A4F.B79364A7@asahi-net.or.jp> Darren Cook wrote: > > > It is a mistake. > > Namazu doesn't support UTF-8. > > (But, it corresponds to the document of ja_JP.UTF-8.) > > That is interesting, as the above both work fine. It is a year since I > set up the above, so my memory may be wrong, but I'm fairly sure I had > problems and using "ja.UTF-8" fixed it. I think I may have had to > upgrade nkf to get it working? Ja_JP.UTF-8 is supported since nkf 2.0 it. Therefore, mknmz can process the document of the ja_JP.UTF-8 encoding. However, it is a clear mistake to specify ja_JP.UTF-8 for --indexing-lang option. Because, --indexing-lang option doesn't specify the encoding of the handled document. It is necessary to specify ja_JP.eucjp for --indexing-lang option. (for UNIX) # Anyway, it is EUC-JP according to the environment though it might # be ja_JP.ujis. -- ===================================================================== TADAMASA TERANISHI yw3t-trns@asahi-net.or.jp http://www.asahi-net.or.jp/~yw3t-trns/index.htm Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E