Namazu-users-en(old)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Mailman & "Charactères Français"



At Mon,  2 Jun 2003 16:26:26 -0500,
dchartrand@xxxxxxxxxx wrote:
> Namazu is having problems displaying and understanding french characters such as
> "àéèçô..." when used to search Mailman archives. A word like "troisième" is
> displayed (and searched) as "troisime" in Namazu... Notice the missing "è".

In the past, I had got a same report. So I tried to check the probrem
with the following sequence:

1. Saved the mail <1054589186.3edbc102c03c0@xxxxxxxxxx> as a text file
   named as "docs/french-text.txt".

2. Typed "LANG=C mknmz -O index ./docs" to make index.

3. Typed "LANG=C namazu -h ` sed -n '78p' index/NMZ.w` index > foo" to
   search the word "troisième", because I don't know how to input any
   french characters.

4. Checked the file foo. It is like the following:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd";>
<html>
<head>
<!-- LINK-REV-MADE -->
<link rev=made href="mailto:webmaster@xxxxxxxxxxxxxxxxxxxxxxx";>
<!-- LINK-REV-MADE -->
<title>Namazu: a Full-Text Search Engine: &lt;troisième&gt;</title>

  :
(snip)
  :
<h2>Results:</h2>
<p>
References:  [ troisième: 1 ] 
  :
(snip)

Hmm, it seems no problem for me.

> I am using Mailman 2.1 and Namazu 2.0.12.

How about your envrionment? The follwoing is mine:

Debian GNU/Linux (today's unstable)
Linux 2.4.21-pre4
glibc 2.3.1
-- 
NOKUBI Takatsugu
E-mail: knok@xxxxxxxxxxxxx
	knok@xxxxxxxxxx / knok@xxxxxxxxxx