Namazu-devel-en(old)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Doubts

From: Makoto Fujiwara <makoto@xxxxx>
Date: Thu, 26 Jun 2003 23:05:11 +0900
X-ml-name: namazu-devel-en
X-mail-count: 00075
References: <DD924B576CB1EA4CAF33ECFEAAE0D99501BD6D59@indhubbhs03.ad.infosys.com>

Navneet> 1. For Japanese documents Namazu stores indexes for the searched text in kana hence converts all Kanji's
Navneet> to kana. Therefore the search cannot differentiate between those two different Kanji charcaters. Is this
Navneet> true?  

NO, the first assumption is NO. NMZ.* files stores two bytes charactors
as it is.

Navneet> 2. Namazu uses software called nkf for Japanese processing. NKF 1.71 supports the following encoding -
Navneet> 7-bit JIS, MS-kanji (Shift_JIS) or EUC.

Yes:

Navneet> It does not support UTF-8. I heard NKF 2.01 onwards supports UTF-8 but the recommended version of NKF for
Navneet> Namazu 2.0 is NKF 1.71??? 

Namazu does not support UTF-8 internally.
---
Makoto Fujiwara, 
Chiba, Japan, Narita Airport and Disneyland prefecture.

References:
- Doubts
  - From: Navneet Saraogi

Prev by Date: Re: NKF
Next by Date: [no subject]
Previous by thread: Doubts
Next by thread: [no subject]
Index(es):
- Date
- Thread