[Namazu-users-en] Re: [Fwd: Bad description of query result]

Rami Addady rami at active.co.il
Fri Sep 2 19:36:27 JST 2005


I tried Earl and Nokubi version of decode_numbered_entity.

None of them give me full solution.

My Hebrew emails (Windows-1255 and UTF-8) are  partly index.  Only the 
Latin character are index.

When I use  ant Earl modification to  decode_numbered_entity

>You may want to look at past discussion on this list.  For example:
namazu.cgi serch result dispaly ????? instead of hebrew characters.

When I use Nohubi version of decode_numbered_entity. namazu.cgi serch 
result dispaly  the html file name instead of the mail title and "<<< 
text/html: EXCLUDED >>>" instead of the 1st email line.

The MHonArc html file are fine, there is no problem to see the hebrew 



NOKUBI Takatsugu wrote:

>The following is my plan to patch:
>sub decode_numbered_entity ($) {
>    my ($num) = @_;
>    return ""
>        if ($num >= 0 && $num <= 31) || ($num >= 127 && $num <= 159) ||
>           ($num >= 255 && !util::islang('ja'));
>    return ""
>        if $num >=127 && util::islang('ja');
>    sprintf ("%c",$num);
>It wouldn't be affect for Japanese environment, and would adopt with
>iso-8859-* characters.
>I tested with it, and seemd good for the test suites.
