[Namazu-users-en] Re: html data not indexed in text/html mails

swati swati_longia at sifycorp.com
Fri Aug 5 14:08:32 JST 2005


Hello,

Thank you very much for that patch. I am able to index and seach on 
those text/html mails now. And everything seems to work fine.
Thanks again

Regards,
Swati

Yukio USUDA wrote:

>swati wrote:
>
>  
>
>>Hello all,
>>This is a sample mail that i was trying to index and search on.
>>
>>    
>>
>
>snip
>
>  
>
>>In this mail I am able to search on the words like rober and streams, which exists in the header part. But the words like fear or member or primers, which exists inside the html part of the mail are not indexed or searched. I tried the new verison of namazu (namazu-2.0.15pre1 ) with that also i am not able to index/search this type of mail.
>>
>>Can anyone give some suggestions as to how I can make these mails also indexd and searched.
>>
>>    
>>
>
>I made a patch for this type mail (from namazu-2.0.15pre1.)
>
>bash$ diff -ub filter/mailnews.pl.org filter/mailnews.pl
>--- filter/mailnews.pl.org      Mon Jun  6 14:41:42 2005
>+++ filter/mailnews.pl  Thu Aug  4 21:13:53 2005
>@@ -65,7 +65,7 @@
>     util::vprint("Processing mail/news file ...\n");
> 
>     uuencode_filter($cont);
>-    mailnews_filter($cont, $weighted_str, $fields);
>+    mailnews_filter($cont, $weighted_str, $headings, $fields);
>     mailnews_citation_filter($cont, $weighted_str);
> 
>     gfilter::line_adjust_filter($cont);
>@@ -79,11 +79,12 @@
> 
> # Original of this code was contributed by <furukawa at tcp-ip.or.jp>. 
> sub mailnews_filter ($$$) {
>-    my ($contref, $weighted_str, $fields) = @_;
>+    my ($contref, $weighted_str, $headings, $fields) = @_;
> 
>     my $boundary = "";
>     my $line     = "";
>     my $partial  = 0;
>+    my $htmlmail = "";
> 
>     $$contref =~ s/^\s+//;
>     # Don't handle if first like does'nt seem like a mail/news header.
>@@ -125,6 +126,10 @@
>                 # contributed by Hiroshi Kato <tumibito at mm.rd.nttdata.co.jp>
>                 $partial = $1;
>                 util::dprint("((partial: $partial))\n");
>+            } elsif ($line =~ m!text/html!i) {
>+               # The simplest form of an HTML email message.
>+               util::dprint("text/html mail\n");
>+               $htmlmail = "yes";
>             } elsif ($line !~ m!text/plain!i) {
>                 $$contref = '';
>                 return;
>@@ -161,6 +166,9 @@
>        multipart_process($contref, $boundary, $weighted_str, $fields);
> 
>     }
>+    if ($htmlmail) {
>+       html::html_filter($contref, $weighted_str, $fields, $headings);
>+    }
> }
> 
> # Prototype declaration for avoiding
>
>
>Yukio USUDA
>
>  
>


-- 
Thanks n Regards ,
Swati

To me vi is Zen.  To use vi is to practice zen. Every command is
a koan. Profound to the user, unintelligible to the uninitiated.
You discover truth everytime you use it.


********** DISCLAIMER **********
Information contained and transmitted by this E-MAIL is proprietary to 
Sify Limited and is intended for use only by the individual or entity to 
which it is addressed, and may contain information that is privileged, 
confidential or exempt from disclosure under applicable law. If this is a 
forwarded message, the content of this E-MAIL may not have been sent with 
the authority of the Company. If you are not the intended recipient, an 
agent of the intended recipient or a  person responsible for delivering the 
information to the named recipient,  you are notified that any use, 
distribution, transmission, printing, copying or dissemination of this 
information in any way or in any manner is strictly prohibited. If you have 
received this communication in error, please delete this mail & notify us 
immediately at admin at sifycorp.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.namazu.org/pipermail/namazu-users-en/attachments/20050805/35cd936d/attachment.htm


More information about the Namazu-users-en mailing list