namazu-ml(avocado)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: pdf等の検索

From: Satoru Takabayashi <ccsatoru@xxxxxxxxxxxxxxxxxx>
Date: Sat, 10 Oct 1998 23:47:42 +0900
X-ml-name: namazu
X-mail-count: 01318
References: <19981010133036J.kunito@sachiko.hal.t.u-tokyo.ac.jp>

Gorochan ^o^ <kunito@xxxxxxxxxxxxxxxxxxx> wrote:

>一太郎文書やword などいろいろありますから、mknmz がpdf からテキストを
>抜き出すなら、.mime.types みたいのを作って filter を定義するのはどうで
>しょうか?

すでにあるにはあるのです。 mknmz を見ると

| ## ヘルパー・プログラムと suffix の対応表 (man は例外)
| %HELPER_PROGRAMS = (
|     'gz'  => '/bin/zcat',
|     'Z'   => '/bin/zcat',
|     'man' => '/usr/bin/jgroff -man -Tnippon',
| );

といった定義があります。 フィルタは

1. ファイル名を引数にとって結果を標準出力に出す。

    % filter filename > kekka

2. 標準入力から読み込んで標準出力に出す。

    % cat filename | filter > kekka

の二つの仕様を満たしていなければなりません。


>そうすれば、各自のサイトで独自のフォーマットを使っている場合にも、
>filter さえ書けば対応できると思います。

はい。その通りです。

-- Satoru Takabayashi

Follow-Ups:
- Re: pdf等の検索
  - From: Gorochan ^o^

References:
- Re: pdf等の検索
  - From: Gorochan ^o^

Prev by Date: Re: pdf等の検索
Next by Date: a tool to exclude phrases from kakasidict
Previous by thread: Re: pdf等の検索
Next by thread: Re: pdf等の検索
Index(es):
- Date
- Thread