[Namazu-users-en] Re: meta-information on large binary files

Alexander Oelzant alexander at oelzant.priv.at
Tue Jul 11 21:49:24 JST 2006

On Tue, Jul 11, 2006 at 01:58:37PM +0900, NOKUBI Takatsugu wrote:
> At Sat, 8 Jul 2006 20:13:42 +0200,
> Alexander Oelzant wrote:
> > Is there any possibility (or planned feature) to have namazu read just a
> > few Kb of a file in order to extract metadata? In analogy to the mp3
> > filter I've written an ogg plugin, but for the large radio recordings
> > it's prohibitively slow.
> If the target files are only one media-type, you can do it like following:
> $ mknmz -O indexdir -t audio/mpeg target-dir
> -t (--media-type) option ommits to read target file for finding binary
> signature.

Thanks, but unfortunately for indexing the filter still has to read in
the entire file, for a 200M-file that produces processes like the

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
22312 user      15  10 1598m 592m 1104 R  0.3 39.2   0:08.19 mknmz             

With enough swap, it takes about twenty minutes to extract the 5 lines
of data and insert those in the db ;-)

I was hoping the $ON_MEMORY_MAX   = 5000000; would take care of that, e.
g. only reading in the first part of the file, but according to
tips.html that only influences the size of the db files kept in memory,
which is only logical, since redesigning namazu to read in files chunk
by chunk would probably involve rewriting all the filters to use a
read_chunk() function instead of accessing $$contref directly, though.


Alexander Oelzant (Durchlaufstr. 7/4/5, A-1200 Wien)
alexander at oelzant.priv.at aoe at fsinf.htu.tuwien.ac.at
       ex-internic, ripe, bofh, priv.at: !ao418
            +43 1 3500929 +43 676 84441065                                  McQ

More information about the Namazu-users-en mailing list