Namazu-devel-ja(旧)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: File-MMagic (Re: ichitaro)



knok@xxxxxxxxxxxxx (NOKUBI Takatsugu) wrote:

>> 駄目みたいです。
>
>  うーん、おかしいですね... あ、良くみたら自分の script は
>checktype_contents() を使ってました。しかし、その挙動は妙です。

僕の script を野首さんの環境で動かすとどうですか?


>  こちらで使っている script を末尾につけておきますので、一度そちらでも
>試してみてもらえませんでしょうか。

試しました。(ファイル名を表示するようにしました)

  % cd tests/data/ja
  % mmagic *(.)
  Makefile: text/plain
  Makefile.am: text/plain
  Makefile.in: text/plain
  acrobat3.pdf: application/pdf
  acrobat4.pdf: application/pdf
  d20000220.hnf: text/plain
  excel97.xls: application/powerpoint
  html.html: text/html
  mail-multipart.txt: message/rfc822
  mail.txt: message/rfc822
  man.1: text/x-roff
  msg00000.html: text/html
  plain.txt: text/plain
  plain.txt.Z: application/x-compress
  plain.txt.bz2: application/x-bzip2
  plain.txt.gz: Compressed:text/plain
  rfc0000.txt: text/plain; x-type=rfc
  taro4.jsw: application/octet-stream
  taro5.jaw: application/octet-stream
  taro6.jbw: application/octet-stream
  tex.tex: application/x-tex
  word6.doc: application/msword
  word95.doc: application/msword
  word97.doc: application/msword
  word98.doc: application/msword

です。wordはいいですけど、

  excel97.xls: application/powerpoint

はまずいですね。

ちなみに、 mknmz では

  % ./mknmz ../tests/data/ja
  (snip)
  14/17 - /home/satoru/cvs/namazu/tests/data/ja/word6.doc 未対応の形式: word7
  14/16 - /home/satoru/cvs/namazu/tests/data/ja/word95.doc 未対応の形式: word7
  14/15 - /home/satoru/cvs/namazu/tests/data/ja/word97.doc [application/msword]
  15/15 - /home/satoru/cvs/namazu/tests/data/ja/word98.doc [application/msword]

と application/msword をきちんと認識します。どうやら
checktype_filehandle と checktype_filename の挙動がおかしい
(仕様?) ようです。


>  それから、念のため
>
># perl -MFile::MMagic -e '$m = new File::MMagic; print "$File::MMagic::VERSION\n"; $m->check_magic();'
>
>  を実行した結果も見せてもらえませんでしょうか。

メイルに添付しました。

-- Satoru Takabayashi
0.20.5
0	string	=BZh	application/x-bzip2
0	string	=#VRML V1.0 ascii	model/vrml
0	string	=#VRML V2.0 utf8	model/vrml
0	short	=51966	
>2	short	=47806	application/java
0	string	=.snd	
>12	belong	=1	audio/basic
>12	belong	=2	audio/basic
>12	belong	=3	audio/basic
>12	belong	=4	audio/basic
>12	belong	=5	audio/basic
>12	belong	=6	audio/basic
>12	belong	=7	audio/basic
>12	belong	=23	audio/x-adpcm
0	lelong	=6583086	
>12	lelong	=1	audio/x-dec-basic
>12	lelong	=2	audio/x-dec-basic
>12	lelong	=3	audio/x-dec-basic
>12	lelong	=4	audio/x-dec-basic
>12	lelong	=5	audio/x-dec-basic
>12	lelong	=6	audio/x-dec-basic
>12	lelong	=7	audio/x-dec-basic
>12	lelong	=23	audio/x-dec-adpcm
8	string	=AIFF	audio/x-aiff	
8	string	=AIFC	audio/x-aiff	
8	string	=8SVX	audio/x-aiff	
0	string	=MThd	audio/unknown	
0	string	=CTMF	audio/unknown	
0	string	=SBI	audio/unknown	
0	string	=Creative Voice File	audio/unknown	
0	string	=RIFF	audio/x-msvideo	
>8	string	=WAVE	audio/x-wav	
0	string	=/* XPM	image/x-xbm
0	string	=/*	text/plain
0	string	=//	text/plain
0	string	=?	application/x-compress
0	string	=?	application/x-gzip
0	string	=	application/octet-stream
0	short	=7967	application/octet-stream
0	short	=8191	application/octet-stream
0	string	=ÿ	application/octet-stream
0	short	=51973	application/octet-stream
0	string	=<MakerFile	application/x-frame
0	string	=<MIFFile	application/x-frame
0	string	=<MakerDictionary	application/x-frame
0	string	=<MakerScreenFon	application/x-frame
0	string	=<MML	application/x-frame
0	string	=<Book	application/x-frame
0	string	=<Maker	application/x-frame
0	string	=<HEAD	text/html
0	string	=<head	text/html
0	string	=<TITLE	text/html
0	string	=<title	text/html
0	string	=<html	text/html
0	string	=<HTML	text/html
0	string	=<!--	text/html
0	string	=<h1	text/html
0	string	=<H1	text/html
0	string	=P1	image/x-portable-bitmap
0	string	=P2	image/x-portable-greymap
0	string	=P3	image/x-portable-pixmap
0	string	=P4	image/x-portable-bitmap
0	string	=P5	image/x-portable-greymap
0	string	=P6	image/x-portable-pixmap
0	string	=IIN1	image/x-niff
0	string	=MM	image/tiff
0	string	=II	image/tiff
0	string	=GIF94z	image/unknown
0	string	=FGF95a	image/unknown
0	string	=PBF	image/unknown
0	string	=GIF	image/gif
0	beshort	=65496	image/jpeg
0	string	=BM	image/bmp
0	string	=;;	text/plain
0	string	=
(	application/x-elc
0	string	=;ELC	application/x-elc
0	string	=Relay-Version:	message/rfc822
0	string	=#! rnews	message/rfc822
0	string	=N#! rnews	message/rfc822
0	string	=Forward to	message/rfc822
0	string	=Pipe to	message/rfc822
0	string	=Return-Path:	message/rfc822
0	string	=Path:	message/news
0	string	=Xref:	message/news
0	string	=From:	message/rfc822
0	string	=Article	message/news
0	string	=þ7#	application/msword
0	string	=Û¥-	application/msword
0	string	=%!	application/postscript
0	string	=%!	application/postscript
0	string	=%PDF-	application/pdf
38	string	=Spreadsheet	application/x-sc
0	string	=÷	application/x-dvi
0	leshort	=759	application/x-dvi
0	string	={\rtf	application/rtf
0	string	=³	video/mpeg
0	byte	=1	video/unknown
0	byte	=2	video/unknown
0	string	=DOC	
>43	byte	=20	application/ichitaro4
>43	byte	=21	application/ichitaro5
>43	byte	=22	application/ichitaro6
0	string	=ÐÏࡱá	
>48	byte	=27	application/excel
0	string	=ÐÏࡱá	
>64	byte	=0	application/powerpoint
0	string	=ÐÏࡱá	
>64	byte	=1	application/msword
0	belong	=435	video/mpeg
0	belong	=442	video/mpeg
0	beshort	&65504	audio/mpeg
0	string	=MOVI	video/quicktime
4	string	=moov	video/quicktime