[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: html-split について

At 13 Mar 2000 14:06:52 +0900,
小関 吉則 (KOSEKI Yoshinori) <kose@xxxxxxxxxxxxxxxxxx> wrote:

> > やっぱり、URL Encode するのが正しいのではないでしょうか?
> > #URL Encodeはweb Browserの仕事、という話もありますが。
> えっほんと??
> は
> になるのですか?


> 実際の browser の実装としては岡埜さんのパッチのように EUC-JP
> にすれば動くようです。それで日本語だけの場合はとりあえず良い
> と思います。

人から聞いた話ですが、Windows版Internet Exploerer4.xだったか

> name="..." に ASCII 以外って入れられないんじゃないのかなー。
> となんとなく思っていたので、詳しい人の解説を聞きたかったので
> す。:-)


2.2. URL Character Encoding Issues

   URLs are sequences of characters, i.e., letters, digits, and special
   characters. A URLs may be represented in a variety of ways: e.g., ink
   on paper, or a sequence of octets in a coded character set. The
   interpretation of a URL depends only on the identity of the
   characters used.

   In most URL schemes, the sequences of characters in different parts
   of a URL are used to represent sequences of octets used in Internet
   protocols. For example, in the ftp scheme, the host name, directory
   name and file names are such sequences of octets, represented by
   parts of the URL.  Within those parts, an octet may be represented by
   the chararacter which has that octet as its code within the US-ASCII
   [20] coded character set.

   In addition, octets may be encoded by a character triplet consisting
   of the character "%" followed by the two hexadecimal digits (from
   "0123456789ABCDEF") which forming the hexadecimal value of the octet.
   (The characters "abcdef" may also be used in hexadecimal encodings.)

   Octets must be encoded if they have no corresponding graphic
   character within the US-ASCII coded character set, if the use of the
   corresponding character is unsafe, or if the corresponding character
   is reserved for some other interpretation within the particular URL


Hirose Yoshihide