encoding.n
上传用户:rrhhcc
上传日期:2015-12-11
资源大小:54129k
文件大小:3k
- '"
- '" Copyright (c) 1998 by Scriptics Corporation.
- '"
- '" See the file "license.terms" for information on usage and redistribution
- '" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
- '"
- '" RCS: @(#) $Id: encoding.n,v 1.3.18.3 2004/10/27 14:23:56 dkf Exp $
- '"
- .so man.macros
- .TH encoding n "8.1" Tcl "Tcl Built-In Commands"
- .BS
- .SH NAME
- encoding - Manipulate encodings
- .SH SYNOPSIS
- fBencoding fIoptionfR ?fIarg arg ...fR?
- .BE
- .SH INTRODUCTION
- .PP
- Strings in Tcl are encoded using 16-bit Unicode characters. Different
- operating system interfaces or applications may generate strings in
- other encodings such as Shift-JIS. The fBencodingfR command helps
- to bridge the gap between Unicode and these other formats.
- .SH DESCRIPTION
- .PP
- Performs one of several encoding related operations, depending on
- fIoptionfR. The legal fIoptionfRs are:
- .TP
- fBencoding convertfromfR ?fIencodingfR? fIdatafR
- Convert fIdatafR to Unicode from the specified fIencodingfR. The
- characters in fIdatafR are treated as binary data where the lower
- 8-bits of each character is taken as a single byte. The resulting
- sequence of bytes is treated as a string in the specified
- fIencodingfR. If fIencodingfR is not specified, the current
- system encoding is used.
- .TP
- fBencoding converttofR ?fIencodingfR? fIstringfR
- Convert fIstringfR from Unicode to the specified fIencodingfR.
- The result is a sequence of bytes that represents the converted
- string. Each byte is stored in the lower 8-bits of a Unicode
- character. If fIencodingfR is not specified, the current
- system encoding is used.
- .TP
- fBencoding namesfR
- Returns a list containing the names of all of the encodings that are
- currently available.
- .TP
- fBencoding systemfR ?fIencodingfR?
- Set the system encoding to fIencodingfR. If fIencodingfR is
- omitted then the command returns the current system encoding. The
- system encoding is used whenever Tcl passes strings to system calls.
- .SH EXAMPLE
- .PP
- It is common practice to write script files using a text editor that
- produces output in the euc-jp encoding, which represents the ASCII
- characters as singe bytes and Japanese characters as two bytes. This
- makes it easy to embed literal strings that correspond to non-ASCII
- characters by simply typing the strings in place in the script.
- However, because the fBsourcefR command always reads files using the
- current system encoding, Tcl will only source such files correctly
- when the encoding used to write the file is the same. This tends not
- to be true in an internationalized setting. For example, if such a
- file was sourced in North America (where the ISO8859-1 is normally
- used), each byte in the file would be treated as a separate character
- that maps to the 00 page in Unicode. The resulting Tcl strings will
- not contain the expected Japanese characters. Instead, they will
- contain a sequence of Latin-1 characters that correspond to the bytes
- of the original string. The fBencodingfR command can be used to
- convert this string to the expected Japanese Unicode characters. For
- example,
- .CS
- set s [fBencoding convertfromfR euc-jp "\xA4\xCF"]
- .CE
- would return the Unicode string "\u306F", which is the Hiragana
- letter HA.
- .SH "SEE ALSO"
- Tcl_GetEncoding(3)
- .SH KEYWORDS
- encoding