README.Charsets
上传用户:blenddy
上传日期:2007-01-07
资源大小:6495k
文件大小:5k
-
- PostgreSQL Charsets README
- Josef Balatka, <balatka@email.cz>
- Draft v0.1, Tue Jul 20 15:49:07 CEST 1999
-
- This document is a brief overview of the national charsets support
- that PostgreSQL ver. 6.5 has implemented. Various compilation options
- and setup tips are mentioned here to be helpful in the particular use.
-
- ---------------------------------------------------------------------------
-
- Table of Contents
-
- 1. Locale awareness
-
- 2. Single-byte charsets recoding
-
- 3. Multi-byte support/recoding
-
- 4. Credits
-
- ---------------------------------------------------------------------------
-
- 1. Locale awareness
-
- PostgreSQL server supports both locale aware and locale not aware
- (default) operational modes. You can determine this mode during the
- configuration stage of the installation with --enable-locale option.
-
- If you don't use --enable-locale, the multi-language code will not be
- compiled and PostgreSQL will behave as an ASCII compliant application.
- This mode is useful for its speed but only provided that you don't
- have to consider national specific chars.
- With --enable-locale you will get a locale aware server using LC_*
- environment variables to determine how to process national specifics.
- In this case strcoll(3) and similar functions are used internally
- so speed is somewhat lower.
-
- Notice here that --enable-locale is sufficient when all your clients
- use the same single-byte encoding as the database server does.
-
- When your clients use encoding different from the server than you have
- to use, moreover, --enable-recode or --with-mb=<encoding> options on
- the server side or a particular client that does recoding itself (e.g.
- there exists a PostgreSQL ODBC driver for Win32 with various Cyrillic
- encoding capability). Option --with-mb=<encoding> is necessary for the
- multi-byte charsets support.
-
-
- 2. Single-byte charsets recoding
-
- You can set up this feature with --enable-recode option. This option
- is described as 'enable Cyrillic recode support' which doesn't express
- all its power. It can be used for *any* single-byte charset recoding.
-
- This method uses charset.conf file located in the $PGDATA directory.
- It's a typical configuration text file where spaces and newlines
- separate items and records and # specifies comments. Three keywords
- with the following syntax are recognized here:
-
- BaseCharset <server_charset>
- RecodeTable <from_charset> <to_charset> <file_name>
- HostCharset <host_spec> <host_charset>
-
- BaseCharset defines encoding of the database server. All charset
- names are only used for mapping inside the charset.conf so you can
- freely use typing-friendly names.
-
- RecodeTable records specify translation table between server and client.
- The file name is relative to the $PGDATA directory. Table file format
- is very simple. There are no keywords and characters are represented by
- a pair of decimal or hexadecimal (0x prefixed) values on single lines:
-
- <char_value> <translated_char_value>
-
- HostCharset records define IP address and charset. You can use a single
- IP address, an IP mask range starting from the given address or an IP
- interval (e.g. 127.0.0.1, 192.168.1.100/24, 192.168.1.20-192.168.1.40)
-
- The charset.conf is always processed up to the end, so you can easily
- specify exceptions from the previous rules. In the src/data you will
- find charset.conf example and a few recoding tables.
-
- As this solution is based on the client's IP address / charset mapping
- there are obviously some restrictions as well. You can't use different
- encoding on the same host at the same time. It's also inconvenient when
- you boot your client hosts into more operating systems.
- Nevertheless, when these restrictions are not limiting and you don't
- need multi-byte chars than it's a simple and effective solution.
-
-
- 3. Multi-byte support/recoding
-
- It's a new generation of charset encoding in PostgreSQL designed as a
- more complex solution supporting both single-byte and multi-byte chars.
- You can set up this feature with --with-mb=<encoding> option.
-
- There is no IP mapping file and recoding is controlled through the new
- SQL statements. Recoding tables are included in the code. Many national
- charsets are already supported and further will follow.
-
- See doc/README.mb, doc/README.mb.jp to get detailed instruction on how
- to use the multibyte support. In the file doc/README.locale there is
- a particular instruction on usage of the multibyte support with Cyrillic.
-
-
- 4. Credits
-
- I'd like to thank the PostgreSQL development team and all contributors
- for creating PostgreSQL. Thanks to Oleg Bartunov, Oleg Broytmann and
- Tatsuo Ishii for opening the door into the multi-language world.
-