10.1.5 Connection Character Sets and Collations

Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:

Additional character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables.

A “connection” is what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several questions about character set and collation handling for client connections, each of which can be answered in terms of system variables:

  • What character set is the statement in when it leaves the client?

    The server takes the character_set_client system variable to be the character set in which statements are sent by the client.

  • What character set should the server translate a statement to after receiving it?

    For this, the server uses the character_set_connection and collation_connection system variables. It converts statements sent by the client from character_set_client to character_set_connection (except for string literals that have an introducer such as _latin1 or _utf8). collation_connection is important for comparisons of literal strings. For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.

  • What character set should the server translate to before shipping result sets or error messages back to the client?

    The character_set_results system variable indicates the character set in which the server returns query results to the client. This includes result data such as column values, and result metadata such as column names and error messages.

Clients can fine-tune the settings for these variables, or depend on the defaults (in which case, you can skip the rest of this section). If you do not use the defaults, you must change the character settings for each connection to the server.

Two statements affect the connection-related character set variables as a group:

  • SET NAMES 'charset_name' [COLLATE 'collation_name']

    SET NAMES indicates what character set the client will use to send SQL statements to the server. Thus, SET NAMES 'cp1251' tells the server, “future incoming messages from this client are in character set cp1251.” It also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a SELECT statement.)

    SET NAMES 'charset_name' statement is equivalent to these three statements:

    SET character_set_client = charset_name;
    SET character_set_results = charset_name;
    SET character_set_connection = charset_name;

    Setting character_set_connection to charset_name also implicitly sets collation_connection to the default collation for charset_name. It is unnecessary to set that collation explicitly. To specify a particular collation, use the optional COLLATE clause:

    SET NAMES 'charset_name' COLLATE 'collation_name'
    
  • SET CHARACTER SET charset_name

    SET CHARACTER SET is similar to SET NAMES but sets character_set_connection and collation_connection to character_set_database and collation_database. A SET CHARACTER SET charset_name statement is equivalent to these three statements:

    SET character_set_client = charset_name;
    SET character_set_results = charset_name;
    SET collation_connection = @@collation_database;

    Setting collation_connection also implicitly sets character_set_connection to the character set associated with the collation (equivalent to executing SET character_set_connection = @@character_set_database). It is unnecessary to set character_set_connection explicitly.

Note

ucs2utf16utf16le, and utf32 cannot be used as a client character set, which means that they do not work for SET NAMES or SET CHARACTER SET.

The MySQL client programs mysqlmysqladminmysqlcheckmysqlimport, and mysqlshow determine the default character set to use as follows:

  • In the absence of other information, the programs use the compiled-in default character set, usually latin1.

  • The programs can autodetect which character set to use based on the operating system setting, such as the value of the LANG or LC_ALL locale environment variable on Unix systems or the code page setting on Windows systems. For systems on which the locale is available from the OS, the client uses it to set the default character set rather than using the compiled-in default. For example, setting LANG to ru_RU.KOI8-R causes the koi8r character set to be used. Thus, users can configure the locale in their environment for use by MySQL clients.

    The OS character set is mapped to the closest MySQL character set if there is no exact match. If the client does not support the matching character set, it uses the compiled-in default. For example, ucs2 is not supported as a connection character set.

    C applications can use character set autodetection based on the OS setting by invoking mysql_options() as follows before connecting to the server:

    mysql_options(mysql,
    MYSQL_SET_CHARSET_NAME,
    MYSQL_AUTODETECT_CHARSET_NAME);
  • The programs support a --default-character-set option, which enables users to specify the character set explicitly to override whatever default the client otherwise determines.

When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set the character_set_clientcharacter_set_results, and character_set_connection system variables. In effect, the server performs a SET NAMES operation using the character set name.

With the mysql client, to use a character set different from the default, you could explicitly execute SET NAMES every time you start up. To accomplish the same result more easily, add the --default-character-set option setting to yourmysql command line or in your option file. For example, the following option file setting changes the three connection-related character set variables set to koi8r each time you invoke mysql:

[mysql]
default-character-set=koi8r

If you are using the mysql client with auto-reconnect enabled (which is not recommended), it is preferable to use the charset command rather than SET NAMES. For example:

mysql> charset utf8
Charset changed

The charset command issues a SET NAMES statement, and also changes the default character set that mysql uses when it reconnects after the connection has dropped.

Example: Suppose that column1 is defined as CHAR(5) CHARACTER SET latin2. If you do not say SET NAMES or SET CHARACTER SET, then for SELECT column1 FROM t, the server sends back all the values for column1 using the character set that the client specified when it connected. On the other hand, if you say SET NAMES 'latin1' or SET CHARACTER SET latin1 before issuing the SELECT statement, the server converts the latin2 values to latin1 just before sending results back. Conversion may be lossy if there are characters that are not in both character sets.

If you want the server to perform no conversion of result sets or error messages, set character_set_results to NULL or binary:

SET character_set_results = NULL;

To see the values of the character set and collation system variables that apply to your connection, use these statements:

SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';

You must also consider the environment within which your MySQL applications execute. See Section 10.1.6, “Configuring the Character Set and Collation for Applications”.

http://dev.mysql.com/doc/refman/5.6/en/charset-connection.html

10.1.5 Connection Character Sets and Collations的更多相关文章

  1. MySQL: Connection Character Sets and Collations

    character_set_server collation_servercharacter_set_databasecollation_database character_set_clientch ...

  2. Firebird Character Sets and Collations

    Firebird Character Sets and Collations Every CHAR or VARCHAR field can (or, better: must) have a cha ...

  3. 02:PostgreSQL Character Sets

    在利用postGIS导入shapefile文件到postgresql数据库的时候,老是提示字符串的问题,或者是乱码,试了好几种都不行,于是度娘之.... 使用默认的UTF8,提示信息是:建议使用LAT ...

  4. docker登录报错Error response from daemon: Get https://192.168.30.10/v1/users/: dial tcp 192.168.30.10:443: connect: connection refused

    背景描述: 登录docker报错: [root@localhost sysconfig]# docker login 192.168.30.10 Username (newcs06): newcs06 ...

  5. Character Sets: Migrating to utf8mb4 with pt_online_schema_change

    David Berube  | June 12, 2018 |  Posted In: MySQL Modern applications often feature the use of data ...

  6. 10 Quality Free Flat Icon Sets for Your Designs

    Subscribe It’s clear that flat design has gained great popularity in recent years. This is hardly su ...

  7. Character Sets, Collation, Unicode :: utf8_unicode_ci vs utf8_general_ci

    w Hi, You can check and compare sort orders provided by these two collations here: http://www.collat ...

  8. mysql set names 命令和 mysql 字符编码问题

    先看下面的执行结果: (root@localhost)[(none)]mysql>show variables like 'character%'; +--------------------- ...

  9. mysql set names 命令和 mysql字符编码问题

    先看下面的执行结果: (root@localhost)[(none)]mysql>show variables like 'character%'; +--------------------- ...

随机推荐

  1. POJ 3280 Cheapest Palindrome 简单DP

    观察题目我们可以知道,实际上对于一个字母,你在串中删除或者添加本质上一样的,因为既然你添加是为了让其对称,说明有一个孤立的字母没有配对的,也就可以删掉,也能满足对称. 故两种操作看成一种,只需要保留花 ...

  2. BZOJ1123: [POI2008]BLO

    1123: [POI2008]BLO Time Limit: 10 Sec  Memory Limit: 162 MBSubmit: 614  Solved: 235[Submit][Status] ...

  3. 了解 Windows Azure 存储的可伸缩性、可用性、持久性和计费

    借助 Windows Azure存储,应用程序开发者及其应用程序和用户可以在云中使用可用性更高.持久性更长.可伸缩性更强的海量存储.开发者可以构建能随时随地高效访问数据的服务,在所需的时间段内存储任意 ...

  4. virsh 基于xml create VMs虚机

  5. mybatis学习笔记第一讲

    第一步:先配置mybatis配置 <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE confi ...

  6. Web Service 初步了解

    Web Service见名之意就是网络上的一些服务,解决的问题就是如何使用这些服务,因为软件的开发有各种各样的语言,利用Java,C#,VB.NET,PHP等等,如何使这些语言编写的程序能够进行互通, ...

  7. 总结下java经常犯的错误

    编写代码是一种艺术,认识错误是我们代码改进的重要途径之一.以下情况并非大家都能碰到过,但希望提高代码质量的人都引以为戒.以下各种情况,都是初学者经常犯的错误. 1.1       字符串没有判断是否为 ...

  8. c++ 11 多线程教学(1)

    本篇教学代码可在GitHub获得:https://github.com/sol-prog/threads. 在之前的教学中,我展示了一些最新进的C++11语言内容: 1. 正则表达式(http://s ...

  9. 注册AxtiveX控件

    Win8.1或者Win7下 首先在“管理员的身份”运行cmd 然后输入:regsvr32 D:\***\*.ocx

  10. [CSAPP笔记][第八章异常控制流][呕心沥血千行笔记]

    异常控制流 控制转移 控制流 系统必须能对系统状态的变化做出反应,这些系统状态不是被内部程序变量捕获,也不一定和程序的执行相关. 现代系统通过使控制流 发生突变对这些情况做出反应.我们称这种突变为异常 ...