w

Hi,

You can check and compare sort orders provided by these two collations here:

http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html 
http://www.collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html

utf8_general_ci is a very simple collation. What it does - it just 
- removes all accents 
- then converts to upper case 
and uses the code of this sort of "base letter" result letter to compare.

For example, these Latin letters: ÀÁÅåāă (and all other Latin letters "a" 
with any accents and in any cases) are all compared as equal to "A".

utf8_unicode_ci uses the default Unicode collation element table (DUCET).

The main differences are:

1. utf8_unicode_ci supports so called expansions and ligatures, for example: 
German letter ß (U+00DF LETTER SHARP S) is sorted near "ss" 
Letter Œ (U+0152 LATIN CAPITAL LIGATURE OE) is sorted near "OE".

utf8_general_ci does not support expansions/ligatures, it sorts 
all these letters as single characters, and sometimes in a wrong order.

2. utf8_unicode_ci is *generally* more accurate for all scripts. 
For example, on Cyrillic block: 
utf8_unicode_ci is fine for all these languages: 
Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian. 
While utf8_general_ci is fine only for Russian and Bulgarian subset of Cyrillic. 
Extra letters used in Belarusian, Macedonian, Serbian, and Ukrainian 
are sorted not well.

The disadvantage of utf8_unicode_ci is that it is a little bit 
slower than utf8_general_ci.

So when you need better sorting order - use utf8_unicode_ci, 
and when you utterly interested in performance - use utf8_general_ci.

Character Sets, Collation, Unicode :: utf8_unicode_ci vs utf8_general_ci的更多相关文章

  1. 3个问题:MySQL 中 character set 与 collation 的理解;utf8_general_ci 与 utf8_unicode_ci 区别;uft8mb4 默认collation:utf8mb4_0900_ai_ci 的含义

    MySQL 中 character set 与 collation 的理解 出处:https://www.cnblogs.com/EasonJim/p/8128196.html 推荐: 编码使用 uf ...

  2. mysql补充(1)校对集utf8_unicode_ci与utf8_general_ci

    创建数据库并设置编码utf-8 多语言(补充1 2) create database mydb default character set utf8 collate utf8_general_ci; ...

  3. Firebird Character Sets and Collations

    Firebird Character Sets and Collations Every CHAR or VARCHAR field can (or, better: must) have a cha ...

  4. 【转】Mysql中的排序规则utf8_unicode_ci、utf8_general_ci的区别总结

    Mysql中utf8_general_ci与utf8_unicode_ci有什么区别呢?在编程语言中,通常用unicode对中文字符做处理,防止出现乱码,那么在MySQL里,为什么大家都使用utf8_ ...

  5. Mysql中的排序规则utf8_unicode_ci、utf8_general_ci的区别总结

    Mysql中utf8_general_ci与utf8_unicode_ci有什么区别呢?在编程语言中,通常用unicode对中文字符做处理,防止出现乱码,那么在MySQL里,为什么大家都使用utf8_ ...

  6. 10.1.5 Connection Character Sets and Collations

    10.1.5 Connection Character Sets and Collations Several character set and collation system variables ...

  7. utf8_unicode_ci与utf8_general_ci

    下面摘录一下Mysql 5.1中文手册中关于utf8_unicode_ci与utf8_general_ci的说明: 当前,utf8_unicode_ci校对规则仅部分支持Unicode校对规则算法.一 ...

  8. Mysql中的排序规则utf8_unicode_ci、utf8_general_ci总结

    Mysql中utf8_general_ci与utf8_unicode_ci有什么区别呢?在编程语言中,通常用unicode对中文字符做处理,防止出现乱码,那么在MySQL里,为什么大家都使用utf8_ ...

  9. utf8_unicode_ci、utf8_general_ci区别

    摘录一下Mysql 5.1中文手册中关于utf8_unicode_ci与utf8_general_ci的说明:   当前,utf8_unicode_ci校对规则仅部分支持Unicode校对规则算法.一 ...

随机推荐

  1. ultragrid checkbox

    울트라그리드에 체크박스 넣을 사용하는 속성. cols["checked"].Header.Caption = ""; cols["checked ...

  2. 从12306网站新验证码看Web验证码设计与破解

    2015年3月16日,铁路官方购票网站12306又出新招,在登录界面推出了全新的验证方式,用户在填写好登录名和密码之后,还要准确的选取图片验证码才能登陆成功.据悉,12306验证码改版后,目前所有抢票 ...

  3. bitnami-redmineserver迁移

    1. 背景 在Redmineserver迁移过程中.假设前后两个Redmine的版本号一样,事情就简单,假设版本号不一样,就有可能面临两个版本号数据库不兼容.那就比較麻烦了.本文旨在介绍数据库不兼容时 ...

  4. 【转载】Oracle数据字典详解

    转自:http://czmmiao.iteye.com/blog/1258462 Oracle数据字典概述 数据库是数据的集合,数据库维护和管理这用户的数据,那么这些用户数据表都存在哪里,用户的信息是 ...

  5. tensorflow 之模型的保存与加载(一)

    怎样让通过训练的神经网络模型得以复用? 本文先介绍简单的模型保存与加载的方法,后续文章再慢慢深入解读. #!/usr/bin/env python3 #-*- coding:utf-8 -*- ### ...

  6. node多项目同时运行,nginx端口监听转发

    在服务器端安装pm2 npm install npm2 -g --save 之后再项目目录下运行 pm2 start app.js 在查看进程,是否已经启动 pm2 list 多个项目,我们只要监听端 ...

  7. Redis 的 fields 遇到的问题

    问题描述:本地和测试环境同使用一台redis服务器,本地环境和测试环境使用 key,fileds,value 中的fileds 来区分,例如 key fields value 004920c6eba1 ...

  8. IPC之信号量

    无名信号量 POSIX标准提出了有名信号量和无名信号量来同步进程和线程,而linux(2.6以前)只实现了无名信号量. sem_overview中有详细介绍:man 7 sem_overview. S ...

  9. FAT,FAT32,NTFS单目录文件数量限制

    http://hi.baidu.com/huaxinchang/item/5ba53ba9b29631756dd4551b —————————————————————————————————————— ...

  10. 更改MVC注册Areas的顺序,掌控Areas的运作

    [转自:http://www.cnblogs.com/dozer/archive/2010/04/14/change-order-of-MVC-Areas.html] 一.前言 首先,有人要问,为什么 ...