Going down your list:

  • "Unicode" isn't an encoding, although unfortunately, a lot of documentation imprecisely uses it to refer to whichever Unicode encoding that particular system uses by default. On Windows and Java, this often means UTF-16; in many other places, it means UTF-8. Properly, Unicode refers to the abstract character set itself, not to any particular encoding.
  • UTF-16: 2 bytes per "code unit". This is the native format of strings in .NET, and generally in Windows and Java. Values outside the Basic Multilingual Plane (BMP) are encoded as surrogate pairs. (These are relatively rarely used - which is a good job, as very few developers get them right, I suspect. I very much doubt that I do.)
  • UTF-8: Variable length encoding, 1-4 bytes per code point. ASCII values are encoded as ASCII using 1 byte.
  • UTF-7: Usually used for mail encoding. Chances are if you think you need it and you're not doing mail, you're wrong. (That's just my experience of people posting in newsgroups etc - outside mail, it's really not widely used at all.)
  • UTF-32: Fixed width encoding using 4 bytes per code point. This isn't very efficient, but makes life easier outside the BMP. I have a .NET Utf32String class as part of my MiscUtil library, should you ever want it. (It's not been very thoroughly tested, mind you.)
  • ASCII: Single byte encoding only using the bottom 7 bits. (Unicode code points 0-127.) No accents etc.
  • ANSI: There's no one fixed ANSI encoding - there are lots of them. Usually when people say "ANSI" they mean "the default locale/codepage for my system" which is obtained viaEncoding.Default, and is often Windows-1252 but can be other locales.

There's more on my Unicode page and tips for debugging Unicode problems.

The other big resource of code is unicode.org which contains more information than you'll ever be able to work your way through - possibly the most useful bit is the code charts.

Unicode, UTF, ASCII, ANSI format differences的更多相关文章

  1. 字符集、字符编码、国际化、本地化简要总结(UNICODE/UTF/ASCII/GB2312/GBK/GB18030)

    PS:要转载请注明出处,本人版权所有. PS: 这个只是基于<我自己>的理解, 如果和你的原则及想法相冲突,请谅解,勿喷. 环境说明   普通的linux 和 普通的windows.    ...

  2. Unicode(UTF&UCS)深度历险

    Unicode(UTF&UCS)深度历险 计算机网络诞生后,大家慢慢地发现一个问题:一个字节放不下一个字符了!因为需要交流,本地化的文字需要能够被支持. 最初的字符集使用7bit来存储字符,因 ...

  3. UNICODE与ASCII

    1.ASCII的特点 ASCII 是用来表示英文字符的一种编码规范.每个ASCII字符占用1 个字节,因此,ASCII 编码可以表示的最大字符数是255(00H—FFH).这对于英文而言,是没有问题的 ...

  4. utf-8,Unicode和ASCII区别

    一.ASCII 码 我们知道,计算机内部,所有信息最终都是一个二进制值.每一个二进制位(bit)有0和1两种状态,因此八个二进制位就可以组合出256种状态,这被称为一个字节(byte).也就是说,一个 ...

  5. V++ MFC CEdit输出数组 UNICODE TO ASCII码

    MFC怎么在静态编辑框中输出数组 //字符转ASCII码void CUTF8Dlg::OnBnClickedButtonCharAscii(){ // TODO: 在此添加控件通知处理程序代码 Upd ...

  6. 创建文件夹并解决解决unicode和ASCII码转换的问题

    # -*- coding: UTF-8 -*-import sysimport timeimport os #解决unicode和ASCII码转换的问题reload(sys) #解决unicode和A ...

  7. c# Unicode 转换 ASCII

    /// <summary> /// Unicode 转换 ASCII /// </summary> /// <param name="theText" ...

  8. python2.7 处理unicode和ascii字符串混用问题

    python2.7默认的编码方式为ascii码,如下可以查询: import sys sys.getdefaultencoding() 如果直接在unicode和ascii字符串之间做计算.比较.连接 ...

  9. ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

    转载请注明原文地址:https://www.cnblogs.com/cnodoo/p/9281366.html  ValueError: All strings must be XML compati ...

随机推荐

  1. winform窗体(五)——布局方式

    一.默认布局 ★可以加panel,也可以不加: ★通过鼠标拖动控件的方式,根据自己的想法布局.拖动控件的过程中,会有对齐的线,方便操作: ★也可选中要布局的控件,在工具栏中有对齐工具可供选择,也有调整 ...

  2. 《java JDK7 学习笔记》之类和对象

    1.在java中,要产生对象必须先定义类,类是对象的设计图,对象是类的实例.类定义时使用class关键词,建立实例对象要使用new关键词.以类名声明的变量,称为参考名称.参考变量或直接叫参考. 2.想 ...

  3. Linux 下编译升级 Python

    一.Centos下升级python3.4.3 1.下载安装 wget http://www.python.org/ftp/python/3.4.3/Python-3.4.3.tgz wget http ...

  4. Asp.Net MVC+BootStrap+EF6.0实现简单的用户角色权限管理5

    我们先直接拷贝下blank.html这个页面的代码,顺带先建立一个Home控制器,并添加Index视图.将代码拷贝进去. <!DOCTYPE html> <html lang=&qu ...

  5. 关于mysql数据库插入数据,不能插入中文和出现中文乱码问题

    首先,推荐一篇博客:http://www.cnblogs.com/sunzn/archive/2013/03/14/2960248.html 当时,我安装完mysql数据库后,新建一个数据库后插入数据 ...

  6. jquery——移动端滚动条插件iScroll.js

    官网:http://cubiq.org/iscroll-5 demo: 滚动刷新:http://cubiq.org/dropbox/iscroll4/examples/pull-to-refresh/ ...

  7. 使用IntelliJ IDEA搭建多maven模块JAVA项目

    一.新建项目和模块 步骤: 1. 新建一个项目,因为maven管理jar包非常方便,故此处建立一个maven项目:New Project->Maven->(Create from arch ...

  8. 【JavaScript Demo】回到顶部功能实现

    随着网站的不断发展,需要展示的内容也越来越丰富,这导致网页上能展示的内容越来越多.当内容堆积影响了用户体验,就需考虑如何提升用户体验.在这一系列的改动中,“回到顶部”的功能成为了一个经典. 1.页面布 ...

  9. UVA - 11137 Ingenuous Cubrency[背包DP]

    People in Cubeland use cubic coins. Not only the unit of currency iscalled a cube but also the coins ...

  10. P1546 最短网络 Agri-Net

    题目背景 农民约翰被选为他们镇的镇长!他其中一个竞选承诺就是在镇上建立起互联网,并连接到所有的农场.当然,他需要你的帮助. 题目描述 约翰已经给他的农场安排了一条高速的网络线路,他想把这条线路共享给其 ...