+ (NSString *)replaceUnicode:(NSString *)unicodeStr { 

NSString *tempStr1 = [unicodeStrstringByReplacingOccurrencesOfString:@"\\u"withString:@"\\U"];
NSString *tempStr2 = [tempStr1stringByReplacingOccurrencesOfString:@"\""withString:@"\\\""];
NSString *tempStr3 = [[@"\""stringByAppendingString:tempStr2]stringByAppendingString:@"\""];
NSData *tempData = [tempStr3dataUsingEncoding:NSUTF8StringEncoding];
NSString* returnStr = [NSPropertyListSerializationpropertyListFromData:tempData
mutabilityOption:NSPropertyListImmutable
format:NULL
errorDescription:NULL]; return [returnStrstringByReplacingOccurrencesOfString:@"\\r\\n"withString:@"\n"]; }

汉字与utf8相互转化

NSString* strA = [@"%E4%B8%AD%E5%9B%BD"stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSString *strB = [@"中国"stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

NSString 转化为utf8

NSString *strings = [NSStringstringWithFormat:@"abc"];

NSLog(@"strings : %@",strings);

CF_EXPORT
CFStringRef CFURLCreateStringByAddingPercentEscapes(CFAllocatorRef allocator,CFStringReforiginalString,CFStringRef charactersToLeaved, CFStringReflegalURLCharactersToBeEscaped,CFStringEncoding encoding); NSString *encodedValue = (__bridge NSString*)CFURLCreateStringByAddingPercentEscapes(nil, (__bridgeCFStringRef)strings,nil, (CFStringRef)@"!*'();:@&=+$,/?%#[]",kCFStringEncodingUTF8);

iso8859-1 到 unicode编码转换

+ (NSString *)changeISO88591StringToUnicodeString:(NSString *)iso88591String
{ NSMutableString *srcString = [[[NSMutableString alloc]initWithString:iso88591String] autorelease]; [srcString replaceOccurrencesOfString:@"&" withString:@"&" options:NSLiteralSearch range:NSMakeRange(, [srcString length])];
[srcString replaceOccurrencesOfString:@"&#x" withString:@"" options:NSLiteralSearch range:NSMakeRange(, [srcString length])]; NSMutableString *desString = [[[NSMutableString alloc]init] autorelease]; NSArray *arr = [srcString componentsSeparatedByString:@";"]; for(int i=;i<[arr count]-;i++){ NSString *v = [arr objectAtIndex:i];
char *c = malloc();
int value = [StringUtil changeHexStringToDecimal:v];
c[] = value &0x00FF;
c[] = value >> &0x00FF;
c[] = '\0';
[desString appendString:[NSString stringWithCString:c encoding:NSUnicodeStringEncoding]];
free(c);
} return desString;
}

Q: Is there a standard method to package a Unicode character so it fits an 8-Bit ASCII stream?

A: There are three or four options for making Unicode fit into an 8-bit format.

a) Use UTF-8. This preserves ASCII, but not Latin-1, because the characters >127 are different from Latin-1. UTF-8 uses the bytes in the ASCII only for ASCII characters. Therefore, it works well in any environment where ASCII characters have a significance as syntax characters, e.g. file name syntaxes, markup languages, etc., but where the all other characters may use arbitrary bytes. 
Example: “Latin Small Letter s with Acute” (015B) would be encoded as two bytes: C5 9B.

b) Use Java or C style escapes, of the form \uXXXXX or \xXXXXX. This format is not standard for text files, but well defined in the framework of the languages in question, primarily for source files.
Example: The Polish word “wyjście” with character “Latin Small Letter s with Acute” (015B) in the middle (ś is one character) would look like: “wyj\u015Bcie".

c) Use the &#xXXXX; or &#DDDDD; numeric character escapes as in HTML or XML. Again, these are not standard for plain text files, but well defined within the framework of these markup languages.
Example: “wyjście” would look like “wyjście"

d) Use SCSU. This format compresses Unicode into 8-bit format, preserving most of ASCII, but using some of the control codes as commands for the decoder. However, while ASCII text will look like ASCII text after being encoded in SCSU, other characters may occasionally be encoded with the same byte values, making SCSU unsuitable for 8-bit channels that blindly interpret any of the bytes as ASCII characters.
Example: “ wyjÛcie” where indicates the byte 0x12 and “Û” corresponds to byte 0xDB. [AF] & [KW]

如c所描述,这是一种“未标准"但广泛采用的做法,说是山寨编码也行 :-)

所以编码过程是

字符串 -> Unicode编码 -> &#xXXXX; or &#DDDDD;

解码过程反过来即可

http://unicode.org/faq/utf_bom.html#General

Unicode转化为汉字的更多相关文章

  1. 如何利用java把文件中的Unicode字符转换为汉字

    有些文件中存在Unicode字符和非Unicode字符,如何利用java快速的把文件中的Unicode字符转换为汉字而不影响文件中的其他字符呢, 我们知道虽然java 在控制台会把Unicode字符直 ...

  2. python判断unicode是否是汉字,数字,英文,或者其他字符

    下面这个小工具包含了 判断unicode是否是汉字,数字,英文,或者其他字符. 全角符号转半角符号. unicode字符串归一化等工作. 还有一个能处理多音字的汉字转拼音的程序,还在整理中. #!/u ...

  3. js 中文汉字转Unicode、Unicode转中文汉字、ASCII转换Unicode、Unicode转换ASCII、中文转换&#XXX函数代码

    最近看不少在线工具里面都有一些编码转换的代码,很多情况下我们都用得到,这里脚本之家小编就跟大家分享一下这些资料 Unicode介绍 Unicode(统一码.万国码.单一码)是一种在计算机上使用的字符编 ...

  4. 数字转化为汉字,如5->五

    //数字转化为汉字 如5-->五-(NSString*)translation:(NSString *)arebic{   NSString *str = arebic;    NSArray ...

  5. 在 Java 中将 Unicode 编码的汉字转码

    今天在做一个新浪微博的抓取测试,发现抓取后的内容是Unicode编码的,完全找不到熟悉的汉字了,下面搜索出来的一种方法,完全可行,只是不知到Java内部是否提供了相关的类库. 实现方法如下: publ ...

  6. Unicode编码转换汉字

    Uri.UnescapeDataString(string) #region Unicode转换汉字 Console.WriteLine(Uri.UnescapeDataString("\u ...

  7. 根据Unicode码生成汉字

    最近需要一批汉字字符数据,类似数字字符与ASCII码之间的对应关系,汉字字符与Unicode码之间也存在对应关系. 所以可以遍历Unicode码批量生成汉字. 其中,汉字为宽字符,输出时候注意需要修改 ...

  8. 使用jdk自带的工具native2ascii 转换Unicode字符和汉字

    1.控制台转换 1.1 将汉字转为Unicode: C:\Program Files\Java\jdk1.5.0_04\bin>native2ascii 测试 \u6d4b\u8bd5 1.2 ...

  9. java中unicode utf-8以及汉字之间的转换工具类

    1.       汉字字符串与unicode之间的转换 1.1          stringToUnicode /** * 获取字符串的unicode编码 * 汉字"木"的Uni ...

随机推荐

  1. vs文件属性(生成操作)各选项功能

    右击项目里的文件,选择属性(F4)会有[生成操作]的选项. 它提供了14项选择,如图: 在这说一下常用的选项: 1.编译 编译用于c#代码类的操作,编译之后输出在该程序集的bin目录下.换句话说,代码 ...

  2. tp3.2博客详情页面查询上一篇下一篇

  3. typeScript入门(三)接口

      接口我感觉是很常用的一块 定义标准: 接口的作用:在面向对象的编程中,接口是一种规范的定义,它定义了行为和动作的规范,在程序设计里面,接口起到一种限制和规范的作用.接口定义了某一批类所需要遵守的规 ...

  4. ActiveMQ VirtualTopic

    参考网址: http://activemq.apache.org/virtual-destinations.html http://blog.csdn.net/kimmking/article/det ...

  5. ArcGIS图框工具5.2发布,支持ArcGIS10.0,10.110.2,支持国家2000坐标系

    ArcGIS图框工具5.2发布,支持ArcGIS10.0,10.110.2,支持国家2000坐标系 下载地址http://files.cnblogs.com/gisoracle/atktoolnew. ...

  6. VC++中对数据类型的限制limits.h文件内容

    limits.h文件中规定了是IDE在OS中规定了每个数据类型的最大值和最小值以及在程序源代码中编译时候所占用的字节数,这这样做有利于帮助程序员在编写程序的时候有效控制在选择合适数据类型的显示范围值. ...

  7. 1.Zabbix 3.0 基础

    请查看我的有道云笔记: http://note.youdao.com/noteshare?id=85046af7675851675679a47beadc7aa3&sub=000AB0B2409 ...

  8. May 13th 2017 Week 19th Saturday

    Mountains look beautiful from a distance. 远处看山山更美. This gnomic seems to circulate very long, its mor ...

  9. c++中的const用法(很详细)——转

    http://www.cnblogs.com/ymy124/archive/2012/04/16/2451433.html const给人的第一印象就是定义常量. (1)const用于定义常量. 例如 ...

  10. 如何修改Fiori Launchpad里Tile计数调用的时间间隔

    Fiori launchpad里的Tile上有一个数字,例如下图My Leads的例子:每隔指定的时间间隔,会向后台发起一次数据请求,读取当前Lead的个数. 这个请求可以在Chrome Develo ...