【BUG】xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence
来自http://blog.csdn.net/chenyanbo/article/details/6866941
xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence
说简单点当你解析别人的xml格式出现这个错误可能就是别人在生成xml时没有保存为utf-8的字符编码格式。
在中文版的window下java的默认的编码为GBK,也就是所虽然我们标识了要将xml保存为utf-8格式但实际上文件是以GBK格式来保存的,所以这也就是为什么能够我们使用GBK、GB2312编码来生成xml文件能正确的被解析,而以UTF-8格式生成的文件不能被xml解析器所解析的原因。
xml解析时遇到的编码异常:
- org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception: Invalid byte 1 of 1-byte UTF-8 sequence.
- at org.dom4j.io.SAXReader.read(SAXReader.java:484)
- at org.dom4j.io.SAXReader.read(SAXReader.java:321)
- at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
- at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
- at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
- Nested exception:
- com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
- at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
- at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
- at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
- at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
- at org.dom4j.io.SAXReader.read(SAXReader.java:465)
- at org.dom4j.io.SAXReader.read(SAXReader.java:321)
- at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
- at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
- at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
- Nested exception: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
- at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
- at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
- at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
- at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
- at org.dom4j.io.SAXReader.read(SAXReader.java:465)
- at org.dom4j.io.SAXReader.read(SAXReader.java:321)
- at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
- at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
- at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.dom4j.io.SAXReader.read(SAXReader.java:484)
at org.dom4j.io.SAXReader.read(SAXReader.java:321)
at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
Nested exception:
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.io.SAXReader.read(SAXReader.java:321)
at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
Nested exception: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.io.SAXReader.read(SAXReader.java:321)
at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
解决:
1、最简单就是把<?xml version="1.0" encoding="UTF-8"?>改成<?xml version="1.0" encoding="gbk"?>
2、或者把xml打开另存的时候把字符集改为UTF-8后保存
3、在代码解析的时候先把xml重新写一遍
- SAXReader reader = new SAXReader();
- org.dom4j.Document document = reader.read("D:\\ha.xml");
- OutputFormat of = new OutputFormat();
- of.setEncoding("UTF-8"); //改变编码方式
- XMLWriter writer = new XMLWriter(new FileWriter "d:\\dom4j.xml"), of);
SAXReader reader = new SAXReader();
org.dom4j.Document document = reader.read("D:\\ha.xml");
OutputFormat of = new OutputFormat();
of.setEncoding("UTF-8"); //改变编码方式
XMLWriter writer = new XMLWriter(new FileWriter "d:\\dom4j.xml"), of);
4、直接dom4j读取的时候用io来读,修改字符编码
- FileInputStream in = new FileInputStream(new File(fileName));
- Reader read = new InputStreamReader(in,"gbk");
- Document document = reader.read(read);
【BUG】xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence的更多相关文章
- Xml读取异常--Invalid byte 1 of 1-byte UTF-8 sequence
xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence org.dom4j.DocumentException: Invalid byte 1 of 1-byte ...
- mybatis异常invalid comparison: java.util.Date and java.lang.String
原文链接:http://blog.csdn.net/wanghailong_qd/article/details/50673144 mybatis异常invalid comparison: java. ...
- xml 读取递归算法
xml 读取递归算法:
- paip.获取proxool的配置 xml读取通过jdk xml 初始化c3c0在代码中总结
paip.获取proxool的配置 xml读取通过jdk xml 初始化c3c0在代码中 xml读取通过jdk xml 初始化c3c0在代码中.. ... 作者Attilax 艾龙, EMAI ...
- Java基础知识强化之IO流笔记27:FileInputStream读取数据一次一个字节数组byte[ ]
1. FileInputStream读取数据一次一个字节数组byte[ ] 使用FileInputStream一次读取一个字节数组: int read(byte[] b) 返回值:返回值其实是实际 ...
- Linq to XML 读取XML 备忘笔记
本文转载:http://www.cnblogs.com/infozero/archive/2010/07/13/1776383.html Linq to XML 读取XML 备忘笔记 最近一个项目中有 ...
- C#基础笔记---浅谈XML读取以及简单的ORM实现
背景: 在开发ASP.NETMVC4 项目中,虽然web.config配置满足了大部分需求,不过对于某些特定业务,我们有时候需要添加新的配置文件来记录配置信息,那么XML文件配置无疑是我们选择的一个方 ...
- python 读取文件时报错UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal multibyte sequence
python读取文件时提示"UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal m ...
- svn checkout The XML response contains invalid XML
svn checkout 报错:The XML response contains invalid XML 待解决? ---目前没有找到好的解决方法,svn数据库中存的log入手应该可以,有时间再去看 ...
随机推荐
- http://python.jobbole.com/85056/ 简单 12 步理解 Python 装饰器,https://www.cnblogs.com/deeper/p/7482958.html另一篇文章
好吧,我标题党了.作为 Python 教师,我发现理解装饰器是学生们从接触后就一直纠结的问题.那是因为装饰器确实难以理解!想弄明白装饰器,需要理解一些函数式编程概念,并且要对Python中函数定义和函 ...
- Nginx HTTP 过滤addition模块(响应前后追加数据)
--with-http_addition_module 需要编译进Nginx 其功能主要在响应前或响应后追加内容 add_before_body 指令 将处理给定子请求后返回的文本添加到响应正文之前 ...
- 数字对——RMQ+二分答案
题目描述 小H是个善于思考的学生,现在她又在思考一个有关序列的问题. 她的面前浮现出一个长度为n的序列{ai},她想找出一段区间[L, R](1 <= L <= R <= n). 这 ...
- EF 跨库查询
原因:最近公司项目,遇到一个ef跨库查询的问题.(只是跨库,并不是跨服务器哈) 主要我们的一些数据,譬如地址,城市需要查询公共资料库. 但是本身我的程序设计采用的是ef框架的.因此为这事花费了1天时间 ...
- 架构师成长之路7.1 CDN理论
点击返回架构师成长之路 架构师成长之路7.1 CDN理论 CDN,Content Distribute Network,内容分发网络:CDN解决的是如何将数据快速可靠从源站传递到用户的问题.用户获取数 ...
- 【BZOJ2227】[ZJOI2011]看电影(组合数学,高精度)
[BZOJ2227][ZJOI2011]看电影(组合数学,高精度) 题面 BZOJ 洛谷 题解 这题太神仙了. 首先\(K<N\)则必定无解,直接特判解决. 现在只考虑\(K\ge N\)的情况 ...
- 洛谷 P1378 油滴扩展 改错
P1378 油滴扩展 题目描述 在一个长方形框子里,最多有\(N(0≤N≤6)\)个相异的点,在其中任何一个点上放一个很小的油滴,那么这个油滴会一直扩展,直到接触到其他油滴或者框子的边界.必须等一个油 ...
- python多线程用法及与单线程耗时比较
下面,通过一个简单的例子,来把多线程和单线程执行任务的耗时做个比较 import time import threading # 音乐播放器 def music(func, loop): for i ...
- selenium 登陆小技巧
from selenium import webdriver from selenium.webdriver.common.keys import Keys driver = webdriver.Fi ...
- 【bzoj1565】 NOI2009—植物大战僵尸
http://www.lydsy.com/JudgeOnline/problem.php?id=1565 (题目链接) 题意 给出$n*m$的棋盘,僵尸攻击每个格子可以获得$v$的分数,每个格子又会保 ...