关于 MySQL 中 utf8 的问题

有一个段子,我每次看了都头皮发麻,这个段子是: 手持两把锟斤拷,口中疾呼烫烫烫,脚踏千朵屯屯屯,笑看万物锘锘锘 我相信凡是和我有一样感觉的应该都知道这是啥,所以为了避免这个问题,我们工作的建议就是,所有的地方都设置成:UTF-8,确实我们这样工作了很多年,不知道哪一年 emoji 突然火了,于是在上家公司做论坛的时候,有一个小伙伴遇到了类似于下面的一个小问题(我找不到原报错日志了): Incorrect string value: ‘\xF0\x9F\x98\x83 <…’ for column ‘summary’ at row 1 当时问我,我也不知道原因,然后经过一番搜索之后,发现原来是 MySQL 存储 emoji 时出错了,存储 emoji 的时候,不应该用 utf8,而是用 utf8mb4。 随着移动互联网的到来,emoji 的应用越来越广泛,本以为这个问题大家都知道了,所以就没有深究过,想着自己记着就行了,但是不知道为什么最近这个问题突然火了,原来不是所有人都知道这个问题,甚至有人说,在做微信开发的时候,天知道微信用户会取什么昵称,所以不得已把用户的昵称都 base64 编码一下,展示的时候再解码,我只能表示:厉害,人民群众的智慧是无穷的。 其实产生这个问题的原因也很简单,据说是 MySQL 的一个 bug: MySQL 的 utf8mb4 是真正的utf8。 MySQL 的 utf8 是一种专属的编码,它能够编码的 Unicode 字符并不多。 关于什么是编码,我相信大家应该都知道了,如果不知道请自行搜索(当年曾经给一个计算机在读博的学姐科普什么是 GBK、GB2312、GB18030、UTF-8、BIG5、ISO-8859-1 等等,也是心累),至于为什么 MySQL 的 utf8 不是真正的 utf8,这是一个历史原因,我搜索的资料是:MySQL 从 4.1 版本开始支持 UTF-8,也就是 2003 年,而今天使用的 UTF-8 标准(RFC 3629)是随后才出现的。 所以现在我们看到的很多 MySQL(或者 MySQL 的分支:MariaDB)资料依然在建议,MySQL 的编码使用 utf8,这在绝大多数的情况下,确实也没有问题,但确实是错误的,最好的选择是,选用真正的 utf8,也就是:utf8mb4。对于已经选用 utf8 需要改用 utf8mb4 的项目,这有一个指南: ...

August 4, 2018 · 1 min · 87 words · Bridge Li

MalformedByteSequenceException: 3 字节的 UTF-8 序列的字节 2 无效

今天这篇文章比较简单,写一个老夫近期工作中遇到的一个问题,这个问题困扰了老夫几个月了,虽然借助强大的Google百度了好久,但一直没有彻底解决,一直感觉挺简单一问题,也挺常见的一问题(网上问这个问题的还挺多),怎么就没有一个靠谱点的解决方案呢,刚好上个周呢时间稍有空闲,于是仔细研究了一下,终于找到了问题的根源,然后同事一言点醒梦中人,豁然开朗,一举解决,所以记录一下,先说一下异常:MalformedByteSequenceException: 3 字节的 UTF-8 序列的字节 2 无效,详细的堆栈信息如下: 严重: StandardWrapper.Throwable org.springframework.beans.factory.BeanDefinitionStoreException: IOException parsing XML document from file [D:J2EEapache-tomcat-7.0.62webappscrm-v1.0WEB-INFclassesspring-kafka-consumer.xml]; nested exception is com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: 3 字节的 UTF-8 序列的字节 2 无效。 at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:409) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:335) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:303) at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:180) at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:216) at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:187) at org.springframework.web.context.support.XmlWebApplicationContext.loadBeanDefinitions(XmlWebApplicationContext.java:125) at org.springframework.web.context.support.XmlWebApplicationContext.loadBeanDefinitions(XmlWebApplicationContext.java:94) at org.springframework.context.support.AbstractRefreshableApplicationContext.refreshBeanFactory(AbstractRefreshableApplicationContext.java:129) at org.springframework.context.support.AbstractApplicationContext.obtainFreshBeanFactory(AbstractApplicationContext.java:540) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:454) at org.springframework.web.servlet.FrameworkServlet.configureAndRefreshWebApplicationContext(FrameworkServlet.java:658) at org.springframework.web.servlet.FrameworkServlet.createWebApplicationContext(FrameworkServlet.java:624) at org.springframework.web.servlet.FrameworkServlet.createWebApplicationContext(FrameworkServlet.java:672) at org.springframework.web.servlet.FrameworkServlet.initWebApplicationContext(FrameworkServlet.java:543) at org.springframework.web.servlet.FrameworkServlet.initServletBean(FrameworkServlet.java:484) at org.springframework.web.servlet.HttpServletBean.init(HttpServletBean.java:136) at javax.servlet.GenericServlet.init(GenericServlet.java:158) at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1284) at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1197) at org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:864) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:134) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:957) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:423) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1079) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:620) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745) Caused by: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: 3 字节的 UTF-8 序列的字节 2 无效。 at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:687) at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:408) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1735) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanData(XMLEntityScanner.java:1234) at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanComment(XMLScanner.java:778) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanComment(XMLDocumentFragmentScannerImpl.java:1038) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2988) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347) at org.springframework.beans.factory.xml.DefaultDocumentLoader.loadDocument(DefaultDocumentLoader.java:76) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadDocument(XmlBeanDefinitionReader.java:428) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:390) &#8230; 35 more 三月 16, 2016 5:41:59 下午 org.apache.catalina.core.StandardWrapperValve invoke 严重: Allocate exception for servlet springServlet com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: 3 字节的 UTF-8 序列的字节 2 无效。 at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:687) at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:408) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1735) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanData(XMLEntityScanner.java:1234) at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanComment(XMLScanner.java:778) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanComment(XMLDocumentFragmentScannerImpl.java:1038) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2988) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347) at org.springframework.beans.factory.xml.DefaultDocumentLoader.loadDocument(DefaultDocumentLoader.java:76) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadDocument(XmlBeanDefinitionReader.java:428) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:390) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:335) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:303) at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:180) at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:216) at org.springframework.beans.factory.support.AbstractBeanDefinitionReader.loadBeanDefinitions(AbstractBeanDefinitionReader.java:187) at org.springframework.web.context.support.XmlWebApplicationContext.loadBeanDefinitions(XmlWebApplicationContext.java:125) at org.springframework.web.context.support.XmlWebApplicationContext.loadBeanDefinitions(XmlWebApplicationContext.java:94) at org.springframework.context.support.AbstractRefreshableApplicationContext.refreshBeanFactory(AbstractRefreshableApplicationContext.java:129) at org.springframework.context.support.AbstractApplicationContext.obtainFreshBeanFactory(AbstractApplicationContext.java:540) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:454) at org.springframework.web.servlet.FrameworkServlet.configureAndRefreshWebApplicationContext(FrameworkServlet.java:658) at org.springframework.web.servlet.FrameworkServlet.createWebApplicationContext(FrameworkServlet.java:624) at org.springframework.web.servlet.FrameworkServlet.createWebApplicationContext(FrameworkServlet.java:672) at org.springframework.web.servlet.FrameworkServlet.initWebApplicationContext(FrameworkServlet.java:543) at org.springframework.web.servlet.FrameworkServlet.initServletBean(FrameworkServlet.java:484) at org.springframework.web.servlet.HttpServletBean.init(HttpServletBean.java:136) at javax.servlet.GenericServlet.init(GenericServlet.java:158) at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1284) at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1197) at org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:864) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:134) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:957) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:423) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1079) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:620) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745) 看完堆栈信息,我相信遇到这个问题的同学,已经知道咋回事了。说起来也很简单,就是tomcat跑一个web项目时,报这个异常,导致项目起不来。在仔细看之后,应该是编码问题导致的,对,就是编码的问题了,看报错的那个XML文件编译之后的情况如下: ...

March 20, 2016 · 2 min · 284 words · Bridge Li