日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

设置utf8编码问题

發(fā)布時間:2023/12/4 编程问答 40 豆豆
生活随笔 收集整理的這篇文章主要介紹了 设置utf8编码问题 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

注意:亂碼和request的具體實現(xiàn)類有關(guān),現(xiàn)在已經(jīng)查到的是RequestDispatcher.forward調(diào)用前使用的是org.apache.catalina.connector.RequestFacade類而RequestDispatcher.forward調(diào)用后使用的是org.apache.catalina.core.ApplicationHttpRequest,他們內(nèi)部在ParseParameter的時候, 用來解碼的默認的編碼邏輯不同,使用不同的協(xié)議時,影響亂碼的因素不同!?
具體參考:Tomcat源碼分析--ServletRequest.getParameterValues內(nèi)部分析,Request字符集&QueryStringEncoding?

亂碼的產(chǎn)生?
譬如漢字“中”,以UTF-8編碼后得到的是3字節(jié)的值%E4%B8%AD,然后通過GET或者POST方式把這3個字節(jié)提交到Tomcat容器,如果你不告訴Tomcat我的參數(shù)是用UTF-8編碼的,那么tomcat就認為你是用ISO-8859-1來編碼的,而ISO8859-1(兼容URI中的標準字符集US-ASCII)是兼容ASCII的單字節(jié)編碼并且使用了單字節(jié)內(nèi)的所有空間,因此Tomcat就以為你傳遞的用ISO-8859-1字符集編碼過的3個字符,然后它就用ISO-8859-1來解碼,得到中-,解碼后。字符串中-在Jvm是以Unicode的形式存在的,而HTTP傳輸或者數(shù)據(jù)庫保存的其實是字節(jié),因此根據(jù)各終端的需要,你可以把unicode字符串中-用UTF-8編碼后得到相應(yīng)的字節(jié)后存儲到數(shù)據(jù)庫(3個UTF-8字符),也可以取得這3個字符對應(yīng)的ISO-8859-1的3個字節(jié),然后用UTF-8重新編碼后得到unicode字符“中”(特性:把其他任何編碼的字節(jié)流當作ISO-8859-1編碼看待都沒有問題),然后用response傳遞給客戶端(根據(jù)你設(shè)置的content-type不同,傳遞的字節(jié)也是不同的!)?
總結(jié):?

  • 1,HTTP GET或者POST傳遞的是字節(jié)?數(shù)據(jù)庫保存的也是字節(jié)(譬如500MB空間就是500M字節(jié))
  • 2,亂碼產(chǎn)生的原因是編碼和解碼的字符集(方式)不同導(dǎo)致的,即對于幾個不同的字節(jié),在不同的編碼方案下對應(yīng)的字符可能不同,也可能在某種編碼下有些字節(jié)不存在(這也是亂碼中?產(chǎn)生的原因)
  • 3,解碼后的字符串在jvm中以Unicode的形式存在
  • 4,如果jvm中存在的Unicode字符就是你預(yù)期的字符(編碼,解碼的字符集相同或者兼容),那么沒有任何問題,如果jvm中存在的字符集不是你預(yù)期的字符,譬如上述例子中jvm中存在的是3個Unicode字符,你也可以通過取得這3個unicode字符對應(yīng)的3個字節(jié),然后用UTF-8對這3個字節(jié)進行編碼生成新的Unicode字符:漢字“中”
  • 5,ISO8859-1是兼容ASCII的單字節(jié)編碼并且使用了單字節(jié)內(nèi)的所有空間,在支持ISO-8859-1的系統(tǒng)中傳輸和存儲其他任何編碼的字節(jié)流都不會被拋棄。換言之,把其他任何編碼的字節(jié)流當作ISO-8859-1編碼看待都沒有問題。


下面的代碼顯示,使用不同的編碼來Encoder會得到不同的結(jié)果,同時如果Encoder和Decoder不一致或者使用的漢字在編碼ISO-8859-1中不存在時,都會表現(xiàn)為亂碼的形式!?

Java代碼??
  • try?{????
  • ??
  • ????????//?漢字“中”用UTF-8進行URLEncode的時候,得到%e4%b8%ad(對應(yīng)的ISO-8859-1的字符是中)??
  • ????????String?item?=?new?String(new?byte[]?{?(byte)?0xe4,?(byte)?0xb8,?(byte)?0xad?},?"UTF-8");??
  • ????????//?中??
  • ????????System.out.println(item);??
  • ??
  • ????????item?=?new?String(new?byte[]?{?(byte)?0xe4,?(byte)?0xb8,?(byte)?0xad?},?"ISO-8859-1");??
  • ????????//?中??
  • ????????System.out.println(item);??
  • ??
  • ????????System.out.println(new?BigInteger("253").toByteArray());??
  • ????????System.out.println(Integer.toBinaryString(253));??
  • ??
  • ????????//?中??
  • ????????item?=?new?String(item.getBytes("ISO_8859_1"),?"UTF-8");??
  • ????????System.out.println(item);??
  • ????????//?中??
  • ????????item?=?new?String(item.getBytes("UTF-8"),?"ISO_8859_1");??
  • ????????System.out.println(item);??
  • ??
  • ????????//?漢字中以UTF-8編碼為?????%E4%B8%AD(3字節(jié))??
  • ????????System.out.println(URLEncoder.encode("中",?"UTF-8"));????
  • ????????//?漢字中以UTF-8編碼為?????%3F??????(1字節(jié)?這是由于漢字在ISO-8859-1字符集中不存在,返回的是?在ISO-8859-1下的編碼)??
  • ????????System.out.println(URLEncoder.encode("中",?"ISO-8859-1"));????
  • ????????//?漢字中以UTF-8編碼為?????%D6%D0????????(2字節(jié))??
  • ????????System.out.println(URLEncoder.encode("中",?"GB2312"));????
  • ????????????
  • ????????//?把漢字中對應(yīng)的UTF-8編碼?????????????????%E4%B8%AD?用UTF-8解碼得到正常的漢字?中??
  • ????????System.out.println(URLDecoder.decode("%E4%B8%AD",?"UTF-8"));????
  • ????????//?把漢字中對應(yīng)的ISO-8859-1編碼????%3F???????用ISO-8859-1解碼得到???
  • ????????System.out.println(URLDecoder.decode("%3F",?"ISO-8859-1"));????
  • ????????//?把漢字中對應(yīng)的GB2312編碼?????????????????%D6%D0????????用GB2312解碼得到正常的漢字?中???
  • ????????System.out.println(URLDecoder.decode("%D6%D0",?"GB2312"));????
  • ????????//?把漢字中對應(yīng)的UTF-8編碼?????????????????%E4%B8%AD?用ISO-8859-1解碼??
  • ????????//?得到字符中(這個就是所謂的亂碼,其實是3字節(jié)%E4%B8%AD中每個字節(jié)對應(yīng)的ISO-8859-1中的字符)??
  • ????????//?ISO-8859-1字符集使用了單字節(jié)內(nèi)的所有空間??
  • ????????System.out.println(URLDecoder.decode("%E4%B8%AD",?"ISO-8859-1"));??
  • ????????//?把漢字中對應(yīng)的UTF-8編碼?????????????????%E4%B8%AD?用GB2312解碼??
  • ????????//?得到字符涓?,因為前2字節(jié)?%E4%B8對應(yīng)的GB2312的字符就是涓,而第3字節(jié)%AD在GB2312編碼中不存在,故返回???
  • ????????System.out.println(URLDecoder.decode("%E4%B8%AD",?"GB2312"));????
  • ????}?catch?(UnsupportedEncodingException?e)?{????
  • ????????//?TODO?Auto-generated?catch?block????
  • ????????e.printStackTrace();????
  • ????}????

  • Tomcat關(guān)于encoding編碼的默認設(shè)置以及相關(guān)標準:?
    對于Get請求,"URI Syntax"規(guī)范規(guī)定HTTP query strings(又叫GET parameters)使用US-ASCII編碼,所有不在這個編碼范圍內(nèi)的字符,必須經(jīng)常一定的轉(zhuǎn)碼:%61的形式(encode)。又由于ISO-8859-1 and ASCII對于0x20 to 0x7E范圍內(nèi)的字符是兼容的,大部分的web容器譬如Tomcat容器默認使用ISO-8859-1解碼URI中%xx部分的字節(jié)。可以使用Connector中的URIEncoding來修改這個默認用來解碼URI中%xx部分字節(jié)的字符集。URIEncoding要和get請求query string中encode的編碼一直,或者通過設(shè)置Content-Type來告訴容器你使用什么編碼來轉(zhuǎn)碼url中的字符?

    POST請求應(yīng)該自己通過參數(shù)Content-Type指定所使用的編碼,由于許多客戶端都沒有設(shè)置一個明確的編碼,tomcat就默認使用ISO-8859-1編碼。注意:用來對URI進行解碼的字符集,Request字符集,Response字符集的區(qū)別!不同的Request實現(xiàn)中,對于上述3個編碼的關(guān)系是不同的?

    對于POST請求,ISO-8859-1是Servlet規(guī)范中定義的HTTP request和response的默認編碼。如果request或者response的字符集沒有被設(shè)定,那么Servlet規(guī)范指定使用編碼ISO-8859-1,請求和相應(yīng)指定編碼是通過Content-Type響應(yīng)頭來設(shè)定的。?

    如果Get、Post請求沒有通過Content-Type來設(shè)置編碼的話,Tomcat默認使用ISO-8859-1編碼。可以使用SetCharacterEncodingFilter來修改Tomcat請求的默認編碼設(shè)置(encoding:使用的編碼, ignore:true,不管客戶端是否指定了編碼都進行設(shè)置, false,只有在客戶端沒有指定編碼的時候才進行編碼設(shè)置, 默認true)?
    注意:一般這個Filter建議放在所有Filter的最前面(Servlet3.0之前基于filter-mapping在web.xml中的順序, Servlet3.0之后有參數(shù)可以指定順序),因為一旦從request里面取值后,再進行設(shè)置的話,設(shè)置無效。因為在第一次從request取值時,tomcat會把querystring或者post方式提交的變量,用指定的編碼轉(zhuǎn)成從parameters數(shù)組,以后直接從這個數(shù)組中獲取相應(yīng)參數(shù)的值!?

    到處都使用UTF-8建議操作:?

    • 1, Set URIEncoding="UTF-8" on your <Connector> in server.xml.使得Tomcat?Http Get請求使用UTF-8編碼
    • 2, Use a character encoding filter with the default encoding set to?UTF-8.?由于很多請求本身沒有指定編碼,?Tomcat默認使用ISO-8859-1編碼作為HttpServletRequest的編碼,通過filter修改
    • 3, Change all your JSPs to include charset name in their contentType. For example, use <%@page contentType="text/html; charset=UTF-8" %> for the usual JSP pages and <jsp:directive.page contentType="text/html; charset=UTF-8" /> for the pages in XML syntax (aka JSP Documents).?指定Jsp頁面使用的編碼
    • 4, Change all your servlets to set the content type for responses and to include charset name in the content type to beUTF-8. Use response.setContentType("text/html; charset=UTF-8") or response.setCharacterEncoding("UTF-8").?設(shè)置Response返回結(jié)果的編碼
    • 5, Change any content-generation libraries you use (Velocity, Freemarker, etc.) to use?UTF-8?and to specify?UTF-8?in the content type of the responses that they generate.指定所有模版引擎佘勇的編碼
    • 6, Disable any valves or filters that may read request parameters before your character encoding filter or jsp page has a chance to set the encoding to?UTF-8.?SetCharacterEncodingFilter一般要放置在第一位,否則可能無效



    Java代碼??
  • /*?
  • *?Licensed?to?the?Apache?Software?Foundation?(ASF)?under?one?or?more?
  • *?contributor?license?agreements.??See?the?NOTICE?file?distributed?with?
  • *?this?work?for?additional?information?regarding?copyright?ownership.?
  • *?The?ASF?licenses?this?file?to?You?under?the?Apache?License,?Version?2.0?
  • *?(the?"License");?you?may?not?use?this?file?except?in?compliance?with?
  • *?the?License.??You?may?obtain?a?copy?of?the?License?at?
  • *?
  • *?????http://www.apache.org/licenses/LICENSE-2.0?
  • *?
  • *?Unless?required?by?applicable?law?or?agreed?to?in?writing,?software?
  • *?distributed?under?the?License?is?distributed?on?an?"AS?IS"?BASIS,?
  • *?WITHOUT?WARRANTIES?OR?CONDITIONS?OF?ANY?KIND,?either?express?or?implied.?
  • *?See?the?License?for?the?specific?language?governing?permissions?and?
  • *?limitations?under?the?License.?
  • */??
  • ??
  • package?filters;??
  • ??
  • ??
  • import?java.io.IOException;??
  • import?javax.servlet.Filter;??
  • import?javax.servlet.FilterChain;??
  • import?javax.servlet.FilterConfig;??
  • import?javax.servlet.ServletException;??
  • import?javax.servlet.ServletRequest;??
  • import?javax.servlet.ServletResponse;??
  • ??
  • ??
  • /**?
  • ?*?<p>Example?filter?that?sets?the?character?encoding?to?be?used?in?parsing?the?
  • ?*?incoming?request,?either?unconditionally?or?only?if?the?client?did?not?
  • ?*?specify?a?character?encoding.??Configuration?of?this?filter?is?based?on?
  • ?*?the?following?initialization?parameters:</p>?
  • ?*?<ul>?
  • ?*?<li><strong>encoding</strong>?-?The?character?encoding?to?be?configured?
  • ?*?????for?this?request,?either?conditionally?or?unconditionally?based?on?
  • ?*?????the?<code>ignore</code>?initialization?parameter.??This?parameter?
  • ?*?????is?required,?so?there?is?no?default.</li>?
  • ?*?<li><strong>ignore</strong>?-?If?set?to?"true",?any?character?encoding?
  • ?*?????specified?by?the?client?is?ignored,?and?the?value?returned?by?the?
  • ?*?????<code>selectEncoding()</code>?method?is?set.??If?set?to?"false,?
  • ?*?????<code>selectEncoding()</code>?is?called?<strong>only</strong>?if?the?
  • ?*?????client?has?not?already?specified?an?encoding.??By?default,?this?
  • ?*?????parameter?is?set?to?"true".</li>?
  • ?*?</ul>?
  • ?*?
  • ?*?<p>Although?this?filter?can?be?used?unchanged,?it?is?also?easy?to?
  • ?*?subclass?it?and?make?the?<code>selectEncoding()</code>?method?more?
  • ?*?intelligent?about?what?encoding?to?choose,?based?on?characteristics?of?
  • ?*?the?incoming?request?(such?as?the?values?of?the?<code>Accept-Language</code>?
  • ?*?and?<code>User-Agent</code>?headers,?or?a?value?stashed?in?the?current?
  • ?*?user's?session.</p>?
  • ?*?
  • ?*?@author?Craig?McClanahan?
  • ?*?@version?$Id:?SetCharacterEncodingFilter.java?939521?2010-04-30?00:16:33Z?kkolinko?$?
  • ?*/??
  • ??
  • public?class?SetCharacterEncodingFilter?implements?Filter?{??
  • ??
  • ??
  • ????//?-----------------------------------------------------?Instance?Variables??
  • ??
  • ??
  • ????/**?
  • ?????*?The?default?character?encoding?to?set?for?requests?that?pass?through?
  • ?????*?this?filter.?
  • ?????*/??
  • ????protected?String?encoding?=?null;??
  • ??
  • ??
  • ????/**?
  • ?????*?The?filter?configuration?object?we?are?associated?with.??If?this?value?
  • ?????*?is?null,?this?filter?instance?is?not?currently?configured.?
  • ?????*/??
  • ????protected?FilterConfig?filterConfig?=?null;??
  • ??
  • ??
  • ????/**?
  • ?????*?Should?a?character?encoding?specified?by?the?client?be?ignored??
  • ?????*/??
  • ????protected?boolean?ignore?=?true;??
  • ??
  • ??
  • ????//?---------------------------------------------------------?Public?Methods??
  • ??
  • ??
  • ????/**?
  • ?????*?Take?this?filter?out?of?service.?
  • ?????*/??
  • ????public?void?destroy()?{??
  • ??
  • ????????this.encoding?=?null;??
  • ????????this.filterConfig?=?null;??
  • ??
  • ????}??
  • ??
  • ??
  • ????/**?
  • ?????*?Select?and?set?(if?specified)?the?character?encoding?to?be?used?to?
  • ?????*?interpret?request?parameters?for?this?request.?
  • ?????*?
  • ?????*?@param?request?The?servlet?request?we?are?processing?
  • ?????*?@param?result?The?servlet?response?we?are?creating?
  • ?????*?@param?chain?The?filter?chain?we?are?processing?
  • ?????*?
  • ?????*?@exception?IOException?if?an?input/output?error?occurs?
  • ?????*?@exception?ServletException?if?a?servlet?error?occurs?
  • ?????*/??
  • ????public?void?doFilter(ServletRequest?request,?ServletResponse?response,??
  • ?????????????????????????FilterChain?chain)??
  • ????throws?IOException,?ServletException?{??
  • ??
  • ????????//?Conditionally?select?and?set?the?character?encoding?to?be?used??
  • ????????if?(ignore?||?(request.getCharacterEncoding()?==?null))?{??
  • ????????????String?encoding?=?selectEncoding(request);??
  • ????????????if?(encoding?!=?null)??
  • ????????????????request.setCharacterEncoding(encoding);??
  • ????????}??
  • ??
  • ????//?Pass?control?on?to?the?next?filter??
  • ????????chain.doFilter(request,?response);??
  • ??
  • ????}??
  • ??
  • ??
  • ????/**?
  • ?????*?Place?this?filter?into?service.?
  • ?????*?
  • ?????*?@param?filterConfig?The?filter?configuration?object?
  • ?????*/??
  • ????public?void?init(FilterConfig?filterConfig)?throws?ServletException?{??
  • ??
  • ????this.filterConfig?=?filterConfig;??
  • ????????this.encoding?=?filterConfig.getInitParameter("encoding");??
  • ????????String?value?=?filterConfig.getInitParameter("ignore");??
  • ????????if?(value?==?null)??
  • ????????????this.ignore?=?true;??
  • ????????else?if?(value.equalsIgnoreCase("true"))??
  • ????????????this.ignore?=?true;??
  • ????????else?if?(value.equalsIgnoreCase("yes"))??
  • ????????????this.ignore?=?true;??
  • ????????else??
  • ????????????this.ignore?=?false;??
  • ??
  • ????}??
  • ??
  • ??
  • ????//?------------------------------------------------------?Protected?Methods??
  • ??
  • ??
  • ????/**?
  • ?????*?Select?an?appropriate?character?encoding?to?be?used,?based?on?the?
  • ?????*?characteristics?of?the?current?request?and/or?filter?initialization?
  • ?????*?parameters.??If?no?character?encoding?should?be?set,?return?
  • ?????*?<code>null</code>.?
  • ?????*?<p>?
  • ?????*?The?default?implementation?unconditionally?returns?the?value?configured?
  • ?????*?by?the?<strong>encoding</strong>?initialization?parameter?for?this?
  • ?????*?filter.?
  • ?????*?
  • ?????*?@param?request?The?servlet?request?we?are?processing?
  • ?????*/??
  • ????protected?String?selectEncoding(ServletRequest?request)?{??
  • ??
  • ????????return?(this.encoding);??
  • ??
  • ????}??
  • ??
  • ??
  • }
  • 總結(jié)

    以上是生活随笔為你收集整理的设置utf8编码问题的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。