Java 解析 XML
標簽: Java基礎
XML解析技術有兩種 DOM SAX
DOM方式 根據XML的層級結構在內存中分配一個樹形結構,把XML的標簽,屬性和文本等元素都封裝成樹的節點對象 優點: 便于實現增 刪 改 查 缺點: XML文件過大可能造成內存溢出 SAX方式 采用事件驅動模型邊讀邊解析:從上到下一行行解析,解析到某一元素, 調用相應解析方法 優點: 不會造成內存溢出, 缺點: 查詢不方便,但不能實現 增 刪 改
不同的公司和組織提供了針對DOM和SAX兩種方式的解析器
SUN的jaxp Dom4j組織的dom4j(最常用:如Spring) JDom組織的jdom 關于這三種解析器淵源可以參考java解析xml文件四種方式.
JAXP 解析
JAXP是JavaSE的一部分,在javax.xml.parsers包下,分別針對dom與sax提供了如下解析器:
Dom DocumentBuilder DocumentBuilderFactory SAX SAXParser SAXParserFactory
示例XML如下,下面我們會使用JAXP對他進行增 刪 改 查操作
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE beans SYSTEM "constraint.dtd">
<beans > <bean id ="id1" class ="com.fq.domain.Bean" > <property name ="isUsed" value ="true" /> </bean > <bean id ="id2" class ="com.fq.domain.ComplexBean" > <property name ="refBean" ref ="id1" /> </bean >
</beans >
<!ELEMENT beans (bean*) ><!ELEMENT bean (
property *)><!ATTLIST bean
id CDATA
class CDATA ><!ELEMENT
property EMPTY><!ATTLIST
property name CDATA value CDATA
ref CDATA
JAXP-Dom
/*** @author jifang* @since 16/1/13下午11:24.*/
public class XmlRead {@Test public void client ()
throws ParserConfigurationException, IOException, SAXException {DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();Document document = builder.parse(ClassLoader.getSystemResourceAsStream(
"config.xml" ));}
}
DocumentBuilder的parse(String/File/InputSource/InputStream param)方法可以將一個XML文件解析為一個Document對象,代表整個文檔. Document(org.w3c.dom包下)是一個接口,其父接口為Node, Node的其他子接口還有Element Attr Text等.
Node常用方法釋義 Node appendChild(Node newChild) Adds the node newChild to the end of the list of children of this node. Node removeChild(Node oldChild) Removes the child node indicated by oldChild from the list of children, and returns it. NodeList getChildNodes() A NodeList that contains all children of this node. NamedNodeMap getAttributes() A NamedNodeMap containing the attributes of this node (if it is an Element) or null otherwise. String getTextContent() This attribute returns the text content of this node and its descendants.
Document常用方法釋義 NodeList getElementsByTagName(String tagname) Returns a NodeList of all the Elements in document order with a given tag name and are contained in the document. Element createElement(String tagName) Creates an element of the type specified. Text createTextNode(String data) Creates a Text node given the specified string. Attr createAttribute(String name) Creates an Attr of the given name.
Dom查詢
public class XmlRead {private Document document;
@Before public void setUp ()
throws ParserConfigurationException, IOException, SAXException {document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(ClassLoader.getSystemResourceAsStream(
"config.xml" ));}
@Test public void client ()
throws ParserConfigurationException, IOException, SAXException {NodeList beans = document.getElementsByTagName(
"bean" );
for (
int i =
0 ; i < beans.getLength(); ++i) {NamedNodeMap attributes = beans.item(i).getAttributes();scanNameNodeMap(attributes);}}
private void scanNameNodeMap (NamedNodeMap attributes) {
for (
int i =
0 ; i < attributes.getLength(); ++i) {Attr attribute = (Attr) attributes.item(i);System.out.printf(
"%s -> %s%n" , attribute.getName(), attribute.getValue());}}
}
@Test
public void client () {list(document,
0 );
}
private void list (Node node,
int depth) {
if (node.getNodeType() == Node.ELEMENT_NODE) {
for (
int i =
0 ; i < depth; ++i)System.
out .print(
"\t" );System.
out .println(
"<" + node.getNodeName() +
">" );}NodeList childNodes = node.getChildNodes();
for (
int i =
0 ; i < childNodes.getLength(); ++i) {list(childNodes.item(i), depth +
1 );}
}
Dom添加節點
在第一個<bean/>標簽下添加一個<property/>標簽,最終結果形式:
<bean
id =
"id1" class =
"com.fq.domain.Bean" ><
property name =
"isUsed" value=
"true" /><
property name =
"name" value=
"simple-bean" >新添加的</
property >
</bean>
/*** @author jifang* @since 16/1/17 下午5:56.*/
public class XmlAppend {private Transformer transformer;
private Document document;
@Before public void setUp ()
throws ParserConfigurationException, IOException, SAXException {document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(ClassLoader.getSystemResourceAsStream(
"config.xml" ));}
@Test public void client () {Node firstBean = document.getElementsByTagName(
"bean" ).item(
0 );
/** 創建一個property標簽 **/ Element property = document.createElement(
"property" );Attr name = document.createAttribute(
"name" );name.setValue(
"name" );property.setAttributeNode(name);Attr value = document.createAttribute(
"value" );value.setValue(
"simple-bean" );property.setAttributeNode(value);property.appendChild(document.createTextNode(
"新添加的" ));firstBean.appendChild(property);}
@After public void tearDown ()
throws TransformerException {transformer = TransformerFactory.newInstance().newTransformer();transformer.transform(
new DOMSource(document),
new StreamResult(
"src/main/resources/config.xml" ));}
}
注意: 必須將內存中的DOM寫回XML文檔才能生效
Dom更新節點
<
property name =
"name" value=
"new-simple-bean" >simple-bean是新添加的</
property >
@Test
public void client () {NodeList properties = document.getElementsByTagName(
"property" );
for (
int i =
0 ; i < properties.getLength(); ++i) {Element property = (Element) properties.item(i);
if (property.getAttribute(
"value" ).equals(
"simple-bean" )) {property.setAttribute(
"value" ,
"new-simple-bean" );property.setTextContent(
"simple-bean是新添加的" );
break ;}}
}
Dom刪除節點
刪除剛剛修改的<property/>標簽
@Test
public void client () {NodeList properties = document.getElementsByTagName(
"property" );
for (
int i =
0 ; i < properties.getLength(); ++i) {Element property = (Element) properties.item(i);
if (property.getAttribute(
"value" ).equals(
"new-simple-bean" )) {property.getParentNode().removeChild(property);
break ;}}
}
JAXP-SAX
SAXParser實例需要從SAXParserFactory實例的newSAXParser()方法獲得, 用于解析XML文件的parse(String uri, DefaultHandler dh)方法沒有返回值,但比DOM方法多了一個事件處理器參數 DefaultHandler:
解析到開始標簽,自動調用DefaultHandler的startElement()方法; 解析到標簽內容(文本),自動調用DefaultHandler的characters()方法; 解析到結束標簽,自動調用DefaultHandler的endElement()方法.
Sax查詢
/*** @author jifang* @since 16/1/17 下午9:16.*/
public class SaxRead {@Test public void client ()
throws ParserConfigurationException, IOException, SAXException {SAXParser parser = SAXParserFactory.newInstance().newSAXParser();parser.parse(ClassLoader.getSystemResourceAsStream(
"config.xml" ),
new SaxHandler());}
private class SaxHandler extends DefaultHandler {@Override public void startElement (String uri, String localName, String qName, Attributes attributes)
throws SAXException {System.out.print(
"<" + qName);
for (
int i =
0 ; i < attributes.getLength(); ++i) {String attrName = attributes.getQName(i);String attrValue = attributes.getValue(i);System.out.print(
" " + attrName +
"=" + attrValue);}System.out.print(
">" );}
@Override public void characters (
char [] ch,
int start,
int length)
throws SAXException {System.out.print(
new String(ch, start, length));}
@Override public void endElement (String uri, String localName, String qName)
throws SAXException {System.out.print(
"</" + qName +
">" );}}
}
private class SaxHandler extends DefaultHandler {private boolean isProperty =
false ;
private Lock mutex =
new ReentrantLock();
@Override public void startElement (String uri, String localName, String qName, Attributes attributes)
throws SAXException {
if (qName.equals(
"property" )) {mutex.lock();isProperty =
true ;}}
@Override public void characters (
char [] ch,
int start,
int length)
throws SAXException {
if (isProperty) {System.out.println(
new String(ch, start, length));}}
@Override public void endElement (String uri, String localName, String qName)
throws SAXException {
if (qName.equals(
"property" )) {
try {isProperty =
false ;}
finally {mutex.unlock();}}}
}
注: SAX方式不能實現增 刪 改操作.
Dom4j解析
Dom4j是JDom的一種智能分支,從原先的JDom組織中分離出來,提供了比JDom功能更加強大,性能更加卓越的Dom4j解析器(比如提供對XPath支持). 使用Dom4j需要在pom中添加如下依賴:
<dependency > <groupId > dom4j
</groupId > <artifactId > dom4j
</artifactId > <version > 1.6.1
</version >
</dependency >
示例XML如下,下面我們會使用Dom4j對他進行增 刪 改 查操作:
<?xml version="1.0" encoding="utf-8"?>
<beans xmlns:xsi ="http://www.w3.org/2001/XMLSchema-instance" xmlns ="http://www.fq.me/context" xsi:schemaLocation ="http://www.fq.me/context http://www.fq.me/context/context.xsd" > <bean id ="id1" class ="com.fq.benz" > <property name ="name" value ="benz" /> </bean > <bean id ="id2" class ="com.fq.domain.Bean" > <property name ="isUsed" value ="true" /> <property name ="complexBean" ref ="id1" /> </bean >
</beans >
<?xml version="1.0" encoding="utf-8"?>
<schema xmlns ="http://www.w3.org/2001/XMLSchema" targetNamespace ="http://www.fq.me/context" elementFormDefault ="qualified" > <element name ="beans" > <complexType > <sequence > <element name ="bean" maxOccurs ="unbounded" > <complexType > <sequence > <element name ="property" maxOccurs ="unbounded" > <complexType > <attribute name ="name" type ="string" use ="required" /> <attribute name ="value" type ="string" use ="optional" /> <attribute name ="ref" type ="string" use ="optional" /> </complexType > </element > </sequence > <attribute name ="id" type ="string" use ="required" /> <attribute name ="class" type ="string" use ="required" /> </complexType > </element > </sequence > </complexType > </element >
</schema >
/*** @author jifang* @since 16/1/18下午4:02.*/
public class Dom4jRead {@Test public void client ()
throws DocumentException {SAXReader reader =
new SAXReader();Document document = reader.read(ClassLoader.getSystemResource(
"config.xml" ));}
}
與JAXP類似Document也是一個接口(org.dom4j包下),其父接口是Node, Node的子接口還有Element Attribute Document Text CDATA Branch等
Node常用方法釋義 Element getParent() getParent returns the parent Element if this node supports the parent relationship or null if it is the root element or does not support the parent relationship.
Document常用方法釋義 Element getRootElement() Returns the root Elementfor this document.
Element常用方法釋義 void add(Attribute/Text param) Adds the given Attribute/Text to this element. Element addAttribute(String name, String value) Adds the attribute value of the given local name. Attribute attribute(int index) Returns the attribute at the specified indexGets the Attribute attribute(String name) Returns the attribute with the given name Element element(String name) Returns the first element for the given local name and any namespace. Iterator elementIterator() Returns an iterator over all this elements child elements. Iterator elementIterator(String name) Returns an iterator over the elements contained in this element which match the given local name and any namespace. List elements() Returns the elements contained in this element. List elements(String name) Returns the elements contained in this element with the given local name and any namespace.
Branch常用方法釋義 Element addElement(String name) Adds a new Element node with the given name to this branch and returns a reference to the new node. boolean remove(Node node) Removes the given Node if the node is an immediate child of this branch.
Dom4j查詢
/*** @author jifang* @since 16/1/18下午4:02.*/
public class Dom4jRead {private Document document;
@Before public void setUp ()
throws DocumentException {document =
new SAXReader().read(ClassLoader.getSystemResource(
"config.xml" ));}
@Test @SuppressWarnings (
"unchecked" )
public void client () {Element beans = document.getRootElement();
for (Iterator iterator = beans.elementIterator(); iterator.hasNext(); ) {Element bean = (Element) iterator.next();String id = bean.attributeValue(
"id" );String clazz = bean.attributeValue(
"class" );System.out.println(
"id: " + id +
", class: " + clazz);scanProperties(bean.elements());}}
public void scanProperties (List<? extends Element> properties) {
for (Element property : properties) {System.out.print(
"name: " + property.attributeValue(
"name" ));Attribute value = property.attribute(
"value" );
if (value !=
null ) {System.out.println(
"," + value.getName() +
": " + value.getValue());}Attribute ref = property.attribute(
"ref" );
if (ref !=
null ) {System.out.println(
"," + ref.getName() +
": " + ref.getValue());}}}
}
Dom4j添加節點
在第一個<bean/>標簽末尾添加<property/>標簽
<bean
id =
"id1" class =
"com.fq.benz" > <
property name =
"name" value=
"benz" /> <
property name =
"refBean" ref =
"id2" >新添加的標簽</
property >
</bean>
/*** @author jifang* @since 16/1/19上午9:50.*/
public class Dom4jAppend {@Test public void client () {Element beans = document.getRootElement();Element firstBean = beans.element(
"bean" );Element property = firstBean.addElement(
"property" );property.addAttribute(
"name" ,
"refBean" );property.addAttribute(
"ref" ,
"id2" );property.setText(
"新添加的標簽" );}
@After public void tearDown ()
throws IOException {OutputFormat format = OutputFormat.createPrettyPrint();XMLWriter writer =
new XMLWriter(
new FileOutputStream(
"src/main/resources/config.xml" ), format);writer.write(document);}
}
我們可以將獲取讀寫XML操作封裝成一個工具, 以后調用時會方便些:
/*** @author jifang* @since 16/1/19下午2:12.*/
public class XmlUtils {public static Document
getXmlDocument (String config) {
try {
return new SAXReader().read(ClassLoader.getSystemResource(config));}
catch (DocumentException e) {
throw new RuntimeException(e);}}
public static void writeXmlDocument (String path, Document document) {
try {
new XMLWriter(
new FileOutputStream(path), OutputFormat.createPrettyPrint()).write(document);}
catch (IOException e) {
throw new RuntimeException(e);}}
}
在第一個<bean/>的第一個<property/>后面添加一個<property/>標簽
<bean
id =
"id1" class =
"com.fq.benz" > <
property name =
"name" value=
"benz" /> <
property name =
"rate" value=
"3.14" /><
property name =
"refBean" ref =
"id2" >新添加的標簽</
property >
</bean>
public class Dom4jAppend {private Document document;
@Before public void setUp () {document = XmlUtils.getXmlDocument(
"config.xml" );}
@Test @SuppressWarnings (
"unchecked" )
public void client () {Element beans = document.getRootElement();Element firstBean = beans.element(
"bean" );List<Element> properties = firstBean.elements();Element property = DocumentFactory.getInstance().createElement(
"property" , firstBean.getNamespaceURI());property.addAttribute(
"name" ,
"rate" );property.addAttribute(
"value" ,
"3.14" );properties.add(
1 , property);}
@After public void tearDown () {XmlUtils.writeXmlDocument(
"src/main/resources/config.xml" , document);}
}
Dom4j修改節點
將id1 bean的第一個<property/>修改如下:
<
property name =
"name" value=
"翡青" />
@Test
@SuppressWarnings(
"unchecked" )
public void client() {Element beans = document
.getRootElement ()Element firstBean = beans
.element (
"bean" )List<Element> properties = firstBean
.elements ()Element property = DocumentFactory
.getInstance ()
.createElement (
"property" , firstBean
.getNamespaceURI ())property
.addAttribute (
"name" ,
"rate" )property
.addAttribute (
"value" ,
"3.14" )properties
.add (
1 , property)
}
Dom4j 刪除節點
@Test
@SuppressWarnings (
"unchecked" )
public void delete () {List<Element> beans = document.getRootElement().elements(
"bean" );
for (Element bean : beans) {
if (bean.attributeValue(
"id" ).equals(
"id1" )) {List<Element> properties = bean.elements(
"property" );
for (Element property : properties) {
if (property.attributeValue(
"name" ).equals(
"name" )) {property.getParent().remove(property);
break ;}}
break ;}}
}
Dom4j實例
在Java 反射一文中我們實現了根據JSON配置文件來加載bean的對象池,現在我們可以為其添加根據XML配置(XML文件同前):
/*** @author jifang* @since 16/1/18下午9:18.*/
public class XmlParse {private static final ObjectPool POOL = ObjectPoolBuilder.init(
null );
public static Element
parseBeans (String config) {
try {
return new SAXReader().read(ClassLoader.getSystemResource(config)).getRootElement();}
catch (DocumentException e) {
throw new RuntimeException(e);}}
public static void processObject (Element bean, List<? extends Element> properties)
throws ClassNotFoundException, IllegalAccessException, InstantiationException, NoSuchFieldException {Class<?> clazz = Class.forName(bean.attributeValue(CommonConstant.CLASS));Object targetObject = clazz.newInstance();
for (Element property : properties) {String fieldName = property.attributeValue(CommonConstant.NAME);Field field = clazz.getDeclaredField(fieldName);field.setAccessible(
true );
if (property.attributeValue(CommonConstant.VALUE) !=
null ) {SimpleValueSetUtils.setSimpleValue(field, targetObject, property.attributeValue(CommonConstant.VALUE));}
else if (property.attributeValue(CommonConstant.REF) !=
null ) {String refId = property.attributeValue(CommonConstant.REF);Object object = POOL.getObject(refId);field.set(targetObject, object);}
else {
throw new RuntimeException(
"neither value nor ref" );}}POOL.putObject(bean.attributeValue(CommonConstant.ID), targetObject);}
}
注: 上面代碼只是對象池項目的XML解析部分,完整項目可參考git@git.oschina.net:feiqing/commons-frame.git
XPath
XPath是一門在XML文檔中查找信息 的語言,XPath可用來在XML文檔中對元素和屬性進行遍歷 .
表達式描述 / 從根節點開始獲取(/beans:匹配根下的<beans/>; /beans/bean:匹配<beans/>下面的<bean/>) // 從當前文檔中搜索,而不用考慮它們的位置(//property: 匹配當前文檔中所有<property/>) * 匹配任何元素節點(/*: 匹配所有標簽) @ 匹配屬性(例: //@name: 匹配所有name屬性) [position] 位置謂語匹配(例: //property[1]: 匹配第一個<property/>;//property[last()]: 匹配最后一個<property/>) [@attr] 屬性謂語匹配(例: //bean[@id]: 匹配所有帶id屬性的標簽; //bean[@id='id1']: 匹配所有id屬性值為’id1’的標簽)
謂語: 謂語用來查找某個特定的節點或者包含某個指定的值的節點.
XPath的語法詳細內容可以參考W3School XPath 教程.
Dom4j對XPath的支持
默認的情況下Dom4j并不支持XPath, 需要在pom下添加如下依賴:
<dependency > <groupId > jaxen
</groupId > <artifactId > jaxen
</artifactId > <version > 1.1.6
</version >
</dependency >
Dom4jNode接口提供了方法對XPath支持:
方法 List selectNodes(String xpathExpression) List selectNodes(String xpathExpression, String comparisonXPathExpression) List selectNodes(String xpathExpression, String comparisonXPathExpression, boolean removeDuplicates) Object selectObject(String xpathExpression) Node selectSingleNode(String xpathExpression)
XPath實現查詢
/*** @author jifang* @since 16/1/20上午9:28.*/
public class XPathRead {private Document document;
@Before public void setUp ()
throws DocumentException {document = XmlUtils.getXmlDocument(
"config.xml" );}
@Test @SuppressWarnings (
"unchecked" )
public void client () {List<Element> beans = document.selectNodes(
"//bean" );
for (Element bean : beans) {System.out.println(
"id: " + bean.attributeValue(
"id" ) +
", class: " + bean.attributeValue(
"class" ));}}
}
XPath實現更新
@Test
public void client () {Node bean = document.selectSingleNode(
"//bean[@id=\"id2\"]" );bean.getParent().remove(bean);
}
參考:
Dom4j的使用 Java 處理 XML 的三種主流技術及介紹
總結
以上是生活随笔 為你收集整理的Java 解析 XML 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。