2011-08-26 42 views
5

是否可以在SAX解析器中提供路徑表達式?我有一個XML文件,它具有幾個相同的名稱標籤,但它們位於不同的元素中。有什麼辦法可以區分它們嗎? 這裏是XML:使用SAX解析器,你如何解析具有相同名稱標籤但在不同元素中的xml文件?

<Schools> 
    <School> 
     <ID>335823</ID> 
     <Name>Fairfax High School</Name> 
     <Student> 
      <ID>4195653</ID> 
      <Name>Will Turner</Name> 
     </Student> 
     <Student> 
      <ID>4195654</ID> 
      <Name>Bruce Paltrow</Name> 
     </Student> 
     <Student> 
      <ID>4195655</ID> 
      <Name>Santosh Gowswami</Name> 
     </Student> 
    </School> 
    <School> 
     <ID>335824</ID> 
     <Name>FallsChurch High School</Name> 
     <Student> 
      <ID>4153</ID> 
      <Name>John Singer</Name> 
     </Student> 
     <Student> 
      <ID>4154</ID> 
      <Name>Shane Warne</Name> 
     </Student> 
     <Student> 
      <ID>4155</ID> 
      <Name>Eddie Diaz</Name> 
     </Student> 
    </School> 
</Schools> 

我想從一所學校的名稱和ID的學生的姓名和ID來區分。

感謝您的答覆:

我創建了具有以下fields-學校ID,school_name,student_id數據併爲他們student_name和getter和setter方法學生POJO。這是我的臨時解析器實現。當我解析XML時,我需要將學校名稱,ID,學生姓名,ID的值放在pojo中並將其返回。你能告訴我如何實現差異化堆棧嗎?這是我的語法分析器框架::

import org.xml.sax.Attributes; 
import org.xml.sax.SAXException; 
import org.xml.sax.helpers.DefaultHandler; 

public class HandleXML extends DefaultHandler { 

    private student info; 
    private boolean school_id = false; 
    private boolean school_name = false; 
    private boolean student_id = false; 
    private boolean student_name = false; 
    private boolean student = false; 
    private boolean school = false; 


    public HandleXML(student record) { 
     super(); 
     this.info = record; 
     school_id = false; 
     school_name = false; 
     student_id = false; 
     student_name = false; 
     student = false; 
     school = false; 
    } 

    @Override 
    public void startElement(String uri, String localName, 
      String qName, Attributes attributes) 
      throws SAXException { 
    if (qName.equalsIgnoreCase("student")) { 
      student = true; 
     } 
    if (qName.equalsIgnoreCase("school")) { 
      school_id = true; 
     } 
    if (qName.equalsIgnoreCase("school_id")) { 
      school_id = true; 
     } 
    if (qName.equalsIgnoreCase("student_id")) { 
      student_id = true; 
     } 
    if (qName.equalsIgnoreCase("school_name")) { 
      school_name = true; 
     } 
    if (qName.equalsIgnoreCase("student_name")) { 
      student_name = true; 
     } 
    } 

    @Override 
    public void endElement(String uri, String localName, 
      String qName) 
      throws SAXException { 
    } 

    @Override 
    public void characters(char ch[], int start, int length) 
      throws SAXException { 

     String data = new String(ch, start, length); 

    } 
} 
+0

工作還檢查了:http://stackoverflow.com/questions/1863250/ - 有一些項目,讓您使用XPath的一個子集流文件。如果您可以將您的問題納入該子集,則所得到的代碼將遠遠優於任何手動SAX處理程序代碼。 –

+0

帶上下文的SAX解析就像一個狀態機:https://en.wikipedia.org/wiki/Finite-state_machine,你需要有一些標誌,你可以打開/關閉,知道你在哪裏,但它可以很快變得非常混亂,在你走得太遠之前你應該考慮其他選擇。 –

回答

13

好了,我已經很多年沒有在Java SAX打,所以這裏就可以了我的看法:

package play.xml.sax; 

import org.xml.sax.Attributes; 
import org.xml.sax.SAXException; 
import org.xml.sax.helpers.DefaultHandler; 

import javax.xml.parsers.ParserConfigurationException; 
import javax.xml.parsers.SAXParser; 
import javax.xml.parsers.SAXParserFactory; 
import java.io.IOException; 
import java.util.ArrayList; 
import java.util.List; 
import java.util.Stack; 

public class Test1 { 
    public static void main(String[] args) { 
     SAXParserFactory spf = SAXParserFactory.newInstance(); 
     SchoolsHandler handler = new SchoolsHandler(); 
     try { 
      SAXParser sp = spf.newSAXParser(); 
      sp.parse("schools.xml", handler); 
      System.out.println("Number of read schools: " + handler.getSchools().size()); 
     } catch (SAXException se) { 
      se.printStackTrace(); 
     } catch (ParserConfigurationException pce) { 
      pce.printStackTrace(); 
     } catch (IOException ie) { 
      ie.printStackTrace(); 
     } 
    } 
} 

class SchoolsHandler extends DefaultHandler { 
    private static final String TAG_SCHOOLS = "Schools"; 
    private static final String TAG_SCHOOL = "School"; 
    private static final String TAG_STUDENT = "Student"; 
    private static final String TAG_ID = "ID"; 
    private static final String TAG_NAME = "Name"; 

    private final Stack<String> tagsStack = new Stack<String>(); 
    private final StringBuilder tempVal = new StringBuilder(); 

    private List<School> schools; 
    private School school; 
    private Student student; 

    public void startElement(String uri, String localName, String qName, Attributes attributes) { 
     pushTag(qName); 
     tempVal.setLength(0); 
     if (TAG_SCHOOLS.equalsIgnoreCase(qName)) { 
      schools = new ArrayList<School>(); 
     } else if (TAG_SCHOOL.equalsIgnoreCase(qName)) { 
      school = new School(); 
     } else if (TAG_STUDENT.equalsIgnoreCase(qName)) { 
      student = new Student(); 
     } 
    } 

    public void characters(char ch[], int start, int length) { 
     tempVal.append(ch, start, length); 
    } 

    public void endElement(String uri, String localName, String qName) { 
     String tag = peekTag(); 
     if (!qName.equals(tag)) { 
      throw new InternalError(); 
     } 

     popTag(); 
     String parentTag = peekTag(); 

     if (TAG_ID.equalsIgnoreCase(tag)) { 
      int id = Integer.valueOf(tempVal.toString().trim()); 
      if (TAG_STUDENT.equalsIgnoreCase(parentTag)) { 
       student.setId(id); 
      } else if (TAG_SCHOOL.equalsIgnoreCase(parentTag)) { 
       school.setId(id); 
      } 
     } else if (TAG_NAME.equalsIgnoreCase(tag)) { 
      String name = tempVal.toString().trim(); 
      if (TAG_STUDENT.equalsIgnoreCase(parentTag)) { 
       student.setName(name); 
      } else if (TAG_SCHOOL.equalsIgnoreCase(parentTag)) { 
       school.setName(name); 
      } 
     } else if (TAG_STUDENT.equalsIgnoreCase(tag)) { 
      school.addStudent(student); 
     } else if (TAG_SCHOOL.equalsIgnoreCase(tag)) { 
      schools.add(school); 
     } 
    } 

    public void startDocument() { 
     pushTag(""); 
    } 

    public List<School> getSchools() { 
     return schools; 
    } 

    private void pushTag(String tag) { 
     tagsStack.push(tag); 
    } 

    private String popTag() { 
     return tagsStack.pop(); 
    } 

    private String peekTag() { 
     return tagsStack.peek(); 
    } 
} 

class School { 
    private int id; 
    private String name; 
    private List<Student> students = new ArrayList<Student>(); 

    public String getName() { 
     return name; 
    } 

    public void setName(String name) { 
     this.name = name; 
    } 

    public int getId() { 
     return id; 
    } 

    public void setId(int id) { 
     this.id = id; 
    } 

    public void addStudent(Student student) { 
     students.add(student); 
    } 

    public List<Student> getStudents() { 
     return students; 
    } 
} 

class Student { 
    private int id; 
    private String name; 

    public String getName() { 
     return name; 
    } 

    public void setName(String name) { 
     this.name = name; 
    } 

    public int getId() { 
     return id; 
    } 

    public void setId(int id) { 
     this.id = id; 
    } 
} 

schools.xml包含您的示例XML。請注意,我把所有內容都塞進一個文件中,但這只是因爲我只是在玩耍而已。

+0

http://www.javatpoint.com/questionanswer.jsp?thread=3474 – Rohit

13

在SAX解析器中,您按文檔順序給出每個元素。您必須維護堆棧以跟蹤嵌套(在處理startElement時將其推入堆棧,並在endElement中彈出)。您可以通過當前堆棧中的內容區分不同的<Name>元素。

或者,只需保留一個變量,告訴您是否遇到<School>標記或<Student>標記,以告訴您所看到的是哪種類型的<Name>

+4

+1保持堆棧,這是要走的路。您可以通過打印堆棧的當前內容來生成類似Xpath的字符串。使用標誌告訴你在裏面的標籤是醜陋的。 :-P –

2

是的,使用SAX解析器理解xml通常比使用DOM更復雜一點。基本上,您需要在SAX解析器中維護狀態/上下文,以便區分這些情況。

注意,實現SAX處理函數的另一個關鍵是理解值可以分成多個多個字符事件。

1

Sax是基於事件的,通過回調函數可以串行讀取XML文檔。由於整個文檔沒有加載到內存中,因此Sax適合閱讀大型XML文檔。您可能想看看Xpath,例如

XPathFactory xPathFactory = XPathFactory.newInstance(); 
XPath xPath = xPathFactory.newXPath(); 
String expression = "/Schools/school/ ..."; 
XPathExpression xPathExpression = xPath.compile(expression); 
// Compile the expression to get a XPathExpression object. 
Object result = xPathExpression.evaluate(xmlDocument); 
+0

在吉姆加里森的回答中很好地解釋了SAX中可能的情況。 –

+1

@唐羅比,糾正是可能的,但你真的會用SAX這樣做嗎?我認爲以這種方式使用SAX過於複雜,並且可以通過XPath(無標誌,無堆棧)以更簡潔的方式實現,儘管我認爲 – eon

+1

編寫SAX應用程序的工作量更大,但如果您需要節省內存SAX給出並能夠承擔額外的編程工作來實現它。但是,我認爲你是對的,擔心的是:有人需要提出這個問題提出的問題與SAX合作會遇到很多困難。 –

0
private boolean isInStudentNode; 
...................................................  

public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { 
    // enter node Student 
    if(qName.equalEgnoreCase("Student"){ 
     isInStudentNode = true; 
    } 
    ... 
} 

public void endElement(String uri, String localName, String qName) throws SAXException { 
    // end node Student 
    if(qName.equalEgnoreCase("Student"){ 
     isInStudentNode = false; 
     ........... 
    } 

    // end node Name (school|student) 
    if(qName.equalEgnoreCase("Name"){ 
     if(isInStudentNode) student.setName(...); 
     else school.setName(...); 
    } 
} 

與我

相關問題