2014-01-31 297 views
0

喜有任何「更快」的方式來解析與libxml2的XML文件? 現在我這樣做的以下C++代碼:最快的方法?

void parse_element_names(xmlNode * a_node, int *calls) 
{ 
    xmlNode *cur_node = NULL; 

    for (cur_node = a_node; cur_node; cur_node = cur_node->next) { 
     (*calls)++; 
     if(xmlStrEqual(xmlCharStrdup("to"),cur_node->name)){ 
     //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content); 
     //do something with the content 
     parse_element_names(cur_node->children->children,calls); 
     } 
     else if(xmlStrEqual(xmlCharStrdup("from"),cur_node->name)) { 
     //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content); 
     //do something with the content 
     parse_element_names(cur_node->children->children,calls); 
     } 
     else if(xmlStrEqual(xmlCharStrdup("note"),cur_node->name)) { 
     //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content); 
     //do something with the content 
     parse_element_names(cur_node->children->children,calls); 
     } 
     . 
     . 
     . 
     //about 100 more node names comming 
     else{ 
     parse_element_names(cur_node->children,calls); 
     } 
    } 

} 
int main(int argc, char **argv) 
{ 

    xmlDoc *doc = NULL; 
    xmlNode *root_element = NULL; 

    if (argc != 2) 
     return(1); 

    /*parse the file and get the DOM */ 
    doc = xmlReadFile(argv[1], NULL, XML_PARSE_NOBLANKS); 

    if (doc == NULL) { 
     printf("error: could not parse file %s\n", argv[1]); 
    } 
    int calls = 0; 
    /*Get the root element node */ 
    root_element = xmlDocGetRootElement(doc); 
    parse_element_names(root_element,&calls); 

    /*free the document */ 
    xmlFreeDoc(doc); 

    xmlCleanupParser(); 

    return 0; 
} 

難道真的最快的方法?或者有什麼更好/更快的解決方案,你可以給我建議?

謝謝

+0

使用哈希表的,如果你有很多節點名稱應該會更快。請注意,您使用的方式'xmlCharStrdup'泄漏內存。只需將字符串文字轉換爲'const xmlChar *'即可。提起「children」兩次也對我來說是錯誤的。 – nwellnhof

回答

2

xmlReadFile et al。基於libxml2的的SAX parser interface(實際上,在SAX2 interface),因此它通常是更快地使用自己的SAX解析器,如果你不需要導致xmlDoc

如果你有一個像你的榜樣許多不同的元素名稱來區分,最快的方法通常是每一個類型的節點創建單獨的功能和使用哈希表中查找這些功能。