從xml文件讀取內容並存儲在數組中

我正在第一次使用xml，並且在將xml文件的內容存儲在數組中時遇到了一些問題。我使用libxml2來解析XML文件，並且能夠獲取數據並能夠打印它。下面的代碼中給出：從xml文件讀取內容並存儲在數組中

#include <stdio.h> 
#include <string.h> 
#include <stdlib.h> 
#include <libxml/xmlmemory.h> 
#include <libxml/parser.h> 
#include <wchar.h> 

wchar_t buffer[7][50]={"\0"}; 

static void parseDoc(const char *docname) 
{ 

    xmlDocPtr doc; 
    xmlNodePtr cur; 
    xmlChar *key; 
    int i=0; 
    doc = xmlParseFile(docname); 

    if (doc == NULL) { 

    fprintf(stderr,"Document not parsed successfully. \n"); 
    return; 
    } 

    cur = xmlDocGetRootElement(doc); 

    if (cur == NULL) 
    { 
     fprintf(stderr,"empty document\n"); 
     xmlFreeDoc(doc); 
     return; 
    } 

    cur = cur->xmlChildrenNode; 

    while (cur != NULL) 
    { 
     key = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 
     wmemcpy(buffer[i],(wchar_t*)(key),size(key)); /*segmentation fault at this stage*/   
     printf("Content : %s\n", key); 
     xmlFree(key); 
     i++; 
     cur = cur->next; 
    } 
    xmlFreeDoc(doc); 
    return; 
} 

int main(void) 
{ 
    const char *docname="/home/workspace/TestProject/Text.xml; 
    parseDoc (docname); 
    return (1); 
}

在屏幕上打印時的示例XML文件中提供下面

<?xml version="1.0"?> 
<story> 
    <author>John Fleck</author> 
    <datewritten>June 2, 2002</datewritten> 
    <keyword>example keyword</keyword> 
    <headline>This is the headline</headline> 
    <para>This is the body text.</para> 
</story>

的文件內容的輸出是如下

Content : null

Content : John Fleck

Content : null

Content : June 2, 2002

Content : null

Content : example keyword

Content : null

Content : This is the headline

Content : null

Content : This is the body text.

我覺得文件在少數地方爲空的內容導致複製中的問題，從而產生分段錯誤。請讓我知道如何解決這個問題，是否有更好的方法來完成這件事。我用MSXML解析器完成了一個類似的XML文件讀取，這是我第一次使用Linux API。

編輯複製部分執行如下，但wchart數組的內容是亂碼。進一步的幫助將不勝感激。

while (cur != NULL) { 

    key = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1); 
    if(key!=NULL) 
    { 
     wmemcpy(DiscRead[i],(const wchar_t *)key,sizeof(key)); 
     i++; 
    } 

    printf("keyword: %s\n", key); 
    xmlFree(key); 

    cur = cur->next; 
}

來源

2013-09-23 Santhosh Pai

你的代碼有多種問題：

您使用wchar_t你的字符串數組。這不適用於您從libxml2獲得的UTF-8編碼字符串。您應該堅持使用xmlChar或使用char。
您使用xmlNodeListGetString來獲取節點的文本內容作爲節點列表通過cur->xmlChildrenNode。後者對於文本節點將爲NULL，因此xmlNodeListGetString將返回NULL作爲錯誤條件。您應該簡單地在當前節點上調用xmlNodeGetContent，但前提是它是元素節點。
使用xmlChildrenNode作爲字段名稱已棄用。你應該使用children。
致電wmemcpy是危險的。我建議像strlcpy這樣更安全。

嘗試是這樣的：

char buffer[7][50]; 

static void parseDoc(const char *docname) 
{ 
    xmlDocPtr doc; 
    xmlNodePtr cur; 
    xmlChar *key; 
    int i = 0; 
    doc = xmlParseFile(docname); 

    if (doc == NULL) { 
     fprintf(stderr, "Document not parsed successfully. \n"); 
     return; 
    } 

    cur = xmlDocGetRootElement(doc); 

    if (cur == NULL) { 
     fprintf(stderr, "empty document\n"); 
     xmlFreeDoc(doc); 
     return; 
    } 

    for (cur = cur->children; cur != NULL; cur = cur->next) { 
     if (cur->type != XML_ELEMENT_NODE) 
      continue; 
     key = xmlNodeGetContent(cur); 
     strlcpy(buffer[i], key, 50); 
     printf("Content : %s\n", key); 
     xmlFree(key); 
     i++; 
    } 

    xmlFreeDoc(doc); 
}

您也應該檢查i不溢出數組中字符串的數量。

來源

2013-09-23 18:34:45 nwellnhof

非常感謝你的回答，這就是它。 –

strlcpy沒有工作，我用memcpy代替，有沒有辦法轉換成wchart數組。任何幫助都很好。 –

請勿使用'memcpy'來複制字符串。如果您不能使用'strlcpy'，請嘗試'strncpy'，但請注意它有一些注意事項。在Linux上，通常不需要'wchar_t'，因爲UTF-8是首選的字符編碼。 – nwellnhof

buffer array不夠大。增加緩衝區大小到buffer[7+3][50]

wchar_t buffer[7][50]={"\0"}; 
... 
while (cur != NULL) { 
    wmemcpy(buffer[i],(wchar_t*)(key),size(key)); /*segmentation fault */ 
    printf("Content : %s\n", key); 
    ... 
    i++; 
}

輸出是10行「Content：...」。因此i從0遞增到9，但buffer只能索引0到6.索引7和更高版本是未定義的行爲，並且最終表現爲段錯誤。

來源

2013-09-23 06:23:44 chux

這沒有奏效，但我修改了邏輯，現在我可以複製，但conetnts有點亂碼。任何想法如何解決相同的問題。我已更新我的問題。 –

從xml文件讀取內容並存儲在數組中

回答

相關問題