我在解析C#中的一些XML數據時遇到了一些麻煩。XML文檔中存在錯誤(155,23)。沒有錯誤,總是在第13頁
方法摘要:
該方法採用一個關鍵字,然後通過使用該網站的URI搜索該關鍵字在www.clinicaltrials.com。例如:
http://www.clinicaltrials.gov/ct2/results?term=ALL&Search=Search&displayxml=true。
該URI將以臨牀試驗的形式將臨牀研究存儲爲XML。由於大量的臨牀數據,他們每頁只有20項研究。因此,要進入下一頁,您必須添加& pg = 2,以轉到第二頁。我的代碼解析所有頁面並將每個頁面轉換爲C#對象。
問題:
的問題是,當它到達13它與下面的錯誤崩潰頁:
InvalidOperationException was unhandled: There is an error in XML document (155, 23)
當我複製XML爲13頁,每頁12或任何其他頁面接近第13頁到XML驗證器,它說它很好。當我自己搜索xml時,我找不到任何錯誤。我在想也許內存已滿,但在240個對象之後?如果我搜索一個關鍵字,它可以檢索到少於13頁的結果。
我已經寫了以檢索並解析XML,你可以在這裏閱讀的代碼:
public List<search_resultsClinical_study> SearchStudyByKeyword(string keyword)
{
int currentPage = 1;
double numberOfStudiesOnAPage = 20;
double totalPages = 1; //if not it will crash anyways
List<search_results> searchResult = new List<search_results>();
try
{
while (totalPages >= currentPage)
{
//crashes if search is larger then 13 pages... have to figure out why....
string newUri = URI + "ct2/results?term=" + keyword + "&Search=Search&displayxml=true&pg=" + currentPage ;
System.Xml.Serialization.XmlSerializer reader = new System.Xml.Serialization.XmlSerializer(typeof(search_results));
XmlReader xmlReader = XmlReader.Create(newUri);
search_results studies = new search_results();
studies = (search_results)reader.Deserialize(xmlReader);
searchResult.Add(studies);
totalPages = Math.Ceiling((double)studies.count/numberOfStudiesOnAPage);
currentPage += 1;
}
//return searchResult;
//Append all studies to one list, easier to handle for user
List<search_resultsClinical_study> result = new List<search_resultsClinical_study>();
foreach (search_results sr in searchResult)
{
foreach (search_resultsClinical_study cs in sr.clinical_study)
{
result.Add(cs);
}
}
return result;
}
catch (WebException)
{
Debug.Write("404 - Might be a invalid search term ");
return null;
}
}
錯誤出現在以下行:
studies = (search_results)reader.Deserialize(xmlReader);
search_result類:
/// <remarks/>
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true)]
[System.Xml.Serialization.XmlRootAttribute(Namespace = "", IsNullable = false)]
public partial class search_results
{
private string queryField;
private search_resultsClinical_study[] clinical_studyField;
private uint countField;
/// <remarks/>
public string query
{
get
{
return this.queryField;
}
set
{
this.queryField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlElementAttribute("clinical_study")]
public search_resultsClinical_study[] clinical_study
{
get
{
return this.clinical_studyField;
}
set
{
this.clinical_studyField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlAttributeAttribute()]
public uint count
{
get
{
return this.countField;
}
set
{
this.countField = value;
}
}
}
/// <remarks/>
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true)]
public partial class search_resultsClinical_study
{
private byte orderField;
private decimal scoreField;
private string nct_idField;
private string urlField;
private string titleField;
private search_resultsClinical_studyStatus statusField;
private string condition_summaryField;
private string last_changedField;
/// <remarks/>
public byte order
{
get
{
return this.orderField;
}
set
{
this.orderField = value;
}
}
/// <remarks/>
public decimal score
{
get
{
return this.scoreField;
}
set
{
this.scoreField = value;
}
}
/// <remarks/>
public string nct_id
{
get
{
return this.nct_idField;
}
set
{
this.nct_idField = value;
}
}
/// <remarks/>
public string url
{
get
{
return this.urlField;
}
set
{
this.urlField = value;
}
}
/// <remarks/>
public string title
{
get
{
return this.titleField;
}
set
{
this.titleField = value;
}
}
/// <remarks/>
public search_resultsClinical_studyStatus status
{
get
{
return this.statusField;
}
set
{
this.statusField = value;
}
}
/// <remarks/>
public string condition_summary
{
get
{
return this.condition_summaryField;
}
set
{
this.condition_summaryField = value;
}
}
/// <remarks/>
public string last_changed
{
get
{
return this.last_changedField;
}
set
{
this.last_changedField = value;
}
}
}
/// <remarks/>
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true)]
public partial class search_resultsClinical_studyStatus
{
private string openField;
private string valueField;
/// <remarks/>
[System.Xml.Serialization.XmlAttributeAttribute()]
public string open
{
get
{
return this.openField;
}
set
{
this.openField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlTextAttribute()]
public string Value
{
get
{
return this.valueField;
}
set
{
this.valueField = value;
}
}
}
XML失敗:
http://www.clinicaltrials.gov/ct2/results?term=ALL&Search=Search&displayxml=true&pg=13
有誰得到了,爲什麼會出現這個錯誤的線索?我還添加了一個XmlSchema,並嘗試基於XmlSchema生成C#類!
感謝您的幫助!
做這個簡單的測試:在試圖反序列化之前,將每個頁面轉儲到硬盤上。你可以這樣做:http://stackoverflow.com/questions/3988832/how-to-create-an-xml-file-from-a-xmlreader之後,嘗試並反序列化硬盤上的文件。 – 2014-10-01 09:00:28
嘿,謝謝你的迴應!即使我在嘗試反序列化之前將每個頁面轉儲到硬盤,我仍然得到相同的錯誤。 – 2014-10-01 09:33:56
附加您遇到問題的具體XML並添加search_results的結構。 – 2014-10-01 10:02:41