2015-11-07 79 views
3

我試圖做的方法來獲取PDF字節:iTextSharp的System.NullReferenceException

public static Byte[] HtmlToBytes(string htmlText) 
    { 
     Byte[] bytes; 

     using (var ms = new MemoryStream()) 
     { 
      using (var doc = new Document(PageSize.A4, 10, 10, 10, 10)) 
      { 
       using (var writer = PdfWriter.GetInstance(doc, ms)) 
       { 
        writer.CloseStream = false; 
        doc.Open(); 
        using (var msHtml = new MemoryStream(Encoding.UTF8.GetBytes(htmlText))) 
        { 
         XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHtml, Encoding.UTF8); 
        } 
       } 
      } 
      bytes = ms.ToArray(); 
     } 

     return bytes; 
    } 

但它在這部分

XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHtml, Encoding.UTF8); 

給出了一個NullReferenceException在我路過那裏的字符串只是像表格和標籤一樣的純html。

Here`s的堆棧跟蹤:

[NullReferenceException: Object reference not set to an instance of an object.] 
iTextSharp.tool.xml.pipeline.html.HtmlPipeline.Close(IWorkerContext context, Tag t, ProcessObject po) +76 
iTextSharp.tool.xml.XMLWorker.EndElement(String tag, String ns) +186 
iTextSharp.tool.xml.parser.XMLParser.EndElement() +111 
iTextSharp.tool.xml.parser.state.ClosingTagState.Process(Char character) +61 
iTextSharp.tool.xml.parser.XMLParser.ParseWithReader(TextReader reader) +247 
iTextSharp.tool.xml.parser.XMLParser.Parse(TextReader reader) +5 
iTextSharp.tool.xml.XMLWorkerHelper.ParseXHtml(PdfWriter writer, Document doc, TextReader inp) +453 
TCC.Globals.HtmlToBytes(String htmlText) in C:\Users\Felipe\Source\Workspaces\Workspace\SgLeitos\TCC\TCC\Helpers\Globals.cs:118 
TCC.Controllers.RelatoriosController.Leitos(Nullable`1 id) in C:\Users\Felipe\Source\Workspaces\Workspace\SgLeitos\TCC\TCC\Controllers\RelatoriosController.cs:34 
lambda_method(Closure , ControllerBase , Object[]) +107 
System.Web.Mvc.ActionMethodDispatcher.Execute(ControllerBase controller, Object[] parameters) +14 
System.Web.Mvc.ReflectedActionDescriptor.Execute(ControllerContext controllerContext, IDictionary`2 parameters) +157 ... 
+0

如果你打的XMLWorkerHelper上的斷點,msHtml變量事實上已被實例化了嗎? – gardarvalur

+0

是的,它不是空的, –

回答

2

我曾嘗試爲休耕:

public static Byte[] HtmlToBytes(string htmlText) 
    { 
     Byte[] bytes; 

     using (var ms = new MemoryStream()) 
     { 
      using (var doc = new Document(PageSize.A4, 10, 10, 10, 10)) 
      { 
       using (var writer = PdfWriter.GetInstance(doc, ms)) 
       { 
        writer.CloseStream = false; 
        doc.Open(); 
        using (var msHtml = new MemoryStream(Encoding.UTF8.GetBytes(htmlText))) 
        { 
         XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHtml, Encoding.UTF8); 
        } 
       } 
      } 
      bytes = ms.ToArray(); 
     } 

     return bytes; 
    } 

    private static void Main(string[] args) 
    { 
     var str = @"<!DOCTYPE html><html lang=""en"" xmlns=""http://www.w3.org/1999/xhtml""><head><meta charset=""utf-8"" /><title></title></head><body><table border=""1"" style=""width:100%""><tr><td>Jill</td><td>Smith</td><td>50</td></tr><tr><td>Eve</td><td>Jackson</td><td>94</td></tr></table></body></html>"; 
     var s = HtmlToBytes(str); 
     var str2 = @"<table border=""1"" style=""width:100%""><tr><td>Jill</td><td>Smith</td><td>50</td></tr><tr><td>Eve</td><td>Jackson</td><td>94</td></tr></table>"; 
     s = HtmlToBytes(str2); 
     var str3 = @"<tabl=""width:100%""><tr><td>Jill</td><td>Smith</td><td>50</td></tr><tr><td>Eve</td><td>Jackson</td><td>94</td></tr></table>"; 
     s = HtmlToBytes(str3); //NULL HERE with corrupted html 
    } 

所以可能的answere是,你的HTML被破壞

+0

就是這樣。我發現我的html到字符串轉換器也把/ r和/ n,以及文件中的選項卡。 –

+1

@FelipeDeguchi同樣,你可以看看我解決它的方式。這並不難,你也可以用你自己的方式來測試它。我在幾秒鐘內導入nuget包[教程](https://docs.nuget.org/consume/package-manager-console),並且檢查了3個沒有使用這個例子(Itextsharp)的庫的知識的測試用例。我不知道如何解決像你這樣的問題,我也不會怪你,我也處於你的階段。但是在將來,練習編程/解決問題的最好方法就是單獨嘗試。另外我建議閱讀[你有什麼試過](http://mattgemmell.com/what-have-you-tried/)。 GL :) –

相關問題