2016-02-19 61 views
1

我正嘗試在Python中將標記的所有匹配寫入csv文件。 我的字符串是:Python將字符串的一部分寫入csv行

<pre class="CodeRay highlight"><code data-lang="java"><span class="annotation">@CDIUI</span>(<span class="string"><span class="delimiter">"</span><span class="content">cdievents</span><span class="delimiter">"</span></span>) 
<span class="annotation">@Theme</span>(<span class="string"><span class="delimiter">"</span><span class="content">valo</span><span class="delimiter">"</span></span>) 
<span class="directive">public</span> <span class="type">class</span> <span class="class">CDIEventUI</span> <span class="directive">extends</span> UI { 
    <span class="annotation">@Inject</span> 
    InputPanel inputPanel; 

    <span class="annotation">@Inject</span> 
    DisplayPanel displayPanel; 

    <span class="annotation">@Override</span> 
    <span class="directive">protected</span> <span class="type">void</span> init(VaadinRequest request) { 
     Layout content = 
      <span class="keyword">new</span> HorizontalLayout(inputPanel, displayPanel); 
     setContent(content); 
    } 
}</code></pre> 

我寫命中一個CSV文件Python代碼是:

hits = soup.find_all("pre", "CodeRay highlight")# "programlisting") 
f = open('extractedsuorceTEST2.csv','ab') 
writer = csv.writer(f) 
writer.writerow(('page', hits[0].text.encode('UTF-8').replace('Â',' '))) 

通過這個代碼命中[0]:

'@CDIUI("cdievents")\[email protected]("valo")\npublic class CDIEventUI extends UI {\n @Inject\n InputPanel inputPanel;\n\n @Inject\n DisplayPanel displayPanel;\n\n @Override\n protected void init(VaadinRequest request) {\n  Layout content =\n   new HorizontalLayout(inputPanel, displayPanel);\n  setContent(content);\n }\n}' 

但結果用csv文件寫成:

@CDIUI(""cdievents"") 
@Theme(""valo"") 
public class CDIEventUI extends UI { 
    @Inject 
    InputPanel inputPanel; 

    @Inject 
    DisplayPanel displayPanel; 

    @Override 
    protected void init(VaadinRequest request) { 
     Layout content = 

當它應該是:

@CDIUI("cdievents") 
@Theme("valo") 
public class CDIEventUI extends UI { 
    @Inject 
    InputPanel inputPanel; 

    @Inject 
    DisplayPanel displayPanel; 

    @Override 
    protected void init(VaadinRequest request) { 
     Layout content = 
      new HorizontalLayout(inputPanel, displayPanel); 
     setContent(content); 
    } 
} 

任何人可以提出一個解決辦法? 感謝

+0

什麼'命中[0] .text.encode( 'UTF-8')。代替( 'A',」「)'包含? – jsfan

+0

命中[0] .text.encode( 'UTF-8')代替( 'A',」 '):\t'。@CDIUI( 「cdievents」)\ n @主題( 「VALO」)\ npublic類CDIEventUI延伸UI {\ n @Inject \ n InputPanel inputPanel; \ n \ n @Inject \ n DisplayPanel displayPanel; \ n \ n @Override \ n protected void init(VaadinRequest request){\ n Layout content = \ n new Horizo​​ntalLayout(inputPanel, displayPanel); \ n setContent(content); \ n} \ n}' – user3707761

+0

這裏的字符串是完整的,但是當我使用「writer.writerow」將它寫入CSV文件時,它被截斷了! – user3707761

回答

0

你必須要小心,不要只是放棄文件或CSVWriter對象。

試着改變你的代碼

hits = soup.find_all("pre", "CodeRay highlight")# "programlisting") 
with open('extractedsuorceTEST2.csv','ab') as f: 
    writer = csv.writer(f) 
    writer.writerow(('page', hits[0].text.encode('UTF-8').replace('Â',' '))) 

如果仍然失敗,請檢查field size limit並根據需要增加。

+0

感謝您的幫助。但是截斷的字符串的其餘部分呢:new Horizo​​ntalLayout(inputPanel,displayPanel); setContent(content); } } – user3707761

+0

對不起,沒有意識到這是問題所在。我發現了第一個區別,但不是截斷。我會編輯我的答案。 – jsfan

+0

對不起,CSVWriter不提供上下文管理器功能。錯誤地記住了這一點,並沒有檢查文檔。我修復了我的代碼。 – jsfan

相關問題