Python BeautifulSoup4 get_text（）或正則表達式

我正在使用Python 2.7.5和BeautifulSoup4。我需要從html標籤中剪切文本。我有輸出<a class="username offline popupctrl" href="member.php?20938-NarutoO" title="NarutoO je offline"><strong><font color="#5566CC">NarutoO</font></strong></a>命令後：Python BeautifulSoup4 get_text（）或正則表達式

print post_owner[0]

我只需要綽號：NarutoO ，不希望使用get_text()。

我的代碼：

post_owner = soup.findAll(attrs={'class':'username offline popupctrl'}) 
for row1 in post_owner: 
    text = ''.join(row1.findAll(text=True)) 
    data1 = text.strip() 
    text_file.write("USER NAME\n") 
    member_count = member_count + 1 
    data1 = data1.encode('utf-8') 
    text_file.write(str(data1) + '\n')

我用在其他posts.If一些解決方案，我理解正確的話，findAll給了我所有比賽的名單。我的代碼將連續打印所有匹配項。我只需要訪問post_owner列表中的元素，並在沒有html標籤的情況下使用它們。一些示例，如：

print post_owner[0] 
    print post_owner[4] 
    print post_owner[2] 
    . 
    . 
    .

抱歉不好解釋，我真的累了：○

來源

2015-05-27 bezoadam

爲什麼你不希望使用'get_text'時這顯然是最好的選擇？ – jwilner

因爲當我用findAll使用get_text時，它會返回錯誤代碼。 – bezoadam

具體是什麼錯誤信息？ – har07

使用soup.select和get()

[i.get('title') for i in soup.select('.username')]

來源

2015-06-02 02:54:04 nickanor

Python BeautifulSoup4 get_text（）或正則表達式

回答

相關問題