在BS4使用find_all獲得文本列表

，我會說，我開始很新的Python。我一直在建立一個不和諧的機器人與discord.py和美麗的湯4，這裏就是我在：在BS4使用find_all獲得文本列表

@commands.command(hidden=True) 
async def roster(self): 
    """Gets a list of CD's members""" 
    url = "http://www.clandestine.pw/roster.html" 
    async with aiohttp.get(url) as response: 
     soupObject = BeautifulSoup(await response.text(), "html.parser") 
    try: 
     text = soupObject.find_all("font", attrs={'size': '4'}) 
     await self.bot.say(text) 
    except: 
     await self.bot.say("Not found!")

這裏的輸出：

現在，我已經在使用get_text()嘗試多種不同的方式來剝離括號和HTML標記從該代碼，但它每次引發錯誤。我將如何能夠既實現這一目標或輸出這個數據到一個數組或列表，然後只打印純文本？

來源

2017-03-07 Craig

您使用的是哪一種Python和美麗的湯的版本？我假設它是> = python 3.5給定異步等待語法 –

您正在返回的Tags從BeautifulSoup列表，你seing括號內是從列表中的對象。

要麼返回它們作爲一個字符串列表：

text = [Member.get_text().encode("utf-8").strip() for Member in soup.find_all("font", attrs={'size': '4'}) if not Member.get_text().encode("utf-8").startswith("\xe2")]

或者一個字符串：

text = ",".join([Member.get_text().encode("utf-8") for Member in soup.find_all("font", attrs={'size': '4'}) if not Member.get_text().encode("utf-8").startswith("\xe2")])

來源

2017-03-07 15:35:51 Zroq

更換

text = soupObject.find_all("font", attrs={'size': '4'})

與此：

all_font_tags = soupObject.find_all("font", attrs={'size': '4'}) 
list_of_inner_text = [x.text for x in all_font_tags] 
# If you want to print the text as a comma separated string 
text = ', '.join(list_of_inner_text)

來源

2017-03-07 15:37:15

在BS4使用find_all獲得文本列表

回答

相關問題