2016-06-11 21 views
1

我有以下的代碼,它爲類「奇怪」或「偶數」divs網站刮傷。我想讓「奇怪」和「偶數」變成我的函數接受的參數,這也允許我添加其他的div。這裏是我的代碼:BeautifulSoup findAll HTML類與多個變量類輸入

# 
# Imports 
# 

import urllib2 
from bs4 import BeautifulSoup 
import re 
import os 
from pprint import pprint 

# 
# library 
# 

def get_soup(url): 
    page = urllib2.urlopen(url) 
    contents = page.read() 
    soup = BeautifulSoup(contents, "html.parser") 
    body = soup.findAll("tr", ["even", "odd"]) 
    string_list = str([i for i in body]) 
    return string_list 


def save_to_file(path, soup): 
    with open(path, 'w') as fhandle: 
     fhandle.write(soup) 


# 
# script 
# 

def main(): 
    url = r'URL GOES HERE' 
    path = os.path.join('PATH GOES HERE') 
    the_soup = get_soup(url) 
    save_to_file(path, the_soup) 



if __name__ == '__main__': 
    main() 

我想結合*args入代碼,以便get_soup function是這樣的:

def get_soup(url, *args): 
    page = urllib2.urlopen(url) 
    contents = page.read() 
    soup = BeautifulSoup(contents, "html.parser") 
    body = soup.findAll("tr", [args]) 
    string_list = str([i for i in body]) 
    return string_list 

def main(): 
    url = r'URL GOES HERE' 
    path = os.path.join('PATH GOES HERE') 
    the_soup = get_soup(url, "odd", "even") 
    save_to_file(path, the_soup) 

不幸的是,這是行不通的。想法?

+0

你有測試網站的網址嗎? –

回答

0

不要把ARGS在列表中,ARGS已經是一個元組所以只是傳遞:

body = soup.findAll("tr", args) 

如果[args],你最終會像[("odd","even")]

而且str([i for i in body])是沒有真正意義上的,這將是一樣的只是做str(body),但我沒有看到格式可以多麼有用。

+0

這是完美的!至於str([我爲我身體]) - 這是兩個函數的組合,我還沒有清理。我顯然抄襲了錯誤的功能 - 儘管它和我的另一個做了同樣的事情。謝謝@Padraic坎寧安! – Lefty

+0

不用擔心,不客氣。 –