檢查是否存在

值我是新來的Python和我正在寫一個webscraper，以查找<td>行的HTML表：檢查是否存在

# open CSV with URLS to scrape 
csv_file = csv.reader(open('urls.csv', 'rb'), delimiter=',') 

names = [] 
for data in csv_file: 
    names.append(data[0]) 

for name in names: 
    html = D.get(name); 
    html2 = html 
    param = '<br />'; 
    html2 = html2.replace("<br />", " | ") 
    print name 

    c = csv.writer(open("darkgrey.csv", "a")) 
    for row in xpath.search(html2, '//table/tr[@class="bgdarkgrey"]'): 
     cols = xpath.search(row, '/td') 
     c.writerow([cols[0], cols[1], cols[2], cols[3], cols[4]])

它所做的是從4表中獲取價值'<td>'

問題是，有些桌子沒有cols[2],cols[3]或cols[4]

有沒有辦法，我可以檢查這些是否存在？

感謝

來源

2013-02-05 user1970557

有點偏離主題，但是你真的想追加到「darkgrey.csv」嗎？如果我在哪裏，我會在全局範圍內用「w」打開該文件，以防止在您再次測試該腳本時它增長到inf。還請確保關閉它！ – RickyA

我不完全熟悉xpath，但你應該能夠只是檢查的cols長度（只要它不是一個真正奇怪的物體，看起來像在其他方面的序列）：

if len(cols) >= 5: 
    ...

另一個常見的python習語是試試看。

try: 
    c.writerow([cols[0], cols[1], cols[2], cols[3], cols[4]]) 
except IndexError: 
    #failed because `cols` isn't long enough. Do something else.

最後，假定cols是list，你總是可以確保它足夠長：

cols.extend(['']*5)

這將墊空字符串你的專欄，讓您有至少5列（通常更多）。

來源

2013-02-05 15:48:09 mgilson

輝煌！我跑了第一部分，它似乎工作。我以前沒有用過Python，所以它都是一條學習曲線。 – user1970557

c.writerow([col[x] for x in range(0,len(col))])

也不要忘記關閉「darkgrey.csv」文件！

來源

2013-02-05 15:50:13 RickyA

也許更容易：'col [：5]' - 切片序列是原諒:) – mgilson

哇，經過測試，它的確是真的寬容。今天學到了新東西:) – RickyA

這樣

c.writerow([cols[0], cols[1], '' if not(cols[2]) else cols[2], '' if not(cols[3]) else cols[3], '' if not(cols[4]) else cols[4]])

來源

2013-02-05 15:57:10 Guddu

的另一種可能的方法也許你想說cols = xpath.search(row, 'td')不cols = xpath.search(row, '/td')？（沒有斜線）

來源

2013-02-05 16:08:13 eviltnan

檢查是否存在

回答

相關問題