2016-09-02 55 views
0

我想在循環中只打印一次特定的行,但在一行的結果中放入它會給出相同的結果四次請幫助我如何停止for循環打印一行我想打印特定的行只是一個循環使用python 3.4

這裏後,完整的HTML和Python代碼也可利用這個腳本結果

<ul class="breadcrumbs" id="BREADCRUMBS"> 
 
    <li class="breadcrumb_item " itemscope="" itemtype="http://data-vocabulary.org/Breadcrumb"> 
 
    <a class="breadcrumb_link" href="/Tourism-g191-United_States-Vacations.html" itemprop="url" onclick="ta.setEvtCookie('Breadcrumbs', 'click', 'Country', 1, this.href); "> 
 
     <span itemprop="title">United States</span> 
 
    </a> 
 
    <span class="separator">›</span> 
 
    </li> 
 
    . 
 
    . 
 
    . 
 
    .
其打印Python腳本導致

ulpart = soup.find_all("ul", {"class": "breadcrumbs"}) 
 
    \t \t \t for unorder in ulpart: 
 
    \t \t \t \t div2 = soup.find_all("li", {"class": "breadcrumb_item "}) 
 
    \t \t \t \t for listitem in div2[0:]: 
 
    \t \t \t \t \t country = soup.select_one("li.breadcrumb_item a[onclick*=Country]").get_text(strip=True) 
 
    \t \t \t \t \t print(country)

下面是該代碼打印相同的結果四個時間

United State 
 
United State 
 
United State 
 
United State

但我想美國國家只是一個時間這樣的結果:

United State

回答

0

,因爲你正在使用無序列表,你可以使用Python集合數據類型(如果你需要爲了使用列表):

printed = set() 

ulpart = soup.find_all("ul", {"class": "breadcrumbs"}) 
       for unorder in ulpart: 
        div2 = soup.find_all("li", {"class": "breadcrumb_item "}) 
        for listitem in div2[0:]: 
         country = soup.select_one("li.breadcrumb_item a[onclick*=Country]").get_text(strip=True) 
         printed.add(country) 
+0

只是魔法命中和實現什麼,我想在您的幫助: ) – Hassan

0
printed_countries = list() 
ulpart = soup.find_all("ul", {"class": "breadcrumbs"}) 
       for unorder in ulpart: 
        div2 = soup.find_all("li", {"class": "breadcrumb_item "}) 
        for listitem in div2[0:]: 
         country = soup.select_one("li.breadcrumb_item a[onclick*=Country]").get_text(strip=True) 
         if not country in printed_countries: 
          print(country) 
          printed_countries.append(country)