我想搜索並計算字符串在webscrape中出現的次數。不過,我想在Webscrape中的x和y之間進行搜索。限制由python搜索的文本區域
任何人都可以告訴我最簡單的方法來計算主要漁夫和次要漁民之間的SEA BASS在下面的示例webscrape。
<p style="color: #555555;
font-family: Arial,Helvetica,sans-serif;
font-size: 12px;
line-height: 18px;">June 21, 2013 By FISH PPL Admin </small>
</div>
<!-- Post Body Copy -->
<div class="post-bodycopy clearfix"><p>MAIN FISHERMAN – </p>
<p><strong>CHAMP</strong> – Pedro 00777<br />
BAIT – LOCATION1 – 2:30 – SEA BASS (3 LBS 11/4)<br />
MULTI – LOCATION2 – 7:30 – COD (3 LBS 13/8)<br />
LURE – LOCATION5 – 3:20 – RUDD (2 LBS 6/1)</p>
<p>JOE BLOGGS <a href="url">url</a><br />
BAIT – LOCATION4 – 4:45 – ROACH (5 LBS 3/1)<br />
MULTI – LOCATION2 – 5:50 – PERCH (3 LBS 6/1)<br />
LURE – LOCATION1 – 3:45 – PIKE (2 LBS 5/1) </p>
BAIT – LOCATION1 – 2:30 – SEA BASS (3 LBS 11/4)<br />
MULTI – LOCATION1 – 3:45 – JUST THE JUDGE (3 LBS 3/1)<br />
LURE – LOCATION3 – 8:25 – SCHOOL FEES (2 LBS 7/1)</p>
<div class="post-bodycopy clearfix"><p>SECONDARY FISHERMAN – </p>
<p><strong>SPOON – <a href="url">url</a></strong><br />
BAIT – LOCATION1 – 2:30 – SEA BASS (3 LBS 11/4)<br />
MULTI – LOCATION2 – 7:30 – COD (3 LBS 7/4)<br />
LURE – LOCATION1 – 4:25 – TROUT (2 LBS 5/1)</p>
我試圖用下面的代碼來實現這一點,但無濟於事。
html = website.read()
pattern_to_exclude_unwanted_data = re.compile('MAIN FISHERMAN(.*)SECONDARY FISHERMAN')
excluding_unwanted_data = re.findall(pattern_to_exclude_unwanted_data, html)
print excluding_unwanted_data("SEA BASS")
哎呀,是的我沒有想到DOTALL--而「組合」這個東西只是馬虎,謝謝! – alexis