0
編程很新穎,我一直在教自己的Java,因爲我一直在。我目前試圖做的是在特定的yelp搜索中提取所有給定公司的名稱,並將結果存儲到數組中。這裏是我去的:如何使用jSoup從Yelp中檢索信息?
import java.util.ArrayList;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
public class YelpScraper
{
public static void main(String[] args) throws IOException
{
String url = "https://www.yelp.com/search?find_desc=&find_loc=new+jersey&ns=1";
Document document = Jsoup.connect(url).get();
Elements elements = document.getElementsByClass("biz-name js-analytics-click");
for (Element element : elements)
{
System.out.println(elements.toString());
}
}
}
現在這是我的問題。這是輸出:
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen & Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen & Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen & Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen & Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/casa-d-paco-newark" data-hovercard-id="iIJ-dWgYcZTewVGJyP6EfQ"><span>Casa d’Paco</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hero-king-handcrafted-sandwiches-newark" data-hovercard-id="hzwE2ub1J7fTwJDjTJwksA"><span>Hero King Handcrafted Sandwiches</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/the-green-chicpea-newark-2" data-hovercard-id="bDWWtSm-8uoW9_urjMCzTA"><span>The Green Chicpea</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/darios-restaurant-newark" data-hovercard-id="resfu-JNLUKR3l82D5W7-A"><span>Dario’s Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/sushi-house-21-newark-2" data-hovercard-id="vMpJRWxm71XSBnWL9XfYpQ"><span>Sushi House 21</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/burger-walla-newark" data-hovercard-id="JmPZ-AyewjQPIJkKbkU0dA"><span>Burger Walla</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/hobbys-delicatessen-and-restaurant-newark" data-hovercard-id="-dEkFa3N6SXLahAMBAM8EA"><span>Hobby’s Delicatessen & Restaurant</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/krugs-tavern-newark" data-hovercard-id="YhiUGWjAB1y7reqoKLWCow"><span>Krug’s Tavern</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/mcwhorter-barbecue-newark" data-hovercard-id="6xf4H2rOCtUIhyMgazRsnA"><span>McWhorter Barbecue</span></a>
<a class="biz-name js-analytics-click" data-analytics-label="biz-name" href="/biz/spanish-tavern-newark" data-hovercard-id="muXH1f3nwoSgWB3KN-rAfA"><span>Spanish Tavern</span></a>
正如你所看到的,其輸出級的HTML代碼,我要的是簡單的企業的名稱。任何想法,我可以如何做不同。顯然getElementsByClass()方法不是我應該使用的。感謝先進的傢伙!
嘿哇,謝謝!奇蹟般有效!我其實剛剛拿起jSoup昨天。如果你不介意,你能解釋一下select()方法的語法嗎?從我所看到的情況來看,你總是會開始。然後是類名跟着標籤? –