我正在使用JSOUP包來獲得像Facebook標題一樣的特定TITLE搜索。這是我的代碼,它給出了TITLE的輸出。從TITLE的我想選擇Facebook網址。如何使用java正則表達式分割一個單詞?
方案:
package googlesearch;
import java.io.IOException;
import java.net.URLDecoder;
import java.net.URLEncoder;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class SearchRegexDiv {
private static String REGEX = ".?[facebook]";
public static void main(String[] args) throws IOException {
Pattern p = Pattern.compile(REGEX);
String google = "http://www.google.com/search?q=";
//String search = "stackoverflow";
String search = "hortonworks";
String charset = "UTF-8";
String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; // Change this to your company's name and bot homepage!
Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select(".g>.r>a");
for (Element link: links) {
String title = link.text();
String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");
if (!url.startsWith("http")) {
continue; // Ads/news/etc.
}
//.?facebook
if (title.matches(REGEX)) {
System.out.println("Done");
title.substring(title.lastIndexOf(" ") + 1); //split the String
//(example.substring(example.lastIndexOf(" ") + 1));
}
System.out.println("Title: " + title);
System.out.println("URL: " + url);
}
}
}
OUTPUT:
Title: Hortonworks - Facebook logo URL: https://www.facebook.com/hortonworks/
從輸出我得到的上述格式的URL's
和TITLE's
列表。
我想匹配包含字Facebook的標題,我想將它拆分成兩個串像
String socila_media = facebook;
String org = hortonworks;
JAVA不是JavaScript,刪除標籤 – mplungjan
可能我錯過了一些東西,但這與perl有什麼關係?刪除了perl標籤。 –
也許perl正則表達式大師會有用:) – mplungjan