2013-10-17 35 views
0

我有多個段落標記像同級別以下屬性名稱JS-鳴叫文本鳴叫文本需要分析文本中的Android解析從多個段落標記文本的Android

Caged parrot sings for its master. Industrialists & IAS officers named in the charge sheet. 
Sometext................ 

HTML文本:

<p class="js-tweet-text tweet-text">Caged parrot sings for its master. Industrialists &amp; IAS officers named in the charge sheet. <a href="/PMOIndia" class="twitter-atreply pretty-link" dir="ltr" ><s>@</s><b>PMOIndia</b></a> &amp; MOS Coal left scot free.</p> 


<p class="js-tweet-text tweet-text">Sometext................ <a href="/PMOInd" class="twitter-atreply pretty-link" dir="ltr" ><s>@</s><b>PMOIndia</b></a> &amp; MOS Coal left sc free.</p> 

等等

可以在任何一個幫助嗎?

回答

1

我已經使用Jsoup分析器對於Android的這個要求

Docuument doc = Jsoup.connect("https://twitter.com/someperson/") 
          .userAgent("Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36") 
          .get(); 

Elements elements = doc.select("p[class=js-tweet-text tweet-text]"); 

for (int j=0;j<elements.size();j++) { 

       Element tmp = elements.get(j); 
       String value = tmp.text(); 
     } 

上面的代碼將返回所有的類屬性文本(它匹配s段落中的「js-tweet-text tweet-text」)值

1

也許這可以通過正則表達式來完成,但因爲我不知道會在標籤內什麼都,這會做,

String input = "<p class=\"js-tweet-text tweet-text\">Caged parrot sings for its master. Industrialists &amp; IAS officers named in the charge sheet. <a href=\"/PMOIndia\" class=\"twitter-atreply pretty-link\" dir=\"ltr\" ><s>@</s><b>PMOIndia</b></a> &amp; MOS Coal left scot free.</p>"; 
    int i=0; 
    boolean flag=true; 
    String result=""; 
    for(i=0;i<input.length();i++) 
    { 
     char c = input.toCharArray()[i]; 
     if(c=='<') flag = false; 
     else if(c=='>') 
     { 
      flag = true; 
      continue; 
     } 
     if(flag) result += c;  
    } 
    System.out.println(result); 

輸出

Caged parrot sings for its master. Industrialists &amp; IAS officers named in the charge sheet. @PMOIndia &amp; MOS Coal left scot free.