解析字符串 - Http字符串

我想要做這樣的事情！因此，我只剩下字符串的網站部分。我在字符串中的報價有問題。解析字符串 - Http字符串

 /////////////////////This is what i read into a string. 

      ///<td width="118"><a href="research.html" class="navText style10 style12"> 

    ///////I wanna be able to parse this so i am only left with research.html 

    //I sometimes also get a string that contains: 

    //<a href="http://www.ucalgary.ca" class="style18"><font size="3">University of Calgary</font></a></div> 

    //From this string i wanna keep http://www.ucalgary.ca

到目前爲止我所得到的並不總是適用於每一種情況。我會感謝您的幫助！我的代碼是

 public class Parse 
     { 
      public static void main(String[] args) 
      { 
      String h = "<a href=\"http://www.departmentofmedicine.com/policy.htm\">"; 
      int n = getIndexOf(h, '"', 0); 


      String[] a = h.substring(n).split(">"); 
      String url = a[0].replaceAll("\"", ""); 
      //String value = a[1].replaceAll("</a", ""); 

      System.out.println(url + " "); 
      } 

      public static int getIndexOf(String str, char c, int n) 
      { 
      int pos = str.indexOf(c, 0); 
      while (n-- > 0 && pos != -1) 
      { 
       pos = str.indexOf(c, pos + 1); 
      } 
      return pos; 
      } 
     }

來源

2014-10-08 chillax786

看看Java字符串的方法。他們已經剝離和這樣 – jgr208 2014-10-08 14:51:28

目前尚不清楚，從你的輸入，「", what do you want to keep/extract ? – ToYonos 2014-10-08 14:53:44

only departmentofmedicine.com/policy.htm /// This input works but the other inputs i mentioned above dont seem to work!! For example if i use this as input///// University of Calgary – chillax786 2014-10-08 14:58:52

我會給Pattern和Matcher這樣的嘗試：

String s = "<a href=\"http://www.departmentofmedicine.com/policy.htm\">"; 

    Pattern p = Pattern.compile(".*href=\"([^\"]*).*"); 
    Matcher m = p.matcher(s); 
    if(m.matches()) { 
     System.out.println(m.group(1)); 
    }

來源

2014-10-08 15:09:44

小碼：

字符串H =「http://www.departmentofmedicine.com/policy .htm \「>」;
String url = h.substring（h.indexOf（「http」））。replace（「\」>「，」「）;
System.out.println（url）;

輸出將是： http://www.departmentofmedicine.com/policy.htm

測試我的機器上。

另外發布什麼是可能的情況。這樣我可以告訴你更好的解決方案。

解決方案的所有三個posibilities：

 //String h1 = "<a href=\"http://www.departmentofmedicine.com/policy.htm\">"; 
     //String h1 = `"<a href=\"ucalgary.ca\"; class=\"style18\"><font size=\"3\">University of Calgary</font></a>"; 
    String h1="<td width=\"118\"><a href=\"research.html\" class=\"navText style10 style12\">";` 

String url = h1.substring(h1.indexOf("href=\"") + "href=\"".length()).substring(0, h1.substring(h1.indexOf("href=\"") + "href=\"".length()).indexOf("\"")); 

System.out.println(url);

取消註釋字符串H1;逐個對象並檢查你的要求。

上面的代碼是給輸出：
research.html
http://www.departmentofmedicine.com/policy.htm
ucalgary.ca

來源

2014-10-08 15:11:01

輸出將是： – 2014-10-08 15:12:12

這是另一種情況： – chillax786 2014-10-08 15:20:02

this is also another case: University of Calgary – chillax786 2014-10-08 15:20:42

解析字符串 - Http字符串

回答

相關問題