嘗試從緩衝讀取器中提取子字符串，讀取某些標記

我使用bufferedreader提取5個網頁，每個網頁用空格分隔，我想使用子字符串來提取每個網頁url，html，源和日期。但我需要指導如何正確使用子字符串來實現這一點，歡呼聲。嘗試從緩衝讀取器中提取子字符串，讀取某些標記

public static List<WebPage> readRawTextFile(Context ctx, int resId) { 

    InputStream inputStream = ctx.getResources().openRawResource(
      R.raw.pages); 

    InputStreamReader inputreader = new InputStreamReader(inputStream); 
    BufferedReader buffreader = new BufferedReader(inputreader); 
    String line; 
    StringBuilder text = new StringBuilder(); 

    try { 
     while ((line = buffreader.readLine()) != null) { 


      if (line.length() == 0) {  
       // ignore for now 
           //Will be used when blank line is encountered 
      } 

      if (line.length() != 0) { 
     //here I want the substring to pull out the correctStrings 
       int sURL = line.indexOf("<!--"); 
        int eURL = line.indexOf("-->"); 
       line.substring(sURL,eURL); 
       **//Problem is here** 
      } 
     } 
    } catch (IOException e) { 
     return null; 

    } 
    return null; 
}

來源

2013-01-04 rtkgpe

我想如何提取文本是這樣的地址我想刪除標籤<！ - 地址：http：//www.google.co.uk.html-->所以，我離開了有了這個，我可以存儲它：http：//www.google.co.uk.html – rtkgpe

爲什麼你想要通過子串操作？只需使用String.replace（）。 – Smit

在catch塊不return null，使用printStackTrace();。它會幫助你找出是否出了問題。

 String str1 = "<!--Address:google.co.uk.html-->"; 
     // Approach 1 
     int st = str1.indexOf("<!--"); // gives index which starts from < 
     int en = str1.indexOf("-->"); // gives index which starts from - 
     str1 = str1.substring(st + 4, en); 
     System.out.println(str1); 

     // Approach 2 
     String str2 = "<!--Address:google.co.uk.html-->"; 
     str2 = str2.replaceAll("[<>!-]", ""); 
     System.out.println(str2);

注$ 100：知道，在的replaceAll使用正則表達式它將取代含正則表達式PARAMS字符串的一切。

來源

2013-01-04 02:07:15 Smit

謝謝，我需要能夠從bufferedreader中提取地址。所以它會經過並找到文本文件中的每個地址取掉標籤並返回地址 – rtkgpe

@ rob12243我不明白。無論如何，你可以使用任何邏輯來實現你的目標。 – Smit

我覺得你想要的是這樣的，

public class Test { 
    public static void main(String args[]) { 
    String text = "<!--Address:google.co.uk.html-->"; 
    String converted1 = text.replaceAll("\\<!--", ""); 
    String converted2 = converted1.replaceAll("\\-->", ""); 
    System.out.println(converted2); 
    }

}

結果顯示：地址：google.co.uk.html

來源

2013-01-04 01:49:09 9ine

謝謝，我會看看我是否能適應它，所以我可以拯救5個網站。 – rtkgpe

正如您使用'ReplaceAll（）'。那爲什麼要這兩個轉換。你可以使用'regex'來達到同樣的效果。無論如何。 – Smit

@smit是的你沒錯。感謝指教:) – 9ine

嘗試從緩衝讀取器中提取子字符串，讀取某些標記

回答

相關問題