0
有助於解決這裏的問題是腳本搜索互聯網並保存,但他讓它們保持錯誤的編碼,並且對於UTP-8,這裏是您可以插入程序代碼的地方,insert請(當保存頁面,其內容也是錯位的字符)文件編碼爲UTF-8
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.net.URL;
import java.net.URLConnection;
import java.util.Scanner;
public class url{
public static void main(String[] args) {
try {
URL PageUrl;
URLConnection GetConn = null;
GetConn = null;
Scanner sc = new Scanner(new File("C:\\test\\url.txt"));
String htmlPage;
while (sc.hasNext()){
htmlPage = sc.nextLine();
PageUrl = new URL(htmlPage);
GetConn = PageUrl.openConnection();
GetConn.connect();
// establish connection:
Scanner scUrl = new Scanner(GetConn.getInputStream());
StringBuffer sb = new StringBuffer();
while(scUrl.hasNext()){
sb.append(scUrl.nextLine());
}
scUrl.close();
String htmlFileName = ("C:\\test\\1\\"+title(sb.toString())+".html");
FileWriter FWriter = new FileWriter(htmlFileName);
BufferedWriter BWriter = new BufferedWriter(FWriter);
BWriter.write(sb.toString());
BWriter.close();
}// end try
sc.close();
}
catch (IOException io) {
System.out.println(io);
}
}
private static String title(String str){
return str.substring(str.indexOf("title>")+6, str.indexOf("</title>"));
}
}
我有沒有必要程序UTF-8編碼的殘留文件 –