正如我在之前的評論中所述,可以使用Map(HashMap)來存儲匹配的單詞及其出現頻率。
我建議將程序的功能封裝到較小的方法/類中,以便每個方法/類只執行一項小任務。所以代碼可以更好地讀取。
我假定你的文件將包含字符串「自動布什勝過她的番茄在矮牽牛汽車」
下面是代碼:
package how_to_calculate_the_frequency;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.HashMap;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Project {
HashMap<String, Integer> map = new HashMap<String, Integer>();
public static void main(String[] args){
Project project = new Project();
Scanner INPUT_TEXT = project.readFile();
project.analyse(INPUT_TEXT);
project.showResults();
}
/**
* logic to count the occurences of words matched by REGEX in a scanner that
* loaded some text
*
* @param scanner
* the scanner holding the text
*/
public void analyse(Scanner scanner) {
String pattern = "[a-zA-Z'-]+";
Pattern r = Pattern.compile(pattern);
while (scanner.hasNext()) {
// read next word
String Stringcandidate = scanner.next();
// see if pattern matches (boolean find)
Matcher matcher = r.matcher(Stringcandidate);
if (matcher.find()) {
String matchedWord = matcher.group();
//System.out.println(matchedWord); //check what is matched
this.addWord(matchedWord);
}
}
scanner.close();// Close your Scanner.
}
/**
* adds a word to the <word,count> Map if the word is new, a new entry is
* created, otherwise the count of this word is incremented
*/
public void addWord(String matchedWord) {
if (map.containsKey(matchedWord)) {
// increment occurrence
int occurrence = map.get(matchedWord);
occurrence++;
map.put(matchedWord, occurrence);
} else {
// add word and set occurrence to 1
map.put(matchedWord, 1);
}
}
/**
* reads a file from disk and returns a scanner to analyse it
*
* @return the file from disk as scanner
*/
public Scanner readFile() {
Scanner scanner = null;
/* use that for reading a file from disk
* try { scanner = new Scanner(new
* File("moviereview.txt")).useDelimiter(" "); } catch (Exception e) {
* e.printStackTrace(); }
*/
scanner = new Scanner("auto bush trumped her tomato in the petunia auto");
return scanner;
}
/**
* prints the matched words and their occurrences
* in a readable way
*/
public void showResults() {
for (HashMap.Entry<String, Integer> matchedWord : map.entrySet()) {
int occurrence = matchedWord.getValue();
System.out.print("\"" + matchedWord.getKey() + "\" appears " + occurrence);
if (occurrence > 1) {
System.out.print(" times\n");
} else {
System.out.print(" time\n");
}
}
// or as the new Java 8 lambda expression
// map.forEach((word,occurrence)->System.out.println("\"" + word + "\"
// appears " + occurrence + " times"));
}
}
// DONE seperate reading a file, analysing the file and
// word-frequency-counting-logic in different
// methods
// Done implement <word,count> Map and logic to add new and known(to the map)
// words
這產生了:
「的」出現1時間
「自動」 出現2次
「她」 AP梨1時間
「在」 出現1次
「襯套」 出現1次
「捏造」 出現1次
「番茄」 出現1次
「矮牽牛」出現1次
關於
你能更具體嗎?現在發生了什麼?我們不在這裏爲您運行您的代碼。而且我們沒有你的文本文件 –
我不能幫你。當你甚至無法正確格式化(縮進)代碼以顯示代碼結構時,我拒絕查看代碼。 – Andreas
歡迎來到StackOverflow。如果您按照幫助中心提供的指導方針,最有可能獲得有用的答案。例如,像這樣:「尋求調試幫助的問題(」爲什麼這個代碼不工作?「)必須包含所需的行爲,特定的問題或錯誤以及在問題本身中重現問題所需的最短代碼。沒有明確問題陳述的問題對其他讀者沒有用處。「 –