2011-03-15 34 views
0
for (a = 0; a < filename; a++) { 

     try { 
      System.out 
        .println(" _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ "); 
      System.out.println("\n"); 
      System.out.println("The word inputted : " + word2); 
      File file = new File(
        "C:\\Users\\user\\fypworkspace\\TextRenderer\\abc" + a 
          + ".txt"); 
      System.out.println(" _________________"); 

      System.out.print("| File = abc" + a + ".txt | \t\t \n"); 

      for (int i = 0; i < array2.length; i++) { 

       totalCount = 0; 
       wordCount = 0; 

       Scanner s = new Scanner(file); 
       { 
        while (s.hasNext()) { 
         totalCount++; 
         if (s.next().equals(array2[i])) 
          wordCount++; 

        } 

        System.out.print(array2[i] + " --> Word count = " 
          + "\t " + "|" + wordCount + "|"); 
        System.out.print(" Total count = " + "\t " + "|" 
          + totalCount + "|"); 
        System.out.printf(" Term Frequency = | %8.4f |", 
          (double) wordCount/totalCount); 

        System.out.println("\t "); 

        double inverseTF = Math.log10((float) numDoc 
          /(numofDoc[i])); 
        System.out.println(" --> IDF = " + inverseTF); 

        double TFIDF = (((double) wordCount/totalCount) * inverseTF); 
        System.out.println(" --> TF/IDF = " + TFIDF + "\n"); 





       } 
      } 
     } catch (FileNotFoundException e) { 
      System.out.println("File is not found"); 
     } 

    } 
} 

這是我的代碼來計算每個我在裏面輸入的查詢的期限頻率。 現在我正在嘗試爲每個文件添加每個查詢頻率。我如何總計每個文件查詢計數?

輸出示例:

文件的數量是這個文件夾是:11 請輸入查詢: 你怎麼樣 如何 - >這個數字包含這個詞3 是文件 - >這個數字包含這個詞的文件7 你 - >包含該字詞7


字輸入文件的這個數字:你怎麼樣


| File = abc0.txt |
how - >Word count = | 4 |總計數= | 957 |術語頻率= | 0.0042 |
- > IDF = 0.5642714398516419 - > TF/IDF = 0.0023585013159943234

是 - >字數 = | 7 |總計數= | 957 |術語頻率= | 0.0073 |
- > IDF = 0.1962946357308887 - > TF/IDF = 0.00143580193324579

你 - >字數 = | 10 |總計數= | 957 |術語頻率= | 0.0104 |
- > IDF = 0.1962946357308887 - > TF/IDF = 0.002051145618922557

實施例:總頻率爲4 + 7 + 10 = 21 ..


輸入的字:你怎麼樣


| File = abc1.txt |
how - >Word count = | 4 |總計數= | 959 |術語頻率= | 0.0042 |
- > IDF = 0.5642714398516419 - > TF/IDF = 0.0023535826479734803

是 - >字數 = | 7 |總計數= | 959 |術語頻率= | 0.0073 |
- > IDF = 0.1962946357308887 - > TF/IDF = 0.0014328075600794795

你 - >字數 = | 10 |總計數= | 959 |術語頻率= | 0.0104 |
- > IDF = 0.1962946357308887 - > TF/IDF = 0.002046867942970685

我怎樣才能使它以總價3查詢字數爲每個文件?

示例:總頻率爲4 + 7 + 10 = 21 ..

+0

可能重複[?如何總結總值(http://stackoverflow.com/questions/5298489/how-to-sum總價值) – 2011-03-15 13:36:49

+0

不,這是我面臨的另一個問題,但是,我已經弄清楚了......感謝您的關注。 – 2011-03-15 13:44:03

+0

如果是這樣的話,那麼你真的很難弄清楚你實際上在問什麼。 – 2011-03-15 13:45:53

回答

0

的TOTALCOUNT必須是你嘗試之外。在嘗試之前初始化並在打印之後進行打印。對Java程序的設計有很多擔憂,我希望你也會考慮這個問題。對於時間beeing,或許這應該是所有你需要:

for (a = 0; a < filename; a++) { 
    int totalcount = 0; 
    try{ 
    int wordcount = 0; 
    for(...){ 
     ... 
    } 
    //print wordcount 
    totalcount += wordcount; 
    }catch(Exception e){ 
    ... 
    return; //to ensure that there is no total count if something goes wrong. 
    } 
    //print totacount 
} 
0

你需要的單詞計數存儲(每個文件)的陣列,或者你可以把它添加到一些「和」變量(這是循環外初始化)

+0

謝謝..我已經弄清楚了。感謝您的關注.. – 2011-03-15 13:43:37