2011-03-14 135 views
0
for (a = 0; a < filename; a++) { 

     try { 
      System.out 
        .println(" _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ "); 
      System.out.println("\n"); 
      System.out.println("The word inputted : " + word2); 
      File file = new File(
        "C:\\Users\\user\\fypworkspace\\TextRenderer\\abc" + a 
          + ".txt"); 
      System.out.println(" _________________"); 

      System.out.print("| File = abc" + a + ".txt | \t\t \n"); 

      for (int i = 0; i < array2.length; i++) { 

       totalCount = 0; 
       wordCount = 0; 

       Scanner s = new Scanner(file); 
       { 
        while (s.hasNext()) { 
         totalCount++; 
         if (s.next().equals(array2[i])) 
          wordCount++; 

        } 

        System.out.print(array2[i] + " --> Word count = " 
          + "\t " + "|" + wordCount + "|"); 
        System.out.print(" Total count = " + "\t " + "|" 
          + totalCount + "|"); 
        System.out.printf(" Term Frequency = | %8.4f |", 
          (double) wordCount/totalCount); 

        System.out.println("\t "); 

        double inverseTF = Math.log10((float) numDoc 
          /(numofDoc[i])); 
        System.out.println(" --> IDF = " + inverseTF); 

        double TFIDF = (((double) wordCount/totalCount) * inverseTF); 
        System.out.println(" --> TF/IDF = " + TFIDF + "\n"); 



       } 
      } 
     } catch (FileNotFoundException e) { 
      System.out.println("File is not found"); 
     } 
    } 
} 

}如何總結總值?

這是輸出示例:

字輸入:你怎麼樣


| File = abc0.txt |

how - > Word count = | 4 |總計數= | 957 |術語頻率= | 0.0042 |

--> IDF = 0.5642714398516419 

--> TF/IDF = 0.0023585013159943234 

是 - >字數= | 7 |總計數= | 957 |術語頻率= | 0.0073 |

--> IDF = 0.1962946357308887 

--> TF/IDF = 0.00143580193324579 

you - > Word count = | 10 |總計數= | 957 |術語頻率= | 0.0104 |

--> IDF = 0.1962946357308887 

--> TF/IDF = 0.002051145618922557 

我如何總結每個文本文件的整個3 TF/IDF?

回答

1

Asssuming你只是想運行總計是能夠顯示,那麼你for loop之前添加類似:

double runningTfIDF = 0; 

然後計算當前TF/IDF後,再加入行

runningTfIDF += TFIDF; 

然後,在您的for loop之後,您可以添加一行以打印runningTfIDF。

編輯以包括更完整的答案

HashMap<String, BigDecimal> runningTdIDF = new HashMap<String, Double>(); 
HashMap<String, BigDecimal> wordCount = new HashMap<String, Double>(); 
HashMap<String, BigDecimal> frequency = new HashMap<String, Double>(); 
HashMap<String, BigDecimal> inverseTF = new HashMap<String, Double>(); 
for (int i = 0; i < array2.length; i++) { 

    totalCount = 0; 
    wordCountVal = 0; 

    Scanner s = new Scanner(file); 
    { 
     while (s.hasNext()) { 
      totalCount++; 
      if (s.next().equals(array2[i])) 
       wordCountVal++; 

      } 

      BigDecimal wordCount(array2[i],new BigDecimal(wordCountVal)); 

      BigDecimal frequencyVal = new BigDecimal((double) wordCount/totalCount)); 
     frequency.put(array2[i],frequencyVal); 

      BigDecimal inverseTFVal = new BigDecimal(Math.log10((float) numDoc 
          /(numofDoc[i]))); 
     inverseTF.put(array2[i], inverseTFVal); 


      BigDecaim TFIDF =new BigDecima(((wordCount/totalCount) * inverseTF)); 
      runningTfIDF.put(array2[i], TFIDF); 

    } 

    for(String word : wordCount.keySet()){ 
     System.out.print(word + " --> word count " 
     + "\t |"+wordCount.get(word)+"|"); 
     System.out.print(" Total count = " + "\t " + "|" 
      + totalCount + "|"); 
     System.out.printf(" Term Frequency = | %8.4f |", 
      frequency.get(word)); 

     System.out.println("\t "); 

     System.out.println(" --> IDF = " + inverseTF.get(word)); 

     System.out.println(" --> TF/IDF = " + runningTfIDF.get(word) + "\n"); 
    } 

}

這不是目前最清潔的實現,但總之你需要通過存儲你的信息,每一個字和循環如果您想要顯示以第一個可能的結果開始的總數,則在創建總計之後的單詞。那有意義嗎?

+0

謝謝先生,但我需要總計它,並顯示在每個字的TF/IDF下。先生可以指導我嗎? –

+0

你的意思是你會顯示TFIDF的第一個字兩次,TFIDF的第二個,其次是總和TFIDF?您可以在每次迭代時打印runningTfIDF,它會在那個時間點給出總和。 – dmcnelis

+0

我的意思是給第一個字的整個[3字總和] ..讓它變得如此混亂.. –