比較兩個HashMap並計數重複值的數量

我創建了兩個包含來自兩個單獨的txt文件的字符串的HashMap。現在比較兩個HashMap並計數重複值的數量

，我想比較兩個包含HashMap和計算每個文件都包含重複值的數量。例如，如果file1和file2都包含字符串「hello」兩次，我的控制檯應該打印：你好發生2次。

這是我的第一個HashMap的：

List<String> word_list = new ArrayList<>(); 
     //Load your words to the word_list here 


     while (INPUT_TEXT1.hasNext()) { 
      String input_word = INPUT_TEXT1.next(); 

      word_list.add(input_word); 

     } 

     INPUT_TEXT1.close(); 

     String regexPattern = "[^a-zA-Z]"; 

     int index = 0; 

     for (String s : word_list) { 

      word_list.set(index++, s.replaceAll(regexPattern, "").toLowerCase()); 
     } 

     //Find the unique words now from list 
     String[] uniqueWords = word_list.stream().distinct(). 
             toArray(size -> new String[size]); 
     Map<String, Integer> wordsMap = new HashMap<>(); 
     int frequency = 0; 

     //Load the words to Map with each uniqueword as Key and frequency as Value 
     for (String uniqueWord : uniqueWords) { 
      frequency = Collections.frequency(word_list, uniqueWord); 
      System.out.println(uniqueWord+" occured "+frequency+" times"); 
      wordsMap.put(uniqueWord, frequency); 
     } 

     //Now, Sort the words with the reverse order of frequency(value of HashMap) 
     Stream<Entry<String, Integer>> topWords = wordsMap.entrySet().stream(). 
     sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6); 

     //Now print the Top 5 words to console 
     System.out.println("Top 5 Words:::"); 
     topWords.forEach(System.out::println); 


     System.out.println("\n\n");

這是我第二次的HashMap：

List<String> wordList = new ArrayList<>(); 
     //Load your words to the word_list here 


     while (INPUT_TEXT2.hasNext()) { 
      String input_word1 = INPUT_TEXT2.next(); 

      wordList.add(input_word1); 

     } 

     INPUT_TEXT2.close(); 

     String regex = "[^a-zA-Z]"; 

     int index1 = 0; 

     for (String s : wordList) { 

      wordList.set(index1++, s.replaceAll(regex, "").toLowerCase()); 
     } 

     String[] uniqueWords1 = wordList.stream().distinct(). 
             toArray(size -> new String[size]); 
     Map<String, Integer> wordsMap1 = new HashMap<>(); 

     //Load the words to Map with each uniqueword as Key and frequency as Value 
     for (String uniqueWord : uniqueWords1) { 
      frequency = Collections.frequency(wordList, uniqueWord); 
      System.out.println(uniqueWord+" occured "+frequency+" times"); 
      wordsMap.put(uniqueWord, frequency); 
     } 

     //Now, Sort the words with the reverse order of frequency(value of HashMap) 
     Stream<Entry<String, Integer>> topWords1 = wordsMap1.entrySet().stream(). 
     sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6)

這是我原來的做法，以尋找重複值：

boolean val = wordsMap.keySet().containsAll(wordsMap1.keySet()); 

    for (Entry<String, Integer> str : wordsMap.entrySet()) { 
     System.out.println("================= " + str.getKey()); 


     if(wordsMap1.containsKey(str.getKey())){ 
      System.out.println("Map2 Contains Map 1 Key"); 
     } 
    } 

    System.out.println("================= " + val);

有誰有沒有其他的建議來實現這個目標？謝謝

編輯我怎麼能計算每個單獨值的出現次數？

來源

2016-11-20 codeREXO

爲什麼你自己的代碼不工作？ – ifly6

哇！這是關於構建我所見過的字頻地圖的最糟糕實施。完整掃描列表以獲取唯一字詞，然後對每個唯一字詞*進行完整掃描。哎呀！由於您使用的Java 8流，請嘗試使用'流（）收集（Collectors.groupingBy（W - > W，Collectors.counting（）））。'。 – Andreas

我把重點放在了最後一次檢查以爲OP是問如何改善它，我完全忽略了第一部分。我同意安德烈亞斯的觀點，第一部分應該完全重構。 – user6904265

我覺得你的代碼工作爲好。如果你的目標是找到一個更好的方法來實現上次檢查，你可以試試這個：

Set<String> keySetMap1 = new HashSet<String>(wordsMap.keySet()); 
Set<String> keySet2 = wordsMap1.keySet(); 
keySetMap1.retainAll(keySet2); 
keySetMap1.stream().forEach(x -> System.out.println("Map2 Contains Map 1 Key: "+x));

來源

2016-11-20 22:06:18 user6904265

我怎麼會去計算每個重複的值出現的次數？ – codeREXO

爲了回答這個問題：我如何計算每個單獨值的出現次數，您可以按照Andreas的建議重構代碼：Map wordsMap = word_list.stream（）。collect（Collectors .groupingBy（w - > w，Collectors.counting（）））;'用這一行你可以計算詞頻映射。希望我們回答您的所有問題。 – user6904265

比較兩個HashMap並計數重複值的數量

回答

相關問題