下面的代碼是爲列表中的每個用戶運行一些代碼的實現。下面的代碼只是比較每個用戶和連接它們的屬性:將必要的解決方案轉換爲併發映射解決方案
case class UserObj(id: String, nCoordinate : String)
val userList = List(UserObj("a1" , "1234"),UserObj("a2" , "1234"), UserObj("a3" , "1234"))
val map1 = new java.util.concurrent.ConcurrentHashMap[String, Double]
userList.par.map(xUser => {
userList.par.map(yUser => {
if (!xUser.id.isEmpty() && !yUser.id.isEmpty()) {
println("Total is "+xUser.id+yUser.id+","+xUser.nCoordinate+yUser.nCoordinate)
map1.put(xUser.id + "," + yUser.id , getJaccardDistance(xUser.nCoordinate, yUser.nCoordinate))
}
})
println("")
}) //> Total is a1a1,12341234
//| Total is a3a1,12341234
//| Total is a2a1,12341234
//| Total is a3a2,12341234
//| Total is a1a2,12341234
//| Total is a3a3,12341234
//| Total is a2a2,12341234
//|
//| Total is a1a3,12341234
//| Total is a2a3,12341234
//|
//|
//| res0: scala.collection.parallel.immutable.ParSeq[Unit] = ParVector((),(), (
//|))
def getJaccardDistance(str1: String, str2: String) = {
val zipped = str1.zip(str2)
val numberOfEqualSequences = zipped.count(_ == ('1', '1')) * 2
val p = zipped.count(_ == ('1', '1')).toFloat * 2
val q = zipped.count(_ == ('1', '0')).toFloat * 2
val r = zipped.count(_ == ('0', '1')).toFloat * 2
val s = zipped.count(_ == ('0', '0')).toFloat * 2
(q + r)/(p + q + r)
}
這以前是勢在必行的解決方案:
for (xUser <- userList) {
for (yUser <- userList) {
if (!xUser.id.isEmpty() && !yUser.id.isEmpty()) {
println("Total is "+xUser.id+yUser.id+","+xUser.nCoordinate+yUser.nCoordinate)
}
}
println("")
}
但我想利用Scala的並行集合,我認爲使用地圖推薦實現這一點的方法。由於上面的命令式代碼可能導致多個線程運行相同的代碼。注意:正在執行的上述代碼:println("Total is "+xUser.id+yUser.id+","+xUser.nCoordinate+yUser.nCoordinate)
只是實際運行算法的一個更簡單的版本。
發佈在問題開始處的功能解決方案的行爲與預期相同,但是一旦該列表包含更多的3000個元素,它幾乎會暫停。爲什麼會發生?我的實現是否正確?
請參閱更新,香港專業教育學院張貼的算法 –
@藍天包含在循環中的Total的'println'是什麼? –
是的,但它不需要。我只是將它包括在內以使其更易於閱讀,看起來效果相反 –