從CSV列表中檢查數據

嗨，我是新來的蟒蛇，我想通過提供一個可用的函數來增加我的知識庫。我試圖建立一個函數，它創建一個從1到59範圍內的一組數字中取出的6個隨機數字的列表。現在我已經破解了這部分，它是下一個棘手的部分。我現在想檢查隨機集中數字的csv文件，然後打印出一個通知，如果從該集合中找到兩個或更多的數字。現在我已經嘗試了print (df[df[0:].isin(luckyDip)])，它有一點成功，它檢查數據幀中的數字，然後顯示數據幀中匹配的數字，但它也顯示數據幀的其餘部分爲NaN，這是技術上不太令人愉快，並不是我想要的。從CSV列表中檢查數據

我只是在尋找一些關於下一步做什麼的指針，或者只是搜索google的東西，bellow是我一直在搞的代碼。

import random 
import pandas as pd 

url ='https://www.national-lottery.co.uk/results/euromillions/draw-history/csv' 
df = pd.read_csv(url, sep=',', na_values=".") 

lottoNumbers = [1,2,3,4,5,6,7,8,9,10, 
      11,12,13,14,15,16,17,18,19,20, 
      21,22,23,24,25,26,27,28,29,30, 
      31,32,33,34,35,36,37,38,39,40, 
      41,42,43,44,45,46,47,48,49,50, 
      51,52,53,54,55,56,57,58,59] 
luckyDip = random.sample(lottoNumbers, k=6) #Picks 6 numbers at random 
print (sorted(luckyDip))  
print (df[df[0:].isin(luckyDip)])

來源

2017-05-23 Mortgage1

如果你只是希望扁平化陣列，並刪除NaN值，你可以添加到您的代碼的末尾：

matches = df[df[0:].isin(luckyDip)].values.flatten().astype(np.float64) 
    print matches[~np.isnan(matches)]

來源

2017-05-23 21:30:58 user2188329

不一樣優雅的@ayhan解決方案，但這個工程：

import random 
import pandas as pd 

url ='https://www.national-lottery.co.uk/results/euromillions/draw-history/csv' 
df = pd.read_csv(url, index_col=0, sep=',') 

lottoNumbers = range(1, 60) 

tries = 0 
while True: 
    tries+=1 
    luckyDip = random.sample(lottoNumbers, k=6) #Picks 6 numbers at random 

    # subset of balls 
    draws = df.iloc[:,0:7] 

    # True where there is match 
    matches = draws.isin(luckyDip) 

    # Gives the sum of Trues 
    sum_of_trues = matches.sum(1) 

    # you are looking for matches where sum_of_trues is 6 
    final = sum_of_trues[sum_of_trues == 6] 
    if len(final) > 0: 
     print("Took", tries) 
     print(final) 
     break

的結果是這樣的：

Took 15545 
DrawDate 
16-May-2017 6 
dtype: int64

來源

2017-05-23 21:34:13 RicLeal

您可以通過計算每行中的notnull值來添加到您擁有的內容。然後顯示匹配大於或等於2的行。

match_count = df[df[0:].isin(luckyDip)].notnull().sum(axis=1) 
print(match_count[match_count >= 2])

這會爲您提供匹配行的索引值和匹配數量。

輸出示例：

如果你也想從這些行的匹配值，您可以添加：

index = match_count[match_count >= 2].index 
matches = [tuple(x[~pd.isnull(x)]) for x in df.loc[index][df[0:].isin(luckyDip)].values] 
print(matches)

輸出示例：

[(19.0, 23.0), (19.0, 41.0), (19.0, 23.0, 34.0), (23.0, 28.0)]

來源

2017-05-23 21:44:01

從CSV列表中檢查數據

回答

相關問題