2017-02-14 47 views
2

問題ValueError異常:必須通過數據框使用「區域」列布爾值僅

在這個數據文件,美國被分成四個區域。

創建一個查詢,發現屬於區域1或2個縣,他的名字開始與「華盛頓」,且其POPESTIMATE2015比他們更大POPESTIMATE 2014年

該函數返回一個5×2數據幀與列= ['STNAME','CTYNAME']和與census_df相同的索引ID(按索引升序排列)。

CODE

def answer_eight(): 
    counties=census_df[census_df['SUMLEV']==50] 
    regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])] 
    washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")] 
    grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]] 
    return grew[grew['STNAME'],grew['COUNTY']] 

outcome = answer_eight() 
assert outcome.shape == (5,2) 
assert list (outcome.columns)== ['STNAME','CTYNAME'] 
print(tabulate(outcome, headers=["index"]+list(outcome.columns),tablefmt="orgtbl")) 

錯誤

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-77-546e58ae1c85> in <module>() 
     6  return grew[grew['STNAME'],grew['COUNTY']] 
     7 
----> 8 outcome = answer_eight() 
     9 assert outcome.shape == (5,2) 
    10 assert list (outcome.columns)== ['STNAME','CTYNAME'] 

<ipython-input-77-546e58ae1c85> in answer_eight() 
     1 def answer_eight(): 
     2  counties=census_df[census_df['SUMLEV']==50] 
----> 3  regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])] 
     4  washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")] 
     5  grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]] 

/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key) 
    1991    return self._getitem_array(key) 
    1992   elif isinstance(key, DataFrame): 
-> 1993    return self._getitem_frame(key) 
    1994   elif is_mi_columns: 
    1995    return self._getitem_multilevel(key) 

/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_frame(self, key) 
    2066  def _getitem_frame(self, key): 
    2067   if key.values.size and not com.is_bool_dtype(key.values): 
-> 2068    raise ValueError('Must pass DataFrame with boolean values only') 
    2069   return self.where(key) 
    2070 

ValueError: Must pass DataFrame with boolean values only 

我無言以對。我哪裏錯了?

感謝

+0

這是錯誤的'縣[census_df [ '區域'] == 1]'你要使用不同的DF來掩蓋你應該使用相同的DF :''縣[縣] ['REGION'] == 1]'或掩蓋父母df only'census_df [census_df ['REGION'] == 1]' – EdChum

+0

謝謝@EdChum,但我試圖做出改變。同樣的錯誤再次出現! –

+0

用更新的嘗試編輯你的問題,注意我說你不應該使用另一個形狀的df來掩蓋另一個df – EdChum

回答

2

您正在嘗試使用不同形狀的DF來掩蓋你的DF,這是錯誤的,另外你傳遞條件的方式不正確使用。當你比較df中的一列或一系列標量以產生一個布爾掩碼時,你應該只傳遞條件,而不是連續使用它。

def answer_eight(): 
    counties=census_df[census_df['SUMLEV']==50] 
    # this is wrong you're passing the df here multiple times 
    regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])] 
    # here you're doing it again 
    washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")] 
    # here you're doing here again also 
    grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]] 
    return grew[grew['STNAME'],grew['COUNTY']] 

你想:

def answer_eight(): 
    counties=census_df[census_df['SUMLEV']==50] 
    regions = counties[(counties['REGION']==1]) | (counties['REGION']==2])] 
    washingtons = regions[regions['COUNTY'].str.startswith("Washington")] 
    grew = washingtons[washingtons['POPESTIMATE2015']>washingtons['POPESTIMATES2014']] 
    return grew[['STNAME','COUNTY']] 
相關問題