2016-11-29 38 views
-3

編輯錯誤 '整數需要' 使用str.replace

代碼時低於:

import sys 
import pymysql 
import pandas as pd 
import numpy as np 

conn = pymysql.connect(host='localhost', user='root', password=secret, 
        db='first_day', charset='utf8') 
curs = conn.cursor(pymysql.cursors.DictCursor) 
sql = "select * from first_day_datas" 
curs.execute(sql) 
rows = curs.fetchall() 
df = pd.DataFrame(rows) 
df = df[pd.notnull(df['longitude'])] 
df.registerdate= df.registerdate.astype(str) # 칼럼 속성 바꾸기 
df2 = pd.to_datetime(df['registerdate']) 
df2 = df2.dt.strftime('%Y%m') # 2016-10-10 이런걸 20161010로 바꿔줌 
df2_df = df2.to_frame() # 시리즈를 데이터프레임형식으로 변환 
df2_df.index.names = ['ID_']# id 칼럼 만들기 
df.index.names = ['ID_'] 
df = df.reset_index()# id 값넣기 
df2_df = df2.reset_index() 
df3 = df.merge(df2_df , on = 'ID_') 
df3.registerdate_y = df3.registerdate_y.astype(int) # 칼럼 속성 바꾸기 
df4 = df3[(df3['registerdate_y'] >= 201402) & (df3['registerdate_y'] < 201406)] # 칼럼에 조건걸어 빼기 
df5 = df4[df4['address'].str.contains('한남동')] 
df6 = df5['blogtext'].astype(str).replace('\n', '') #\n을 바꿈 
df7 = df6[(df6['blogtext'] != 'None')] # 칼럼에 조건걸어 빼기 
df7.to_csv(r'E:\내논문자료\wordcloud\test1\1402_06.csv') 
with open(r'E:\내논문자료\wordcloud\test1\1402_06.txt', 'w', encoding='utf-8') as f: 
for row in map(str, df7['blogtext']): 
    f.write(row + "\n") 

但是,當我在df6工作我得到一個錯誤

Traceback (most recent call last): 
File "pandas\index.pyx", line 161, in pandas.index.IndexEngine.get_loc (pandas\index.c:4289) 
File "pandas\src\hashtable_class_helper.pxi", line 404, in pandas.hashtable.Int64HashTable.get_item (pandas\hashtable.c:8534) 
TypeError: an integer is required 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
File "E:/빅데이터 캠퍼스/untitled1/handling data.py", line 101, in <module> 
df7 = df6[(df6['blogtext'] != 'None')] # 칼럼에 조건걸어 빼기 
File "C:\Python34\lib\site-packages\pandas\core\series.py", line 601, in __getitem__ 
result = self.index.get_value(self, key) 
File "C:\Python34\lib\site-packages\pandas\indexes\base.py", line 2169, in get_value 
tz=getattr(series.dtype, 'tz', None)) 
File "pandas\index.pyx", line 105, in pandas.index.IndexEngine.get_value (pandas\index.c:3567) 
File "pandas\index.pyx", line 113, in pandas.index.IndexEngine.get_value (pandas\index.c:3250) 
File "pandas\index.pyx", line 163, in pandas.index.IndexEngine.get_loc (pandas\index.c:4373) 
KeyError: 'blogtext' 

Process finished with exit code 1 

哪有我解決了嗎?

+4

請提供一個可重現的錯誤示例和**整個錯誤追溯*** –

+0

噢好吧。我犯了一個錯誤 – victory

回答

3

從您的代碼中刪除.str。 Python文檔顯示,指示一個字符串變量(或文字)應該在那裏。不,從字面上看,「str」。

+0

謝謝〜但我有同樣的問題..我不知道爲什麼.T.T – victory

+0

你有更新你的代碼在你的問題?其中一行仍然包含'.str'。這一個:'df5 = df4 [df4 ['address']。str.contains('한남동')]' –

+0

哦,我解決它。謝謝~~ – victory