剝離大熊貓數據框索引'\ n'和空格

從html響應中獲取數據並通過pandas Dataframe使用下面的代碼提供數據後，轉置數據並打印結果。剝離大熊貓數據框索引' n'和空格

r1 = bs4.BeautifulSoup(r.text, 'lxml').prettify() 
r3 = pandas.read_html(r1, header=None, index_col=None)[0] 
r3.dropna(inplace=True) 

r4 = pandas.DataFrame.transpose(r3) 

r5 = r4.index 

print(r5)

我收到下面的結果。

Index(['\n      ', 
     '\n      2006-12\n      ', 
     '\n      2007-12\n      ', 
     '\n      2008-12\n      ', 
     '\n      2009-12\n      ', 
     '\n      2010-12\n      ', 
     '\n      2011-12\n      ', 
     '\n      2012-12\n      ', 
     '\n      2013-12\n      ', 
     '\n      2014-12\n      ', 
     '\n      2015-12\n      ', 
     '\n      TTM\n      '], 
     dtype='object')

如何剝離在這個指數只是有數字和TTM所有'\n'和white spaces？

來源

2017-02-11 jake wong

IIUC你能做到這樣：從@Joe Lin

In [98]: i 
Out[98]: 
Index(['\n      ', '\n      2006-12\n      ', '\n      2007-12\n 
     ', 
     '\n      2008-12\n      ', '\n      2009-12\n      ', '\n 
     2010-12\n      ', 
     '\n      2011-12\n      ', '\n      2012-12\n      ', '\n 
     2013-12\n      ', 
     '\n      2014-12\n      ', '\n      2015-12\n      ', '\n 
     TTM\n      '], 
     dtype='object') 

In [99]: i = i.str.replace(r'[\n\s]+', '') 

In [100]: i 
Out[100]: Index(['', '2006-12', '2007-12', '2008-12', '2009-12', '2010-12', '2011-12', '2012-12', '2013-12', '2014-12', '2015-12', 'TTM'], d 
type='object')

更好的解決方案：

In [103]: i.str.strip() 
Out[103]: Index(['', '2006-12', '2007-12', '2008-12', '2009-12', '2010-12', '2011-12', '2012-12', '2013-12', '2014-12', '2015-12', 'TTM'], d 
type='object')

來源

2017-02-11 18:10:18 MaxU

'i.str.strip（）'可以更簡單。 – j0e1in

@JoeLin，好點，謝謝！我已將它添加到答案 – MaxU

謝謝！這工作很好 –

剝離大熊貓數據框索引'\ n'和空格

回答

相關問題