如何在行中存在某些字符時獲取熊貓數據框中的子字符串？

我有一個數據框，其中某些行包含特殊字符'＃'。如何在行中存在某些字符時獲取熊貓數據框中的子字符串？

這是我的數據，我能找到的「＃」索引的位置：

import pandas as pd 
df = pd.DataFrame(data=['fig#abc', 'strawberry', 'applepie#efg'], columns=['fruitname']) 
ind= df.fruitname.str.find("#") 
df['col1'].str.find(".")-1] 
print df 
print ind 


    fruitname 
0 fig#abc 
1 strawberry 
2 applepie#efg 

0 3 
1 -1 
2 8

我想與之前「＃」前幾個字符的值的新列的數據，如果索引「＃ '是大於4，其它方式與原始數據的價值：

fruitname_new 
0 fig#abc 
1 strawberry 
2 applepie

什麼是得到這樣的結果的最佳方式？

來源

2017-05-27 mnnmountain

#use apply to split fruitname and then check the length before setting the new fruitname column. 

df['fruitname_new'] = df.apply(lambda x: x.fruitname if len(x.fruitname.split('#')[0])<=4 else x.fruitname.split('#')[0], axis=1) 

df 
Out[484]: 
     fruitname fruitname_new 
0  fig#abc  fig#abc 
1 strawberry strawberry 
2 applepie#efg  applepie

來源

2017-05-27 06:41:56 Allen

如何在行中存在某些字符時獲取熊貓數據框中的子字符串？

回答

相關問題