Python .replace運行兩次

python仍然很新穎，第一次使用.replace，我遇到了一個奇怪的問題。Python .replace運行兩次

url_base = 'http://sfbay.craigslist.org/search/eby/apa' 
params = dict(bedrooms=1, is_furnished=1) 
rsp = requests.get(url_base, params=params) 
# BS4 can quickly parse our text, make sure to tell it that you're giving   html 
html = bs4(rsp.text, 'html.parser') 

# BS makes it easy to look through a document 
#print(html.prettify()[:1000]) 

# BS4 can quickly parse our text, make sure to tell it that you're giving html 
html = bs4(rsp.text, 'html.parser') 

# BS makes it easy to look through a document 
print(html.prettify()[:1000]) 
# find_all will pull entries that fit your search criteria. 
# Note that we have to use brackets to define the `attrs` dictionary 
# Because "class" is a special word in python, so we need to give a string. 
apts = html.find_all('p', attrs={'class': 'row'}) 
print(len(apts)) 

# We can see that there's a consistent structure to a listing. 
# There is a 'time', a 'name', a 'housing' field with size/n_brs, etc. 
this_appt = apts[15] 
print(this_appt.prettify()) 

# So now we'll pull out a couple of things we might be interested in: 
# It looks like "housing" contains size information. We'll pull that. 
# Note that `findAll` returns a list, since there's only one entry in 
# this HTML, we'll just pull the first item. 
size = this_appt.findAll(attrs={'class': 'housing'})[0].text 
print(size) , 'this is the size' 

def find_size_and_brs(size): 
    split = size.strip('/- ').split(' - ') 
    print len(split) 
    if 'br' in split[0] and 'ft2' in split[0]: 
     print 'We made it into 1' 
     n_brs = split[0].replace('br -', '',) 
     this_size = split[0].replace('ft2 -', '') 
    elif 'br' in split[0]: 
     print 'we are in 2' 
     # It's the n_bedrooms 
     n_brs = split[0].replace('br', '') 
     this_size = np.nan 
    elif 'ft2' in split[0]: 
     print 'we are in 3' 
     # It's the size 
     this_size = split[0].replace('ft2', '') 
     n_brs = np.nan 
     print n_brs 
     print this_size 
    return float(this_size), float(n_brs) 
this_size, n_brs = find_size_and_brs(size)

此輸出：

We made it into 1 

      1 
      800ft2 - 


      1br - 
      800

我不能找出爲什麼它打印出來的數據兩次，更換一個單一的時間中的數據對於每個數據點。

想法？謝謝

來源

2016-10-05 Peter Hartnett

你是什麼意思「單次更換數據」？你期望什麼具體的輸出呢？ – BrenBarn

它不適合我。我得到了'ValueError：float（）：1br - 800'的無效文字。你確定你使用這段代碼得到了這個結果嗎？也許你運行不同的代碼？ – furas

@BrenBarn我期待得到1 800的輸出。基本上沒有br或ft2的數據。這有意義嗎？ –

現在適合我。我做了一些修改與strip，split並添加評論# <- here

url_base = 'http://sfbay.craigslist.org/search/eby/apa' 
params = dict(bedrooms=1, is_furnished=1) 
rsp = requests.get(url_base, params=params) 
# BS4 can quickly parse our text, make sure to tell it that you're giving   html 
html = bs4(rsp.text, 'html.parser') 

# BS makes it easy to look through a document 
#print(html.prettify()[:1000]) 

# BS4 can quickly parse our text, make sure to tell it that you're giving html 
html = bs4(rsp.text, 'html.parser') 

# BS makes it easy to look through a document 
#print(html.prettify()[:1000]) 
# find_all will pull entries that fit your search criteria. 
# Note that we have to use brackets to define the `attrs` dictionary 
# Because "class" is a special word in python, so we need to give a string. 
apts = html.find_all('p', attrs={'class': 'row'}) 
#print(len(apts)) 

# We can see that there's a consistent structure to a listing. 
# There is a 'time', a 'name', a 'housing' field with size/n_brs, etc. 
this_appt = apts[15] 
#print(this_appt.prettify()) 

# So now we'll pull out a couple of things we might be interested in: 
# It looks like "housing" contains size information. We'll pull that. 
# Note that `findAll` returns a list, since there's only one entry in 
# this HTML, we'll just pull the first item. 
size = this_appt.findAll(attrs={'class': 'housing'})[0].text 
#print(size) , 'this is the size' 

def find_size_and_brs(size): 
    split = size.strip().split(' - ') # <- here strip() 
    #print len(split) 
    if 'br' in split[0] and 'ft2' in split[0]: 
     print 'We made it into 1' 
     two = split[0].split('\n') # <- here split() 
     n_brs = two[0].replace('br -', '',).strip() # <- here two[0] and strip() 
     this_size = two[1].replace('ft2 -', '').strip() # <- here two[1] and strip() 
     #print '>', n_brs, '<' 
     #print '>', this_size, '<' 
    elif 'br' in split[0]: 
     print 'we are in 2' 
     # It's the n_bedrooms 
     n_brs = split[0].replace('br', '') 
     this_size = np.nan 
    elif 'ft2' in split[0]: 
     print 'we are in 3' 
     # It's the size 
     this_size = split[0].replace('ft2', '') 
     n_brs = np.nan 
     print n_brs 
     print this_size 
    return float(this_size), float(n_brs) 
this_size, n_brs = find_size_and_brs(size) 
print '>', this_size, '<' 
print '>', n_brs, '<'

PS。我使用>,<在print中查看空格。

來源

2016-10-05 03:08:57 furas

非常棒！另外我喜歡小費！使照片看起來非常好！ –

Python .replace運行兩次

回答

相關問題