2017-06-16 84 views
0

我想收集所有可用的谷歌評論的業務。舉個例子,喬治亞州有超過10名泌尿科醫生,列在谷歌評論中。但是,當我運行這個代碼時,它給了我一個csv文件中只有4-5名泌尿科醫生的信息。但是,我需要所有在至少有一項評分/評論的企業中的信息在Google評論中列出。我應該在這個代碼中做什麼修改? 謝謝,爲什麼此代碼不打印所有可用業務的評論?

import requests 
import csv 
import pprint 

#sending get request. 
main_api = "https://maps.googleapis.com/maps/api/place/textsearch/json?" 
parameters = {"query":"Urologists, Georgia", 
      "key":" "} #enter api key here. 
resp = requests.get(main_api, parameters).json() 


#it selects the places with at least one rating, and puts their place id in place_id. 
place_id = [] 
for i in range(len(resp['results'])-1): 
    if 'rating' in resp['results'][i]: 
     place_id.append(resp['results'][i]['place_id']) 

#creating a csv file and with headings. 
with open("Urologists_FunGeorgia_Google.csv", "w") as toWrite: 
    writer=csv.writer(toWrite) 
    writer.writerow(['Date Collected', 'Health Care Provider', 'HCP location', 'Website Review is From', 'Specialty', 'Reviewer Name',\ 
     'Date of Review', 'Reviewer Demographics(gender/race)', 'Star Rating', 'How Many Stars', 'Other Meta-Data', 'Review', 'URL']) 
    #getting responses using place ids collected in place_id. 
    for ids in place_id: 
     details_api = "https://maps.googleapis.com/maps/api/place/details/json?" 
     parameters = {"placeid": ids, 
        "key":" " } #api key here. 
     detail_resp = requests.get(details_api, parameters) 
     resp1 = detail_resp.json() 
     reviewss = resp1['result']['reviews'] 
     doc_name=resp1['result']['name'] 
     doc_url = resp1['result']['url'] 
     city_state = resp1['result']['formatted_address'] 
     website = 'GOOGLE' 
     specialty = 'Urologists' 
     date_collected = 'June 15 2017' 
     total_poss = '5' 
     #gets multiple reviews of the physician(if any). 
     for i in range(len(reviewss)-1): 
      rating = resp1['result']['reviews'][i]['rating'] 
      revname = resp1['result']['reviews'][i]['author_name'] 
      rev = resp1['result']['reviews'][i]['text'] 
      date_review = resp1['result']['reviews'][i]['relative_time_description'] 
      rev_url = resp1['result']['reviews'][i]['author_url'] 

      writer.writerow([date_collected, doc_name, city_state, website, specialty, revname, date_review, rev_url, rating, total_poss, '', rev, doc_url]) 
+0

爲什麼你從你正在循環的列表長度中減去1? – Barmar

+0

循環遍歷列表的pythonic方法是使用'for list in list:',而不是'in range in(len(list)):' – Barmar

+0

@Barmar,這樣它就不會超出範圍。 – kandal

回答

0

因爲你從len(reviewss)在內部循環減1,你跳過最後的審查。如果一名泌尿科醫生只有1次複查,你會完全跳過該泌尿科醫生。因此,您只會將超過1次審覈的泌尿科醫生放入您的CSV文件中。

擺脫for這兩個循環中的-1。但是我建議你更改爲pythonic for item in list:語法,並在設置place_id時更改列表理解。

由於某些評論沒有review_url屬性,因此您需要提供默認值。你可以這樣做review.get('author_url', '')

import requests 
import csv 
import pprint 

#sending get request. 
main_api = "https://maps.googleapis.com/maps/api/place/textsearch/json?" 
parameters = {"query":"Urologists, Georgia", 
      "key":" "} #enter api key here. 
resp = requests.get(main_api, parameters).json() 

#it selects the places with at least one rating, and puts their place id in place_id. 
place_id = [result['place_id'] for result in resp['results'] if 'rating' in result] 

#creating a csv file and with headings. 
with open("Urologists_FunGeorgia_Google.csv", "w") as toWrite: 
    writer=csv.writer(toWrite) 
    writer.writerow(['Date Collected', 'Health Care Provider', 'HCP location', 'Website Review is From', 'Specialty', 'Reviewer Name',\ 
     'Date of Review', 'Reviewer Demographics(gender/race)', 'Star Rating', 'How Many Stars', 'Other Meta-Data', 'Review', 'URL']) 
    #getting responses using place ids collected in place_id. 
    for ids in place_id: 
     details_api = "https://maps.googleapis.com/maps/api/place/details/json?" 
     parameters = {"placeid": ids, 
        "key":" " } #api key here. 
     detail_resp = requests.get(details_api, parameters) 
     result = detail_resp.json()['result'] 
     reviewss = result['reviews'] 
     doc_name=result['name'] 
     doc_url = result['url'] 
     city_state = result['formatted_address'] 
     website = 'GOOGLE' 
     specialty = 'Urologists' 
     date_collected = 'June 15 2017' 
     total_poss = '5' 
     #gets multiple reviews of the physician(if any). 
     for review in reviewss: 
      rating = review['rating'] 
      revname = review['author_name'] 
      rev = review['text'] 
      date_review = review['relative_time_description'] 
      rev_url = review.get('author_url', '') 

      writer.writerow([date_collected, doc_name, city_state, website, specialty, revname, date_review, rev_url, rating, total_poss, '', rev, doc_url]) 
+0

我按照你的說法做了修改。現在,它只打印少量結果(之前較少),並出現此錯誤:'KeyError:'author_url''。 – kandal

+0

這意味着某些評論沒有'author_url'字段。您需要檢查並提供默認值。 – Barmar

+0

我已經修正瞭如何解決這個問題的答案。 – Barmar