2015-07-10 82 views
0

捕獲「名稱」的內容時遇到問題:他經常出現在「pluralName」之前的其他頁面。有什麼更好的方法呢? (就性能而言最好的方式)。感謝您的幫助!如何使用正則表達式提取信息頁面

注:我使用python

有我需要的信息頁面的塊:

{"count":0,"items":[]},"shortUrl":"http:\/\/4sq.com\/11nP13T","likes":{"count":22,"groups":[{"type":"others","count":22,"items":[]}],"summary":"22 Likes"},"ratingColor":"FF9600","id":"5172311be4b0ecc0a12a9953","canonicalPath":"\/v\/kee-hiong-klang-bak-kut-teh\/5172311be4b0ecc0a12a9953","canonicalUrl":"https:\/\/foursquare.com\/v\/kee-hiong-klang-bak-kut-teh\/5172311be4b0ecc0a12a9953","rating":5.3,"categories":[**{"pluralName":"Chinese Restaurants","name":"Chinese Restaurant",**"icon":{"prefix":"https:\/\/ss3.4sqi.net\/img\/categories_v2\/food\/asian_","mapPrefix":"https:\/\/ss3.4sqi.net\/img\/categories_map\/food\/chinese","suffix":".png"},"id":"4bf58dd8d48988d145941735","shortName":"Chinese","primary":true},{"pluralName":"Asian Restaurants","name":"Asian Restaurant","icon":{"prefix":"https:\/\/ss3.4sqi.net\/img\/categories_v2\/food\/asian_","mapPrefix":"https:\/\/ss3.4sqi.net\/img\/categories_map\/food\/asian","suffix":".png"},"id":"4bf58dd8d48988d142941735","shortName":"Asian"}],"createdAt":1366438171,"tips":{"count":25,"groups":[{"count":25,"items":[{"logView":true,"text":"Portion is quite small and expensive. Service attitude is so so. The BKT taste is not my preference.One of the up car restaurants in SS2 which I'll never go back again. 👎","likes":{"count":1,"groups":[{"type":"others","count":1,"items":[{"photo":{"prefix":"https:\/\/irs0.4sqi.net\/img\/user\/","suffix":"\/43964080-5LYADRF2EEP2RWPL.jpg"},"lastName":".w","firstName":"Jackie","id":"43964080","canonicalPath":"\/user\/43964080","canonicalUrl":"https:\/\/foursquare.com\/user\/43964080","gender":"female"}]}],"summary":"1 like"},"id":"541c2b73498eb0cfe1f76b9e","canonicalPath":"\/item\/541c2b73498eb0cfe1f76b9e","canonicalUrl":"https:\/\/foursquare.com\/item\/541c2b73498eb0cfe1f76b9e","createdAt":1.411132275E9,"todo":{"count":0},"user":{"photo":{"prefix":"https:\/\/irs1.4sqi.net\/img\/user\/","suffix":"\/5765949-NW4BAJWFBCVLRR1M.jpg"} 
+0

你有什麼需要精確匹配? –

+0

你能提供我想匹配在這個例子中的「亞洲餐廳」預期的輸出 – The6thSense

+0

,但是,我將運行到有標記「名」不同值的其他網頁: – user2905427

回答

1
(?:"pluralName":"[^"]*","name":"([^"]*))|(?:"name":"([^"]*)","pluralName") 

re.findall。看到演示試試這個。

https://regex101.com/r/hR7tH4/4

print re.findall(r'(?:"pluralName":"[^"]*","name":"([^"]*))|(?:"name":"([^"]*)","pluralName")',test_str) 
+0

謝謝你,有比較的結果是作爲輸出,但使用這些結果更容易。 – user2905427

1

不要使用正則表達式的。

而是使用JSON解析器並訪問生成的對象。這是更強大。

import json # part of python 
o = json.loads(str) 
+0

很好的答案!如果你展示一個如何使用'o'的例子,爲什麼這個答案更好呢?甚至可能在發佈的問題的上下文中。 – tsroten

+0

他分享的JSON片段是FUBAR,無法修復。 –

相關問題