只需要請求即可。該職位是http://ipindiaservices.gov.in/publicsearch/resources/webservices/search.php一個PARAMRC_這是我們與了time.time創建一個時間戳。
在"field[]"
每個值應該匹配到每個"fieldvalue[]"
和反過來匹配"operator[]"
是否選擇*AND*
*OR*
或*NOT*
,即我們傳遞值(S)陣列,每個密鑰指定後[]
,沒有,沒有什麼會的工作:
data = {
"publication_type_published": "on",
"publication_type_granted": "on",
"fieldDate": "APD",
"datefieldfrom": "19120101",
"datefieldto": "20160906",
"operatordate": " AND ",
"field[]": ["PA"], # claims,.description, patent-number codes go here
"fieldvalue[]": ["chris*"], # matching values for ^^ go here
"operator[]": [" AND "], # matching sql logic for ^^ goes here
"page": "1", # gives you next page results
"start": "0", # not sure what effect this actually has.
"limit": "25"} # not sure how this relates as len(r.json()[u'record']) stays 25 regardless
import requests
from time import time
post = "http://ipindiaservices.gov.in/publicsearch/resources/webservices/search.php?_dc={}".format(
str(time()).replace(".", ""))
with requests.Session() as s:
s.get("http://ipindiaservices.gov.in/publicsearch/")
s.headers.update({"X-Requested-With": "XMLHttpRequest"})
r = s.post(post, data=data)
print(r.json())
輸出將看起來像下面,我不能添加這一切因爲有太多的數據要發佈:
{u'success': True, u'record': [{u'Publication_Status': u'Published', u'appDate': u'2016/06/16', u'pubDate': u'2016/08/31', u'title': u'ACTUATOR FOR DEPLOYABLE IMPLANT', u'sourceID': u'inpat', u'abstract': u'\n Systems and methods are provided for usin.............
如果使用記錄鍵你喜歡類型的字典列表:
{u'Publication_Status': u'Published', u'appDate': u'2015/01/27', u'pubDate': u'2015/06/26', u'title': u'CORRUGATED PALLET', u'sourceID': u'inpat', u'abstract': u'\n A corrugated paperboard pallet is produced from two flat blanks which comprise a pallet top and a pallet bottom. The two blanks are each folded to produce only two parallel vertically extending double thickness ribs three horizontal panels two vertical side walls and two horizontal flaps. The ribs of the pallet top and pallet bottom lock each other from opening in the center of the pallet by intersecting perpendicularly with notches in the ribs. The horizontal flaps lock the ribs from opening at the edges of the pallet by intersecting perpendicularly with notches and the vertical sidewalls include vertical flaps that open inward defining fork passages whereby the vertical flaps lock said horizontal flaps from opening.\n ', u'Assignee': u'OLVEY Douglas A., SKETO James L., GUMBERT Sean G., DANKO Joseph J., GABRYS Christopher W., ', u'field_of_invention': u'FI10', u'publication_no': u'26/2015', u'patent_no': u'', u'application_no': u'642/DELNP/2015', u'UCID': u'WVJ4NVVIYzFLcUQvVnJsZGczcVRmSS96Vkh3NWsrS1h3Qk43S2xHczJ2WT0%3D', u'Publication_Type': u'A'}
這是你的專利信息。
你可以看到,如果我們選擇在我們的瀏覽器中的幾個值,在值的所有fieldValue方法,場和操作排隊,AND
是默認的,所以你看到,每個選項:
所以找出代碼,選擇你想要的東西和職務。
那麼你有你要求的HTML。然而,這個頁面似乎是作爲一個web應用程序,所有東西都通過JavaScript處理(在'app.js'中)。所以你的方法很可能不起作用。你可能想看看,如果該網站提供的API可以使用 – UnholySheep
是的,我確實在尋找這樣的信息。這似乎並不存在。我也嘗試了幾個在線網絡刮板。有沒有辦法,我可以刮這個網站? –
正如我所說的,它更像是一個webapp而不是一個網站(因爲它完全是通過javascript來驅動的)。您可能可以使用Selenium做些事情,但我從未使用它。 – UnholySheep