0
我在使用read_html函數在熊貓從一些html表中提取數據。熊貓read_html不存儲完整的數據
例如::
0 RECKITT BENCKISER INDIA PRIVATE LIMITED Vs.ST...
1 SMT. SONY AND ANOTHER Vs. STATE OF UTTARAKHA...
2 BHATIA BHAWAN DHARAMSHALA Vs. STATE OF UTTAR...
3 MOHD. YASEEN AND OTHERS Vs. STATE OF UTTARAK...
4 DR. ADITYA PRAKASH SINGH Vs. STATE OF UTTARA...
5 DR. MANOJ KUMAR UNIYAL Vs. STATE OF UTTARAKH...
6 DR. LALIT MOHAN PANDEY Vs. STATE OF UTTARAKH...
7 SUBHAM SAINI AND ANOTHER Vs. STATE OF UTTARA...
在這裏每一種情況下該表應具有存儲北阿坎德邦狀態(+更多數據)
從源代碼,但由於某種原因,輸出得到一定大小後切:
<span class="style2">RECKITT BENCKISER INDIA PRIVATE LIMITED
</span><br><span class="style4"> Vs.</span><br><span
class="style2">STATE OF UTTARAKHAND AND ANOTHER
</span></td><td width="20%"
如何解決此問題。
我只是做:
df = pd.read_html(test,flavor='html5lib',header=0)
print (df)
包括你從獲取表的URL。 – MYGz
網址是某些回覆的一部分:http://pastebin.com/raw/p7vfb2JG – Shrey