我試圖從pypi中提取pip包的許可信息,然後加載到熊貓數據框中。我之前做過一個例子,爲PD加載列表解析。但我無法弄清楚這一個...將數據加載到熊貓
到目前爲止,我已經寫了。
from requests import get
import pandas as pd
import pip
url = 'https://pypi.python.org/pypi'
# packages_list = ['numpy','twisted']
installed_packages = pip.get_installed_distributions()
installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
for i in installed_packages])
packages = []
licenses = []
summarys = []
for index, package in enumerate(installed_packages_list):
package = package.split("==")[0]
full_url = url+'/'+ package +'/json'
#print 'url is ' + full_url
page = get(url+'/'+package+'/json').json()
#print 'Package: ' + package + ', license is:' + page['info']['license'] + '. ' + page['info']['summary']
packages.append(package)
licenses.append(page['info']['license'])
summarys.append(page['info']['summary'])
print packages
pd_packages = pd.DataFrame(
{
"packages":[packages],
"licenses":[licenses],
"summarys":[summarys]
})
print pd_packages
什麼這是個問題嗎? –
它顯示類似於0 [MIT,,MPL-2.0,LGPL,UNKNOWN,BSD-like,BSD,... packages \ 0 [beautifulsoup4,bs4,certifi,chardet,get,i ... summarys 0 [屏幕抓取庫,虛擬包是... – vkk07
我想獲取這種數據在桌子的種類和轉儲到使用熊貓csv – vkk07