2016-07-28 45 views
0

我寫了一些代碼來抓取我需要的數字this website,但我不知道接下來要做什麼。從我的網站上使用Python和美味湯把表格數據從網站插入我自己的網站上的表格

它抓取底部表格中的數字。在產犢容易,出生體重,斷奶體重,一歲體重,牛奶和孕產婦總數。

#!/usr/bin/python 
import urllib2 
from bs4 import BeautifulSoup 
import pyperclip 

def getPageData(url): 

    if not ('abri.une.edu.au' in url): 
     return -1 
    webpage = urllib2.urlopen(url).read() 
    soup = BeautifulSoup(webpage, "html.parser") 

    # This finds the epd tree and saves it as a searchable list 
    pedTreeTable = soup.find('table', {'class':'TablesEBVBox'}) 

    # This puts all of the epds into a list. 
    # it looks for anything in pedTreeTable with an td tag. 
    pageData = pedTreeTable.findAll('td') 
    pageData.pop(7) 
    return pageData 

def createPedigree(animalPageData): 
    ''' make animalPageData much more useful. Strip the text out and put it in a dict.''' 
    animals = [] 
    for animal in animalPageData: 
     animals.append(animal.text) 

    prettyPedigree = { 
    'calving_ease' : animals[18], 
    'birth_weight' : animals[19], 
    'wean_weight' : animals[20], 
    'year_weight' : animals[21], 
    'milk' : animals[22], 
    'total_mat' : animals[23] 
    }  

    for animalKey in prettyPedigree: 
     if animalKey != 'year_weight' and animalKey != 'dam': 
      prettyPedigree[animalKey] = stripRegNumber(prettyPedigree[animalKey]) 
    return prettyPedigree 

def stripRegNumber(animal): 
    '''returns the animal with its registration number stripped''' 
    lAnimal = animal.split() 
    strippedAnimal = "" 
    for word in lAnimal: 
     if not word.isdigit(): 
      strippedAnimal += word + " " 
    return strippedAnimal 

def prettify(pedigree): 
    ''' Takes the pedigree and prints it out in a usable format ''' 
    s = '' 

    pedString = "" 

    # this is also ugly, but it was the only way I found to format with a variable 
    cFormat = '{{:^{}}}' 
    rFormat = '{{:>{}}}' 

    #row 1 of string 
    s += rFormat.format(len(pedigree['calving_ease'])).format(
          pedigree['calving_ease']) + '\n' 

    #row 2 of string 
    s += rFormat.format(len(pedigree['birth_weight'])).format(
          pedigree['birth_weight']) + '\n' 

    #row 3 of string 
    s += rFormat.format(len(pedigree['wean_weight'])).format(
          pedigree['wean_weight']) + '\n' 

    #row 4 of string 
    s += rFormat.format(len(pedigree['year_weight'])).format(
          pedigree['year_weight']) + '\n' 

    #row 4 of string 
    s += rFormat.format(len(pedigree['milk'])).format(
          pedigree['milk']) + '\n' 

    #row 5 of string 
    s += rFormat.format(len(pedigree['total_mat'])).format(
          pedigree['total_mat']) + '\n' 


    return s 

if __name__ == '__main__': 
    while True: 
     url = raw_input('Input a url you want to use to make life easier: \n') 
     pageData = getPageData(url) 
     s = prettify(createPedigree(pageData)) 
     pyperclip.copy(s) 
     if len(s) > 0: 
      print 'the easy string has been copied to your clipboard' 

我剛剛使用此代碼輕鬆複製和粘貼。我所要做的就是插入URL,並將它保存到我的剪貼板中。

現在我想在我的網站上使用此代碼;我希望能夠在我的HTML代碼中插入一個URL,並在表格的頁面上顯示這些數字。

我的問題如下:

  1. 我如何使用Python代碼的網站上?
  2. 如何將收集到的數據插入到HTML表格中?
+0

你在你的網站上使用了什麼框架? –

+0

我希望能在多個網站上實現這一點。另外我不完全理解不同類型的框架。我非常業餘,我的網站也是。我不知道這個信息是否有幫助,但我使用HTML,CSS和JavaScript –

回答

0

這聽起來像你想要使用類似Django。雖然學習曲線有點陡峭,但它值得它它(當然)支持python。