0
我對圖片我試過的Python 3.6圖像從谷歌圖片搜索</p> <p>crwaling
1.Open的鍍鉻驅動器與硒
2.向下滾動到結束
3。使用BeautifulSoup獲取圖片網址並保存圖片
但這是一個問題,因爲圖片太小
所以,我發現有SRC
它是在src原始圖像的圖像irc_mi類
的(以「.jpg」結尾),但我不知道如何將其拉出
我嘗試使用find_all作爲類名,但失敗了。
我該怎麼辦?
這裏是源代碼
def Remainder_All_ImagesURLs_Google(searchText):
def scroll_page():
for i in range(7):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
sleep(3)
def click_button():
more_imgs_button_xpath = "//*[@id='smb']"
element = driver.find_element_by_xpath(more_imgs_button_xpath)
element.click()
sleep(3)
def create_soup():
html_source = driver.page_source
soup = BeautifulSoup(html_source, 'html.parser')
return soup
def find_imgs():
soup = create_soup()
imgs_urls = []
for img in soup.find_all('img'):
try:
if img['src'].startswith('http'):
imgs_urls.append(img['src'])
except:
pass
return imgs_urls
driver = webdriver.Chrome('C:/chromedriver.exe')
driver.maximize_window()
sleep(2)
searchUrl = "https://www.google.com/search?q={}&site=webhp&tbm=isch".format(searchText)
driver.get(searchUrl)
try:
scroll_page()
click_button()
scroll_page()
except:
click_button()
scroll_page()
imgs_urls = find_imgs()
driver.close()
return(imgs_urls)
def download_image(url,filename):
full_name = str(filename) + ".jpg"
urllib.request.urlretrieve(url, 'C:/Python/Picture' + full_name)