編輯:請不要DOWNVOTE沒有發出你爲什麼下樓的感覺。我正在盡我最大的努力寫這篇文章!Python - 爲什麼findall正則表達式找不到這個特定的文本?
我正嘗試在網站上打印所有手錶的URL鏈接。除了一個之外,我已經把所有這些打印都打印好了,即使那個打印機和其他打印機具有完全相同的正則表達式條件。有人可以解釋爲什麼這不是打印請嗎?我在某處弄錯了一些語法嗎?下面的代碼應該能夠被粘貼到Python編輯器(即IDLE)中並運行。
## Import required modules
from urllib import urlopen
from re import findall
import re
## Provide URL
dennisov_url = 'https://denissov.ru/en/'
## Open and read URL as string named 'dennisov_html'
dennisov_html = urlopen(dennisov_url).read()
## Find all of the links when each watch is clicked (those with the designated
## preceeding text 'window.open', then any character that occurs zero or more
## times, then the text '/en/'. Remove matches with the word "History" and
## any " symbols in the URL.
watch_link_urls = findall('window.open.*(/en/[^history][^"]*/)', dennisov_html)
## For every URL, convert it into a string on a new line and add the domain
for link in watch_link_urls:
link = 'https://denissov.ru' + link
## Print out the full URLs
print link
## This code should show the link https://denissov.ru/en/speedster/ yet
## it isn't showing. It has the exact preceeding text as the other links
## that are printing and is in the same div container. If you inspect the
## website then search 'en/barracuda_mechanical/ and then 'en/speedster/'
## you will see that the speedster link is only a few lines below barracuda
## mechanical and there is nothing different about the two's preceeding
## text, so speedster should be printing
噢是的,'[^ history] [^「] *'部分被搞砸了,它意味着任何字符,但是h,I,s,t,o,r,y跟着螞蟻字符,而後跟'''。 –