Python字符串條

我正在嘗試使用strip（）來刪除某些HTML的結尾。我們的想法是，最終建設成一個圈這一點，但現在我只是想弄清楚如何使這項工作：Python字符串條

httpKey=("<a href=\"http://www.uaf.edu/academics/degreeprograms/index.html>Degree Programs</a>") 
httpKeyEnd=">" 

#the number in the httpKey that the httpKey end is at 
stripNumber=(httpKey.find(httpKeyEnd)) 
#This is where I am trying to strip the rest of the information that I do not need. 
httpKey.strip(httpKey.find(httpKeyEnd)) 
print (httpKey)

最終的結果是打印httpKey只篩選：

A HREF =「http://www.uaf.edu/academics/degreeprograms/index.html

來源

2014-11-02 akagent49

您的最終目標是什麼？你是否試圖提取href？如果是這樣，通過使用像[美麗的湯]（http://www.crummy.com/software/BeautifulSoup/）的HTML解析庫，你的生活會變得更加容易。 – senshin 2014-11-02 06:57:48

是的，那是目標。然而，這是我正在做的一項家庭作業任務，只能使用字符串操作。 – akagent49 2014-11-02 06:59:34

好的，接下來的問題是：故意在href之後沒有關閉引號，還是一個錯字？ – senshin 2014-11-02 07:01:29

對於你的情況，這將工作：

>>> httpKey=("<a href=\"http://www.uaf.edu/academics/degreeprograms/index.html>Degree Programs</a>") 
>>> httpKey[1:httpKey.index('>')] 
'a href="http://www.uaf.edu/academics/degreeprograms/index.html'

來源

2014-11-02 07:26:26

find將返回其中串中發現的索引（一個數字），並strip刪除字符從字符串的末尾開始;它並沒有消除「從這一點開始的一切」。

你想使用字符串的切片來代替：

>>> s = 'hello there: world!' 
>>> s.index(':') 
11 
>>> s[s.index(':')+1:] 
' world!'

如果你只是想找出環節是什麼，使用圖書館像BeautifulSoup：

>>> from bs4 import BeautifulSoup as bs 
>>> doc = bs('<a href="http://www.uaf.edu/academics/degreeprograms/index.html">Degree Programs</a>') 
>>> for link in doc.find_all('a'): 
...  print(link.get('href')) 
... 
http://www.uaf.edu/academics/degreeprograms/index.html

來源

2014-11-02 07:02:36

這樣就可以切割一切，我需要63號字符之前的所有內容，後。 – akagent49 2014-11-02 07:23:45

Python字符串條

回答

相關問題