Scrapy如何提取樣式屬性？

的HTML元素的下方，Scrapy如何提取樣式屬性？

<div style="width: 80.42%;" class="classA"></div>

使用這個代碼，我可以提取的整體風格元素：

response.xpath("//div[@class='classA']").xpath("@style").extract()

但我希望得到的風格元素的寬度值，即80.42％，我能怎麼做？

來源

2017-03-08 Julian Zhang

你可以使用.RE（），像這樣：

response.xpath("//div[@class='classA']").xpath("@style").re('width: (\d+\.\d+%)')

它可以工作

來源

2017-03-08 01:54:57 jhao104

你可以使用cssutils，先用安裝：

$ pip install cssutils

然後在你的代碼中使用它：

import cssutils 
... 

css_style = response.xpath("//div[@class='classA']/@style").extract() 
parsed_css = cssutils.parseStyle(css_style) 
print parsed_css.width # 80.42%

來源

2017-03-08 01:07:37 eLRuLL

我也只是把它當作一個文本字符串，並根據需要把它分解：

text = '<div style="width: 80.42%;" class="classA"></div>' 

if "width:" in text: 
    # split at first occurance of "width:" take everything thereafter 
    text = text.split("width:",1)[1] 
    # split at first semicolon take everything before 
    text = text.split(";",1)[0] 
    # strip whitespace 
    text = " ".join(text.split()) 

print text 

>>>80.42%

或使用百分號代替分號：

text = '<div style="width: 80.42%;" class="classA"></div>)' 

if "width:" in text: 
    # split after width 
    text = text.split("width:",1)[1] 
    # split before percent 
    text = text.split("%",1)[0] 
    # add back percent 
    text += '%' 
    # strip whitespace 
    text = " ".join(text.split()) 


print text 

>>>80.42%

或簡潔

text = '<div style="width: 80.42%;" class="classA"></div>)' 

if "width:" in text: 
    text = " ".join(((text.split("width:",1)[1]).split("%",1)[0]+'%').split()) 

print text 

>>>80.42%

來源

2017-03-08 03:43:41 litepresence

Scrapy如何提取樣式屬性？

回答

相關問題