什麼是適當的nokogiri xpath來獲得一系列的行？

-1

<tr class="style6"><td>SomeStuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr class="style6"><td>SomeStuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr> 
<tr><td>Some other stuff</td></tr>

我想行（開始用style6類到最後一行的下一個style6發生之前）塊分成，我可以遍歷組。有沒有將它分成塊的方法？我知道Xpath position函數，但不確定它在這種情況下是否合理。

任何想法？

來源

2016-09-16 Nick ONeill

-1

一個有用的模式是計算以前出現的<tr class="style6"><td>SomeStuff</td></tr>。

對於您的示例中的第一組，這將是：

//tr[not(@class="style6")][count(preceding-sibling::tr[@class="style6"])=1]

對於第二組：

//tr[not(@class="style6")][count(preceding-sibling::tr[@class="style6"])=2]

等

我不使用nokogiri所以這裏有一個例子使用Python和lxml：

>>> import lxml.html 
>>> from pprint import pprint 

>>> doc = lxml.html.fromstring('''<tr class="style6"><td>SomeStuff</td></tr> 
... <tr><td>Some other stuff group 1</td></tr> 
... <tr><td>Some other stuff group 1</td></tr> 
... <tr><td>Some other stuff group 1</td></tr> 
... <tr><td>Some other stuff group 1</td></tr> 
... <tr><td>Some other stuff group 1</td></tr> 
... <tr class="style6"><td>SomeStuff</td></tr> 
... <tr><td>Some other stuff group 2</td></tr> 
... <tr><td>Some other stuff group 2</td></tr> 
... <tr><td>Some other stuff group 2</td></tr> 
... <tr><td>Some other stuff group 2</td></tr> 
... <tr><td>Some other stuff group 2</td></tr> 
... <tr class="style6"><td>SomeStuff</td></tr> 
... <tr><td>Some other stuff group 3</td></tr> 
... <tr><td>Some other stuff group 3</td></tr> 
... <tr><td>Some other stuff group 3</td></tr> 
... <tr><td>Some other stuff group 3</td></tr> 
... <tr><td>Some other stuff group 3</td></tr>''') 

>>> pprint(list(lxml.html.tostring(row) 
...   for row in doc.xpath(''' 
...     //tr[not(@class="style6")] 
...      [count(preceding-sibling::tr[@class="style6"])=1]'''))) 
[b'<tr><td>Some other stuff group 1</td></tr>\n', 
b'<tr><td>Some other stuff group 1</td></tr>\n', 
b'<tr><td>Some other stuff group 1</td></tr>\n', 
b'<tr><td>Some other stuff group 1</td></tr>\n', 
b'<tr><td>Some other stuff group 1</td></tr>\n'] 
>>> pprint(list(lxml.html.tostring(row) 
...   for row in doc.xpath(''' 
...     //tr[not(@class="style6")] 
...      [count(preceding-sibling::tr[@class="style6"])=2]'''))) 
[b'<tr><td>Some other stuff group 2</td></tr>\n', 
b'<tr><td>Some other stuff group 2</td></tr>\n', 
b'<tr><td>Some other stuff group 2</td></tr>\n', 
b'<tr><td>Some other stuff group 2</td></tr>\n', 
b'<tr><td>Some other stuff group 2</td></tr>\n'] 
>>> pprint(list(lxml.html.tostring(row) 
...   for row in doc.xpath(''' 
...     //tr[not(@class="style6")] 
...      [count(preceding-sibling::tr[@class="style6"])=3]'''))) 
[b'<tr><td>Some other stuff group 3</td></tr>\n', 
b'<tr><td>Some other stuff group 3</td></tr>\n', 
b'<tr><td>Some other stuff group 3</td></tr>\n', 
b'<tr><td>Some other stuff group 3</td></tr>\n', 
b'<tr><td>Some other stuff group 3</td></tr>'] 
>>>

來源

2016-09-17 11:56:11

什麼是適當的nokogiri xpath來獲得一系列的行？

回答

相關問題