2016-03-26 53 views
0

我有以下html代碼片段。我想要網頁抓取頁面以獲取主題和副主題並將其存儲在對象中。如何使用jQuery選擇器構建具有兄弟標籤的分層對象

期望的結果是什麼:

{ 
'topic': 'Java Basics', 
'subtopics':['Define the scope of variables', 'Define the structure of a Java class', ...] 
} 

我試圖使其與Jsdom工作於Node.js和JQuery:

var jsdom = require('jsdom'); 
var fs = require("fs"); 


var topicos = fs.readFileSync("topic.html", "utf-8"); 

    jsdom.env(topicos, ["http://code.jquery.com/jquery.js"], function (error, window) { 
     var $ = window.$; 
     var length = $('div ~ ').each(function() { 
      //??? 
      var topic = $(this); 
      var text = topic.text();     
      console.log(text.trim()) 
     }); 
    }) 

但由於我缺乏jQuery的經驗,我我無法正確組織層次結構。

HTML片段:

<div> 
    <strong>Java Basics&nbsp;</strong></div> 
<ul> 
    <li> 
     Define the scope of variables&nbsp;</li> 
    <li> 
     Define the structure of a Java class 
    </li> 
    <li> 
     Create executable Java applications with a main method; run a Java program from the command line; including 
     console output. 
    </li> 
    <li> 
     Import other Java packages to make them accessible in your code 
    </li> 
    <li> 
     Compare and contrast the features and components of Java such as: 
     platform independence, object orientation, encapsulation, etc. 
    </li> 
</ul> 
<div> 
    <strong>Working With Java Data Types&nbsp;</strong></div> 
<ul> 
    <li> 
     Declare and initialize variables (including casting of primitive data types) 
    </li> 
    <li> 
     Differentiate between object reference variables and primitive variables 
    </li> 
    <li> 
     Know how to read or write to object fields 
    </li> 
    <li> 
     Explain an Object's Lifecycle (creation, "dereference by reassignment" and garbage collection) 
    </li> 
    <li> 
     Develop code that uses wrapper classes such as Boolean, Double, and Integer. &nbsp;</li> 
</ul> 
... 

回答

1

這裏的工作片斷fiddle

var topicos = []; 

jQuery('div').each(function(){ 
var data = {}; 
var jThis = jQuery(this); 
    data.topic = jThis.find('strong').text(); 
    data.subtopics = []; 
    jThis.next('ul').find('li').each(function(){ 
    var jThis = jQuery(this); 
    data.subtopics.push(jThis.text()); 
    }); 
topicos.push(data); 
}); 

console.log(topicos); 

但我會強烈建議類添加到您的標記,並以此作爲選擇的,而不是標籤名稱:

<div class="js-topic-data"> 
    <div> 
    <strong class="js-topic">Java Basics&nbsp;</strong> 
    </div> 
    <ul> 
    <li class="js-sub-topic"> 
     Define the scope of variables&nbsp;</li> 
    <li> 
    </ul> 
</div> 

然後,你可以做類似:

jQuery('.js-topic-data').each(function(){ 
var data = {}; 
var jThis = jQuery(this); 
    data.topic = jThis.find('.js-topic').text(); 
    data.subtopics = []; 
    jThis.next('.js-sub-topic').each(function(){ 
    var jThis = jQuery(this); 
    data.subtopics.push(jThis.text()); 
    }); 
topicos.push(data); 
}); 

這對於標記更改等更加健壯