2012-10-05 32 views
5

我試圖做一些html刮與cheerio(不能使用jsdon我有一個問題依賴:與contextify ECC的錯誤。)但我不能得到元「og:type」,「og:標題「...如何使用cheerio訪問meta OpenGraph?

request(Url, function(error, response, body) { 
var $ = cheerio.load(body); 
    $('meta').each(function() { 
     console.log( $('meta').attr('content')); 
    }); 
}); 

我只得到第一個metatext/html; charset = UTF-8「。你知道如何訪問嗎?

回答

3

你必須玩一下對象$('meta)的鍵並檢查是否存在所需的鍵,才能獲得結果。

試試這個代碼:

var cheerio = require('cheerio') 
var request = require('request') 

request(Url, function(error, response, body) { 
    var $ = cheerio.load(body); 

    var meta = $('meta') 
    var keys = Object.keys(meta) 

    var ogType; 
    var ogTitle; 

    keys.forEach(function(key){ 
    if ( meta[key].attribs 
     && meta[key].attribs.property 
     && meta[key].attribs.property === 'og:type') { 
     ogType = meta[key].attribs.content; 
    } 
    }); 

    keys.forEach(function(key){ 
    if ( meta[key].attribs 
     && meta[key].attribs.property 
     && meta[key].attribs.property === 'og:title') { 
     ogTitle = meta[key].attribs.content; 
    } 
    }); 

    console.log(ogType); 
    console.log(ogTitle); 
}); 
+0

喜什麼是.attribs.property?我無法在cheerio中找到它,它在javascript中是本地的嗎? – MkM

+0

'.attribs'保存屬性; '.attribs.property'屬性屬性,如果存在。因此,在檢查值之前檢查'.attrib'和'.attrib.property'是否存在。請使用以下網址測試此代碼:'http:// geekli.st/goranhalusa/micro/23359' –

1

擴展在赫爾曼的回答:

我發現node-crawler + cheerio組合是有點更易於管理下面的代碼使得它更容易一點追蹤哪些標籤屬性你正在尋找,並可以很容易地調整,以包括其他標籤。以下是我做的:

var crawler = require('crawler'), 
    url = require('url'); 

    var c = new crawler({ 
    maxConnections:10, 
    callback:function(error,response,$) {  
     var data = { 
     'og:type':null, 
     'og:title':null, 
     'og:description':null, 
     'og:image':null, 
     'twitter:title':null, 
     'twitter:image':null, 
     'twitter:description':null, 
     'twitter:site':null, 
     'twitter:creator':null, 
     } 
     var meta = $('meta'); 
     var keys = Object.keys(meta); 
     for (var s in data) { 
     keys.forEach(function(key) { 
      if (meta[key].attribs 
      && meta[key].attribs.property 
      && meta[key].attribs.property === s) { 
       data[s] = meta[key].attribs.content; 
      } 
     }) 
     } 
     console.log(data); 
    } 
    }) 
    c.queue([ YOUR URL HERE ]) 
6

一個簡單的解決辦法是,如果你知道你想哪個屬性:(假設你想拿到冠軍)

var $ = cheerio.load(html); 
var result = $('meta[property="og:title"]').attr('content');