2014-02-14 40 views
0

在使用casperjs,當我推出這個代碼:獲取與casperjs遠程txt文件的內容,而不html標籤

var casper = require('casper').create(); 
var url = 'https://www.youtube.com/robots.txt'; 

casper.start(url, function() { 
    var js = this.evaluate(function() { 
     return document; 
    }); 
    this.echo(js.all[0].innerHTML); 
}); 
casper.run(); 

非但沒有這樣的:

# robots.txt file for YouTube 
# Created in the distant future (the year 2000) after 
# the robotic uprising of the mid 90's which wiped out all humans. 

User-agent: Mediapartners-Google* 
Disallow: 

User-agent: * 
Disallow: /bulletin 
Disallow: /comment 
Disallow: /forgot 
Disallow: /get_video 
Disallow: /get_video_info 
Disallow: /login 
Disallow: /results 
Disallow: /signup 
Disallow: /t/terms 
Disallow: /t/privacy 
Disallow: /verify_age 
Disallow: /videos 
Disallow: /watch_ajax 
Disallow: /watch_popup 
Disallow: /watch_queue_ajax 

我得到這樣的結果:

<head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;"># robots.txt file for YouTube 
# Created in the distant future (the year 2000) after 
# the robotic uprising of the mid 90's which wiped out all humans. 

User-agent: Mediapartners-Google* 
Disallow: 

User-agent: * 
Disallow: /bulletin 
Disallow: /comment 
Disallow: /forgot 
Disallow: /get_video 
Disallow: /get_video_info 
Disallow: /login 
Disallow: /results 
Disallow: /signup 
Disallow: /t/terms 
Disallow: /t/privacy 
Disallow: /verify_age 
Disallow: /videos 
Disallow: /watch_ajax 
Disallow: /watch_popup 
Disallow: /watch_queue_ajax 
</pre></body> 

casperjs似乎是添加html標籤。我怎樣才能得到純粹的txt文件完全一樣的來源?

回答

0

download功能呢?

腳本成爲

var casper = require('casper').create(); 
var url = 'https://www.youtube.com/robots.txt'; 

casper.start(url, function() { 
    this.download(url, 'robots.txt'); 
}); 
casper.run(); 

UPDATE

如果你想遠程文件內容存儲到一個字符串,使用base64encode

var casper = require('casper').create(); 
var url = 'https://www.youtube.com/robots.txt'; 
var contents; 
casper.start(url, function() { 
    contents = atob(this.base64encode(url)); 
    console.log(contents); 
}); 

casper.run(); 
+0

其實,它的工作原理,但它的下載該文件,儘管我試圖把它的內容直接放入一個字符串中。如果我啓動了多個實例,則必須使用此解決方案處理覆蓋:/ – mattspain

+0

好吧,檢查我的編輯 – Cybermaxs