如何下載phpBB3論壇的所有帖子，如果我不是管理員？

我用來在一個論壇上發佈我的想法，並開始擔心如果它關閉，我會放鬆他們。你知道一個好的方法來下載整個（其他人的想法也很好！）phpBB3論壇到數據庫？有沒有軟件可用，或者我必須自己寫？如何下載phpBB3論壇的所有帖子，如果我不是管理員？

UPDATE1：

好了，我可以寫我自己 - 這不是一個很難的問題，不是嗎？我只是不想浪費時間發明自行車。

UPDATE2：

有一個在超級用戶回答：How can I download an entire (active) phpbb forum?

不過我更願意做一個Ruby腳本backuping論壇。這不是一個完整的解決方案，但它對我來說已經足夠了。是的，如果你很擔心，它不會違反任何TOS。

require :rubygems 
require :hpricot 
require 'open-uri' 
require :uri 
require :cgi 
#require 'sqlite3-ruby' 

class PHPBB 
    def initialize base_url 
    @base_url = base_url 
    @forums, @topics = Array.new(4) { {} } 
    self.parse_main_page 'main', 'index.php' 
    @forums.keys.each do |f| 
     self.parse_forum "forum.#{f}", "viewforum.php?f=#{f}" 
    end 
    @topics.keys.each do |t| 
     self.parse_topic "topic.#{t}", "viewtopic.php?t=#{t}" 
    end 
    end 


    def read_file cached, remote 
    local = "%s.%s.html" % [__FILE__, cached] 
    if File.exists? local 
     return IO.read local 
    else # download and save 
     puts "load #{remote}" 
     File.new(local, "w+") << (content = open(@base_url + remote).read) 
     return content 
    end 
    end 


    def parse_main_page local, remote 
    doc = Hpricot(self.read_file(local,remote)) 
    doc.search('ul.forums/li.row').each do |li| 
     fa = li.search('a.forumtitle').first # forum anchor 
     f = self.parse_anchor(fa)['f'] 
     @forums[f] = { 
     forum_id: f, 
     title: fa.inner_html, 
     description: li.search('dl/dt').first.inner_html.split('<br />').last.strip 
     } 
     ua, pa = li.search('dd.lastpost/span/a') # user anchor, post anchor 
     q = self.parse_anchor(pa) 
     self.last_post f, q['p'] unless q.nil? 
    end 
    end 

    def last_post f,p 
    @last_post = {forum_id: f, post_id: p} if @last_post.nil? or p.to_i > @last_post[:post_id].to_i 
    end 

    def last_topic f,t 
    end 


    def parse_forum local, remote, start=nil 
    doc = Hpricot(self.read_file(local,remote)) 
    doc.search('ul.topics/li.row').each do |li| 
     ta = li.search('a.topictitle').first # topic anchor 
     q = self.parse_anchor(ta) 
     f = q['f'] 
     t = q['t'] 
     u = self.parse_anchor(li.search('dl/dt/a').last)['u'] 
     @topics[t] = { 
     forum_id: f, 
     topic_id: t, 
     user_id: u, 
     title: ta.inner_html 
     } 
    end 
    end 


    def parse_topic local, remote, start=nil 
    doc = Hpricot(self.read_file(local,remote)) 
    if start.nil? 
     doc.search('div.pagination/span/a').collect{ |p| self.parse_anchor(p)['start'] }.uniq.each do |p| 
     self.parse_topic "#{local}.start.#{p}", "#{remote}&start=#{p}", true 
     end 
    end 
    doc.search('div.postbody').each do |li| 
     # do something 
    end 
    end 


    def parse_url href 
    r = CGI.parse URI.parse(href).query 
    r.each_pair { |k,v| r[k] = v.last } 
    end 


    def parse_anchor hp 
    self.parse_url hp.attributes['href'] unless hp.nil? 
    end 
end

來源

2010-09-01 Andrei

http://superuser.com/questions/116201/how-can-i-download-an- whole-active-phpbb-forum – Andrei 2010-09-01 23:22:09

使用[offline Explorer]（http://www.softpedia.com/get/Internet/Offline-Browsers/Offline-Explorer-Pro.shtml） – 2010-09-01 18:34:06

這是我的想法，但我會只喜歡有用的信息的SQLite數據庫。 – Andrei 2010-09-01 19:02:32

這將違反服務條款並可能違法。

其次，如果StackOverflow的社區開始解決這些類型的網絡刮的問題，那麼你就知道...

來源

2010-09-01 18:15:20 shamittomar

不，這個劑量不違反TOS。是什麼讓你這麼想的？對我來說，這是解析和排序數據的標準問題。你有這個問題嗎？ – Andrei 2010-09-01 19:08:02

@Andrei，你能提供論壇的網址嗎？ – shamittomar 2010-09-01 19:09:50

嗯...我會拒絕這樣做。它開始看起來更邪惡嗎？ – Andrei 2010-09-01 19:14:47

如何下載phpBB3論壇的所有帖子，如果我不是管理員？

回答

相關問題