2016-06-07 39 views
0

我使用PRAW與Python,我希望能夠:充分利用自身的文本鏈接PRAW版(Subreddit)對象的版(Subreddit)

  1. 通過「新」的帖子上一版(Subreddit)去
  2. 檢測如果有一個鏈接到一個subreddit帖子自我文字
  3. 如果有一個subreddit鏈接,獲取該subreddit作爲一個PRAW對象,將在稍後使用。

我可以做第1步,但發現是否有一個subreddit鏈接,然後得到該subreddit是我的困難部分。下面是我到目前爲止有:

#! python3 
# Reply with subreddit info from subreddit in text body 

import praw, time 

# Bot login details 
USERNAME = "AutoMobBot"; 
PASSWORD = "<redacted>"; 

UA = "[Subreddit Info Provider (Update 0) by /u/MatthewMob]"; 
r = praw.Reddit(UA); 
r.login(USERNAME, PASSWORD, disable_warning=True); 

submissions = r.get_subreddit("matthewmob_csstesting").get_new(limit=10); 

for submission in submissions: 
    for word in submission.selftext.lower().split(): 
     if word.startswith("/r/"): 
      print("Found subreddit in:", submission.title); 
      print(submission.selftext_html); 

print("Done..."); 
input(); 

這將剛剛得到的意見,在分裂的selftext的話,並打印出的東西,如果分割的話之一,/r/開始,顯然這是行不通的所有的時間,如果用戶,例如,只鏈接subreddit作爲r/askredditwww.reddit.com/r/askreddit。即使如此,如果他們將/r/askreddit/top(與最後的東西)聯繫起來,我將如何能夠將該子reddit作爲PRAW對象?我一直在試圖找到一些正則表達式來幫助我做到這一點,但沒有找到它。

我的主要問題是什麼是從用戶selftext中的鏈接獲得subreddit的最佳方法,以及我該怎麼做?

如果您需要更多說明,我很樂意提供更多信息。

回答

0

我現在找到了我自己的答案。這裏是適用於我的代碼:

#! python3 
# Reply with subreddit info from subreddit in text body 

import praw, bs4, re 
from pprint import pprint 

# Bot login details 
USERNAME = "AutoMobBot"; 
PASSWORD = "<Password>"; 

UA = "[Subreddit Info Provider (Update 4) by /u/MatthewMob]"; 
r = praw.Reddit(UA); 
r.login(USERNAME, PASSWORD, disable_warning=True); 

submissions = r.get_subreddit("matthewmob_csstesting").get_new(limit=3); 

for submission in submissions: 
    subs = []; 
    subsfound = -1; 
    soup = bs4.BeautifulSoup(submission.selftext_html, "lxml"); 
    for a in soup.find_all("a", href=True): 
     href = a["href"] + "/"; 
     getsub = re.findall("\/r\/(.*?)\/", href, re.DOTALL); 
     if getsub != None: 
      if getsub[subsfound] not in subs: 
       subs.append(getsub[subsfound]); 
       subsfound = subsfound + 1; 
       print("\nTitle:", submission.title); 
       print("\nSubreddits Found:", subsfound); 
       print("\nSubreddit Found:", subs[subsfound] + "\n"); 

print("Done..."); 
input(); 
相關問題