如何只從Tweepy中提取的推文中獲取文本部分？

我正在做一個類似於情感分析的研究項目。我使用Tweepy從twitter中提取了推文。我得到的數據是這樣的：如何只從Tweepy中提取的推文中獲取文本部分？

{"created_at":"Sat Apr 22 07:28:47 +0000 2017","id":855684794939842560,"id_str":"855684794939842560","text":"#PL | FIXTURES - 22 April 2017 \nWest Ham v Everton 16:00\nHull v Watford\nSwansea v Stoke \nBournemouth v Middlesbrough #CCFMSport","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":256051042,"id_str":"256051042","name":"Ayanda Frances Felem","screen_name":"AyandaFelemZA","location":"Cape Town, South Africa","url":"http:\/\/ccfm.org.za","description":"Sports Producer\/Reporter for @RadioCCFm, Views are my own. [email protected]","protected":false,"verified":false,"followers_count":446,"friends_count":1648,"listed_count":23,"favourites_count":1625,"statuses_count":16110,"created_at":"Tue Feb 22 15:15:38 +0000 2011","utc_offset":7200,"time_zone":"Pretoria","geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme11\/bg.gif","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme11\/bg.gif","profile_background_tile":false,"profile_link_color":"DD2E44","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/850335374446665728\/BvVIo7oB_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/850335374446665728\/BvVIo7oB_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/256051042\/1491570881","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"PL","indices":[0,3]},{"text":"CCFMSport","indices":[117,127]}],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1492846127625"}

現在我想提取只能從該文件中鳴叫「文本」。我曾經嘗試這樣做：

import json 

tweets_data_path = 'twitter_streaming.txt' 
tweets_data = [] 
tweets_file = open(tweets_data_path, "r") 

json_load = json.load(tweets_file) 
texts = json_load['text'] 
coded = texts.encode('utf-8') 
s = str(coded) 
tweets_data.append(s[1:-2)) 
print tweets_data

但我得到一個錯誤說：

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

試圖尋找此錯誤的原因，但沒有發現任何具體。

我在做什麼錯？有沒有更好的辦法？

來源

2017-04-23 Liam

null,false = None,False 
a = {"created_at":"Sat Apr 22 07:28:47 +0000 2017","id":855684794939842560,"id_str":"855684794939842560","text":"#PL | FIXTURES - 22 April 2017 \nWest Ham v Everton 16:00\nHull v Watford\nSwansea v Stoke \nBournemouth v Middlesbrough #CCFMSport","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":256051042,"id_str":"256051042","name":"Ayanda Frances Felem","screen_name":"AyandaFelemZA","location":"Cape Town, South Africa","url":"http:\/\/ccfm.org.za","description":"Sports Producer\/Reporter for @RadioCCFm, Views are my own. [email protected]","protected":false,"verified":false,"followers_count":446,"friends_count":1648,"listed_count":23,"favourites_count":1625,"statuses_count":16110,"created_at":"Tue Feb 22 15:15:38 +0000 2011","utc_offset":7200,"time_zone":"Pretoria","geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme11\/bg.gif","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme11\/bg.gif","profile_background_tile":false,"profile_link_color":"DD2E44","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/850335374446665728\/BvVIo7oB_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/850335374446665728\/BvVIo7oB_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/256051042\/1491570881","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"PL","indices":[0,3]},{"text":"CCFMSport","indices":[117,127]}],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1492846127625"} 
print a["text"]

我只是使用這一行代碼，它返回給我以下輸出。

#PL | FIXTURES - 22 April 2017 
West Ham v Everton 16:00 
Hull v Watford 
Swansea v Stoke 
Bournemouth v Middlesbrough #CCFMSport

雖然問題不明確，您是否正在尋找此文本？

來源

2017-04-23 14:01:56

是的，這正是我正在尋找，但你能告訴我你在哪裏使用這個打印[「文本」]。非常感謝。 – Liam

好吧，我得到你所做的，但爲此，我不得不將所有的錯誤更改爲False，並將其更改爲無 – Liam

@Liam是的，我編輯了我的答案，以便它對您有更多的幫助！ –

如何只從Tweepy中提取的推文中獲取文本部分？

回答

相關問題