2012-12-29 119 views
1

我是python的新手,我試圖使用urllib從musicbrainz獲取JSON數據,並且能夠解析一些數據。但是,對於一些藝術家來說,JSON數據中的某些字段/鍵不會出現,因爲它們可能不是必需的。我不知道如何根據進一步的篩選條件篩選獲取的JSON數據。過濾器在Python中基於特定條件獲取JSON數據

在下面的例子中,我只需要將獲取的JSON數據過濾爲那些具有release-list >> release >> release-group >>類型爲'Single'的類型。獲取的JSON最多可以有50次迭代,但我只是進一步將其過濾爲類別'Single'。請讓我知道一個人可以怎麼做。謝謝!

樣品JSON:

{ 
    "created": "2012-12-27T13:22:55.834Z", 
    "recording-list": { 
     "count": 3, 
     "offset": 0, 
     "recording": [{ 
      "score": "100", 
      "title": "Stronger", 
      "artist-credit": { 
       "name-credit": [{ 
        "artist": { 
         "name": "Britney Spears", 
         "sort-name": "Spears, Britney" 
        } 
       }] 
      }, 
      "release-list": { 
       "release": [{ 
        "id": "13c5511f-1f99-4ffe-97d5-562c05e9d8d5", 
        "title": "Hit Hammer 2001 (disc 1)", 
        "status": "Official", 
        "artist-credit": { 
         "name-credit": [{ 
          "artist": { 
           "id": "89ad4ac3-39f7-470e-963a-56509c546377", 
           "name": "Various Artists" 
          } 
         }] 
        }, 
         "release-group": { 
         "id": "6c4c2cc3-3d8e-3a19-9d46-da076c34b6e9", 
         "type": "Compilation", 
         "primary-type": "Album", 
         "secondary-type-list": { 
          "secondary-type": ["Compilation"] 
         } 
        }, 
        "medium-list": { 
         "track-count": 20, 
         "medium": [{ 
          "position": 1, 
          "track-list": { 
           "count": 20, 
           "offset": 0, 
           "track": [{ 
            "number": "1", 
            "title": "Stronger", 
            "length": 203266 
           }] 
          } 
         }] 
        } 
       }] 
      } 
     }, { 
      "id": "feb9acbf-1d3d-4395-9512-bfbdcfa72eb9", 
      "score": "100", 
      "title": "Stronger", 
      "artist-credit": { 
       "name-credit": [{ 
        "joinphrase": "", 
        "artist": { 
         "name": "Britney Spears", 
         "sort-name": "Spears, Britney" 
        } 
       }] 
      }, 
      "release-list": { 
       "release": [{ 
        "id": "45e2a271-2f6b-4029-b11e-b6d94d169f9a", 
        "title": "Stronger: The Remixes", 
        "status": "Official", 
        "release-group": { 
         "id": "4d018ba8-f05e-4817-8c70-34307161a0fc", 
         "type": "Single", 
         "primary-type": "Single" 
        }, 
        "date": "2000-12-12", 
        "country": "US", 
        "medium-list": { 
         "track-count": 6, 
         "medium": [{ 
          "position": 1, 
          "format": "CD", 
          "track-list": { 
           "count": 6, 
           "offset": 0, 
           "track": [{ 
            "number": "1", 
            "title": "Stronger", 
            "length": 203000 
           }] 
          } 
         }] 
        } 
       }] 
      }, 
      "puid-list": { 
       "puid": [{ 
        "id": "28550845-c68a-314d-90c1-010dff730f4a" 
       }] 
      } 
     }] 
    } 
} 

Python代碼:

def get_mbid(artist, song): 
    artist=urllib.quote_plus(artist) 
    song=urllib.quote_plus(song) 
    recording_url = 'http://search.musicbrainz.org/ws/2/recording/?&fmt=json&query=artist:"'+artist+'"%20AND%20recording:"'+song+'"' 
    search_results = urllib.urlopen(recording_url) 

    json = simplejson.loads(search_results.read()) 
    search_results.close() 
    if json['recording-list']['count'] == 0: 
     return get_mbid_artist(artist) 
    else: 
     recordings = json['recording-list']['recording'] 
     for recording in recordings: 
      mbid = recording['artist-credit']['name-credit'][0]['artist']['id'] 
      print mbid 
+0

你想過濾'主要類型'真的; 'type'已被棄用,正在被'primary-'和'secondary-type'取代。爲什麼不將過濾器添加到查詢網址?添加'&primary-type = Single'應該這樣做。 –

+0

嗨馬丁,感謝您的建議,但它似乎工作,JOSN轉儲仍然看起來相同,即使我通過URL中的主類型。 http://musicbrainz.org/ws/2/recording?&fmt=json&query=artist%3A"Britney+Spears"+AND+recording%3A"Stronger"&primary-type=Single –

+0

對,我懷疑它會返回任何錄音*至少有*一個版本。你需要什麼樣的過濾?你可以隨時循環發佈,只處理那些單曲,對嗎? –

回答

0

http://musicbrainz.org/ws/2/recording?&query=artist%3A%22Britney+Spears%22+AND+recording%3A%22Stronger%22+AND+primarytype%3ASingle

會給你的primarytypeSingle錄音。

也就是說沒有編碼:

artist:"Britney Spears" AND recording:"Stronger" AND primarytype:Single

參見:Web Service Search。 當然,您可以再次添加fmt=json。我只是刪除了這個部分,因爲在瀏覽器中讀取XML更容易。


我也要注意,沒有爲XML Web Service的當前版本的python library