2015-02-24 78 views
0

所以我有一個包含類似於下面的代碼的公開網頁:在Python字典解析JavaScript數組

var arrayA = new Array(); 
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000); 

我想要做的是有Python來閱讀此頁,並把數據轉換成2字典的名稱+描述是關鍵。

dict1["Name1Description1"] = 1.000 

dict2["Name1Description1"] = 2.000 

dict1["Name2Description2"] = 4.000 

dict2["Name2Description2"] = 8.000 

有一個簡單的辦法,我們可以做到這一點,否則我們幾乎要分析它像任何其他字符串?顯然這個陣列可以是任意長度的。

謝謝!

回答

1

是的,這是可以使用正則表達式。

import re 

st = ''' 
var arrayA = new Array(); 
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000); 
''' 

dict1, dict2 = {}, {} 
matches = re.findall('\"(\d+)\",\"(.*?)\",\"(.*?)\",(\d+.\d+),(\d+.\d+)', st, re.DOTALL) 
for m in matches: 
    key = m[1] + m[2] 
    dict1[key] = float(m[3]) 
    dict2[key] = float(m[4]) 

print(dict1) 
print(dict2) 

# {'Name1description1': 1.0, 'Name2description2': 4.0} 
# {'Name1description1': 2.0, 'Name2description2': 8.0} 

正則表達式的邏輯是:

\" - Match a double quote 
\"(\d+)\" - Match any number of digits contained in between two double quotes 
\"(.*?)\" - Match any number of any characters contained between two double quotes 
(\d+.\d+) - Match any number of numbers with a dot followed by any number of numbers 
, - Match a comma 

所以正則表達式將與此預期模式的JS字符串輸入相匹配。但我認爲js在逗號之間沒有空格。你可以先脫掉逗號然後運行它。