我想從網站URL中提取目錄層次結構。並非所有網站都符合目錄結構。對於做(下面)的網站,我希望能夠創建一個反映目錄層次結構的Python字典(如下)。我怎樣才能建立一個python腳本,可以從url中將結構提取到字典中?從URL中提取目錄結構
Raw data:
http://www.ex.com
http://www.ex.com/product_cat_1/
http://www.ex.com/product_cat_1/item_1
http://www.ex.com/product_cat_1/item_2
http://www.ex.com/product_cat_2/
http://www.ex.com/product_cat_2/item_1
http://www.ex.com/product_cat_2/item_2
http://www.ex.com/terms_and_conditions/
http://www.ex.com/Media_Center
Example output:
{'url':'http://www.ex.com', 'sub_dir':[
{'url':'http://www.ex.com/product_cat_1/', 'sub_dir':[
{'url':'http://www.ex.com/product_cat_1/item_1'}, {'url':'http://www.ex.com/product_cat_1/item_2'}]},
{'url':'http://www.ex.com/product_cat_2/', 'sub_dir':[
{'url':'http://www.ex.com/product_cat_2/item_1'},
'url':'http://www.ex.com/product_cat_2/item_2']},
{'url':'http://www.ex.com/terms_and_conditions/'},
{'url':'http://www.ex.com/Media_Center'},
]}
你試過了什麼? http://mattgemmell.com/2008/12/08/what-have-you-tried/ – FlavorScape