需要一個包含大量垃圾的URL的名稱才能指定名稱。（高級BASH）

http://romhustler.net/file/54654/RFloRzkzYjBxeUpmSXhmczJndVZvVXViV3d2bjExMUcwRmdhQzltaU5UUTJOVFE2TVRrM0xqZzNMakV4TXk0eU16WTZNVE01TXpnME1UZ3pPRHBtYVc1aGJGOWtiM2R1Ykc5aFpGOXNhVzVy < - URL，它需要識別需要一個包含大量垃圾的URL的名稱才能指定名稱。（高級BASH）

http://romhustler.net/rom/ps2/final-fantasy-x-usa < - 家長URL

如果你複製粘貼此鏈接，你會看到瀏覽器識別的文件名。我怎樣才能得到一個bash腳本來做同樣的事情？

我需要WGET第一個URL，但因爲它將爲100多個項目我無法複製粘貼每個URL。

我目前已經爲所有文件設置了菜單。只是不知道如何批量下載每個文件，因爲這些文件的URL沒有匹配的模式。

*位我的工作菜單：

    #Raw gamelist grabber 
    w3m http://romhustler.net/roms/ps2 |cat|egrep "/5" > rawmenu.txt 

        #splits initial file into a files(games01) that contain 10 lines. 
        #-d puts lists files with 01 
    split -l 10 -d rawmenu.txt games 

        #s/ /_/g - replaces spaces with underscore 
        #s/__.*//g - removes anything after two underscores 
    select opt in\ 
    $(cat games0$num|sed -e 's/ /_/g' -e 's/__.*//g')\ 
    "Next"\ 
    "Quit" ; 

    if [[ "$opt" =~ "${lines[0]}" ]]; 
    then 
     ### Here the URL needs to be grabbed ###

這做的是BASH。這可能嗎？

來源

2014-03-03 Infinite

似乎romhustler.net在完整下載頁面上使用一些JavaScript來隱藏加載頁面後的幾秒鐘的最終下載鏈接，可能會阻止這種網頁抓取。

但是，如果他們使用直接鏈接到ZIP文件，例如，我們可以這樣做：

# Use curl to get the HTML of the page and egrep to match the hyperlinks to each ROM 
curl -s http://romhustler.net/roms/ps2 | egrep -o "rom/ps2/[a-zA-Z0-9_-]+" > rawmenu.txt 

# Loop through each of those links and extract the full download link 
while read LINK 
do 
    # Extract full download link 
    FULLDOWNLOAD=`curl -s "http://romhustler.net$LINK" | egrep -o "/download/[0-9]+/[a-zA-Z0-9]+"` 
    # Download the file 
    wget "http://romhustler.net$FULLDOWNLOAD" 
done < "rawmenu.txt"

來源

2014-03-03 10:24:01 Jon

感謝您的反饋！你是否說沒有其他方法可以解密Javascript？但即使這樣也不可能？ – Infinite

爲了獲得最終的下載URL，您需要一個能夠在頁面上執行Javascript的CLI瀏覽器。快速搜索拋出http://phantomjs.org/，但我認爲你也可以（理論上）使用Webkit引擎。從頁面的快速分析看來，下載鏈接是從AJAX請求獲得的。 – Jon

需要一個包含大量垃圾的URL的名稱才能指定名稱。 （高級BASH）

回答

相關問題

需要一個包含大量垃圾的URL的名稱才能指定名稱。（高級BASH）