2014-03-27 72 views
0

我有一個文件,下面的json文件,我想提取轉錄和解釋之間的數據。提取兩個字符串之間的數據

` 「1010320」:{

"transcript": [ 

     "1012220", 

     "to build. so three is not correct." 
    ], 

    "explain": "Describing&Interpreting" 

}, 

「1019660」:{

"transcript": [ 

     "1031920", 

     "The moment disturbance comes, if this control strategy is to be implemented properly, the moment disturbance comes, it is picked up immediately, and corrective action done immediately." 

    ], 

    "explain": "Describing&Interpreting" 
}, 

"1041600": { 

    "transcript": [` 

"1044860",

"this is also not correct because it will take some time."

],

"explain": "Describing&Interpreting"

},

' 「1053100」:{

"transcript": [ 

     "1073800", 
    ], ` 

' 「解釋」: 「描述&解讀」 },

"2082920": { 

    "transcript": [ 

     "2089000", 

     "45 minutes i.e., whereas this taken around 15seconds or something. Is that ok?" 
], 

    "explain": "Describing&Interpreting" 
}, ` 

我要排序的字符串和數字。

輸出應該是:

"to build. so three is not correct." 

"The moment disturbance comes, if this control strategy is to be implemented properly, the moment disturbance comes, it is picked up immediately, and corrective action done immediately." 

"this is also not correct because it will take some time." 

"45 minutes i.e., whereas this taken around 15seconds or something. Is that ok?" 

是否有可能?

+4

您似乎在尋找一個JSON解析器。 – devnull

回答

0
sed -n -e '/",[[:blank:]]*$/,/^[[:blank:]]*],/ { 
    /^[[:blank:]]*".*"[[:blank:]]*$/ { 
     G;p 
     } 
    }' YourFile 

根據您的樣本結構,採取串",結束字符串,字符串之間開始],,只打印行僅報價之間。 我只是增加幾個空格字符的可能性([:空白:]實際上是擴展空間,焦炭狀標籤)

+0

謝謝..但它不適合我。我沒有得到任何輸出。 – user1862399

+0

嘗試使用選項'-r'(也許'-e'就足夠了)和'--posix',它可以在我的AIX/KSH /非GNU sed上運行,並且你的系統肯定有GNU sed – NeronLeVelu

+0

完成。 'sort filename.json | uniq -uc' 'cat filename.json | tr -d'[]''{},''[0-9]'' – user1862399

0

這可能會爲你工作(GNU SED):

sed -n '/^\s*"transcript": \[/,/^\s*\],/{/^\s*"[^"]*"\s*$/p}' file 

它使用SEDS的grep類模式並打印以轉錄條款中的雙引號開頭和結尾的行。

+0

哦,這工作完全正常..非常感謝.. :-) – user1862399

+0

嗨,我們可以做到這一點在Python? – user1862399

相關問題