2013-07-09 50 views
0

我有很多的Firefox會話管理器保存的文件,其名爲* .session.and我想從文件導出的URL和標題,我寫的正則表達式:正則表達式:獲取URL和標題的形式JSON字符串

(?<=entries":\[{"url":"(?<link>.*?(?="))","title":"(?<content>.*?)(?=",")) 

但它似乎不工作。它匹配得太多了。

第三方文件

[SessionManager v2] 
name=jjjjjjjjjjjjjjjjjj 
timestamp=1368030038170 
autosave=false count=1/49 screensize=1366x768 
{"windows":[{"tabs":[{"entries":[{"url":"http://blog.csdn.net/gisfarmer/article/details/4135975?1357376310","title":"圖像相似度算法的C#實現及測評 - 老駱駝空間站 - 博客頻道 - CSDN.NET","ID":1673113085,"docshellID":36,"referrer":"http://blog.csdn.net/gisfarmer/article/details/4135975","docIdentifier":80,"children":[{"url":"about:blank","ID":1673113086,"docshellID":34,"docIdentifier":81},{"url":"about:blank","ID":1673113087,"docshellID":168,"docIdentifier":82},{"url":"about:blank","ID":1673113088,"docshellID":55,"docIdentifier":83},{"url":"about:blank","ID":1673113089,"docshellID":37,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQWh0dHA6Ly9ibG9nLmNzZG4ubmV0L2dpc2Zhcm1lci9hcnRpY2xlL2RldGFpbHMvNDEzNTk3NT8xMzU3Mzc2MzEwAAAAAAAAAAQAAAAHAAAADQAAAAf/////AAAAB/////8AAAAHAAAADQAAABQAAAAtAAAAFAAAACIAAAAUAAAAGwAAAC8AAAAHAAAAL/////8AAAAA/////wAAADcAAAAKAAAAFP////8BAAAAAAAAAAAAAQAAAAAAAA==","docIdentifier":84},{"url":"about:blank","ID":1673113090,"docshellID":31,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQWh0dHA6Ly9ibG9nLmNzZG4ubmV0L2dpc2Zhcm1lci9hcnRpY2xlL2RldGFpbHMvNDEzNTk3NT8xMzU3Mzc2MzEwAAAAAAAAAAQAAAAHAAAADQAAAAf/////AAAAB/////8AAAAHAAAADQAAABQAAAAtAAAAFAAAACIAAAAUAAAAGwAAAC8AAAAHAAAAL/////8AAAAA/////wAAADcAAAAKAAAAFP////8BAAAAAAAAAAAAAQAAAAAAAA==","docIdentifier":85},{"url":"about:blank","ID":1673113091,"docshellID":63,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQWh0dHA6Ly9ibG9nLmNzZG4ubmV0L2dpc2Zhcm1lci9hcnRpY2xlL2RldGFpbHMvNDEzNTk3NT8xMzU3Mzc2MzEwAAAAAAAAAAQAAAAHAAAADQAAAAf/////AAAAB/////8AAAAHAAAADQAAABQAAAAtAAAAFAAAACIAAAAUAAAAGwAAAC8AAAAHAAAAL/////8AAAAA/////wAAADcAAAAKAAAAFP////8BAAAAAAAAAAAAAQAAAAAAAA==","docIdentifier":86},{"url":"about:blank","ID":1673113092,"docshellID":22,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQWh0dHA6Ly9ibG9nLmNzZG4ubmV0L2dpc2Zhcm1lci9hcnRpY2xlL2RldGFpbHMvNDEzNTk3NT8xMzU3Mzc2MzEwAAAAAAAAAAQAAAAHAAAADQAAAAf/////AAAAB/////8AAAAHAAAADQAAABQAAAAtAAAAFAAAACIAAAAUAAAAGwAAAC8AAAAHAAAAL/////8AAAAA/////wAAADcAAAAKAAAAFP////8BAAAAAAAAAAAAAQAAAAAAAA==","docIdentifier":87},{"url":"about:blank","ID":1673113093,"docshellID":118,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQWh0dHA6Ly9ibG9nLmNzZG4ubmV0L2dpc2Zhcm1lci9hcnRpY2xlL2RldGFpbHMvNDEzNTk3NT8xMzU3Mzc2MzEwAAAAAAAAAAQAAAAHAAAADQAAAAf/////AAAAB/////8AAAAHAAAADQAAABQAAAAtAAAAFAAAACIAAAAUAAAAGwAAAC8AAAAHAAAAL/////8AAAAA/////wAAADcAAAAKAAAAFP////8BAAAAAAAAAAAAAQAAAAAAAA==","docIdentifier":88},{"url":"about:blank","ID":1673113094,"docshellID":59,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQWh0dHA6Ly9ibG9nLmNzZG4ubmV0L2dpc2Zhcm1lci9hcnRpY2xlL2RldGFpbHMvNDEzNTk3NT8xMzU3Mzc2MzEwAAAAAAAAAAQAAAAHAAAADQAAAAf/////AAAAB/////8AAAAHAAAADQAAABQAAAAtAAAAFAAAACIAAAAUAAAAGwAAAC8AAAAHAAAAL/////8AAAAA/////wAAADcAAAAKAAAAFP////8BAAAAAAAAAAAAAQAAAAAAAA==","docIdentifier":89},{"url":"about:blank","ID":1673113095,"docshellID":137,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQWh0dHA6Ly9ibG9nLmNzZG4ubmV0L2dpc2Zhcm1lci9hcnRpY2xlL2RldGFpbHMvNDEzNTk3NT8xMzU3Mzc2MzEwAAAAAAAAAAQAAAAHAAAADQAAAAf/////AAAAB/////8AAAAHAAAADQAAABQAAAAtAAAAFAAAACIAAAAUAAAAGwAAAC8AAAAHAAAAL/////8AAAAA/////wAAADcAAAAKAAAAFP////8BAAAAAAAAAAAAAQAAAAAAAA==","docIdentifier":90},{"url":"about:blank","ID":1673113096,"docshellID":254,"owner_b64":"CbflmEkNQj+opi5sTsh3UAAAAAAAAAAAwAAAAAAAAEYB3pRy0IA0EdOTmQAQS6D9QDlf4EV9GErbo/2vmMihrxEAAAAC/////wAAAFABAAAAQW 

和結果 enter image description here

任何機構誰可以幫助!

+0

{ 「項」:\ [{ 「URL」: 「(*)」, 「稱號」:」 ([^「] +) –

回答

3

馬特布萊恩特的方式似乎是最好的。爲了您的正則表達式的問題,你可以簡單地使用:

"url":"(?<link>[^"]+)","title":"(?<content>[^"]+) 

或更多安全

"url":"(?<link>(?>[^"]+|(?<=\\)")+)","title":"(?<content>(?>[^"]+|(?<=\\)")+) 
+0

謝謝,它非常完美! –

7

爲什麼不解析json並在不使用正則表達式的情況下循環遍歷?

+0

becase json將提取太nuch對象,我不知道對象的含義。 –