使用python搜索字符串

我想知道如何使用python搜索特定的字符串。其實我打開它含有片狀下方的降價文件：使用python搜索字符串

| --------- | -------- | --------- | 
|**propped**| - | -a flashlight in one hand and a large leather-bound book (A History of Magic by Bathilda Bagshot) propped open against the pillow. | 
|**Pointless**| - | -「Witch Burning in the Fourteenth Century Was Completely Pointless — discuss.」| 
|**unscrewed**| - | -Slowly and very carefully he unscrewed the ink bottle, dipped his quill into it, and began to write,| 
|**downtrodden**| - | -For years, Aunt Petunia and Uncle Vernon had hoped that if they kept Harry as downtrodden as possible, they would be able to squash the magic out of him.| 
|**sheets,**| - | -As long as he didn’t leave spots of ink on the sheets, the Dursleys need never know that he was studying magic by night.| 
|**flinch**| - | -But he hoped she’d be back soon — she was the only living creature in this house who didn’t flinch at the sight of him.|

我必須從與裝飾各行得到的字符串| ** |，如：

撐起
無意義
擰開
受壓迫
片
flinch

我試圖使用正則表達式但未能提取它。

來源

2017-02-26 Marco Mei

原始markdown文件內容如下所示： --------- | -------- | --------- | | **支持** | - | - 一隻手電筒和一個大皮革書（Bathilda Bagshot的魔術史）在枕頭上張開。 | | **毫無意義** | - | - 「十四世紀的女巫燃燒毫無意義 - 討論。」 –

有[在線正則表達式測試器]（https://regex101.com/）使用Python風格的正則表達式 - 它們對於微調模式非常有用。 – wwii

您正在搜索的文本中是否有'''**'''字符？ – wwii

嘗試使用下面的正則表達式：

(?<=\|)(?!\s).*?(?!\s)(?=\|)

看demo/explanation

來源

2017-02-26 16:54:36 m87

非常感謝。並且您分享的網站非常有用。 –

對不起，我在這裏新... –

我以爲我可以接受很多次...對不起 –

如果星號是您正在搜索的文本，你不想sheets後面的逗號。該模式將是管道後跟兩個星號，然後是任何不是星號或逗號。

\|\*{2}([^*,]+)

如果你可以用逗號居住或是否有可能是用逗號你想趕上

\|\*{2}([^*]+)

使用帶有re.findall或re.finditer要麼模式捕捉你想要的文字。

如果使用第二種模式，則需要遍歷組並去除不需要的逗號。

來源

2017-02-26 17:42:32 wwii

是的，當然，很高興做到這一點，但我不知道如何接受它，因爲這是我第一次在這裏發佈問題，你介意告訴我該怎麼做嗎？ –

謝謝你wwii。 –

import re 

y = '(?<=\|\*{2}).+?(?=,{0,1}\*{2}\|)' 
reg = re.compile(y) 
a = '| --------- | -------- | --------- | |**propped**| - | -a flashlight in one hand and a large leather-bound book (A History of Magic by Bathilda Bagshot) propped open against the pillow. | |**Pointless**| - | -「Witch Burning in the Fourteenth Century Was Completely Pointless — discuss.」|' 
reg.findall(a)

正則表達式（Y）上面解釋：

(?<=\|\*{2}) - 匹配，如果字符串中的當前位置由匹配前面\|\*{2}即|**

.+? - 將嘗試找到任何東西（除換新線）重複一次或多次。限定符之後添加?使其以非貪婪或最小方式執行匹配;儘可能少的字符將被匹配。

(?=,{0,1}\*{2}\|) - ?=匹配前面提到的正則表達式之前的任何字符串。在這種情況下，我提到了,{0,1}\*{2}\|，這意味着零或一個,和2 *和結尾|。

來源

2017-02-26 17:52:20

感謝Dhruv Baveja。 –

@MarcoMei如果解決方案適用於您，您可以請它upvote它。謝謝 –

嗨，Dhruv Baveja。感謝您的好意和有用的幫助，它的運作非常好，當然我希望得到它的支持，但是當我試圖這樣做時，它會提示「感謝您的反饋，記錄了名聲低於15的人的投票，但不要更改公開顯示的帖子分數。「我能做的其他事情嗎？ –

我已經寫了下面的程序來實現所需的輸出。我創建了一個文件string_test，其中我複製了所有原始字符串：

a=re.compile("^\|\*\*([^*,]+)") 
with open("string_test","r") as file1: 
for i in file1.readlines(): 
    match=a.search(i) 
    if match: 
     print match.group(1)

來源

2017-02-27 09:10:13

使用python搜索字符串

回答

相關問題