2015-06-18 56 views
-3

工作,我想抓住它,然後"genome_"串的每一次出現,但",("前結束,並與特定的字符串替換,比方說"XXX"正則表達式並不適用於多個模式occurence

在下面的文字:

(ID_Bxylanisolvens_NLAE-ZL-C182_genome_orf00003 ____ Bxylanisolvens_NLAE -.._ 843_unknown ___ 1278-2120_1 _ ^^ neighbours_ID_Bxylanisolvens_NLAE-ZL-C182_genome_orf00002_1__ID_Bxylanisolvens_NLAE-ZL-C182_genome_orf00004_1__neighbour_genes_Bxylanisolvens_NLAE -.._ Bxylanisolvens_NLAE- ..:0.00000230914009336068,((ID_Bxylanisolvens_NLAE-ZL-G421_genome_orf00003 ____ Bxylanisolvens_NLAE -.._ 843_unknown ___ 1315-2157_1 _ ^^ neighbours_ID_Bxylanisolvens_NLAE-ZL-G421_genome_orf00002_1__ID_Bxylanisolvens_NLAE-ZL-G421_genome_orf00004_1__neighbour_genes_Bxylanisolvens_NLAE -.._ Bxylanisolvens_NLAE- ..:0.00000230914009336068,ID_Bxylanisolvens_NLAE-ZL-C339_genome_orf00003 ____ Bxylanisolvens_NLAE -.._ 843_unknown ___ 1084- 1926_1 _ ^^ neighbours_ID_Bxylanisolvens_NLAE-ZL-C339_genome_orf00002_1__ID_Bxylanisolvens_NLAE-ZL-C339_genome_orf00004_1__neighbour_genes_Bxylanisolvens_NLAE -.._ Bxylanisolvens_NLAE- ..:0.00000230914009336068)28:0.00000230914009336068,(

期望的結果:

(ID_Bxylanisolvens_NLAE-ZL-C182_XXX,((ID_Bxylanisolvens_NLAE-ZL-G421_XXX,(

+0

期望的結果: (ID_Bxylanisolvens_NLAE-ZL-C182_XXX,((你使用(PCRE,蟒蛇),JavaScript的)什麼味道正則表達式的ID_Bxylanisolvens_NLAE-ZL-G421_XXX,( – ap88

+0

你嘗試過什麼 – Jota

+0

我使用? Python的re模塊已經嘗試了一些模式:'_genome _。* \,\('and'_genome _。*?\,\(' – ap88

回答

1

根據您的樣本數據和期望輸出,正環視應該有所幫助:

(?<=ID_Bxylanisolvens_NLAE-zl-[A-Z]\d{3,3}_)(genome.*?)(?=,\() 
  • (?<=ID_Bxylanisolvens_NLAE-zl-[A-Z]\d{3,3}_)回顧並檢查特定的字符序列。可能需要根據實際數據的可變性進行調整。
  • (genome.*?)捕獲位來替換 - 問號使其不貪婪。
  • (?=,\()期待字符組合來限定被刪除的部分。

看到它的行動:RegEx101
如果需要進一步的細節/調整,請發表評論。