2014-04-01 74 views
3

當我鍵入ls我得到:替代方法:剪切-d <string>?

aedes_aegypti_upstream_dremeready_all_simpleMasked_random.fasta 
anopheles_albimanus_upstream_dremeready_all_simpleMasked_random.fasta 
anopheles_arabiensis_upstream_dremeready_all_simpleMasked_random.fasta 
anopheles_stephensi_upstream_dremeready_all_simpleMasked_random.fasta 
culex_quinquefasciatus_upstream_dremeready_all_simpleMasked_random.fasta 

我要管到這個切口(或通過一些替代方式),這樣我只得到:

aedes_aegypti 
anopheles_albimanus 
anopheles_arabiensis 
anopheles_stephensi 
culex_quinquefasciatus 

如果削減將接受一個字符串(多個字符),因爲它的分隔符,然後我可以使用:

cut -d "_upstream_" -f1 

但這是不允許的,因爲cut只需要單個字符作爲分隔符。

回答

3

awk確實允許一個字符串作爲分隔符:

$ awk -F"_upstream_" '{print $1}' file 
aedes_aegypti 
anopheles_albimanus 
anopheles_arabiensis 
anopheles_stephensi 
culex_quinquefasciatus 
drosophila_melanogaster 

注意給定的輸入,你也可以使用cut_作爲分隔符,並打印第一兩項紀錄:

$ cut -d'_' -f-2 file 
aedes_aegypti 
anopheles_albimanus 
anopheles_arabiensis 
anopheles_stephensi 
culex_quinquefasciatus 
drosophila_melanogaster 

sedgrep CAN也做到了。例如,這grep採用先行打印一切從行的開始,直到找到_upstream

$ grep -Po '^\w*(?=_upstream)' file 
aedes_aegypti 
anopheles_albimanus 
anopheles_arabiensis 
anopheles_stephensi 
culex_quinquefasciatus 
drosophila_melanogaster 
+1

這是這樣一個快速的答案@fedoriqui +1 - 所以是如此強大... –

+1

很高興讀到,高興它工作給你:) – fedorqui

3

如果你只想要第一個字段,你可以在純bash中做到這一點:

ls | while read line; do echo "${line%%_upstream_*}"; done 
+0

這麼多的替代方法,我從每一個學到了一點謝謝! –

+1

@hello_there_andy沒問題,就是這麼回事 –

3

您還可以使用SED:

sed -i.bak 's/_upstream.*//' file 

結果:

aedes_aegypti 
anopheles_albimanus 
anopheles_arabiensis 
anopheles_stephensi 
culex_quinquefasciatus 
drosophila_melanogaster 

注意:這也會創建原始文件的備份爲file.bak。

3

類似@湯姆·芬內克 - 使用bash parameter expansion/substring removal - 但使用for循環:

$ ls 
aedes_aegypti_upstream_dremeready_all_simpleMasked_random.fasta 
anopheles_albimanus_upstream_dremeready_all_simpleMasked_random.fasta 
anopheles_arabiensis_upstream_dremeready_all_simpleMasked_random.fasta 
anopheles_stephensi_upstream_dremeready_all_simpleMasked_random.fasta 
culex_quinquefasciatus_upstream_dremeready_all_simpleMasked_random.fasta 
drosophila_melanogaster_upstream_dremeready_all_simpleMasked_random.fasta 

$ for file in *; do 
> echo "${file%%_upstream_*}" 
> done 
aedes_aegypti 
anopheles_albimanus 
anopheles_arabiensis 
anopheles_stephensi 
culex_quinquefasciatus 
drosophila_melanogaster