2012-03-27 52 views
2

在當前Linux發行版中找到的XMLStarlet版本具有每個xmlstarlet ed調用的128個操作的限制,而全部版本受操作系統的最大命令行長度限制。這怎麼解決?處理XMLStarlet中的長編輯列表

+0

這是限制你一個問題,在實踐中? – npostavs 2012-03-28 14:17:38

+0

@npostavs是的。查看我對http://stackoverflow.com/questions/9880808/shell-script-to-parse-csv-to-an-xml-query/9882015的答案,以查看需要處理多於一個少數輸入線。我也在商業,生產代碼中遇到了這個問題(儘管後面的例子被改寫爲在XQuery中執行相關處理而不是bash + xmlstarlet)。 – 2012-03-28 15:42:03

回答

3

下休息長xmlstarlet編輯列表爲較短操作的流水線:

xmlstarlet_max_commands=100 # max per instance; see http://sourceforge.net/tracker/?func=detail&aid=3488240&group_id=66612&atid=515106 
shopt -s extglob # enable +([0-9]) as an equivalent to the regex ^[[:digit:]]+ 

xmlstarlet_ed() { 
    declare -a global_parameters 
    declare -a parameters 
    declare -i num_commands 
    declare -i cmd_len 

    global_parameters=() 
    parameters=() 
    num_commands=0 

    global_parameters_remaining=$1; shift 

    while ((global_parameters_remaining)); do 
    global_parameters+=("$1"); shift 
    ((global_parameters_remaining--)) 
    done 

    while (("$#")) ; do 
    cmd_len=$1; shift 
    if ! [[ $cmd_len = +([0-9]) ]] ; then 
     echo "ERROR: xmlstarlet_ed commands must be prefixed by run length" 
     return 1 
    fi 

    if ((num_commands < xmlstarlet_max_commands)) ; then 
     parameters+=("${@:1:$cmd_len}") 
     num_commands+=1 
     shift $cmd_len 
    else 
     xmlstarlet ed "${#global_parameters[@]}" "${global_parameters[@]}" "${parameters[@]}" \ 
     | xmlstarlet_ed "${#global_parameters[@]}" "${global_parameters[@]}" "$cmd_len" "[email protected]" 
     return 0 
    fi 
    done 

    if ((${#parameters[@]} > 0)) ; then 
    xmlstarlet ed "${global_parameters[@]}" "${parameters[@]}" 
    else 
    cat 
    fi 
} 

可以調用像這樣:

# first list passed is global parameters; first the count, then the values 
# pass only a 0 if no global parameters are desired 
global_parameters=(2 -N "xhtml=http://www.w3.org/1999/xhtml") 

# build up the parameter list as length/command pairs; the lengths are used 
# to determine the potential split points between subprocesses 
parameters=() 
while read; do 
    parameters+=(8 -s /xhtml:html/xhtml:body -t elem -n line -v "$REPLY") 
done 

# ...and actually invoke: 
xmlstarlet_ed "${global_parameters[@]}" "${parameters[@]}" \ 
<<<"<html xmlns='http://www.w3.org/1999/xhtml'><body/></html>" 
+0

+1在一讀時沒有注意到'xmlstarlet_ed'和'xmlstarlet ed'之間的區別。我感覺這是一個簡短的通知,說'xmlstarlet_ed'是一個遞歸函數,可以增強可讀性。 – 2014-08-05 14:13:55