基本上需要從由number.xml命名的一組單個XML文件中刪除當事方實體(以及它們之間的所有內容)。我嘗試以下,但它並不完全生產我需要的一切:Unix中的腳本從文件中刪除XML標記和內容
cat test.xml | sed "s;<parties>;\do_opentag ;" | sed "s;</parties>;\do_closetag ;" | awk 'BEGIN { doPrint = 1; } /do_opentag/ { doPrint = 0; print $0; } /do_closetag/ { doPrint = 1; } { if (doPrint) print $0; }' | grep -v 'do_opentag\|do_closetag'
<?xml version="1.0" encoding="UTF-8"?>
<patent-document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" pid="58326519" doc-generation-date="2016-10-11">
<bibliographic-data>
<application-reference>
<pan>46422</pan>
</application-reference>
<publication-reference>
<publication-office>KR</publication-office>
<patent-publication-date>
<year>2016</year>
<month>10</month>
<day>11</day>
</patent-publication-date>
</publication-reference>
<parties>
<applicants>
<applicant sequence="1">
<name lang="EN"></name>
<address>
<location-of-work>KR</location-of-work>M
</address>
</applicant>
</applicants>
</parties>
</bibliographic-data>
<vendor>Any</vendor>
<document-translation-date>2016-11-24</document-translation-date>M
<invention-title lang="EN">Cell preservation container for liquid-based cell inspection</invention-title>
<abstract lang="EN">The present invention relates to a liquid for discharging liquid containing cells and cell may be a sampling which is simply eminent generated in </abstract>
<comment lang="EN"></comment>
</patent-document>
謝謝。差不多了。出於某種原因,我收到一條消息,指出「在文件test.xml結尾處缺少換行符」,並且正在關閉的 patent-document>標記被丟棄。有什麼辦法解決這個問題? – Cinda
由於最後一行不包含終止換行符,sed從不處理它。我從來沒有見過這個問題,但[這裏的第二個答案](http://unix.stackexchange.com/questions/31947/how-to-add-a-newline-to-the-end-of-a-文件)似乎是合理的:'echo >> test.xml; sed -e'/ /,/ <\/parties>/d'test.xml' –
stevesliva