2012-06-15 10 views
4

我想用批處理文件插入一個字符串替換特定列 的空白處說,我有一個input.txt中像下面使用批處理文件中插入串

field1  field2   field3 
AAAAA  BBBBB   CCCCC 
DDDDD      EEEEE 
FFFFF       
GGGGG  HHHHH 

我需要在每個空的字段上插入一個字符串「NULL」,並確保字段1不爲空,並且字段2,3有時將爲空。此外,&空間之間字段1場2是從域2 &不同領域的3

output.txt的

field1  field2   field3 
AAAAA  BBBBB   CCCCC 
DDDDD  NULL    EEEEE 
FFFFF  NULL    NULL  
GGGGG  HHHHH   NULL 

,因爲我還需要批處理文件腳本.. 我嘗試寫代碼(字段2始終從左起12個字符開始,字段3始終從左起29個字符)

@echo off 

set line= 
for /F in (input.txt)do 
if "!line:~12" equ " " 
write "NULL" >> (i am not sure whether this work) 

if "!line:~29" equ " " 
write "NULL" 

echo .>> output.txt 

也許,任何人都可以糾正m錯誤? 謝謝!!

+0

那是什麼語言? –

+0

是窗口腳本,DOS。 – cheeseng

回答

1

正如承諾的,這裏是一個Python解決方案。這個程序可以在Python 3.x或Python 2.7中正常工作。如果你對編程非常陌生,我建議Python 3.x,因爲我認爲它更容易學習。你可以從這裏免費獲得Python:http://python.org/download/

Python的最新版本是3.2.3版本;我建議你明白這一點。

保存在一個文件中的Python代碼調用add_null.py並用命令來運行它:

python add_null.py input_file.txt output_file.txt 

的代碼,有很多的意見:從運行此程序

# import brings in "modules" which contain extra code we can use. 
# The "sys" module has useful system stuff, including the way we can get 
# command-line arguments. 
import sys 

# sys.argv is an array of command-line arguments. We expect 3 arguments: 
# the name of this program (which we don't care about), the input file 
# name, and the output file name. 
if len(sys.argv) != 3: 
    # If we didn't get the right number of arguments, print a message and exit. 
    print("Usage: python add_null.py <input_file> <output_file>") 
    sys.exit(1) 

# Unpack the arguments into variables. Use '_' for any argument we don't 
# care about. 
_, input_file, output_file = sys.argv 


# Define a function we will use later. It takes two arguments, a string 
# and a width. 
def s_padded(s, width): 
    if len(s) >= width: 
     # if it is already wide enough, return it unchanged 
     return s 
    # Not wide enough! Figure out how many spaces we need to pad it. 
    len_padding = width - len(s) 
    # Return string with spaces appended. Use the Python "string repetition" 
    # feature to repeat a single space, len_padding times. 
    return s + ' ' * len_padding 


# These are the column numbers we will use for splitting, plus a width. 
# Numbers put together like this, in parentheses and separated by commas, 
# are called "tuples" in Python. These tuples are: (low, high, width) 
# The low and high numbers will be used for ranges, where we do use the 
# low number but we stop just before the high number. So the first pair 
# will get column 0 through column 11, but will not actually get column 12. 
# We use 999 to mean "the end of the line"; if the line is too short, it will 
# not be an error. In Python "slicing", if the full slice can't be done, you 
# just get however much can be done. 
# 
# If you want to cut off the end of lines that are too long, change 999 to 
# the maximum length you want the line ever to have. Longer than 
# that will be chopped short by the "slicing". 
# 
# So, this tells the program where the start and end of each column is, and 
# the expected width of the column. For the last column, the width is 0, 
# so if the last column is a bit short no padding will be added. If you want 
# to make sure that the lines are all exactly the same length, change the 
# 0 to the width you want for the last column. 
columns = [ (0, 12, 12), (12, 29, 17), (29, 999, 0) ] 
num_columns = len(columns) 

# Open input and output files in text mode. 
# Use a "with" statement, which will close the files when we are done. 
with open(input_file, "rt") as in_f, open(output_file, "wt") as out_f: 
    # read the first line that has the field headings 
    line = in_f.readline() 
    # write that line to the output, unchanged 
    out_f.write(line) 

    # now handle each input line from input file, one at a time 
    for line in in_f: 
     # strip off only the line ending 
     line = line.rstrip('\n') 

     # start with an empty output line string, and append to it 
     output_line = '' 
     # handle each column in turn 
     for i in range(num_columns): 
      # unpack the tuple into convenient variables 
      low, high, width = columns[i] 
      # use "slicing" to get the columns we want 
      field = line[low:high] 
      # Strip removes spaces and tabs; check to see if anything is left. 
      if not field.strip(): 
       # Nothing was left after spaces removed, so put "NULL". 
       field = "NULL" 

      # Append field to output_line. field is either the original 
      # field, unchanged, or else it is a "NULL". Either way, 
      # append it. Make sure it is the right width. 
      output_line += s_padded(field, width) 

     # Add a line ending to the output line. 
     output_line += "\n" 
     # Write the output line to the output file. 
     out_f.write(output_line) 

輸出:

field1  field2   field3 
AAAAA  BBBBB   CCCCC 
DDDDD  NULL    EEEEE 
FFFFF  NULL    NULL 
GGGGG  HHHHH   NULL 
+0

嗨Stevaha, 非常感謝你..你的解決方案真的很有幫助! – cheeseng

+0

我很高興它爲你工作! :-) Python比微軟的「批處理」語言容易得多,所以這對未來也更好。 – steveha

+0

您好Stevaha,我會嘗試挑選這種語言!通常我的工作涉及到批處理文件腳本,所以我需要調用cmd批處理文件中的python文件。你如何在批處理文件中調用? 如果不是如何自動化它? 如果方案是讀取一個文件夾,每個文件與.txt格式 addnull.py ,我嘗試谷歌,但沒有完全聽懂了沒有.. – cheeseng

0

我不認爲你想在微軟的「批處理」腳本中做什麼。但是有一個全套的記錄在這裏串運營商:

http://www.dostips.com/DtTipsStringManipulation.php

但批處理文件是可怕的,我希望你能使用更好的東西。如果你想要一個Python解決方案或AWK,我可以幫你。

如果我是你,我是真的要做到這一點「一批」的腳本,我將打破各行成三個子,使用~x,y列切片(其中x是第一列和y是第二)。然後檢查每個是否只是空格,而對於那些只是空格的替換爲「NULL」。然後將這些子字符串重新加入到一個字符串中,並打印出來。在循環中做這個,你有你的程序。

+0

嗨Stevaha,感謝您的建議..我不是很熟悉Python,如果使用Python可以將它作爲窗口中的日程安排任務運行?其實我也想使用VBScript,但由於我新編碼..任何建議,我歡迎!希望你能幫助我解決這個問題.. – cheeseng

+0

我不知道VBScript,但在StackOverflow上會有其他人知道它。 Python不符合Windows的標準,所以你必須安裝它。如果VBScript在你的工作中更標準,你可能想要使用它。但是,如果您確實安裝了Python,那麼確定您可以將其作爲計劃任務運行。 – steveha

+0

嗨Stevaha, 謝謝,我會試試先安裝python ..並探索新的編碼方法.. 你會告訴我如何使用python來部署這個任務嗎? 解釋你的代碼是如何工作的讚賞,因爲我不熟悉Python ... :) – cheeseng