2016-10-05 45 views
0

我對python比較陌生,想知道如果我可以在解析數據時獲得一些幫助,以便更容易分析。解析凌亂的數據


我的數據是在以下形式(每個是整條生產線):

20160930-07:06:54.481737|I|MTP_4|CL:BF K7-M7-N7 Restrict for maxAggressive: -4.237195 
20160930-07:06:54.481738|I|MTP_4|CL:BF K7-M7-N7 BidPrice: -5.0 mktBestBid: -5.0 bidTheo: -4.096774 bidSeedEdge: 0.195028 bidUnseedEdge: CL:BF K7-M7-N7 = 0.14042 Min Bid: -6.0 Max Ticks Offset: 1 Max Aggressive Ticks: 1 

這是到目前爲止我的代碼

# Output file 
output_filename = os.path.normpath("Mypath/testList.log") 
# Overwrites the file 
with open(output_filename, "w") as out_file: 
    out_file.write("") 

# Open output file 
with open(output_filename, "a") as out_file: 
    # Open input file in 'read' mode 
    with open("mypath/tradedata.log", "r") as in_file: 
     # Loop over each log line, Grabs lines with necessary data 
     for line in islice(in_file, 177004, 8349710): 
        out_file.write(line) 

這將是最簡單的只是去通過並通過關鍵字來做到這一點; bidUnseedEdge,mktBesdBid等?

+0

我們需要更多的上下文。解析數據是爲了獲得某些優勢,例如更改表示類型或查找/過濾元素。你的解析的目標是什麼?跳過符合某些標準的線條?只是改變表示?如果是這樣,到什麼類型? – sal

+0

@Alex,您需要的實際輸出是什麼? – Prabhakar

+0

我試圖抓取我們產品K7-M7-N7的數據,以及bidTheo和maxAggressive的相應值,以便我可以分析數據。 – Alex

回答

0
infilename = "path/data.log" 
outfilename = "path/OutputData.csv" 

with open(infilename, 'r') as infile,\ 
    open(outfilename, "w") as outfile: 
    lineCounter = 0 
    for line in infile: 
     lineCounter += 1 
     if lineCounter % 1000000 == 0: 
      print lineCounter 
     data = line.split("|") 
     if len(data) < 4: 
      continue 
     bidsplit = data[3].split("bidTheo:") 
     namebid = data[3].split("BidPrice:") 
     if len(bidsplit) == 2: 
      bid = float(bidsplit[1].strip().split()[0]) 
      bidname = namebid[0].strip().split(",")[0] 
      #print "bidTheo," + data[0] + "," + str(bid) 
      outfile.write("bidTheo," + data[0] + "," + bidname + "," + str(bid) + "\n") 
     offersplit = data[3].split("offerTheo:") 
     nameoffer = data[3].split("AskPrice:") 
     if len(offersplit) == 2: 
      offer = float(offersplit[1].strip().split()[0]) 
      offername = nameoffer[0].strip().split(",")[0] 
      #print "offerTheo," + data[0] + "," + str(offer) 
      outfile.write("offerTheo," + data[0] + "," + offername + "," + str(offer) + "\n") 

print "Done"