1
我有兩個數據文件a.csv
和b.csv
可從引擎收錄獲得方式有4列和一些評論:一個合併兩個文件具有相同的「列名」和「不同行」用大熊貓在python
# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
14.2 531.0 0.0618 0.9382
14.2 532.0 0.07905 0.92095
14.2 533.0 0.09989 0.90011
14.2 534.0 0.12324 0.87676
14.2 535.0 0.14674 0.85326
14.2 536.0 0.16745 0.83255
14.2 537.0 0.1837 0.8163
#
# 171 lines, 5 comments, 166 data
第二個文件b.csv有不同數量的行的一個共同的列兩列:
# Version 2.0 - nm, [email protected] to 1, burrows+2006c91.21_T1350_g4.7_f100_solar
# Wavelength(nm) Flambda(ergs/cm^s/s/nm)
300.0 1.53345164121e-32
300.1 1.53345164121e-32
300.2 1.53345164121e-32
# total lines = 20003, comment lines = 2, data lines = 20001
現在,我想合併這兩個文件與第二列公共(兩個文件中的波長應該是相同的)。
輸出看起來像:
# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
# Version 2.0 - nm, [email protected] to 1, burrows+2006c91.21_T1350_g4.7_f100_solar
# Wavelength(nm) Flambda(ergs/cm^s/s/nm)
14.2 531.0 0.0618 0.9382 1.14325276212
14.2 532.0 0.07905 0.92095 1.14557732058
注:的意見也被合併。
在文件b.csv
中,波長是行號= 2313.
我們如何在python中這樣做?
我最初的嘗試是這樣的:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author : Bhishan Poudel
# Date : Jun 17, 2016
# Imports
from __future__ import print_function
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# read in dataframes
#======================================================================
# read in a file
#
infile = 'a.csv'
colnames = ['angle', 'wave','trans','refl']
print('{} {} {} {}'.format('\nreading file : ', infile, '',''))
df1 = pd.read_csv(infile,sep='\s+', header = None,skiprows = 0,
comment='#',names=colnames,usecols=(0,1,2,3))
print('{} {} {} {}'.format('df.head \n', df1.head(),'',''))
#------------------------------------------------------------------
#======================================================================
# read in a file
#
infile = 'b.csv'
colnames = ['wave', 'flux']
print('{} {} {} {}'.format('\nreading file : ', infile, '',''))
df2 = pd.read_csv(infile,sep='\s+', header = None,skiprows = 0,
comment='#',names=colnames,usecols=(0,1))
print('{} {} {} {}'.format('df.head \n', df2.head(),'','\n'))
#----------------------------------------------------------------------
result = df1.append(df2, ignore_index=True)
print(result.head())
print("\n")
一些有用的鏈接如下:
How to merge data frame with same column names
http://pandas.pydata.org/pandas-docs/stable/merging.html