使用pd.read_clipboard複製MultiIndex數據框？

給定一個dataframe like this：使用pd.read_clipboard複製MultiIndex數據框？

如何使用pd.read_clipboard看了嗎？我已經試過這樣：

df = pd.read_clipboard(index_col=[0, 1])

但它拋出一個錯誤：

ParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 3

我怎樣才能解決這個問題？

其他pd.read_clipboard問題：

How do you handle column names having spaces in them when using pd.read_clipboard?
How to handle custom named index when copying a dataframe using pd.read_clipboard?

來源

2017-08-17 cᴏʟᴅsᴘᴇᴇᴅ

hhahahhhahhhaaahahhaahaaaaa。 ....你不！說真的，我希望你能。隨意編寫解析器並貢獻給熊貓項目。（ - ： – piRSquared

哇......真的嗎？:(就像Scott一樣，我總是將列移動到下方，然後手動設置索引，我以爲我是唯一一個做冗餘的人。 huh。 –

如果你可以收集所有這些案件的筆記本電腦，我會非常感謝:) – Wen

UPDATE：現在它解析剪貼板 - 即無需SAV Ë事前

def read_clipboard_mi(index_names_row=None, **kwargs): 
    encoding = kwargs.pop('encoding', 'utf-8') 

    # only utf-8 is valid for passed value because that's what clipboard 
    # supports 
    if encoding is not None and encoding.lower().replace('-', '') != 'utf8': 
     raise NotImplementedError(
      'reading from clipboard only supports utf-8 encoding') 

    from pandas import compat, read_fwf 
    from pandas.io.clipboard import clipboard_get 
    from pandas.io.common import StringIO 
    data = clipboard_get() 

    # try to decode (if needed on PY3) 
    # Strange. linux py33 doesn't complain, win py33 does 
    if compat.PY3: 
     try: 
      text = compat.bytes_to_str(
       text, encoding=(kwargs.get('encoding') or 
           get_option('display.encoding')) 
      ) 
     except: 
      pass 

    index_names = None 
    if index_names_row: 
     if isinstance(index_names_row, int): 
      index_names = data.splitlines()[index_names_row].split() 
      skiprows = [index_names_row] 
      kwargs.update({'skiprows': skiprows}) 
     else: 
      raise Exception('[index_names_row] must be of [int] data type') 

    df = read_fwf(StringIO(data), **kwargs) 
    unnamed_cols = df.columns[df.columns.str.contains(r'Unnamed:')].tolist() 

    if index_names: 
     idx_cols = df.columns[range(len(index_names))].tolist() 
    elif unnamed_cols: 
     idx_cols = df.columns[range(len(unnamed_cols))].tolist() 
     index_names = [None] * len(idx_cols) 

    df[idx_cols] = df[idx_cols].ffill() 
    df = df.set_index(idx_cols).rename_axis(index_names) 

    return df

測試多指數DF無索引名：

In [231]: read_clipboard_mi() 
Out[231]: 
      C 
1.1 111 20 
    222 31 
3.3 222 24 
    333 65 
5.5 333 22 
6.6 777 74

測試多指標DF與指數名稱：

In [232]: read_clipboard_mi(index_names_row=1) 
Out[232]: 
      C 
A B 
1.1 111 20 
    222 31 
3.3 222 24 
    333 65 
5.5 333 22 
6.6 777 74

注：

這不是很好的測試
它不支持多級列
看到1點;-)

NOTE2：請隨意使用此代碼或創建a pull request on Pandas github

來源

2017-08-17 17:54:06 MaxU

哇，看起來很神奇！你應該考慮給熊貓開發一個公關。 –

不錯，期待大熊貓新增功能〜:) – Wen

謝謝各位！我認爲我沒有足夠的時間和耐心編寫所有必要的測試... – MaxU

使用pd.read_clipboard複製MultiIndex數據框？

回答

相關問題