2016-11-16 50 views
0

我在本地運行地圖縮小。Map Reduce:爲什麼需要指定「python」之前管道到.py文件?

我的命令行命令如下:

cat testfile | python ./mapper.py | python ./reducer.py 

,這工作得很好。然而,當我的命令如下:

cat testfile | ./mapper.py | ./reducer.py 

我收到以下錯誤:

./mapper.py: line 1: import: command not found 
./mapper.py: line 3: syntax error near unexpected token `(' 
./mapper.py: line 3: `def mapper(): 

這是有道理的,因爲在命令行正在讀我的Python文件作爲bash和由Python的語法感到困惑。

但我看到的所有在線示例(例如http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/)都不包括.py文件之前的python。如何在不指定之前配置我的機器以運行管道mapper.pyreducer.py

萬一有幫助,這是我的映射器代碼:

import sys 

def mapper(): 
    for line in sys.stdin: 
     data = line.strip().split('\t') 
     if len(data) == 6: 
      category = data[3] 
      sales = data[4] 
      print '{0}\t{1}'.format(category, sales) 

if __name__ == "__main__": 
    mapper() 

這裏是我的減速器代碼:

import sys 

def reducer(): 
    current_total = 0 
    old_key = None 

    for line in sys.stdin: 
     data = line.strip().split('\t') 
     if len(data) == 2: 
      current_key, sales = data 
      sales = float(sales) 

      if old_key and current_key != old_key: 
       print "{0}\t{1}".format(old_key, current_total) 
       current_total = 0 
      old_key = current_key 
      current_total += sales 

    print "{0}\t{1}".format(current_key, current_total) 

if __name__ == "__main__": 
    reducer() 

我的數據是這樣的:

2012-01-01  09:01 Anchorage  DVDs 6.38 Amex 
2012-01-01  09:01 Aurora Electronics 117.81 MasterCard 
2012-01-01  09:01 Philadelphia DVDs 351.31 Cash 
+0

你的Python腳本的開頭'#添加hashbang行的/ usr/bin中/ env的python' –

+0

附加家當並設置執行ATTRIB'使用chmod + X script.py' – furas

回答

3

因爲你文件不知道它的iterpreter。您正在使用python ./myfile明確指定它。如果你不想明確地定義它。您可以在文件的第一行提到shebang,這基本上是解釋器的路徑。對於Python,認領是這樣的:

#!/usr/bin/env python 

#!/usr/local/bin/python 

有關詳細信息,讀:

作爲每shebang wiki

Under Unix-like operating systems, when a script with a shebang is run as a program, the program loader parses the rest of the script's initial line as an interpreter directive; the specified interpreter program is run instead, passing to it as an argument the path that was initially used when attempting to run the script

+0

真棒!工作完美,謝謝。 – bigmacboy78

相關問題