2014-11-17 48 views
0

我與參數Hadoop的數據流作業:如何在Oozie上使用hadoop streaming cmdenv?

-cmdenv TEXT_DIR=cachetextdir 

如何在Oozie的工作流程指定此?

(我假設我可以指出在Oozie的到cachetextdir有:

<archive>hdfs://localhost:54310/user/vm/textinput/cachetextdir.tar.gz#cachetextdir</archive> 

回答

1

的樣子:

  <streaming> 
      <mapper>[MAPPER-PROCESS]</mapper> 
      <reducer>[REDUCER-PROCESS]</reducer> 
      <record-reader>[RECORD-READER-CLASS]</record-reader> 
      <record-reader-mapping>[NAME=VALUE]</record-reader-mapping> 
      ... 
      <env>[NAME=VALUE]</env> 
      ... 
     </streaming> 

here將做的工作

UPDATE:是的,它確實:

<streaming> 
     <mapper>python smspipelineHadoop.py</mapper> 
     <env>TEXT_DIR=cachetextdir</env> 

    </streaming>