2012-11-06 99 views
2

我想要做的: 該程序旨在從SQL Server 2008加載區域銷售數據,並在MapReduce上運行簡單的統計計算以獲得每個區域的總銷售額。我得到的錯誤說程序找不到sqljdbc4.jar文件,但是,該文件確實已經複製到代碼中指定的位置。在Hadoop中訪問SQL Server

  1. 所以喜歡代碼:

    //fileName: MRExp.java 
    public class MRExp { 
        public static void main(String[] args) throws IOException { 
         JobConf conf = new JobConf(MRExp.class); 
         DistributedCache.addFileToClassPath(new Path("/userX/sqljdbc4.jar"), conf); 
    
         conf.setMapperClass(MRMapper.class); 
         conf.setReducerClass(MRReducer.class); 
    
         conf.setMapOutputKeyClass(Text.class); 
         conf.setMapOutputValueClass(LongWritable.class); 
    
         conf.setOutputKeyClass(LongWritable.class); 
         conf.setOutputValueClass(Text.class); 
    
         conf.setInputFormat(DBInputFormat.class); 
         FileOutputFormat.setOutputPath(conf, new Path(args[0])); 
    
         DBConfiguration 
           .configureDB(
             conf, 
             "com.microsoft.sqlserver.jdbc.SQLServerDriver", 
             "jdbc:sqlserver://MyDbServerAddr:1433;databaseName=ThisDb;integratedSecurity=true;", 
             "db_userName", "db_Pws"); 
    
         DBInputFormat 
           .setInput(conf, InfoUnit.class, 
             "SELECT R_NAME,L_ORDERKEY from dbo.United10MB ;"/* inputQuery */ 
             , "SELECT COUNT(L_ORDERKEY) from dbo.United10MB"/* inputCountQuery */); 
    
         try { 
          JobClient.runJob(conf); 
         } catch (Exception e) { 
          e.printStackTrace(); 
         } 
        } 
    } 
    

    //其次MRMapper和MRReducer和資訊室的定義。 InfoUnit實現了Writable,DBWritable。

  2. 文件位置:

    [根@測試MRExp]#PWD
    /根/ MRExp
    [根@測試MRExp]#LS
    類的hadoop-0.20.2-core.jar添加MRExp。的java sqljdbc4.jar

  3. 然後,編譯MRExp.java:

    [根@測試MRExp]#javac的-classpath的hadoop-0.20.2-core.jar添加-d類/ MRExp.java
    [root @ test MRExp]#jar -cvf MRExp.jar -C classes /。

也複製到sqljdbc4.jar HDFS:

[[email protected] MRExp]# hadoop dfs -copyFromLocal sqljdbc4.jar /userX 

所以我們得到:

[[email protected] MRExp]# ls 
classes hadoop-0.20.2-core.jar MRExp.jar MRExp.java sqljdbc4.jar 
  1. 以上後,火MAPR過程:

    [root @ test MRExp]#hadoop jar MRExp.jar mrexp.MRExp/userX/output

但程序上寫着:

17:02:50 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 
12/10/28 17:02:50 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1350984913454_0009 
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver 
     at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:165) 
     at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:70) 
     at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130) 
     at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:607) 
     at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:476) 
     at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:468) 
     at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:359) 
     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226) 
     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at javax.security.auth.Subject.doAs(Subject.java:396) 
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) 
     at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223) 
     at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609) 
     at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at javax.security.auth.Subject.doAs(Subject.java:396) 
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) 
     at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604) 
     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:880) 
     at mrexp.MRExp.main(MRExp.java:70) 
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
     at java.lang.reflect.Method.invoke(Method.java:597) 
     at org.apache.hadoop.util.RunJar.main(RunJar.java:208) 
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver 
     at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.getConnection(DBInputFormat.java:191) 
     at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:159) 
     ... 25 more 
Caused by: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver 
     at java.net.URLClassLoader$1.run(URLClassLoader.java:202) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at java.net.URLClassLoader.findClass(URLClassLoader.java:190) 
     at java.lang.ClassLoader.loadClass(ClassLoader.java:306) 
     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) 
     at java.lang.ClassLoader.loadClass(ClassLoader.java:247) 
     at java.lang.Class.forName0(Native Method) 
     at java.lang.Class.forName(Class.java:169) 
     at org.apache.hadoop.mapreduce.lib.db.DBConfiguration.getConnection(DBConfiguration.java:148) 
     at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.getConnection(DBInputFormat.java:185) 
     ... 26 more 

回答

0

包含在「-libjars」命令行hadoop jar …命令的選項sqljdbc4.jar JAR。

請閱讀Cloudera的this以獲取更多信息。

UPDATE:

執行以下操作

[[email protected] MRExp]# hadoop dfs -ls /userX 

複製到文件系統的sqljdbc4.jar的絕對路徑,並把下面一行

DistributedCache.addFileToClassPath(new Path("<Absolute Path>/sqljdbc4.jar"), conf); 

這將解決這個問題。