2015-10-07 60 views
0

輸入數據文件:的Hadoop MapReduce的實踐

名,月,類別,支出

hitesh,1,A1,10020 
hitesh,2,A2,10300 
hitesh,3,A3,10400 
hitesh,4,A4,11000 
hitesh,5,A1,21000 
hitesh,6,A2,5000 
hitesh,7,A3,9000 
hitesh,8,A4,1000 
hitesh,9,A1,111000  
hitesh,10,A2,12000 
hitesh,11,A3,71000 
hitesh,12,A4,177000  
kuwar,1,A1,10700 
kuwar,2,A2,17000 
kuwar,3,A3,10070 
kuwar,4,A4,10007 

人明智的總支出和計數獨特的類別花費。 (輸出需要的樣子:姓名,總支出,獨特的類別總數)

我曾嘗試.....我的代碼

人 - 聰明的總支出

public class Emp 
    { 
    public static class MyMap extends Mapper<LongWritable,Text,Text,IntWritable> 
    { 
     public void map(LongWritable k,Text v, Context con) 
     throws IOException, InterruptedException 
     { 
     String line = v.toString(); 
     String[] w=line.split(","); 
     String person=w[0]; 
     int exp=Integer.parseInt(w[3]); 
     con.write(new Text(person), new IntWritable(exp)); 
     } 
    } 
    public static class MyRed extends Reducer<Text,IntWritable,Text,IntWritable> 
    { 
     public void reduce(Text k, Iterable<IntWritable> vlist, Context con) 
     throws IOException , InterruptedException 
     { 
     int tot =0; 
     for(IntWrit 

able v:vlist) 
    tot+=v.get(); 
    con.write(k,new IntWritable(tot)); 
    } 
} 
public static void main(String[] args) throws Exception 
{ 
    Configuration c = new Configuration(); 
    Job j= new Job(c,"person-wise"); 
    j.setJarByClass(Emp.class); 
    j.setMapperClass(MyMap.class); 
    j.setReducerClass(MyRed.class); 
    j.setOutputKeyClass(Text.class); 
    j.setOutputValueClass(IntWritable.class); 
    Path p1 = new Path(args[0]); 
    Path p2 = new Path(args[1]); 
    FileInputFormat.addInputPath(j,p1); 
    FileOutputFormat.setOutputPath(j,p2); 
    System.exit(j.waitForCompletion(true) ? 0:1); 
} 

} 

如何以獲得該計劃中獨特類別的總數,以及如何使輸出看起來像名稱,總支出,獨特類別的總數。

感謝

回答

0

您可以創建一個自定義寫IntWritabe的pair和一個文本爲類開支和其他和使用,作爲地圖的價值。否則將一些分離器的支出和類別傳遞到一個單獨的字符串中,然後將其分解到還原器一側。

一旦你得到那對總循環相同的總費用和類別把所有的類別放入一個Java集合在同一個循環內,然後使用set.size()來獲得唯一類別的數量和在context.write中打印。再次打印減少邊值時,您可以遵循用於傳遞地圖值的相同技術。

在Mapper方面,用字符串生成器添加類別和支出,並將其作爲地圖值傳遞。

StringBuilder sb = new StringBuilder(); 
String sep=":"; 
sb.append(w[2]); 
sb.append(sep); 
sb.append(w[3]); 

con.write(new Text(person), new Text(sb.toString())); 

在減少方分割與地圖側使用的值並總結花費和計算創建的類別的大小。該代碼未經測試,如果在下面的代碼中遺漏了這些變量,則會投射這些變量。

public void reduce(Text k, Iterable<Text> vlist, Context con) 
     throws IOException , InterruptedException 
     { 
     int tot =0; 
     String myval; 
     Strng[] split_val; 
     Set<String> myset=new HashSet<String>(); 
     int uniq_category; 
     StringBuilder sb1 = new StringBuilder(); 
     for(Text v:vlist) 
     { 
     myval=v.toString(); 
     split_val=myval.split(":"); 
     myset.add(split_val[0]); 
     tot+=Integer.ParseInt(split_val[1]); 
     } 
     uniq_category=myset.size(); 
     String sep=" "; 
    sb1.append(uniq_category); 
    sb1.append(sep); 
    sb1.append(tot); 
    con.write(k,new Text(sb1.toString())); 
    } 
} 

或者創建一個pair與IntWritable和文本在地圖和前面提到的減少值。

+0

@aashish_soni完成編輯帖子 –

+0

@aashish_soni這有助於解決您的問題嗎? –

0

已經完成了代碼中的修改。希望這是有用的。

public class Emp 
     { 
     public static class MyMap extends Mapper<LongWritable,Text,Text,Text> 
     { 
      public void map(LongWritable k,Text v, Context con) 
      throws IOException, InterruptedException 
      { 
      String line = v.toString(); 
      String[] w=line.split(","); 
      String person=w[0]; 
      int exp=Integer.parseInt(w[3]); 
      con.write(new Text(person), new Text(line)); 
      } 
     } 
     public static class MyRed extends Reducer<Text,Text,Text,Text> 
     { 
      public void reduce(Text k, Iterable<Text> vlist, Context con) 
      throws IOException , InterruptedException 
      { 
      int tot =0; 
      Set<String> cat = new HashSet<String>(); 
      for(Text v:vlist){ 
       String data = v.toString(); 
       String[] dataArray = data.Split(","); 
       tot+ = Integer.parseInt((dataArray[3]); //calculating the total spend 
       cat.add(dataArray[2]);// finding the number of unique categories 

     } 
      con.write(k,new Text(tot.toString()+","+cat.size().toString()));// writing the name,total spend and total unique categories to the output 
    } 
    public static void main(String[] args) throws Exception 
    { 
     Configuration c = new Configuration(); 
     Job j= new Job(c,"person-wise"); 
     j.setJarByClass(Emp.class); 
     j.setMapperClass(MyMap.class); 
     j.setReducerClass(MyRed.class); 
     j.setOutputKeyClass(Text.class); 
     j.setOutputValueClass(IntWritable.class); 
     Path p1 = new Path(args[0]); 
     Path p2 = new Path(args[1]); 
     FileInputFormat.addInputPath(j,p1); 
     FileOutputFormat.setOutputPath(j,p2); 
     System.exit(j.waitForCompletion(true) ? 0:1); 
    } 

    }