0
我想知道是否可以根據條件組合列值。讓我來解釋......Hive根據條件組合列值
讓說我的數據看起來像這樣
Id name offset
1 Jan 100
2 Janssen 104
3 Klaas 150
4 Jan 160
5 Janssen 164
的我的輸出應該是這樣的
Id fullname offsets
1 Jan Janssen [ 100, 160 ]
我想的名字值在兩行合併的地方兩行的偏移不再相隔1個字符。
我的問題是,如果這種類型的數據操作是可能的,是否有人可以共享一些代碼和解釋?
請溫柔,但是這小小的一段代碼返回這片一些我想要的東西是什麼?
ArrayList<String> persons = new ArrayList<String>();
// write your code here
String _previous = "";
//Sample output form entities.txt
//USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Berkowitz,PERSON,9,10660
//USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Marottoli,PERSON,9,10685
File file = new File("entities.txt");
try {
//
// Create a new Scanner object which will read the data
// from the file passed in. To check if there are more
// line to read from it we check by calling the
// scanner.hasNextLine() method. We then read line one
// by one till all line is read.
//
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
if(_previous == "" || _previous == null)
_previous = scanner.nextLine();
String _current = scanner.nextLine();
//Compare the lines, if there offset is = 1
int x = Integer.parseInt(_previous.split(",")[3]) + Integer.parseInt(_previous.split(",")[4]);
int y = Integer.parseInt(_current.split(",")[4]);
if(y-x == 1){
persons.add(_previous.split(",")[1] + " " + _current.split(",")[1]);
if(scanner.hasNextLine()){
_current = scanner.nextLine();
}
}else{
persons.add(_previous.split(",")[1]);
}
_previous = _current;
}
} catch (Exception e) {
e.printStackTrace();
}
for(String person : persons){
System.out.println(person);
}
工作樣本數據
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Richard,PERSON,7,2732
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Marottoli,PERSON,9,2740
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Marottoli,PERSON,9,2756
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Marottoli,PERSON,9,3093
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Marottoli,PERSON,9,3195
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Berkowitz,PERSON,9,3220
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Berkowitz,PERSON,9,10660
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Marottoli,PERSON,9,10685
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Lea,PERSON,3,10858
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Lea,PERSON,3,11063
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Ken,PERSON,3,11186
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Marottoli,PERSON,9,11234
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Berkowitz,PERSON,9,17073
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Lea,PERSON,3,17095
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Stephanie,PERSON,9,17330
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4,Putt,PERSON,4,17340
其中產生這樣的輸出
Richard Marottoli
Marottoli
Marottoli
Marottoli
Berkowitz
Berkowitz
Marottoli
Lea
Lea
Ken
Marottoli
Berkowitz
Lea
Stephanie Putt
親切的問候
我對輸出如何派生有點困惑,但我認爲這是非常類似於[這個問題](http://stackoverflow.com/questions/14028796/reduce-a-set-of-rows -hive-to-another-set-of-rows)我在配置單元中使用自定義映射/減少來回答。你只需要提供適當的reduce腳本。 – libjack
我用一段java代碼編寫我的問題,樣本數據和輸出。我想將我的java代碼轉換爲配置單元代碼。任何想法,如果這是可能的? – Tinuz
抱歉,您的其他代碼仍未明確說明您要完成的任務。較新的代碼/數據看起來像要將表加載到配置單元並提取列(這很可能),而前者以某種方式組合行。 – libjack