2013-10-08 30 views
7

手冊/文檔廣泛使用'內袋'和'外袋'的語言(比如說:http://pig.apache.org/docs/r0.11.1/basic.html),但是我還是無法清楚地明確區分術語的確切定義。pigLatin中「外袋」和「內袋」的區別是什麼?

例如所有固有的相互關聯:

  • 如果我給你一包「富,」你會需要知道標記FOO作爲「內包」與一個「外袋」?
  • 是不是最外包的'任何包',然後'內包'?
  • 內部和外部的標籤總是排他性的嗎?
  • 在PigLatin中,都是'包包''關係' - 或者只是'最外層包'的關係? (和內袋不關係)

以創建分析討論例如:

grunt> dump A;  
(1,2,3) 
(4,2,1) 
(8,3,4) 
(4,3,3) 


grunt> W1 = GROUP A ALL;   
grunt> W2 = GROUP W1 ALL; 
grunt> W3 = GROUP W2 ALL; 
grunt> W4 = GROUP W3 ALL; 

grunt> describe W4; 
W4: {group: chararray,W3: {(group: chararray,W2: {(group: chararray,W1: {(group: chararray,A: {(f1: int,f2: int,f3: int)})})})}} 


grunt> illustrate W4; 
(1,2,3) 
--------------------------------------------------- 
| A  | f1:int  | f2:int  | f3:int  | 
--------------------------------------------------- 
|  | 1   | 2   | 3   | 
|  | 8   | 3   | 4   | 
--------------------------------------------------- 
------------------------------------------------------------------------------------------------ 
| W1  | group:chararray  | A:bag{:tuple(f1:int,f2:int,f3:int)}       | 
------------------------------------------------------------------------------------------------ 
|  | all     | {(1, 2, 3), (8, 3, 4)}          | 
------------------------------------------------------------------------------------------------ 
----------------------------------------------------------------------------------------------------------------------------------------------- 
| W2  | group:chararray  | W1:bag{:tuple(group:chararray,A:bag{:tuple(f1:int,f2:int,f3:int)})}           | 
----------------------------------------------------------------------------------------------------------------------------------------------- 
|  | all     | {(all, {(1, 2, 3), (8, 3, 4)})}                    | 
----------------------------------------------------------------------------------------------------------------------------------------------- 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
| W3  | group:chararray  | W2:bag{:tuple(group:chararray,W1:bag{:tuple(group:chararray,A:bag{:tuple(f1:int,f2:int,f3:int)})})}              | 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
|  | all     | {(all, {(all, {(1, 2, 3), (8, 3, 4)})})}                             | 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
| W4  | group:chararray  | W3:bag{:tuple(group:chararray,W2:bag{:tuple(group:chararray,W1:bag{:tuple(group:chararray,A:bag{:tuple(f1:int,f2:int,f3:int)})})})}                  | 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
|  | all     | {(all, {(all, {(all, {(1, 2, 3), (8, 3, 4)})})})}                                       | 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 

grunt> dump W4; 
(all,{(all,{(all,{(all,{(1,2,3),(4,2,1),(8,3,4),(4,3,3)})})})}) 

之中袋 - W1,W2,W3,W4 - 這是內,這是外?

回答

4

外袋實際上是關係A。這有點奇怪,但一旦你知道內袋是什麼,它就會變得清晰。爲了便於閱讀,我們只需看W1,因爲嵌套行李不會改變答案。

模式和輸出W1

W1: {group:chararray, A:bag{:tuple(f1:int,f2:int,f3:int)}} 
(all,{(1, 2, 3), (8, 3, 4)}) 

我們可以看到他們的是一個名爲AW1一個字段是一個袋子。這是一個內袋,因爲袋子是關係中的一個領域。

請記住,袋只是無序集元組,我們可以看到這是W1的輸出。現在,看看關係A的輸出:

(1,2,3) 
(4,2,1) 
(8,3,4) 
(4,3,3) 

豬不保證這些元組的順序(除非你ORDER或東西)。所以,如果你仔細想想,關係A實際上只是一個無序的元組集合。這是一個外袋。

你可以找到這個here的一些例子。

+0

這有助於thx。我想我現在明白了:沒有任何其他包裝的包包就是'外包';也恰好是'關係'。如果它包含任何袋子,那麼每個袋子都是'內袋'(而不是'外袋')。 –