concat在pentaho中的每個n行

我想要形成一個轉換，我需要連接每個10行的值。concat在pentaho中的每個n行

第一步驟：表輸入（從Postgres的DB查詢：select id from tablename）從上述查詢

樣品結果：

    id 
00000191-555c-11e4-922d-29fb57a42e4c 
00000192-555c-11e4-922d-29fb57a42e4c 
00000193-555c-11e4-922d-29fb57a42e4c 
00000194-555c-11e4-922d-29fb57a42e4c 
00000195-555c-11e4-922d-29fb57a42e4c 
00000196-555c-11e4-922d-29fb57a42e4c 
00000197-555c-11e4-922d-29fb57a42e4c 
00000198-555c-11e4-922d-29fb57a42e4c 
00000199-555c-11e4-922d-29fb57a42e4c 
0000019a-555c-11e4-922d-29fb57a42e4c 
000001a3-3cf2-11e4-b398-e52ee0ec6a4c 
000002ad-3768-4242-88cf-96f27d0263af 
000003ea-26e3-11e4-ace7-15c7d609fa6e 
00000684-73fb-4d65-a502-87c4eb6607c1 
0000087a-f587-44fa-8e88-7bcae5bcb22c 
00000889-39c5-11e4-bd0e-c3f9d65ac856 
0000094c-be98-4456-8b49-6357a36581aa 
00000987-2f19-4574-ab85-6744a65ee4e3 
00000cd0-4097-11e4-a4e6-af71a3d902c0 
00000e1e-3b55-11e4-9897-d958d55e6784

這裏我不得不IDS Concat的每10行到單個行。例如。單行1-10行ID，另一行11-20行ID，等等。

預期輸出：

ids 

00000191-555c-11e4-922d-29fb57a42e4c,00000192-555c-11e4-922d-29fb57a42e4c,00000193-555c-11e4-922d-29fb57a42e4c,00000194-555c-11e4-922d-29fb57a42e4c,00000195-555c-11e4-922d-29fb57a42e4c,00000196-555c-11e4-922d-29fb57a42e4c,00000197-555c-11e4-922d-29fb57a42e4c,00000198-555c-11e4-922d-29fb57a42e4c,00000199-555c-11e4-922d-29fb57a42e4c,0000019a-555c-11e4-922d-29fb57a42e4c 
000001a3-3cf2-11e4-b398-e52ee0ec6a4c,000002ad-3768-4242-88cf-96f27d0263af,000003ea-26e3-11e4-ace7-15c7d609fa6e,00000684-73fb-4d65-a502-87c4eb6607c1,0000087a-f587-44fa-8e88-7bcae5bcb22c,00000889-39c5-11e4-bd0e-c3f9d65ac856,0000094c-be98-4456-8b49-6357a36581aa,00000987-2f19-4574-ab85-6744a65ee4e3,00000cd0-4097-11e4-a4e6-af71a3d902c0,00000e1e-3b55-11e4-9897-d958d55e6784

我知道group by或內存組通過將Concat的行，但在這種情況下，如果可以讓我使用它，我該如何使用它。

請幫我這個。提前致謝！

來源

2017-03-06 Arunraj

編輯您的問題，並提供樣本數據和期望的結果。 –

@GordonLinoff添加了示例數據和預期輸出。希望現在可以理解。 – Arunraj

像這樣的嗎？

t=# \x 
Expanded display is on. 
t=# with a as 
(
    select ntile(2) over (order by id),id from tablename 
) 
select 
    string_agg(id,',') 
from a 
group by ntile; 
-[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
string_agg | 00000191-555c-11e4-922d-29fb57a42e4c, 00000192-555c-11e4-922d-29fb57a42e4c, 00000193-555c-11e4-922d-29fb57a42e4c, 00000194-555c-11e4-922d-29fb57a42e4c, 00000195-555c-11e4-922d-29fb57a42e4c, 00000196-555c-11e4-922d-29fb57a42e4c, 00000197-555c-11e4-922d-29fb57a42e4c, 00000198-555c-11e4-922d-29fb57a42e4c, 00000199-555c-11e4-922d-29fb57a42e4c, 0000019a-555c-11e4-922d-29fb57a42e4c 
-[ RECORD 2 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
string_agg | 000001a3-3cf2-11e4-b398-e52ee0ec6a4c, 000002ad-3768-4242-88cf-96f27d0263af, 000003ea-26e3-11e4-ace7-15c7d609fa6e, 00000684-73fb-4d65-a502-87c4eb6607c1, 0000087a-f587-44fa-8e88-7bcae5bcb22c, 00000889-39c5-11e4-bd0e-c3f9d65ac856, 0000094c-be98-4456-8b49-6357a36581aa, 00000987-2f19-4574-ab85-6744a65ee4e3, 00000cd0-4097-11e4-a4e6-af71a3d902c0, 00000e1e-3b55-11e4-9897-d958d55e6784

來源

2017-03-06 13:19:36

會不會根據數據集大小聚合不同數量的行（2）？ – user4637357

肯定 - 我使用ntile（20/10）作爲例子 –

@VaoTsun當表擁有數百萬條記錄時，上述查詢的性能如何？ – Arunraj

如果您沒有合適的字段來分組您的ID，請自行創建一個ID。

在這種情況下，我會在查詢中添加行號並將它們除以10以得到一個體面且易於配置的組。

select row_number()/10 + 1 OVER (ORDER BY id) as rnum, id from tablename ORDER BY rnum

這應該給你10行與RNUM 1，10行與RNUM 2等配置此字段作爲分組依據字段，你就大功告成了。

來源

2017-03-06 13:17:43 Cyrus

或者，您可以通過組合「添加序列」步驟，然後在計算器，UDJC，java腳本步驟或java表達式中將計數器除以10來在PDI中創建組字段。 PDI方式更笨拙，但如果您需要使用其他數據源，可能會有用。 – user4637357

順便說一下，您需要確保Group By步驟的輸入在組字段上排序以獲得正確的聚合。因此，最好將ORDER BY ID或ORDER BY rnum包含到上面的查詢中。我不認爲PostgreSQL在一般情況下爲row_number值提供了任何排序保證。 – user4637357

@ user4637357我會將它添加到答案中，但是在窗口函數期間發生排序，所以它應該是不必要的。沒有指定其他操作，優化器不會再次洗牌記錄。至少，我從未在其他RDBMS中看到過這種情況。 – Cyrus

我認爲解決的辦法是：

select string_agg(id, ',') 
from (select t.*, row_number() over (order by id) - 1 as seqnum 
     from t 
    ) t 
group by floor(seqnum/10);

雖然這種使用string_agg()，我可能會使用陣列來的結果。

來源

2017-03-07 02:22:50

concat在pentaho中的每個n行

回答

相關問題