2016-07-28 59 views
3

我有一些數據在2個CSV文件中,一個包含頂點,另一個文件包含的邊緣在另一個文件中。我正在研究如何使用ETL進行設置,並且接近但尚未完成 - 它主要工作,但我的邊有屬性,我不確定它們是否正確加載。 This question是有益的,但我還是失去了一些東西......OrientDB ETL加載一個文件中的頂點和另一個邊上的頂點的CSV

這裏是我的數據:

vertices.csv

label,data,date 
v01,0.1234,2015-01-01 
v02,0.5678,2015-01-02 
v03,0.9012,2015-01-03 

edges.csv

u,v,weight,date 
v01,v02,12.4,2015-06-17 
v02,v03,17.9,2015-09-14 

我用這個導入我的頂點:

commonVertices.json

{ 
"begin": [ 
      { "let": { "name":  "$filePath", 
         "expression": "$fileDirectory.append($fileName)" 
         } 
      }, 
     ], 
"config": { "log": "info"}, 
"source": { "file": { "path": "$filePath" } }, 
"extractor": { "csv": { "ignoreEmptyLines": true, 
         "nullValue": "N/A", 
         "dateFormat": "yyyy-mm-dd" 
         } 
      }, 
"transformers": [ 
        { "vertex": { "class": "myVertex" } }, 
        { "code": { "language": "Javascript", 
            "code":  "print(' Current record: ' + record); record;" } 
        } 
       ], 
"loader": { "orientdb": { 
      "dbURL": "plocal:my_orientdb", 
      "dbType": "graph", 
      "batchCommit": 1000, 
      "classes": [ { "name": "myVertex", "extends", "V" }, 
         ], 
      "indexes": [] 
      } 
      } 
} 

vertices.json

{ "config": { "log":   "info", 
       "fileDirectory": "./", 
       "fileName":  "vertices.csv" 
      } 
} 

commonEdges.json

{ 
    "begin": [ 
     { "let": { "name": "$filePath", 
        "expression": "$fileDirectory.append($fileName)" 
       } 
     }, 
    ], 

    "config": { "log": "info" 
       }, 

    "source": { "file": { "path": "$filePath" } }, 

    "extractor": { "csv": { "ignoreEmptyLines": true, 
          "nullValue": "N/A", 
          "dateFormat": "yyyy-mm-dd" 
          } 
       }, 

    "transformers": [ 
      { "merge": { "joinFieldName": "u", "lookup": "myVertex.label" } }, 
      { "edge": { "class":   "myEdge", 
          "joinFieldName": "v", 
          "lookup":  "myVertex.label", 
          "direction":  "out", 
          "unresolvedLinkAction": "NOTHING" 
         } 
      }, 
      { "field": { "fieldNames": ["u", "v"], "operation": "remove" } } 
     ], 

    "loader": { 
     "orientdb": { 
      "dbURL": "plocal:my_orientdb", 
      "dbType": "graph", 
      "batchCommit": 1000, 
      "useLightweightEdges": false, 
      "classes": [ 
       { "name": "myEdge", "extends", "E" } 
      ], 
      "indexes": [] 
     } 
    } 
} 

edges.json

{ 
    "config": { 
     "log": "info", 
     "fileDirectory": "./", 
     "fileName": "edges.csv" 
    } 
} 

我與oetl.sh像這樣運行它:

$ oetl.sh vertices.json commonVertices.json 
$ oetl.sh edges.json commonEdges.json 

,一切都會運行,但是當我查詢的邊緣......我是新來OrientDB,所以也許這是得到的屬性在我的邊緣,但是當我查詢的邊緣,我不看重量和日期字段:

orientdb {db=my_orientdb}> SELECT FROM myEdge 
+----+-----+------+-----+-----+ 
|# |@RID |@CLASS|out |in | 
+----+-----+------+-----+-----+ 
|0 |#33:0|myEdge|#25:0|#26:0| 
|1 |#34:0|myEdge|#26:0|#27:0| 
+----+-----+------+-----+-----+ 

頂點表包含從我edges.csv和[日期]對[體重]字段我的領域越來越clo i一個奇怪的方式。這個月的日子越來越覆蓋從edge.csv文件,這是不可取的日子,但很奇怪,我認爲本月本身是不是也越來越變化:

orientdb {db=my_orientdb}> SELECT FROM myVertex 
+----+-----+--------+------+-------------------+-----+------+----------+---------+ 
|# |@RID |@CLASS |data |date    |label|weight|out_myEdge|in_myEdge| 
+----+-----+--------+------+-------------------+-----+------+----------+---------+ 
|0 |#25:0|myVertex|0.1234|2015-01-17 00:06:00|v01 |12.4 |[#33:0] |   | 
|1 |#26:0|myVertex|0.5678|2015-01-14 00:09:00|v02 |17.9 |[#34:0] |[#33:0] | 
|2 |#27:0|myVertex|0.9012|2015-01-03 00:01:00|v03 |  |   |[#34:0] | 
+----+-----+--------+------+-------------------+-----+------+----------+---------+ 

我敢肯定,這可能是一個簡單的調整,任何幫助將是偉大的!

回答

5

在邊緣變壓器中使用edgeFields來綁定邊中的屬性。例如:

"transformers": [ 
      { "merge": { "joinFieldName": "u", "lookup": "myVertex.label" } }, 
      { "edge": { "class":   "myEdge", 
          "joinFieldName": "v", 
          "lookup":  "myVertex.label", 
          "edgeFields": { "weight": "${input.weight}", "date": "${input.date}" }, 
          "direction":  "out", 
          "unresolvedLinkAction": "NOTHING" 
         } 

      }, 
      { "field": { "fieldNames": ["u", "v"], "operation": "remove" } } 
     ], 

希望它有幫助。

+0

謝謝,這解決了我在這個問題上遇到的兩個問題之一。 – TxAG98

+0

我在日期字段中特別針對[另一個問題](http://stackoverflow.com/questions/38702959/edge-properties-clobbering-vertex-properties-in-orientdb-from-etl)發佈了後續行爲問題... – TxAG98

相關問題