我有一個文件,其中有一堆數據和文本。我想以這樣一種方式讀取文件,即只保留具有三個座標的行。三個座標是指我有格式的行,例如490353.36, 3755632.81, 109.73
。換句話說,我想保留表面線後的數據。數據在不同橫截面上具有x,y和z座標。僅當R中有三列時纔讀取數據
樣本數據如下:
ENDSTREAMNETWORK:
BEGIN CROSS-SECTIONS:
CROSS-SECTION:
STREAM ID:Sipsey Fork
REACH ID:Sipsey Fork
STATION:13.60
NODE NAME:
CUT LINE:
490353.358391478 , 3755632.80772044
490254.511677942 , 3755640.28160111
490229.8 , 3755642.15
490205.088314326 , 3755644.01839947
490130.953109393 , 3755649.62143546
SURFACE LINE:
490353.36, 3755632.81, 109.73
490341.00, 3755633.74, 103.63
490331.74, 3755634.44, 97.54
490276.13, 3755638.65, 91.44
490263.78, 3755639.58, 85.34
490254.51, 3755640.28, 79.25
490254.51, 3755640.28, 79.25
490242.16, 3755641.22, 75.59
490229.80, 3755642.15, 75.59
490217.44, 3755643.08, 75.59
490205.09, 3755644.02, 79.25
490205.09, 3755644.02, 79.25
490186.55, 3755645.42, 85.34
490177.29, 3755646.12, 91.44
490158.75, 3755647.52, 97.54
490146.40, 3755648.45, 103.63
490130.95, 3755649.62, 109.73
END:
CROSS-SECTION:
STREAM ID:Sipsey Fork
REACH ID:Sipsey Fork
STATION:13.552*
NODE NAME:
CUT LINE:
490348.236792825 , 3755554.44864345
490248.581497463 , 3755561.99219479
490223.87626427 , 3755563.8637565
490199.171038808 , 3755565.73531763
490122.732478269 , 3755571.5258566
SURFACE LINE:
490348.24, 3755554.45, 109.73
490335.78, 3755555.39, 103.68
490332.73, 3755555.62, 101.72
490326.44, 3755556.10, 97.65
490321.09, 3755556.50, 96.98
490279.74, 3755559.63, 92.42
490270.38, 3755560.34, 91.35
490262.42, 3755560.94, 87.53
490258.64, 3755561.23, 85.56
490257.92, 3755561.29, 85.22
490253.65, 3755561.61, 82.50
490248.58, 3755561.99, 79.27
490248.58, 3755561.99, 79.27
490245.75, 3755562.21, 78.40
490243.64, 3755562.37, 77.73
490236.08, 3755562.94, 75.58
490223.88, 3755563.86, 75.58
490212.36, 3755564.74, 75.58
490209.15, 3755564.98, 76.44
490206.21, 3755565.20, 77.24
490200.50, 3755565.63, 78.84
490199.17, 3755565.74, 79.26
490199.17, 3755565.74, 79.26
490197.66, 3755565.85, 79.78
490193.00, 3755566.20, 81.22
490186.72, 3755566.68, 83.20
490182.06, 3755567.03, 84.83
490180.06, 3755567.18, 85.47
490170.51, 3755567.91, 91.44
490170.23, 3755567.93, 91.52
490151.40, 3755569.35, 97.45
490141.55, 3755570.10, 102.06
490138.66, 3755570.32, 103.48
490133.49, 3755570.71, 105.53
490122.73, 3755571.53, 109.73
END:
我有上千行如上所示。我只想編譯所有數據,並用逗號分隔三列,並將其保存爲R中的數據框。
上述數據集所需的示例輸出如下。逗號也應刪除
490353.36, 3755632.81, 109.73
490341.00, 3755633.74, 103.63
490331.74, 3755634.44, 97.54
490276.13, 3755638.65, 91.44
490263.78, 3755639.58, 85.34
490254.51, 3755640.28, 79.25
490254.51, 3755640.28, 79.25
490242.16, 3755641.22, 75.59
490229.80, 3755642.15, 75.59
490217.44, 3755643.08, 75.59
490205.09, 3755644.02, 79.25
490205.09, 3755644.02, 79.25
490186.55, 3755645.42, 85.34
490177.29, 3755646.12, 91.44
490158.75, 3755647.52, 97.54
490146.40, 3755648.45, 103.63
490130.95, 3755649.62, 109.73
490348.24, 3755554.45, 109.73
490335.78, 3755555.39, 103.68
490332.73, 3755555.62, 101.72
490326.44, 3755556.10, 97.65
490321.09, 3755556.50, 96.98
490279.74, 3755559.63, 92.42
490270.38, 3755560.34, 91.35
490262.42, 3755560.94, 87.53
490258.64, 3755561.23, 85.56
490257.92, 3755561.29, 85.22
490253.65, 3755561.61, 82.50
490248.58, 3755561.99, 79.27
490248.58, 3755561.99, 79.27
490245.75, 3755562.21, 78.40
490243.64, 3755562.37, 77.73
490236.08, 3755562.94, 75.58
490223.88, 3755563.86, 75.58
490212.36, 3755564.74, 75.58
490209.15, 3755564.98, 76.44
490206.21, 3755565.20, 77.24
490200.50, 3755565.63, 78.84
490199.17, 3755565.74, 79.26
490199.17, 3755565.74, 79.26
490197.66, 3755565.85, 79.78
490193.00, 3755566.20, 81.22
490186.72, 3755566.68, 83.20
490182.06, 3755567.03, 84.83
490180.06, 3755567.18, 85.47
490170.51, 3755567.91, 91.44
490170.23, 3755567.93, 91.52
490151.40, 3755569.35, 97.45
490141.55, 3755570.10, 102.06
490138.66, 3755570.32, 103.48
490133.49, 3755570.71, 105.53
490122.73, 3755571.53, 109.73
如果你使用的是linux或者有'awk',這一行可以幫助'awk'{FS =「,」} {if(NF == 3)print}'raw_text' – dickoa