我不熟悉C++文件讀取,但我通過pyspark做了很多。 所以現在我有一個txt文件,該文件內容如下:C++文件讀取和拆分列
1 52 Hayden Smith 18:16 15 M Berlin
2 54 Mark Puleo 18:25 15 M Berlin
3 97 Peter Warrington 18:26 29 M New haven
4 305 Matt Kasprzak 18:53 33 M Falls Church
5 272 Kevin Solar 19:17 16 M Sterling
6 394 Daniel Sullivan 19:35 26 M Sterling
7 42 Kevan DuPont 19:58 18 M Boylston
8 306 Chris Goethert 20:00 43 M Falls Church
正如你可以看到有8列和351列(其中我只顯示8行), 對於每一行,[0 ]是排名,[1]是BIB,[2]是名,[3]是姓,[4]是時間,[5]是年齡,[6]是性別,[7]是城鎮 例如,第一排,第一排名,第52名BIB,海登史密斯名,18:16是時間,15歲,M是男性,柏林是小鎮。
我有一個排序的鏈接結構,我們稱之爲:類SortedLinked 和項目類型類,叫做:類亞軍
你不必擔心SortedLinked類。
級亞軍有四個私有屬性:
string name, int age, int min, int sec
在我的驅動程序文件,我可以這樣做:
SortedLinked mylist // initialize a sorted list
Runner M("Jordan", 22, 20, 20) // initialize a Runner called Jordan, who is 22 years old, and finished the race in 20 mins and 20 sec
mylist.add(M) //add Runner M into my sorted list
所以我需要閱讀的文本文件,並創建一個亞軍對象跑步者的名字,年齡,分鐘數和秒數。將該Runner插入到已排序的鏈接列表中。
因此,如果這是在pyspark,我可以做到這一點:
file=sc.textFile("hdfs") //we usually use hdfs in pyspark
newfile = file.map(lambda line: line.split('\t') //for each column, they are seperated by Tabs, except column[2][3] are separated by a space
ColumnIneed = newfile.filter(lambda r: [r[2], r[3], r[4], r[5]]) // I only need the column [2][3][4][5]
mylist = ColumnIneed.collect() // transform the RDD into a list
Then I can just transform every row into a Runner object.
,但在C++中,我只知道這一點:
ifstream, infile;
string s, sAll;
if(infile.is_open())
{
while(getline(line, s))
{
s = s.rstrip('\n') //does NOT work in C++
name, age, time = s.split('\t') // Does NOT work in C++ and I dont need all the columns
SO,提出問題:
1,我需要訪問每一行,並且去掉換行符
2,我只需要列[2] [3] [4] [5] //每列是s通過標籤eparated
3,柱[4]是時間,這是字符串中的文本文件,我需要拆分 「:」 並投入mintues和秒
4,柱[2] [3]是姓氏和名字,我需要將它們組合成字符串名稱
5,列[2] [3]由空格
分離,理想情況下,我想這樣做:
while(I need a loop)
{
eachline = access each line;
eachline.strip('\n') //strip newline
eachline.split('\t') //split Tabs
string name = eachline[2][3];
string time = eachline[4];
int min;
int sec;
min, sec = time.split(':")
int age = eachline[5];
Runner M(name, age, min, sec) //I don't know if this works, because it looks like you are overwriting the Runner M each time you access a new line.
mylist.add(M) //add M into my linkedlist, this step you don't need to worry, I already finished.
}
如果你有更好的方法做的,我真的很感激它。
請編輯格式。 – muXXmit2X
今天早些時候提出了一個類似的問題。它可能有幫助。 http://stackoverflow.com/questions/35786613/populating-a-string-vector-with-tab-delimited-text – user4581301