我通過Mingw在Windows上編譯了一個Linux程序,但輸出錯誤。C++程序在Linux上正常打開文件但在Windows上不打開
錯誤描述:
程序的輸出看起來不同在Windows比Linux上。這是它的外觀在Windows上:
>tig_2
CAATCTTCAGAGTCCAGAGTGGGAGGCACAGACTACAGAAAATGAGCAGCGGGGCTGGTA
>cluster_1001_conTTGGTGAAGAGAATTTGGACATGGATGAAGGCTTGGGCTTGACCATGCGAAGG
預期輸出:
>cluster_1001_contig2
CAATCTTCAGAGTCCAGAGTGGGAGGCACAGACTACAGAAAATGAGCAGCGGGGCTGGTA
>cluster_1001_contig1
TTGGTGAAGAGAATTTGGACATGGATGAAGGCTTGGGCTTGACCATGCGAAGG
(注:輸出非常大,在這裏貼上所以上面的例子是僞實數)。
可能的原因:
我觀察到,如果我轉換輸入的字符在Linux(LF)輸入文件到Windows(CRLF)幾乎工作:在文件中的第一個字符(>)丟失。在沒有任何輸入轉換的情況下,相同的代碼在Linux上完美運行所以,這個問題必須在沒有解析輸入寫入輸出的一個功能:
seq_db.Read(db_in.c_str(), options);
的源代碼:
這是解析輸入文件中的一塊。無論如何,我可能會錯誤。錯誤可能在其他地方。在情況需要的時候,完整的源代碼是here :)
void SequenceDB::Read(const char *file, const Options & options)
{
Sequence one;
Sequence dummy;
Sequence des;
Sequence *last = NULL;
FILE *swap = NULL;
FILE *fin = fopen(file, "r");
char *buffer = NULL;
char *res = NULL;
size_t swap_size = 0;
int option_l = options.min_length;
if(fin == NULL) bomb_error("Failed to open the database file");
if(options.store_disk) swap = OpenTempFile(temp_dir);
Clear();
dummy.swap = swap;
buffer = new char[ MAX_LINE_SIZE+1 ];
while (not feof(fin) || one.size) { /* do not break when the last sequence is not handled */
buffer[0] = '>';
if ((res=fgets(buffer, MAX_LINE_SIZE, fin)) == NULL && one.size == 0) break;
if(buffer[0] == '+'){
int len = strlen(buffer);
int len2 = len;
while(len2 && buffer[len2-1] != '\n'){
if ((res=fgets(buffer, MAX_LINE_SIZE, fin)) == NULL) break;
len2 = strlen(buffer);
len += len2;
}
one.des_length2 = len;
dummy.des_length2 = len;
fseek(fin, one.size, SEEK_CUR);
}else if (buffer[0] == '>' || buffer[0] == '@' || (res==NULL && one.size)) {
if (one.size) { // write previous record
one.dat_length = dummy.dat_length = one.size;
if(one.identifier == NULL || one.Format()){
printf("Warning: from file \"%s\",\n", file);
printf("Discarding invalid sequence or sequence without identifier and description!\n\n");
if(one.identifier) printf("%s\n", one.identifier);
printf("%s\n", one.data);
one.size = 0;
}
one.index = dummy.index = sequences.size();
if(one.size > option_l) {
if (swap) {
swap_size += one.size;
// so that size of file < MAX_BIN_SWAP about 2GB
if (swap_size >= MAX_BIN_SWAP) {
dummy.swap = swap = OpenTempFile(temp_dir);
swap_size = one.size;
}
dummy.size = one.size;
dummy.offset = ftell(swap);
dummy.des_length = one.des_length;
sequences.Append(new Sequence(dummy));
one.ConvertBases();
fwrite(one.data, 1, one.size, swap);
}else{
//printf("==================\n");
sequences.Append(new Sequence(one));
//printf("------------------\n");
//if(sequences.size() > 10) break;
}
//if(sequences.size() >= 10000) break;
}
}
one.size = 0;
one.des_length2 = 0;
int len = strlen(buffer);
int len2 = len;
des.size = 0;
des += buffer;
while(len2 && buffer[len2-1] != '\n'){
if ((res=fgets(buffer, MAX_LINE_SIZE, fin)) == NULL) break;
des += buffer;
len2 = strlen(buffer);
len += len2;
}
size_t offset = ftell(fin);
one.des_begin = dummy.des_begin = offset - len;
one.des_length = dummy.des_length = len;
int i = 0;
if(des.data[i] == '>' || des.data[i] == '@' || des.data[i] == '+') i += 1;
if(des.data[i] == ' ' or des.data[i] == '\t') i += 1;
if(options.des_len and options.des_len < des.size) des.size = options.des_len;
while(i < des.size and (des.data[i] != '\n')) i += 1;
des.data[i] = 0;
one.identifier = dummy.identifier = des.data;
} else {
one += buffer;
}
}
#if 0
int i, n = 0;
for(i=0; i<sequences.size(); i++) n += sequences[i].bufsize + 4;
cout<<n<<"\t"<<sequences.capacity() * sizeof(Sequence)<<endl;
int i;
scanf("%i", & i);
#endif
one.identifier = dummy.identifier = NULL;
delete[] buffer;
fclose(fin);
}
輸入文件的格式是這樣的:
> comment ACGTACGTACGTACGTACGTACGTACGTACGT > comment ACGTACGTACGTACGTACGTACGTACGTACGT > comment ACGTACGTACGTACGTACGTACGTACGTACGT etc
嘗試用'「rb」'打開。 – 0x499602D2 2014-09-06 01:05:18
「rb」用於以二進制形式打開文件。如果你不知道這意味着什麼,那麼這可能就是你遇到問題的原因。當您打開一個文件爲「r」時,您可以讓運行時在您正在閱讀的數據中進行一些時髦的CR/LF翻譯,並且不受您的控制。使用「rb」,沒有翻譯,構成文件的每個字符都是「原樣」讀取的。 – PaulMcKenzie 2014-09-06 01:36:02
呵呵,另外,當你的代碼正在做的時候打開一個文件爲「r」意味着'Ctrl-Z'(ASCII 26)標記了Windows中文件的結尾。所以如果該文件包含Ctrl-Z,則停止讀取文件。 – PaulMcKenzie 2014-09-06 01:40:13