將字符數組拆分爲分隔符爲NUL的標記char

我想使用NUL char作爲分隔符將char數組拆分爲標記。將字符數組拆分爲分隔符爲NUL的標記char

我有一個字符數組，我通過網絡從recv命令收到，所以我知道char數組的長度。在那個char數組中，有一堆被NUL char（\0）分開的字符串。

由於分隔符是NUL字符，這意味着我不能使用strtok，因爲它使用NULL用於自己的目的。

所以我想遍歷從字節8開始的所有字符串（字符串前面有2個32位整數）。

我想我可以遍歷雖然所有字符尋找\0字符，然後做到目前爲止，我已經找到了長度的memcpy，但我想一定有比這一個更好的方法。

我還能採取哪些其他方法？

來源

2016-05-15 Joel Pearson

如果您不需要重新使用其他緩衝區，可以將字符串保存在那裏，只需使用指向它們開始的指針即可。它們已經被NUL終止。 –

@ ThomasPadron-McCarthy我在一段時間內沒有做太多的事情，我該如何再次移動指針的位置？ –

或者，如果你確實需要重新使用緩衝區，你可以找到每個字符串的第一個字符和'strcpy（）'或'strdup（）'它。無論哪種方式，都要確保最後一個字符串也是以null結尾的，否則將其作爲特殊情況處理。 –

下面是一些簡單的代碼展示瞭如何獲取包含字符串：

#include <stdio.h> 
#include <string.h> 

int main(void) { 
    char recbuf[7] = {'a', 'b', 'c', '\0', 'd', 'e', '\0'}; 
    int recbuf_size = 7; 
    int j = 0; 
    char* p = recbuf; 
    while(j < recbuf_size) 
    { 
     printf("%s\n", p); // print the string found 
          // Here you could copy the string if needed, e.g. 
          // strcpy(mySavedStrings[stringCount++], p); 

     int t = strlen(p); // get the length of the string just printed 
     p += t + 1;   // move to next string - add 1 to include string termination 
     j += t + 1;   // remember how far we are 
    } 
    return 0; 
}

輸出：

abc 
de

如果您需要跳過一些字節的緩衝區的開始，然後只是不：

int number_of_bytes_to_skip = 4; 
int j = number_of_bytes_to_skip; 
char* p = recbuf + number_of_bytes_to_skip;

注意：

上面的代碼假定接收緩衝區是總是正確地以'\0'終止。在現實世界中的代碼，你應該檢查在運行代碼之前，並添加錯誤處理，例如：

if (recbuf[recbuf_size-1] != '\0') 
{ 
    // Some error handling... 
}

來源

2016-05-15 12:36:47 4386427

我重新使用recbuf進行recv的多個調用。這是否意味着我需要重置指針回到開始使用這種方法？如果我在遍歷它之前簡單地複製recbuf，那麼原始指針應該可以？ –

@JoelPearson - 當你收到一個新的緩衝區時，你需要將指針'p'設置爲新緩衝區的地址（並添加跳過的字節數）。如果你重用相同的接收緩衝區，那麼你需要在解析緩衝區之前設置指針'p' – 4386427

啊我看到char p *解決了我的問題，謝謝！ –

NUL分離實際上使您的工作很容易。

char* DestStrings[MAX_STRINGS]; 
int j = 0; 
int length = 0; 
inr prevLength =0; 
int offset = 8; 
for(int i = 0;i<MAX_STRINGS;i++) 
{ 
    length += strlen(&srcbuffer[j+offset+length]); 
    if(length == prevLength)       
    { 
     break; 
    } 
    else 
    { 

     DestStrings[i] = malloc(length-prevLength+1); 
     strcpy(DestStrings[i],&srcbuffer[j+offset+length]); 
     prevLength = length; 
     j++; 
    } 

}

您需要添加一些額外的檢查以避免潛在的緩衝區溢出錯誤。希望這段代碼能讓你對如何繼續前進有一點點想法。

編輯1：儘管這是由於修改索引而導致修改索引而不是整個解決方案的代碼。

編輯2：由於已知接收數據緩衝區的長度，請將NUL附加到接收到的數據以使此代碼正常工作。另一方面，接收數據的長度本身可以用來與複製的長度進行比較。

來源

2016-05-15 12:13:14 Vagish

假設該輸入數據：

char input[] = { 
    0x01, 0x02, 0x0a, 0x0b, /* A 32bit integer */ 
    'h', 'e', 'l', 'l', 'o', 0x00, 
    'w', 'o', 'r', 'l', 'd', 0x00, 
    0x00 /* Necessary to make the end of the payload. */ 
};

在開始一個32整數給出：

const size_t header_size = sizeof (uint32_t);

解析輸入可以通過識別「串」來完成的第一字符和存儲指向它的指針，然後精確地移動到找到的字符串很長（1+），然後重新開始，直到達到輸入的結尾。

size_t strings_elements = 1; /* Set this to which ever start size you like. */ 
size_t delta = 1; /* 1 is conservative and slow for larger input, 
        increase as needed. */ 

/* Result as array of pointers to "string": */ 
char ** strings = malloc(strings_elements * sizeof *strings); 

{ 
    char * pc = input + header_size; 
    size_t strings_found = 0; 
    /* Parse input, if necessary increase result array, and populate its elements: */ 
    while ('\0' != *pc) 
    { 
    if (strings_found >= strings_elements) 
    { 
     strings_elements += delta; 
     void * pvtmp = realloc(
     strings, 
     (strings_elements + 1) * sizeof *strings /* Allocate one more to have a 
             stopper, being set to NULL as a sentinel.*/ 
    ); 

     if (NULL == pvtmp) 
     { 
     perror("realloc() failed"); 
     exit(EXIT_FAILURE); 
     } 

     strings = pvtmp; 
    } 

    strings[strings_found] = pc; 
    ++strings_found; 

    pc += strlen(pc) + 1; 
    } 

    strings[strings_found] = NULL; /* Set a stopper element. 
            NULL terminate the pointer array. */ 
} 

/* Print result: */ 
{ 
    char ** ppc = strings; 
    for(; NULL != *ppc; ++ppc) 
    { 
    printf("%zu: '%s'\n", ppc - strings + 1, *ppc) 
    } 
} 

/* Clean up: */ 
free(strings);

如果您需要在分裂複製，通過

strings[strings_found] = strdup(pc);

替換該行

strings[strings_found] = pc;

和使用之後添加清理代碼和ING stringsfree()前：

{ 
    char ** ppc = strings; 
    for(; NULL != *ppc; ++ppc) 
    { 
    free(*ppc); 
    } 
}

上面的代碼假定至少有1 '\0'（NUL又名空字符）跟在有效負載之後。

如果後面的條件沒有得到滿足，您需要定義任何其他終止序列/ around或需要知道其他來源的輸入大小。如果你不是你的問題是不可解決的。

上面的代碼需要以下標題：

#include <inttypes.h> /* for int32_t */ 
#include <stdio.h> /* for printf(), perror() */ 
#include <string.h> /* for strlen() */ 
#include <stdlib.h> /* for realloc(), free(), exit() */

，以及它可能需要以下定義之一：

#define _POSIX_C_SOURCE 200809L 

#define _GNU_SOURCE

或什麼別的你的C編譯器要求做出strdup()可用。

來源

2016-05-15 12:29:54 alk

我使用實施做這類工作的分詞結構建議。閱讀和維護會更容易，因爲它看起來類似於面向對象的代碼。它隔離了memcpy，所以我認爲它「更好」。

首先，標題我將使用：

#include <stdio.h> 
#include <stdlib.h> 
#include <string.h>

標記生成器structurehas記住字符串的開頭（這樣我們就可以抹去的記憶不是不再需要後），實際的指數和檢查我們是否已經解析了整個字符串：

struct Tokenizer { 
    char *string; 
    char *actual_index; 
    char *end_index; 
};

我建議使用類似工廠的函數來創建一個標記器。它在這裏構建，使用memcpy複製輸入字符串，因爲string.h函數在第一個'\ 0'字符處停止。

struct Tokenizer getTokenizer(char string[], unsigned length) { 
    struct Tokenizer tokenizer; 
    tokenizer.string = (char *)malloc(length); 
    tokenizer.actual_index = tokenizer.string; 
    tokenizer.end_index = tokenizer.string + length; 
    memcpy(tokenizer.string, string, length); 
    return tokenizer; 
}

現在負責獲取令牌的功能。它返回新分配的字符串，它們的末尾有'\ 0'字符。它也改變了actual_index指向的地址。它採用分詞作爲參數的地址，所以它可以改變自己的價值觀：

char * getNextToken(struct Tokenizer *tokenizer) { 
    char * token; 
    unsigned length; 
    if(tokenizer->actual_index == tokenizer->end_index) 
     return NULL; 
    length = strlen(tokenizer->actual_index); 
    token = (char *)malloc(length + 1); 
    // + 1 because the '\0' character has to fit in 
    strncpy(token, tokenizer->actual_index, length + 1); 
    for(;*tokenizer->actual_index != '\0'; tokenizer->actual_index++) 
     ; // getting the next position 
    tokenizer->actual_index++; 
    return token; 
}

樣品使用標記生成器，以顯示如何處理內存分配昂如何使用它。

int main() { 
    char c[] = "Lorem\0ipsum dolor sit amet,\0consectetur" 
     " adipiscing elit. Ut\0rhoncus volutpat viverra."; 
    char *temp; 
    struct Tokenizer tokenizer = getTokenizer(c, sizeof(c)); 
    while((temp = getNextToken(&tokenizer))) { 
     puts(temp); 
     free(temp); 
    } 
    free(tokenizer.string); 
    return 0; 
}

來源

2016-05-15 13:51:38

我來自Java背景，所以這種方式對我來說讀起來好多了，謝謝。 –

將字符數組拆分爲分隔符爲NUL的標記char

回答

相關問題