2010-03-28 133 views
0

假設我有這樣的:如何用大於一個單個字符的分隔符分隔字符串?

"foo bar 1 and foo bar 2" 

我怎樣才能把它分成:

foo bar 1 
foo bar 2 

我試過strtok()strsep()但都沒有工作。他們不認可「和」作爲分隔符,他們認可「a」,「n」和「d」作爲分隔符。

任何函數來幫助我這個,或者我將不得不拆分的空白空間,並做一些字符串操作?

回答

5

你可以使用strstr()找到第一個「和」,並通過跳過這麼多字符並重新執行來「自動」標記字符串。

2

這裏是一個不錯的短的例子,我只是寫了演示如何使用strstr在給定的字符串分割字符串:

#include <string.h> 
#include <stdio.h> 

void split(char *phrase, char *delimiter) 
{ 
    char *loc = strstr(phrase, delimiter); 
    if (loc == NULL) 
    { 
     printf("Could not find delimiter\n"); 
    } 
    else 
    { 
     char buf[256]; /* malloc would be much more robust here */ 
     int length = strlen(delimiter); 
     strncpy(buf, phrase, loc - phrase); 
     printf("Before delimiter: '%s'\n", buf); 
     printf("After delimiter: '%s'\n", loc+length); 
    } 
} 

int main() 
{ 
    split("foo bar 1 and foo bar 2", "and"); 
    printf("-----\n"); 
    split("foo bar 1 and foo bar 2", "quux"); 
    return 0; 
} 

輸出:

 
Before delimiter: 'foo bar 1 ' 
After delimiter: ' foo bar 2' 
----- 
Could not find delimiter 

當然,我沒有經過充分測試,它可能容易受到與字符串長度相關的大多數標準緩衝區溢出問題的影響;但這至少是一個可證明的例子。

5

在C中分割字符串的主要問題是它不可避免地會產生一些動態內存管理,並且在任何可能的情況下傾向於通過標準庫來避免 。這就是爲什麼標準的C函數沒有處理動態內存分配的原因,只有malloc/calloc/realloc 這樣做。

但是自己做這件事並不難。讓我引導你通過 它。

我們需要返回一些字符串,並且最簡單的方法是將返回一個指向字符串的數組指針數組,該指針數組由 作爲NULL項終止。除了最後的NULL之外,數組中的每個元素都指向一個動態分配的字符串 。

首先我們需要一些輔助函數來處理這樣的數組。 最簡單的一個是一個(最後NULL前元件 )計算的字符串數:

/* Return length of a NULL-delimited array of strings. */ 
size_t str_array_len(char **array) 
{ 
    size_t len; 

    for (len = 0; array[len] != NULL; ++len) 
     continue; 
    return len; 
} 

另一種簡單的一個是用於釋放該陣列的功能:

/* Free a dynamic array of dynamic strings. */ 
void str_array_free(char **array) 
{ 
    if (array == NULL) 
     return; 
    for (size_t i = 0; array[i] != NULL; ++i) 
     free(array[i]); 
    free(array); 
} 

稍微更復雜的是該函數將字符串 的副本添加到數組中。它需要處理一些特殊情況,例如 數組尚不存在(整個數組爲空)。另外,它需要 句柄字符串不以'\ 0'結尾,以便我們的實際分割函數更容易在 追加時僅使用輸入字符串的一部分。

/* Append an item to a dynamically allocated array of strings. On failure, 
    return NULL, in which case the original array is intact. The item 
    string is dynamically copied. If the array is NULL, allocate a new 
    array. Otherwise, extend the array. Make sure the array is always 
    NULL-terminated. Input string might not be '\0'-terminated. */ 
char **str_array_append(char **array, size_t nitems, const char *item, 
         size_t itemlen) 
{ 
    /* Make a dynamic copy of the item. */ 
    char *copy; 
    if (item == NULL) 
     copy = NULL; 
    else { 
     copy = malloc(itemlen + 1); 
     if (copy == NULL) 
      return NULL; 
     memcpy(copy, item, itemlen); 
     copy[itemlen] = '\0'; 
    } 

    /* Extend array with one element. Except extend it by two elements, 
     in case it did not yet exist. This might mean it is a teeny bit 
     too big, but we don't care. */ 
    array = realloc(array, (nitems + 2) * sizeof(array[0])); 
    if (array == NULL) { 
     free(copy); 
     return NULL; 
    } 

    /* Add copy of item to array, and return it. */ 
    array[nitems] = copy; 
    array[nitems+1] = NULL; 
    return array; 
} 

這是一個有趣的。對於非常好的風格,如果將輸入項設置爲自己的 函數,將拆分爲動態副本,但我會將其作爲excercise給讀者。

最後,我們有實際的分裂函數。它也需要處理 一些特殊情況:

  • 輸入字符串可能以分隔符開頭或結尾。
  • 可能有分隔符彼此相鄰。
  • 輸入字符串可能根本不包含分隔符。

我已選擇一個空字符串添加到的結果,如果隔膜是 旁邊的開始或輸入字符串的末尾,或毗鄰 另一個分離器。如果你需要別的東西,你需要調整 的代碼。

除了特殊情況和一些錯誤處理,拆分 現在是相當簡單的。

/* Split a string into substrings. Return dynamic array of dynamically 
    allocated substrings, or NULL if there was an error. Caller is 
    expected to free the memory, for example with str_array_free. */ 
char **str_split(const char *input, const char *sep) 
{ 
    size_t nitems = 0; 
    char **array = NULL; 
    const char *start = input; 
    char *next = strstr(start, sep); 
    size_t seplen = strlen(sep); 
    const char *item; 
    size_t itemlen; 

    for (;;) { 
     next = strstr(start, sep); 
     if (next == NULL) { 
      /* Add the remaining string (or empty string, if input ends with 
       separator. */ 
      char **new = str_array_append(array, nitems, start, strlen(start)); 
      if (new == NULL) { 
       str_array_free(array); 
       return NULL; 
      } 
      array = new; 
      ++nitems; 
      break; 
     } else if (next == input) { 
      /* Input starts with separator. */ 
      item = ""; 
      itemlen = 0; 
     } else { 
      item = start; 
      itemlen = next - item; 
     } 
     char **new = str_array_append(array, nitems, item, itemlen); 
     if (new == NULL) { 
      str_array_free(array); 
      return NULL; 
     } 
     array = new; 
     ++nitems; 
     start = next + seplen; 
    } 

    if (nitems == 0) { 
     /* Input does not contain separator at all. */ 
     assert(array == NULL); 
     array = str_array_append(array, nitems, input, strlen(input)); 
    } 

    return array; 
} 

這是整個程序的一個部分。它還包含一個主程序 來運行一些測試用例。

#include <assert.h> 
#include <stdbool.h> 
#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 


/* Append an item to a dynamically allocated array of strings. On failure, 
    return NULL, in which case the original array is intact. The item 
    string is dynamically copied. If the array is NULL, allocate a new 
    array. Otherwise, extend the array. Make sure the array is always 
    NULL-terminated. Input string might not be '\0'-terminated. */ 
char **str_array_append(char **array, size_t nitems, const char *item, 
         size_t itemlen) 
{ 
    /* Make a dynamic copy of the item. */ 
    char *copy; 
    if (item == NULL) 
     copy = NULL; 
    else { 
     copy = malloc(itemlen + 1); 
     if (copy == NULL) 
      return NULL; 
     memcpy(copy, item, itemlen); 
     copy[itemlen] = '\0'; 
    } 

    /* Extend array with one element. Except extend it by two elements, 
     in case it did not yet exist. This might mean it is a teeny bit 
     too big, but we don't care. */ 
    array = realloc(array, (nitems + 2) * sizeof(array[0])); 
    if (array == NULL) { 
     free(copy); 
     return NULL; 
    } 

    /* Add copy of item to array, and return it. */ 
    array[nitems] = copy; 
    array[nitems+1] = NULL; 
    return array; 
} 


/* Free a dynamic array of dynamic strings. */ 
void str_array_free(char **array) 
{ 
    if (array == NULL) 
     return; 
    for (size_t i = 0; array[i] != NULL; ++i) 
     free(array[i]); 
    free(array); 
} 


/* Split a string into substrings. Return dynamic array of dynamically 
    allocated substrings, or NULL if there was an error. Caller is 
    expected to free the memory, for example with str_array_free. */ 
char **str_split(const char *input, const char *sep) 
{ 
    size_t nitems = 0; 
    char **array = NULL; 
    const char *start = input; 
    char *next = strstr(start, sep); 
    size_t seplen = strlen(sep); 
    const char *item; 
    size_t itemlen; 

    for (;;) { 
     next = strstr(start, sep); 
     if (next == NULL) { 
      /* Add the remaining string (or empty string, if input ends with 
       separator. */ 
      char **new = str_array_append(array, nitems, start, strlen(start)); 
      if (new == NULL) { 
       str_array_free(array); 
       return NULL; 
      } 
      array = new; 
      ++nitems; 
      break; 
     } else if (next == input) { 
      /* Input starts with separator. */ 
      item = ""; 
      itemlen = 0; 
     } else { 
      item = start; 
      itemlen = next - item; 
     } 
     char **new = str_array_append(array, nitems, item, itemlen); 
     if (new == NULL) { 
      str_array_free(array); 
      return NULL; 
     } 
     array = new; 
     ++nitems; 
     start = next + seplen; 
    } 

    if (nitems == 0) { 
     /* Input does not contain separator at all. */ 
     assert(array == NULL); 
     array = str_array_append(array, nitems, input, strlen(input)); 
    } 

    return array; 
} 


/* Return length of a NULL-delimited array of strings. */ 
size_t str_array_len(char **array) 
{ 
    size_t len; 

    for (len = 0; array[len] != NULL; ++len) 
     continue; 
    return len; 
} 


#define MAX_OUTPUT 20 


int main(void) 
{ 
    struct { 
     const char *input; 
     const char *sep; 
     char *output[MAX_OUTPUT]; 
    } tab[] = { 
     /* Input is empty string. Output should be a list with an empty 
      string. */ 
     { 
      "", 
      "and", 
      { 
       "", 
       NULL, 
      }, 
     }, 
     /* Input is exactly the separator. Output should be two empty 
      strings. */ 
     { 
      "and", 
      "and", 
      { 
       "", 
       "", 
       NULL, 
      }, 
     }, 
     /* Input is non-empty, but does not have separator. Output should 
      be the same string. */ 
     { 
      "foo", 
      "and", 
      { 
       "foo", 
       NULL, 
      }, 
     }, 
     /* Input is non-empty, and does have separator. */ 
     { 
      "foo bar 1 and foo bar 2", 
      " and ", 
      { 
       "foo bar 1", 
       "foo bar 2", 
       NULL, 
      }, 
     }, 
    }; 
    const int tab_len = sizeof(tab)/sizeof(tab[0]); 
    bool errors; 

    errors = false; 

    for (int i = 0; i < tab_len; ++i) { 
     printf("test %d\n", i); 

     char **output = str_split(tab[i].input, tab[i].sep); 
     if (output == NULL) { 
      fprintf(stderr, "output is NULL\n"); 
      errors = true; 
      break; 
     } 
     size_t num_output = str_array_len(output); 
     printf("num_output %lu\n", (unsigned long) num_output); 

     size_t num_correct = str_array_len(tab[i].output); 
     if (num_output != num_correct) { 
      fprintf(stderr, "wrong number of outputs (%lu, not %lu)\n", 
        (unsigned long) num_output, (unsigned long) num_correct); 
      errors = true; 
     } else { 
      for (size_t j = 0; j < num_output; ++j) { 
       if (strcmp(tab[i].output[j], output[j]) != 0) { 
        fprintf(stderr, "output[%lu] is '%s' not '%s'\n", 
          (unsigned long) j, output[j], tab[i].output[j]); 
        errors = true; 
        break; 
       } 
      } 
     } 

     str_array_free(output); 
     printf("\n"); 
    } 

    if (errors) 
     return EXIT_FAILURE; 
    return 0; 
} 
+0

非常感謝你爲這個驚人的深入,作用片的代碼。 – 2015-05-01 03:58:28

0

如果你知道定界符例如逗號的類型或分號,你可以用這個嘗試:

#include<stdio.h> 
#include<conio.h> 
int main() 
{ 
    int i=0,temp=0,temp1=0, temp2=0; 
    char buff[12]="123;456;789"; 
    for(i=0;buff[i]!=';',i++) 
    { 
    temp=temp*10+(buff[i]-48); 
    } 
    for(i=0;buff[i]!=';',i++) 
    { 
    temp1=temp1*10+(buff[i]-48); 
    } 
    for(i=0;buff[i],i++) 
    { 
    temp2=temp2*10+(buff[i]-48); 
    } 
    printf("temp=%d temp1=%d temp2=%d",temp,temp1,temp2); 
    getch(); 
    return 0; 
} 

輸出:

temp=123 temp1=456 temp2=789