2015-12-18 40 views
1

我有一些代碼在兩個其他字符串(三明治)之間刮字符串。它正在工作 - 但我需要循環使用各種「三明治」字符串。簡單的數據抓取使用PHP循環/ foreach

//needle in haystack 
$result 'sandwich: Today is a nice day. 
    sandwich: Today is a cloudy day. 
    sandwich: Today is a rainy day. 
    sandwich type 2: Yesterday I had an awesome time. 
    sandwich type 2: Yesterday I had an great time.'; 

$beginString = 'today is a'; 
$endString = 'day'; 

function extract_unit($haystack, $keyword1, $keyword2) { 
    $return = array(); 

    while($a = strpos($haystack, $keyword1, $a)) { // loop until $a is FALSE 
     $a+=strlen($keyword1);     // set offset to after $keyword1 word 

     if($b = strpos($haystack, $keyword2, $a)) { // if found $keyword2 position's 
      $return[] = trim(substr($haystack, $a, $b-$a)); // put result to $return array 
     } 
    } 
    return $return; 
} 

$text = $result; 
$unit = extract_unit($text, $beginString, $endString); 
print_r($unit); 

//$unit returns= nice, cloudy and rainy 

我需要循環通過不同類型的句子/三明治,並能夠捕獲所有的形容詞(漂亮的陰雨真棒大):

//needle in haystack 
$result 'sandwich: Today is a nice day. 
    sandwich: Today is a cloudy day. 
    sandwich: Today is a rainy day. 
    sandwich type 2: Yesterday I had an awesome time. 
    sandwich type 2: Yesterday I had an great time.'; 

$beginString1 = 'today is a'; 
$endString1 = 'day'; 
$beginString2 = 'Yesterday I had an'; 
$endString2 = 'time'; 

[scaping code with loop...] 
print_r($unit); 

這是落得目標這個陣列:

Array ([0] => nice [1] => cloudy [2] => rainy [3] => awesome [4] => great) 

任何想法?非常感激。

回答

3

你可以使用正則表達式刮入strings,如果你沒有使用arrays問題,而不是分開strings,這可能是一個示例代碼來做到這一點:

$starts = array('Today is a', 'Yesterday I had an'); 
$ends = array('day', 'time'); 

$haystack = array(
    'Today is a nice day.', 
    'Today is a cloudy day.', 
    'Today is a rainy day.', 
    'Yesterday I had an awesome time.', 
    'Yesterday I had an great time.' 
); 

function extract_unit($haystack, $starts, $ends){ 

    $reg = '/.*?(?:' . implode('|', $starts) . ')(.*?)(?:' . implode('|', $ends) . ').*/'; 

    foreach($haystack as $str){ 

     if(preg_match($reg, $str)) $return[] = preg_replace($reg, '$1', $str); 

    } 

    return $return; 

} 

print_r (extract_unit($haystack, $starts, $ends)); 

編輯

下面我做了一些改動代碼@ven意見,現在是更精確:

//---Array with all sandwiches 
$between = array(
    array('hay1=', 'hay=Gold'), 
    array('hay2=', 'hay=Silver') 
); 

$haystack = 'Data set 1: hay2= this is a bunch of hay hay1= Gold_Needle hay=Gold 
      Data Set 2: hay2=Silver_Needle hay=Silver'; 

function extract_unit($haystack, $between){ 

    $return = array(); 

    foreach($between as $item){ 

     $reg = '/.*?' . $item[0] . '\s*(.*?)\s*' . $item[1] . '.*?/'; 

     preg_match_all($reg, $haystack, $finded); 

     $return = array_merge($return, $finded[1]); 

    } 

    return $return; 

} 

print_r (extract_unit($haystack, $between)); 

結果將是:

Array 
(
    [0] => Gold_Needle 
    [1] => Silver_Needle 
) 

Here you have an Ideone sample code

+0

@ideone - 非常感謝! ...但我得到一個錯誤:preg_match_all($ reg,$ haystack,$ return);其中指出「解析錯誤:語法錯誤,意外的'preg_match_all'(T_STRING)in」。我猜它適合你? - 我嘗試了乾草堆是單個字符串的編輯版本。 – ven

+0

最後你有一個在線例子,你可以分解這些代碼並進行自己的測試。你有什麼PHP版本? – ElChiniNet

+0

我忘了「;」在示例代碼中前一行的末尾,那是我的錯誤。我修正了這個問題。 – ElChiniNet