2013-05-31 66 views
0

我需要提取它的一些html標籤內現有的JSON JSON內容。如何提取的名字從下面的JSON(鍵)值使用正則表達式如何採取使用正則表達式(正則表達式)在PHP

<div id="gwt_products_display_results" class="gwt_products_display_results"> 
       <span class="JSON" style="display: none;"> 
{ 
    "products": [ 
     { 
      "targetURL": "/athena-mineral-fabric-by-the-yard/262682", 
      "listIndex": "0", 
      "minimumPrice": 20, 
      "categoryOnSale": "false", 
      "mfPartNumber": "FF010ATM", 
      "hasAtLeastOneBuyableAndPublishedItem": "true", 
      "attributes": [], 
      "partNumber": "b_FF010ATM", 
      "itemAsProduct": "true", 
      "iapAttribute": "", 
      "productDetailTargetURL": "/athena-mineral-fabric-by-the-yard/262682", 
      "iapAttributeCode": "", 
      "beanType": "bundle", 
      "name": "Athena Mineral Fabric by the Yard", 
      "maxListPrice": 0, 
      "thumbNail": "null", 
      "hasSaleSKUs": false, 
      "productId": "262682", 
      "currencyCode": "USD", 
      "hasMoreColors": false, 
      "xPriceLabel": "null", 
      "minListPrice": 0, 
      "maximumPrice": 20, 
      "iapAttributeDisplayName": "", 
      "shortDescription": "null", 
      "listId": "SEARCHRESULTS", 
      "categoryId": "null" 
     }, 
     { 
      "targetURL": "/athena-slate-fabric-by-the-yard/262683", 
      "listIndex": "1", 
      "minimumPrice": 20, 
      "categoryOnSale": "false", 
      "mfPartNumber": "FF010ATS", 
      "hasAtLeastOneBuyableAndPublishedItem": "true", 
      "attributes": [], 
      "partNumber": "b_FF010ATS", 
      "itemAsProduct": "true", 
      "iapAttribute": "", 
      "productDetailTargetURL": "/athena-slate-fabric-by-the-yard/262683", 
      "iapAttributeCode": "", 
      "beanType": "bundle", 
      "name": "Athena Slate Fabric by the Yard", 
      "maxListPrice": 0, 
      "thumbNail": "null", 
      "hasSaleSKUs": false, 
      "productId": "262683", 
      "currencyCode": "USD", 
      "hasMoreColors": false, 
      "xPriceLabel": "null", 
      "minListPrice": 0, 
      "maximumPrice": 20, 
      "iapAttributeDisplayName": "", 
      "shortDescription": "null", 
      "listId": "SEARCHRESULTS", 
      "categoryId": "null" 
     }, 
     { 
      "targetURL": "/typewriter-keys-giclee/261307", 
      "listIndex": "2", 
      "minimumPrice": 259, 
      "categoryOnSale": "false", 
      "mfPartNumber": "WD813", 
      "hasAtLeastOneBuyableAndPublishedItem": "true", 
      "attributes": [ 
       { 
        "S7 - Overlay 1": "blank" 
       } 
      ], 
      "partNumber": "p_WD813", 
      "itemAsProduct": "true", 
      "iapAttribute": "", 
      "productDetailTargetURL": "/typewriter-keys-giclee/261307", 
      "iapAttributeCode": "", 
      "beanType": "product", 
      "name": "Typewriter Keys Giclee", 
      "maxListPrice": 0, 
      "thumbNail": "null", 
      "hasSaleSKUs": false, 
      "productId": "261307", 
      "currencyCode": "USD", 
      "hasMoreColors": false, 
      "xPriceLabel": "null", 
      "minListPrice": 0, 
      "maximumPrice": 259, 
      "iapAttributeDisplayName": "", 
      "shortDescription": "null", 
      "listId": "SEARCHRESULTS", 
      "categoryId": "null" 
     } 
    ] 
} 
</span> 
</div> 

我所到目前爲止已經試過被

<span class="JSON" style="display: none;">([\s\S]+?)<\/span> 
+4

爲什麼???只需使用'json_decode'。 – enenen

+4

爲什麼對於世界上所有可能被認爲是聖潔的人來說,你想要在像JSON這樣的數據結構上使用正則表達式? __Parse__它到一個對象/數組中,並直接或通過循環訪問你想要的值。 – CBroe

+0

如果您打算放棄'json_encode()',寫自己的功能完善的JSON解析器,你可能需要比正則表達式更因爲JSON允許任意元素的無限嵌套層次。你在尋找這樣的夏天做什麼? –

回答

1

爲什麼,正則表達式?正如這裏提到的其他人一樣,您可以使用json_decode將其解析爲數組並對其進行處理。

但如果你堅持的正則表達式,我會說/"(.+?)":/將匹配所有鑰匙,如果你的JSON有具體的格式如圖所示。

UPDATE

那麼,你是從HTML字符串得到它。考慮變量是$ html,當你堅持使用正則表達式時,使用正則表達式解析json,然後解碼。爲了解析鍵,使用array_keys()

preg_match('/<span.*?class="JSON".*?>(.+?)<\/span>/s', $html, $matches); 

$decoded_array = json_decode($matches[1], true); 

print_r($decoded_array); 

$keys = array_keys($decoded_array['products'][0]); 

print_r($keys); 
+0

Acctually我是從HTML結構數據說話所以它很難用json_decode –

+0

然後解析從HTML的JSON,然後進行解碼。更新了答案。 – Jithin

+0

@SunithSaga:請檢查上面的答案。 – Jithin

4

你可以將其轉換爲一個數組,然後獲得使用array_keys();

名稱
$array = json_decode($json); 

$keys = array_keys($array['products']); 
0

您可以使用DOMDocumentDOMXPath找到包含了JSON的span元素,然後json_decode這一點。這裏有一個粗略的例子,讓你對你的方式: -

<?php 
$html = ' 
<html> 
    <head> 
     <title>Example</title> 
    </head> 
    <body> 
     <div id="gwt_products_display_results" class="gwt_products_display_results"> 
      <span class="JSON" style="display: none;"> 
      { 
       "products": [ 
        { 
         "targetURL": "/athena-mineral-fabric-by-the-yard/262682", 
         "listIndex": "0", 
         "minimumPrice": 20, 
         "categoryOnSale": "false", 
         "mfPartNumber": "FF010ATM", 
         "hasAtLeastOneBuyableAndPublishedItem": "true", 
         "attributes": [], 
         "partNumber": "b_FF010ATM", 
         "itemAsProduct": "true", 
         "iapAttribute": "", 
         "productDetailTargetURL": "/athena-mineral-fabric-by-the-yard/262682", 
         "iapAttributeCode": "", 
         "beanType": "bundle", 
         "name": "Athena Mineral Fabric by the Yard", 
         "maxListPrice": 0, 
         "thumbNail": "null", 
         "hasSaleSKUs": false, 
         "productId": "262682", 
         "currencyCode": "USD", 
         "hasMoreColors": false, 
         "xPriceLabel": "null", 
         "minListPrice": 0, 
         "maximumPrice": 20, 
         "iapAttributeDisplayName": "", 
         "shortDescription": "null", 
         "listId": "SEARCHRESULTS", 
         "categoryId": "null" 
        }, 
        { 
         "targetURL": "/athena-slate-fabric-by-the-yard/262683", 
         "listIndex": "1", 
         "minimumPrice": 20, 
         "categoryOnSale": "false", 
         "mfPartNumber": "FF010ATS", 
         "hasAtLeastOneBuyableAndPublishedItem": "true", 
         "attributes": [], 
         "partNumber": "b_FF010ATS", 
         "itemAsProduct": "true", 
         "iapAttribute": "", 
         "productDetailTargetURL": "/athena-slate-fabric-by-the-yard/262683", 
         "iapAttributeCode": "", 
         "beanType": "bundle", 
         "name": "Athena Slate Fabric by the Yard", 
         "maxListPrice": 0, 
         "thumbNail": "null", 
         "hasSaleSKUs": false, 
         "productId": "262683", 
         "currencyCode": "USD", 
         "hasMoreColors": false, 
         "xPriceLabel": "null", 
         "minListPrice": 0, 
         "maximumPrice": 20, 
         "iapAttributeDisplayName": "", 
         "shortDescription": "null", 
         "listId": "SEARCHRESULTS", 
         "categoryId": "null" 
        } 
       ] 
      } 
      </span> 
     </div> 
    </body>  
</html> 
'; 

$document = DOMDocument::loadHTML($html); 
$xpath  = new DOMXPath($document); 
$spans  = $xpath->query('//div/span[@class="JSON"]'); 

foreach ($spans as $span) { 
    $catalog = json_decode($span->nodeValue); 
    printf("We found %d products.\n", count($catalog->products)); 
    foreach ($catalog->products as $index => $product) { 
     printf("Product #%d - %s.\n", ++$index, $product->name); 
    } 
} 

/* 
    We found 2 products. 
    Product #1 - Athena Mineral Fabric by the Yard. 
    Product #2 - Athena Slate Fabric by the Yard. 
*/ 
+0

對不起,說我需要正則表達式而不是DOMXpath –