2016-02-08 22 views
1

我目前在以下結構項目:轉換的產品和類別平板名單樹結構

[{ 
    "category" => ["Alcoholic Beverages", "Wine", "Red Wine"], 
    "name" => "Robertson Merlot", 
    "barcode" => '123456789-000' 
    "wine_farm" => "Robertson Wineries", 
    "price" => 60.00 
}] 

我已經作出了這個數據,但我使用的數據是相同的結構,我不能改變數據進來。

我有> 100 000這些。

每個產品都嵌套在1和n(無限)類別之間。

由於此數據的表格性質,類別會重複。我想使用樹形數據來防止這種重複並將文件大小減少25%到30%。

我的目標在樹形結構是這樣的:

{ 
    "type" => "category", 
    "properties" => { 
     "name" => "Alcoholic Beverages" 
    }, 
    "children" => [{ 
         "type" => "category", 
         "properties" => { 
          "name" => "Wine" 
         }, 
         "children" => [{ 
              "type" => "category", 
              "properties" => { 
               "name" => "Red Wine" 
              }, 
              "children" => [{ 
                  "type" => "product", 
                  "properties" => { 
                   "name" => "Robertson Merlot", 
                   "barcode" => '123456789-000', 
                   "wine_farm" => "Robertson Wineries", 
                   "price" => 60.00 
                  } 
                 }] 

             }] 
        }] 
} 
  1. 我似乎無法想到一個高效的算法得到這個權利的。我希望在正確的方向上有所幫助。

  2. 我應該爲每個節點生成ID和廣告父ID嗎?我擔心使用ID會增加文本的長度,我正試圖縮短文本的長度。

+1

這是什麼邏輯?爲什麼''children''節點取決於「紅酒」而不是「梅洛」?你到底想要做什麼? – Surya

+0

這是一個錯誤。固定。 – Steve

+1

沒有。我沒有看到包含美樂作爲最後一個孩子的更新。我會要求你再次看到結構。 – Surya

回答

1

雖然我已經簡化它從你的請求結構的一點,你可以用邏輯來得到它如何做一個想法:

require 'pp' 
x = [{ 
    "category" => ["Alcoholic Beverages", "Wine", "Red Wine"], 
    "name" => "Robertson Merlot", 
    "barcode" => '123456789-000', 
    "wine_farm" => "Robertson Wineries", 
    "price" => 60.00 
}] 

result = {} 

x.each do |entry| 

    # Save current level in a variable 
    current_level = result 

    # We want some special logic for the last item, so let's store that. 
    item = entry['category'].pop 


    # For each category, check if it exists, else add a category hash. 
    entry['category'].each do |category| 
    unless current_level.has_key?(category) 
     current_level[category] = {'type' => 'category', 'children' => {}, 'name' => category} 
    end 
    current_level = current_level[category]['children'] # Set the new current level of the hash. 
    end 

    # Finally add the item: 
    entry.delete('category') 
    entry['type'] = 'product' 
    current_level[item] = entry 

end 

pp result 

,它給我們:

{"Alcoholic Beverages"=> 
    {"type"=>"category", 
    "children"=> 
    {"Wine"=> 
     {"type"=>"category", 
     "children"=> 
     {:"Red Wine"=> 
      {"name"=>"Robertson Merlot", 
      "barcode"=>"123456789-000", 
      "wine_farm"=>"Robertson Wineries", 
      "price"=>60.0, 
      "type"=>"product"}}, 
     "name"=>"Wine"}}, 
    "name"=>"Alcoholic Beverages"}} 
0

有可能更簡單的方法做到這一點,但這是我現在所能想到的,它應該符合您的結構。

require 'json' 

# Initial set up, it seems the root keys are always the same looking at your structure 
products = { 
    'type' => 'category', 
    'properties' => { 
    'name' => 'Alcoholic Beverages' 
    }, 
    'children' => [] 
} 

data = [{ 
    "category" => ['Alcoholic Beverages', 'Wine', 'Red Wine'], 
    "name" => 'Robertson Merlot', 
    "barcode" => '123456789-000', 
    "wine_farm" => 'Robertson Wineries', 
    "price" => 60.00 
}] 

data.each do |item| 
    # Make sure we set the current to the top-level again 
    curr = products['children'] 

    # Remove first entry as it's always 'Alcoholic Beverages' 
    item['category'].shift 

    item['category'].each do |category| 
    # Get the index for the category if it exists 
    index = curr.index {|x| x['type'] == 'category' && x['properties']['name'] == category} 

    # If it exists then change current hash level to the child of that category 
    if index 
     curr = curr[index]['children'] 

    # Else add it in 
    else 
     curr << { 
     'type' => 'category', 
     'properties' => { 
      'name' => category 
     }, 
     'children' => [] 
     } 

     # We can use last as we know it'll be the last index. 
     curr = curr.last['children'] 
    end 
    end 

    # Delete category from the item itself 
    item.delete('category') 

    # Add the item as product type to the last level of the hash 
    curr << { 
    'type' => 'product', 
    'properties' => item 
    } 
end 

puts JSON.pretty_generate(products) 
相關問題