2017-05-03 27 views
4

我正在處理的平臺有非常嚴格的內存限制,我試圖找到一種解析大型JSON字符串的方法,而不需要最多加載超過幾百字節進入記憶。 JSON字符串存儲在一個更大的芯片(閃存)中的文件中。解析比內存大的JSON字符串

有兩件事情,我真的不能找到一個很好的解決方案爲:

  1. 訪問通過指定「路徑」像foo["bar"][2]一定的價值。 (如果該值是一個數組/對象,那麼我們只應該返回它是一個數組/對象的事實,也可能它是空的或不空的。)
  2. 迭代在數組/對象內的任何對象/數組JSON。

所以基本上我需要的功能,調用它時,一步解析JSON一步,既節約,我們確實需要繼續解析部分。

對於界面我不認爲有可能有類似exampleJson["aa"].2.["gg],但我設法得到真正接近:exampleJson["aa"].2.["gg"]()。這會導致函數被調用,然後可以輕鬆訪問{'aa',2,'gg'}並從文件中讀取/解析json。

這是到目前爲止我的代碼,但我真的不知道該怎麼繼續:
https://repl.it/HfwS/2

-- Looks complicated, but is pretty simple. Using meta tables we create a json interface that can almost be accessed as if it was a lua table. 
-- E.g. example["aa"][2]["gg"]() ; the only difference is that we have to use parentheses at the end 
-- The problematic part starts where it says `THIS IS WHERE THE JSON PARSING WOULD HAPPEN` 
json = {} 
setmetatable(json, { 
    __call = function(path) 
     local jsonFile = _file.open(filePath) 
     local fileLen = jsonFile:stat().size 

     local patternTable = {} -- Will store `{'aa',2,'gg'}` for `example.['aa'].[2]['gg']()` 

     local fakeJson = {} 
     setmetatable(fakeJson, { 
      __index = function (t, k) 
       patternTable[#patternTable+1] = k 
       return fakeJson 
      end; 
      __call = function() 

       -- THIS IS WHERE THE JSON PARSING WOULD HAPPEN -- 

       -- The patternTable contains {'aa',2,'gg'} at this point 

       -- Loop through the json file char by char 
       local valueToReturn = '' 
       local filePos = 0 
       for i=1, fileLen do 
        jsonFile:seek("set", filePos) 
        local currentChar = jsonFile:read(1) -- read character at current position 
        filePos = filePos + 1 
        -- print(currentChar) 

        -- Now the question is, how do we parse the json? 
        print('Magic to parse the json') 
        -- valueToReturn = ? 
       end 

       patternTable = {} -- Reset the patternTable 
       return valueToReturn 
      end; 
     }) 
     return fakeJson 
    end; 
}) 


local fakeParsedJson = json('example.json') 
local value = fakeParsedJson["aa"][2]["gg"]() -- Notice the `()` in the end 

print(value) 

回答

0

我花了一些時間思考如何這可以完成,終於成功地把它關閉。檢索值並遍歷數組/對象就像一個魅力。如果你知道更好的方法,請告訴我。 (我對代碼不太滿意,看起來它可能會更清晰。)但是,它的工作原理。

如果你想在這裏嘗試這是一個小提琴: https://repl.it/HfwS/31

json = {} 
setmetatable(json, { 
    __call = function(filePath) 
     local jsonFile = _file.open(filePath) 
     local fileLen = jsonFile:stat().size 

     local jsonPath = {} -- Would store `{'aa',2,'gg'}` for `example['aa'][2]['gg']()` 

     local fakeJson = {} 
     setmetatable(fakeJson, { 
      __index = function (t, k) 
       jsonPath[#jsonPath+1] = k 
       return fakeJson 
      end; 
      __call = function() 

       -- THIS IS WHERE THE JSON PARSING WOULD HAPPEN -- 

       -- The jsonPath contains {'aa',2,'gg'} at this point 

       local brcStack = {} -- will be used to push/pop braces/brackets 
       local jsonPathDim = 1 -- table dimension (['a'] == 1; ['a']['b'] == 2; ...) 
       -- Loop through the json file char by char 
       local valueToReturn 
       local filePos = 0 
       local nextChar = function() 
        jsonFile:seek("set", filePos) 
        filePos = filePos + 1 
        local char = jsonFile:read(1) 
        --print(char) 
        return char 
       end 
       local jsonValid = true 
       for o=1, fileLen do -- infinite 
        if jsonPathDim > #jsonPath then -- jsonPath followed. Now we can extract the value. 
         while true do 
          local currentChar = nextChar() 
          if currentChar == '"' then -- string 
           valueToReturn = '' 
           for i=1, fileLen do 
            currentChar = nextChar() 
            if currentChar == '"' then 
             break 
            elseif currentChar == nil then 
             jsonValid = false 
             break 
            else 
             valueToReturn = valueToReturn .. currentChar 
            end 
           end 
           break 
          elseif string.find(currentChar,'[%d.]') then -- numbers 0.3, .3, 99 etc 
           local rawValue = '' 
           if currentChar == '.' then 
            rawValue = '0' 
           end 
           for i=1, fileLen do 
            if string.find(currentChar, '[%s,\r\n%]%}]') then 
             break 
            elseif filePos > fileLen then 
             jsonValid = false 
             break 
            else 
             rawValue = rawValue .. currentChar 
            end 
            currentChar = nextChar() 
           end 
           valueToReturn = tonumber(rawValue) 
           break 
          elseif currentChar == 't' then -- true 
           valueToReturn = true 
           break 
          elseif currentChar == 'f' then -- false 
           valueToReturn = false 
           break 
          elseif currentChar == 'n' then -- null 
           valueToReturn = nil -- ? 
           break 
          elseif currentChar == '{' then -- null 
           valueToReturn = {} 
           brcStack[#brcStack+1] = '{' 
           local origBrcLvl = #brcStack 
           while true do 
            currentChar = nextChar() 
            if filePos > fileLen then 
             jsonValid = false 
             break 
            elseif currentChar == '\\' then 
             nextChar() 
             -- Continue 
            elseif origBrcLvl == #brcStack and currentChar == '"' then 
             local keyToPush = '' 
             while true do 
              currentChar = nextChar() 
              if currentChar == '"' then 
               while true do 
                currentChar = nextChar() 
                if currentChar == ':' then 
                 valueToReturn[keyToPush] = 0 
                 break 
                elseif filePos > fileLen then 
                 break 
                end 
               end 
               break 
              elseif filePos > fileLen then 
               jsonValid = false 
               break 
              else 
               keyToPush = keyToPush .. currentChar 
              end 
             end 
             break 
            elseif currentChar == '[' or currentChar == '{' then 
             brcStack[#brcStack+1] = currentChar 
            elseif currentChar == ']' then 
             if brcStack[#brcStack] == ']' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            elseif currentChar == '}' then 
             if brcStack[#brcStack] == '}' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            end 
           end 
           break 
          elseif currentChar == '[' then 
           brcStack[#brcStack+1] = '[' 
           valueToReturn = {} 
           local origBrcLvl = #brcStack 
           while true do 
            currentChar = nextChar() 

            if origBrcLvl == #brcStack and #valueToReturn == 0 and not string.find(currentChar, '[%s\r\n%]]') then 
             valueToReturn[#valueToReturn+1] = 0 
            end 
            if filePos > fileLen then 
             jsonValid = false 
             break 
            elseif currentChar == '\\' then 
             nextChar() 
             -- Continue 
            elseif origBrcLvl == #brcStack and currentChar == ',' then 
             valueToReturn[#valueToReturn+1] = 0 
            elseif currentChar == '[' or currentChar == '{' then 
             brcStack[#brcStack+1] = currentChar 
            elseif currentChar == ']' then 
             if brcStack[#brcStack] == ']' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            elseif currentChar == '}' then 
             if brcStack[#brcStack] == '}' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            end 
           end 
           break 
          end 
         end 
         break 
        end 
        local currentKey = jsonPath[jsonPathDim] 
        local currentKeyLen = string.len(currentKey) 
        if type(jsonPath[jsonPathDim]) == 'string' then -- Parsing { object 
         while true do 
          local currentChar = nextChar() 
          if currentChar == '{' then 
           brcStack[#brcStack+1] = '{' 
           local origBrcLvl = #brcStack 
           local keyFound = true 
           for z=1, fileLen do -- loop over keys until we find it 
            currentChar = nextChar() 
            if currentChar == '\\' then 
             nextChar() 
             -- Continue 
            elseif origBrcLvl == #brcStack and currentChar == '"' then 
             local keyMatched = false 
             for i=1, fileLen do 
              local expectedChar = string.sub(currentKey,i,i) 
              if nextChar() == expectedChar then 
               if i == currentKeyLen and nextChar() == '"' then 
                keyMatched = true 
                while true do 
                 currentChar = nextChar() 
                 if currentChar == ':' then 
                  break 
                 elseif currentChar == nil then 
                  jsonValid = false 
                  break 
                 end 
                end 
                break 
               end 
               -- Continue 
              else 
               keyMatched = false 
               break 
              end 
             end 
             if keyMatched then 
              keyFound = true 
              break 
             end 
            elseif currentChar == '[' or currentChar == '{' then 
             brcStack[#brcStack+1] = currentChar 
            elseif currentChar == ']' then 
             if brcStack[#brcStack] == ']' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            elseif currentChar == '}' then 
             if brcStack[#brcStack] == '}' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            end 
           end 
           if keyFound then 
            jsonPathDim = jsonPathDim+1 
           end 
           break 
          elseif currentChar == nil then 
           jsonValid = false 
           break 
          end 
         end 
        elseif type(jsonPath[jsonPathDim]) == 'number' then -- Parsing [ array 
         while true do 
          local currentChar = nextChar() 
          if currentChar == '[' then 
           brcStack[#brcStack+1] = '[' 
           local origBrcLvl = #brcStack 
           local currentIndex = 1 
           -- currentKey 
           local keyMatched = true 
           for i=1, fileLen do 
            currentChar = nextChar() 
            if currentChar == '\\' then 
             nextChar() 
             -- Continue 
            elseif origBrcLvl == #brcStack and currentChar == ',' then 
             currentIndex = currentIndex +1 
             if currentIndex == currentKey then 
              jsonPathDim = jsonPathDim+1 
              break 
             end 
            elseif currentChar == '[' or currentChar == '{' then 
             brcStack[#brcStack+1] = currentChar 
            elseif currentChar == ']' then 
             if brcStack[#brcStack] == ']' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            elseif currentChar == '}' then 
             if brcStack[#brcStack] == '}' then 
              brcStack[#brcStack] = nil 
             else 
              jsonValid = false 
              break 
             end 
            else 
             -- Continue 
            end 
           end 
           break 
          elseif currentChar == nil then 
           jsonValid = false 
           break 
          end 
         end 
        else 
         jsonValid = false 
         break -- Invalid json 
        end 
       end 
       jsonPath = {} -- Reset the jsonPath 
       return valueToReturn 
      end; 
     }) 
     return fakeJson 
    end; 
}) 



local example = json('example.json') 

-- Read a value 
local value = example["aa"][2]['k1']() 
print(value) 

-- Loop over a key value table and print the keys and values 
for key, value in pairs(example["aa"][2]()) do 
    print('key: ' .. key, 'value: ' .. example["aa"][2][key]()) 
end 

JSON驗證可能會更好,但如果你提供無效的JSON數據那麼你不應該指望什麼反正。

0

如果要,而不是整個JSON解碼來解碼單JSON元素(對象,陣列等),則需要具有JSON庫兩個特點:

  • 「遍歷」功能性(乾式遊程解碼而不創建Lua對象)
  • 將JSON作爲小部件序列傳遞的能力(而不是將整個JSON預加載爲巨大的Lua字符串)。

示例:
如何使用this module部分解碼JSON:

-- This is content of data.txt file: 
-- {"aa":["qq",{"k1":23,"gg":"YAY","Fermat_primes":[3, 5, 17, 257, 65537]}]} 
-- We want to extract as Lua values only "Fermat_primes" array and "gg" string 
local json = require('json') 

-- Open file 
local file = assert(io.open('data.txt', 'r')) 

-- Define loader function which will read the file in 64-byte chunks 
local function my_json_loader() 
    return file:read(64) 
end 

local FP, gg 
-- Prepare callback function for traverse with partial decode 
local function my_callback (path, json_type, value) 
    path = table.concat(path, '/') 
    if path == "aa/2/Fermat_primes" then 
     FP = value 
     return true -- we want to decode this array instead of traverse through it 
    elseif path == "aa/2/gg" then 
     gg = value 
    end 
end 

json.traverse(my_json_loader, my_callback) 

-- Close file 
file:close() 

-- Display the results 
print('aa.2.gg = '..gg) 
print('aa.2.Fermat_primes:') 
for k, v in ipairs(FP) do print(k, v) end 

輸出:

aa.2.gg = YAY 
aa.2.Fermat_primes: 
1 3 
2 5 
3 17 
4 257 
5 65537