搜索時顯示的數據錯誤

我擁有超過一百萬行的數據集。我已經使用logstash與Mysql集成了elasticsearch。當我鍵入以下URL在郵遞員來取，搜索時顯示的數據錯誤

http://localhost:9200/persondetails/Document/_search?q= *

我得到如下：

{ 
"took": 1, 
"timed_out": false, 
"_shards": { 
    "total": 5, 
    "successful": 5, 
    "failed": 0 
}, 
"hits": { 
    "total": 2, 
    "max_score": 1, 
    "hits": [ 
     { 
      "_index": "persondetails", 
      "_type": "Document", 
      "_id": "%{idDocument}", 
      "_score": 1, 
      "_source": { 
       "iddocument": 514697, 
       "@timestamp": "2017-08-31T05:18:46.916Z", 
       "author": "vaibhav", 
       "expiry_date": null, 
       "@version": "1", 
       "description": "ly that", 
       "creation_date": null, 
       "type": 1 
      } 
     }, 
     { 
      "_index": "persondetails", 
      "_type": "Document_count", 
      "_id": "AV4o0J3OJ5ftvuhV7i0H", 
      "_score": 1, 
      "_source": { 
       "query": { 
        "term": { 
         "author": "rishav" 
        } 
       } 
      } 
     } 
    ] 
}

}

它是錯誤的，因爲排在我的表數超過100萬，這表明總數只有2個。我無法找到這裏的錯誤。

當我鍵入http://localhost:9200/_cat/indices?v 這表明該

健康：黃
狀態：開放
指數：persondetails
UUID：4FiGngZcQfS0Xvu6IeHIfg
PRI：5
代表：1
docs.count：2
docs.deleted：1054
store.size：125.4kb
pri.store.size：125.4kb

這是我logstash.conf文件

input { 
jdbc { 
    jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/persondetails" 
    jdbc_user => "root" 
    jdbc_password => "" 
    schedule => "* * * * *" 
    jdbc_validate_connection => true 
    jdbc_driver_library => "/usr/local/Cellar/logstash/5.5.2/mysql-connector-java-3.1.14/mysql-connector-java-3.1.14-bin.jar" 
    jdbc_driver_class => "com.mysql.jdbc.Driver" 
    statement => "SELECT * FROM Document" 
    type => "persondetails" 
} 
} 
output { 
elasticsearch { 
    #protocol=>http 
    index =>"persondetails" 
    document_type => "Document" 
    document_id => "%{idDocument}" 
    hosts => ["http://localhost:9200"] 
    stdout{ codec => rubydebug} 
} 
}

來源

2017-08-31 Vaibhav Savala

你從哪裏看到這個回答中的總數是1？ – Val

對不起，其實2.你可以看到總數是2，但我的表中有10lac行。 –

我看到你有不同的映射類型。運行'GET http：// localhost：9200/persondetails/_search？q = *'時會得到什麼？ – Val

從你的結果，它看起來像有是這是造成因爲沒有得到產生DOCUMENT_ID要覆蓋文檔您logstash配置的問題，並有效只有一個文件與文檔ID的指數爲「％{} idDocument」

請參閱從結果如下_source片段到您提供的搜索查詢：

"_source": { 
      "iddocument": 514697, 
      "@timestamp": "2017-08-31T05:18:46.916Z", 
      "author": "vaibhav", 
      "expiry_date": null, 
      "@version": "1", 
      "description": "ly that", 
      "creation_date": null, 
      "type": 1 
}

即使看索引的小尺寸，看起來好像還沒有更多的文檔。你應該看看你的jdbc輸入是否提供了「idDocument」字段。

來源

2017-08-31 08:10:56 Animesh

是的，謝謝。我將其從idDocument更改爲我的conf文件中的iddocument，它可以工作。我不知道爲什麼會發生這種情況，因爲列名是我的表中的idDocument。看起來像jdbcinput正在將其更改爲iddocument。 –

嗨@VaibhavSavala，不客氣。您可以在jdbc輸入輸入定義中使用「lowercase_column_names => false」來阻止這種情況的發生。該標誌在此處列出 - https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html#plugins-inputs-jdbc-lowercase_column_names。如果這回答你的問題，請考慮接受它（https://meta.stackexchange.com/q/5234/179419） – Animesh

搜索時顯示的數據錯誤

回答

相關問題