2016-06-01 81 views
0

我想搜索直到特定行的單詞,而不是使用solr查詢。我嘗試了近距離賽,但它沒有奏效。我的數據是像SOLR如何限制solr查詢中的搜索內容

塊引用「日期:星期四,2014年7月24日9點36分44秒GMT \的nCache控制:私人\ nContent類型:應用程序/ JSON;字符集= UTF-8 \ nContent編碼: gzip \ nVary:Accept-Encoding \ nP3P:CP =%20CURo TAIo IVAo IVO ONL UNI COM NAV INT DEM STA OUR%20 \ nX-Powered-By:ASP.NET \ n內容長度:570 \ n保持活躍:timeout = 120 \ nConnection:Keep-Alive \ n \ n [{%20rows%20:[],%20index%20:[],%20folders%20:[[%20Inbox%20,%20Inbox%20,%20%20 ,1,1,0,0,0,%20Inbox%20,0,0,%20none%20,0],[%20Drafts%20,%20Drafts%20,%20%20,1,1,0, 0,0,%20Drafts%20,0,0,%20none%20,0],[%20Sent%20,%20Sent%20,%20%20,1,1,0,0,11,%20Sent% 20,1,0,%20none%20,0],[%20Spam%20,%20Spam%20,%20%20,1,1,0,0,0,%20Spam%20,1,0,% 20none%20,0],[%20Deleted%20,%20Trash%20,%20%20,1,1,0,7,9,%20Deleted%20,1,0,%20none%20,0], [%20已保存%20,%20已保存郵件%20,%20%20,1,1,0,0,0,%20已保存%20,1,0,%20n一個%20,0],[%20保存的IM%20,%20已保存的聊天%20,%20保存的%20,2,1,0,0,0,%20保存的內容%20,1,0,%20none%20,0] ],%20fcsupport%20:真,%20hasNewMsg%20:假,%20totalItems%20:0,%20isSuccess%20:真,%20foldersCanMoveTo%20:[%20Sent%20,%20Spam%20,%20Deleted%20 ,%20保存%20,%20保存的%20],%20索引開始%20:}} POST /38664-816/aol-6/en-us/common/rpc/RPC.aspx?user=hl1lkgReIh & transport = xmlhttp & r = 0.019667088333411797 & a = GetMessageList & l = 31211 HTTP/1.1 \ n主機:mail.aol.com \ n用戶代理:Mozilla/5.0(Windows NT 5.1; rv:31.0)Gecko/20100101 Firefox/31.0 \ n接受:text/html,application/xhtml + xml,application/xml; q = 0.9,/; q = 0.8 \ n接受 - 語言:en-US,en; q = 0.5 \ nAcept-Encoding:gzip,deflate \ nContent-Type:application/x-www-form-urlencoded; charset = UTF-8 \ nX-Requested-With:XMLHttpRequest \ nReferer:http://mail.aol.com/38664-816/aol-6/en-us/Suite.aspx \ nContent-Length:452 \ nCookie:mbox = PC#1405514778803-136292.22_06#1407395182 | session#1406185366924-436868#1406187442 | check#true#1406185642 ; s_pers =%20s_fid%3D55C638B5F089E6FB-19ACDEED1644FD86%7C1469344726539%3B%20s_getnr%3D1406186326569-重複%7C1469258326569%3B%20s_nrgvo%3DRepeat%7C1469258326571%3B; s_vi = [CS] V1 | 29E33A0D051D366F-60000105200097FF [CE]; UNAUTHID = 1.5efb4a11934a40b8b5272557263dadfe.88c5; RSP_COOKIE = type = 3& name = LTState =版本:5 & LAV:22 & UN:* UQo5AwAnAytffwJSYg%3D%3D & SN:* UQo5AwAnAytffwJSYg%3D%3D &紫外線:AOL & LC:EN-US & UD:aol.com & EA:* UQo5AwAnAytffwJSCAsnWWoJASZL & PRMC :825345 & MT:6個& AMS:1個& CMAI:365 & SNT:0 & vnop:假& MH:core-mia002b.r1000.mail.aol.com &峯br:100 & WM:mail.aol.com & CKD :.mail.aol.com & ckp:%2f & ha:1NGRuUTRRxGFF2s5A4JwkuCT43Q%3d &; aolweatherlocation = 10003;數據層=缺點%3D6.107%26coms%3D629; grvinsights = 69f3a2bb86ed3cd31aa1d14a1ce9e845; CUNAUTHID = 1.5efb4a11934a40b8b5272557263dadfe.88c5; s_sess =%20s_cc%3Dtrue%3B%20s_sq%3Daolcmp%253D%252526pid%25253Dcmp%2525253A%25252520Help%25252520%2525257C%25252520View%25252520Article%2525253A%25252520Clear%25252520cookies%2525252C%25252520cache%2525252C%25252520history%25252520and%25252520footprints%252526pidt %25253D1%252526oid%25253Dhttp%2525253A%2525252F%2525252Fwebmail.aol.com%2525252F%2525253F_AOLLOCAL%2525253Dmail%252526ot%25253DA%2526aolsnssignin%253D%252526pid%25253Dsso%25252520%2525253A%25252520login%252526pidt%25253D1%252526oid%25253DSign%25252520In %252526oidt%25253D3%252526ot%25253DSUBMIT%3B; L7Id = 31211;上下文= ver:3 & sid:923f783b-bc6e-4edf-87c9-e52f19b3ce67 & rt:STANDARD & i:f & ckd:.mail.aol。com & ckp:%2f & ha:X80Ku4ffRKsOVSwgmEVPCfpfxeU%3d &; IDP_A = S-1- V0c3QiuO6BzQ5S6_u3s0brfUqMCktezAz7sWlVfHD90omIijDXRrMJkSM -9- xcnUcSTnXbcZ1aUCgvfuToVeJihcftKY5KtsC_nB7Y9qf6P0xUnNfCIAmWVtRf4ctSQ9JwRIzHa40dhFuULwYLu3NUPTxckeFUFAzcSS4hrmb4grhEtyOGp0qV5rIKtjs4u8; MC_CMP_ESK =無義; SNS_AA = asrc = 2 & sst = 1406185424 & type = 0; _utd = GD#MzRb%2FjjHIe8odpr%2FfxZR2g%3D%3D | PR#一個| ST#sns.webmail.aol.com | UID#; AUTH =版本:22個& UAS:* UQo5AwAnAytffwJSZAskRiwLBSIDWVpVXxVTVwJCLFxdSnpHUWBbeV1jcikERgl6CEYLJUweGUhdFQQLW1h%2bBAZRcllWfVl8VH4DUmRaZARoPhw%2bBFBA & IDL:0 & UN:* UQo5AwAnAytffwJSYg%3D%3D &於:SNS & SN:* UQo5AwAnAytffwJSYg%3D%3D & WIM:%252FwQCAAAAAAAEk2ihy%252BE4MMebm4R1jvxY07zNZhFOHSz2EFBnsNdOAUsl8QyZceo54kWYZ4vwVayLFF7w &麥粒腫: 0 & UD:aol.com & UID:hl1lkgReIh & SS:635417678271359104個& SVS:SNS_AA%7c1406185424 & LA:635417687268954835 & AAT:甲&行爲:M &峯br:100 & CBR:AOL & MT:&薪酬:0 & MBT:摹&紫外線:AOL & LC:EN-US &投標:1 & ACD:1403348988 & PIX:3829 & PRMC:825345 & RELM:AOL &麻將:%2 \ nConnection:保活\ n「

並且希望從數據中搜索Content-Type:application/json,而不是在這行之後。我曾嘗試

http://192.168.0.164:8983/solr/collection_with_all_details/select?q=Content%3A的Content-Type JSON * &重量= JSON &縮進=真

,但它在整個內容搜索。我需要限制搜索內容

回答

0

我不認爲這是可能的在這種情況下。您可以檢查highlighter以突出顯示響應返回前200個字符。

可能是你需要寫一個自定義響應作家,可以幫助這一點。

另一個選項駕駛室將創建更多的字段與indexed="false"stored="true"將更有效率。

創建您的原始字段indexed="true"stored="false",您的索引大小將會減小。新副本字段將爲indexed="false"stored="true"

<copyField source="text" dest="textShort" maxChars="200"/> 

檢查這是否適合您。

0

您應該真正地預處理您的數據以僅索引要使用的部分。在事實之後這樣做並不是一個好的解決方案,因爲您已經擁有索引中的大部分內容,並且您正在尋找一個未位於特定字節位置的分隔符(這就是maxChars將能夠去做)。

根據您的索引方式,您可以在索引步驟(regextransformer,在您自己的代碼中使用SolrJ等)執行此操作,也可以在代碼的分析步驟中使用類似於patternreplacefilter。這將允許你刪除你要找的標題後的任何東西。

這樣,您應該能夠將內容索引到一個header字段和一個body字段中,例如,根據您的需要。