如果符合關鍵字，則返回函數中的第一個數字

我正在處理數據，並且已將它設置爲吐出我需要的項目。例子：如果符合關鍵字，則返回函數中的第一個數字

LOT OF 4 American motor vinegar 
Lot of (6) 808 metal/steel/G/N LWAP 
LOT 12 product number 57838290

我想是有它在每個批次吐出量，可能是小寫或大寫，如果'lot'在文本中找到。我認爲我的代碼已經建成了一半，但由於該值不在設置的位置，我不知道如何檢索它。另外，上面的列表是從文本字符串，因此它不承認整數

def auction(title): 
    for word in title.split(): 
     if word.startswith('lot'): 
      return # not sure what to return (from the example the answer would be 4 6 and 12)

來源

2015-06-22 57JWL

*由於該值不在設定的位置，我不知道如何檢索它*循環播放直到找到一個數字的字符串，並在找到第一個不是第一個字符串時停止循環一個數字 –

您可以重新編寫以下順序：

def auction(title): 
    found = False; 
    for word in title.split(): 
     if word.upper().startswith('LOT'): 
       found = True; 
     if found: 
       if word.isdigit(): 
        return int(word)

的基礎是一樣的自己，我們在找到LOT值（任何大寫或小寫）後將布爾值設置爲True。然後我們檢查這個單詞是否是一個數字，如果它是，則返回它的值。

來源

2015-06-22 22:17:38 MCSH

問題是它從一個文本字符串中提取，所以它不能識別整數。所以它發現很多，但沒有數字可以找到，所以當它運行時它總是以「無」的形式返回 – 57JWL

@ 57JWL我檢查了它，它工作正常。唯一的問題是，word.upper（）。中的upperwith（'LOT'）：沒有（）。在你的第一個例子中，它返回4，依此類推。 isdigit（）函數作用於字符串並檢查這些字符串是否可以是數字，我認爲這是您的擔心。 – MCSH

您可以使用列表解析，看看你是否需要解析

num=['0','1','2','3','4','5','6','7','8','9'] 
t='this is a lot of 10' 
if [e for e in t if e in num]!=[]: 
    parse_the_string(t) 

def parse_the_string(the_string): 
    the_string=the_string.upper() 
    the_number='' 
    number_founded=False 
    for n in the_string[the_string.find("LOT"):]: 
     if n.isdigit(): 
      the_number+=n 
      number_founded=True 
     elif number_founded: 
      break; 
    return the_number

來源

2015-06-22 22:33:50

可以使用字符串正則表達式

import re 
def auction(title): 
for word in title.split(): 
    if word.startswith('lot'): 
     search_result = re.search('([0-9]+)', title) 
     if search_result 
      return int(search_result.groups()[0])

來源

2015-06-22 22:34:53

有些人不喜歡regular expressions，但在這樣的情況下，他們'非常方便。我可能會嘗試這樣的事：

import re 

inputs = [ 
    "LOT OF 4 CISCO AIRONET 4800 AIR-LM4800 DSSS WLAN PC CARD", 
    "Lot of (6) CISCO AIRONET AIR-LAP1252AG-A-K9 DUAL BAND 802.11A/G/N LWAP", 
    "LOT 12 Cisco Systems Aironet 1200 Wireless Access Point AIR-AP1231G-A-K9 MP21G", 
    "CISCO AIRONET 4800 AIR-LM4800 DSSS WLAN PC CARD lot of 4", 
    "Ocelot 4800 AIR-LM4800"] 

patterns = [ 
    r'\blot(?:\s+of|)\s+(\d+)', 
    r'\blot(?:\s+of|)\s+\((\d+)\)'] 

for a in inputs: 
    for pattern in patterns: 
     m = re.search(pattern, a, flags=re.IGNORECASE) 
     if m: 
      print "lot size = ", m.group(1) 
      break 
    else: 
     print "No lot size found!"

輸出：

lot size = 4 
lot size = 6 
lot size = 12 
lot size = 4 
No lot size found!

的模式在這裏看起來有點可怕，但他們只是在說這句話的：查找單詞「很多」，隨後可能（或不是）'的'字，然後一些數字。或者，在第二種情況下，一些數字由文字括號包圍。

由於這是您解析的自由文本，您可能會遇到一些錯誤，這些錯誤可能需要通過手動或添加更多模式進行更正。

來源

2015-06-22 22:39:02 cbare

如果符合關鍵字，則返回函數中的第一個數字

回答

相關問題