2016-03-23 45 views
0

您好我有在python壓縮任務來開發代碼,其中如果輸入是在python分配數字在字符串

'hello its me, hello can you hear me, hello are you listening'

然後輸出應該是

1,2,3,1,4,5,6,3,1,7,5,8

基本上每個單詞都被分配一個數值,如果該單詞重複,那麼單詞也是如此。 這個代碼是在Python中,請幫助我,謝謝

+3

你嘗試過什麼嗎? StackOverflow不是一個代碼寫入服務。 –

回答

3

一個簡單的方法是使用一個字典,當你發現一個新的單詞添加一個鍵/值配對使用增量變量,當你看到前的單詞只是從字典打印值:

s = 'hello its me, hello can you hear me, hello are you listening' 


def cyc(s): 
    # set i to 1 
    i = 1 
    # split into words on whitespace 
    it = s.split() 
    # create first key/value pair 
    seen = {it[0]: i} 
    # yield 1 for first word 
    yield i 
    # for all var the first word 
    for word in it[1:]: 
     # if we have seen this word already, use it's value from our dict 
     if word in seen: 
      yield seen[word] 
     # else first time seeing it so increment count 
     # and create new k/v pairing 
     else: 
      i += 1 
      yield i 
      seen[word] = i 


print(list(cyc(s))) 

輸出:

[1, 2, 3, 1, 4, 5, 6, 3, 1, 7, 5, 8] 

您還可以避免切片用iter並調用next彈出的第一個字,如果你還想讓foo == foo!我們需要從哪個凸輪與str.rstrip做字符串中刪除任何標點:

from string import punctuation 
def cyc(s): 
    i = 1 
    it = iter(s.split()) 
    seen = {next(it).rstrip(punctuation): i} 
    yield i 
    for word in it: 
     word = word.rstrip(punctuation) 
     if word in seen: 
      yield seen[word] 
     else: 
      i += 1 
      yield i 
      seen[word] = i 
2

如何建立一個與dict項目:索引映射:

>>> s 
'hello its me, hello can you hear me, hello are you listening' 
>>> 
>>> l = s.split() 
>>> d = {} 
>>> i = 1 
>>> for x in l: 
     if x not in d: 
      d[x]=i 
      i += 1 


>>> d 
{'its': 2, 'listening': 8, 'hear': 6, 'hello': 1, 'are': 7, 'you': 5, 'me,': 3, 'can': 4} 
>>> for x in l: 
     print(x, d[x]) 


hello 1 
its 2 
me, 3 
hello 1 
can 4 
you 5 
hear 6 
me, 3 
hello 1 
are 7 
you 5 
listening 8 
>>> 

如果你不」不想在你的分裂列表中的任何標點,那麼你可以做:

>>> import re 
>>> l = re.split(r'(?:,|\s)\s*', s) 
>>> l 
['hello', 'its', 'me', 'hello', 'can', 'you', 'hear', 'me', 'hello', 'are', 'you', 'listening'] 
1
import re 
from collections import OrderedDict 

text = 'hello its me, hello can you hear me, hello are you listening' 
words = re.sub("[^\w]", " ", text).split() 
uniq_words = list(OrderedDict.fromkeys(words)) 
res = [uniq_words.index(w) + 1 for w in words] 

print(res) # [1, 2, 3, 1, 4, 5, 6, 3, 1, 7, 5, 8]