0

嗨,我是新來的python,並試圖運行腳本(https://github.com/detuvoldo/tagger),我取代了utils.py中的2行,因爲我使用的是Windows 10,路徑相關的問題。Python命名實體識別錯誤:IndexError:列表索引超出範圍

models_path = u"\\\\?\\" + os.path.abspath(u".\\models") 
eval_path = os.path.abspath(u".\\evaluation") 

的錯誤是

run train.py --train lstm/fold1/train --dev lstm/fold1/dev --test lstm/fold1/test 
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL: 
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29 

Using gpu device 0: GeForce GT 620M (CNMeM is enabled with initial size: 85.0% of memory, cuDNN not available) 
Model location: \?\E:\New-Code\tagger-master\tagger-master\models\tag_scheme=iob,lower=False,zeros=False,char_dim=25,char_lstm_dim=25,char_bidirect=True,word_dim=100,word_lstm_dim=100,word_bidirect=True,pre_emb=,all_emb=False,cap_dim=0,crf=True,dropout=0.3,lr_method=sgd-lr_.005 
Found 2573 unique words (48986 in total) 
Found 64 unique characters 
Found 27 unique named entity tags 
858/289/286 sentences in train/dev/test. 
Saving the mappings to disk... 
Compiling... 
Starting epoch 0... 
50, cost average: 101.645935 
100, cost average: 83.234520 
150, cost average: 82.757523 
200, cost average: 69.019493 
250, cost average: 64.411346 
300, cost average: 62.836563 
350, cost average: 60.969635 
400, cost average: 58.851826 
450, cost average: 49.994457 
ID NE Total O I-LOC B-CTT B-OBJ B-LOC B-ACR B-INT B-PRC I-FACE I-PRC I-ACR I-OBJ B-FNUM I-FNUM I-DDIR B-FACEI-BEDNUM I-CTT B-DDIR I-INTB-BEDNUMB-BATHNUMI-BATHNUM I-FPOS B-FPOS I-BDIR B-BDIR Percent 
0 O 9314 9175 0 63 14 0 0 62 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 98.508 
1 I-LOC 2604 2602 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
2 B-CTT 478 245 0 233 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 48.745 
3 B-OBJ 464 282 0 0 177 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 38.147 
4 B-LOC 439 439 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
5 B-ACR 346 334 0 1 1 0 7 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2.023 
6 B-INT 339 126 0 0 32 0 0 181 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 53.392 
7 B-PRC 233 232 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
8 I-FACE 218 218 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
9 I-PRC 232 225 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
10 I-ACR 214 203 0 0 2 0 1 0 0 0 0 7 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3.271 
11 I-OBJ 201 198 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
12 B-FNUM 170 156 0 0 5 0 0 8 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
13 I-FNUM 166 157 0 0 8 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
14 I-DDIR 170 169 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
15 B-FACE 120 120 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
16I-BEDNUM 103 103 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
17 I-CTT 103 98 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
18 B-DDIR 83 83 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
19 I-INT 57 56 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
20B-BEDNUM 57 57 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
21B-BATHNUM 44 44 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
22I-BATHNUM 45 44 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
23 I-FPOS 42 42 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
24 B-FPOS 37 36 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
25 I-BDIR 22 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
26 B-BDIR 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000 
9780/16307 (59.97424%) 
Traceback (most recent call last): 

File "E:\New-Code\tagger-master\tagger-master\train.py", line 221, in 
dev_data, id_to_tag, dico_tags, epoch) 

File "utils.py", line 284, in evaluate 
return float(eval_lines[1].strip().split()[-1]) 

IndexError: list index out of range 

能否請你建議的東西,可以幫我解決這個錯誤嗎?我被困在最後2個月。謝謝

+0

它看起來像'eval_lines'只有一個元素,'eval_lines [1]'是導致IndexError的原因。不能說更多,因爲我沒有看到如何生成'eval_lines'。 – Reti43

+0

感謝您的回覆@ Reti43,所以你能否請建議我應該用eval_lines [1]替換哪些以避免錯誤? –

+0

我現在沒有太多時間看它,但這裏有一個指針,直到另一種靈魂出現。似乎'eval_path'參與了'eval_temp'的定義,然後影響'scores_path'和最後'eval_lines'。所以它可能與你在'eval_path'中的改變有關。 – Reti43

回答

0

我假設你正在運行E:\New-Code\tagger-master\tagger-master\目錄下的腳本,"models""evaluation"正好在它的內部。在這種情況下,本應正確指定的路徑:

models_path = "models" 
eval_path = "evaluation" 
eval_temp = os.path.join(eval_path, "temp") 
eval_script = os.path.join(eval_path, "conlleval") 

如果您看到這個錯誤使用此設置,問題是你"eval.*.scores"文件中的一個,而不是路徑規範。我無法確定必須包含,但至少要提供其實際內容。

+0

請注意DOS路徑的最大長度爲260個字符,除非腳本在啓用了長路徑的Windows 10系統中以Python 3.6+運行。 OP正在使用一個庫來創建超過200個字符的長文件名,因此被迫使用前綴爲「u」的絕對Unicode路徑\\\\?\\「'。在所有支持的Windows版本中,這些路徑最多可以有32760個字符。 – eryksun

相關問題