我正在訓練斯坦福NER CRF模型,在自定義數據集上,但是用於訓練模型的迭代次數現在已經達到了迭代次數 - 即,這個培訓過程現在已經過去了幾個小時。 下面是在該終端打印的信息 - 文件被使用在下面給出限制斯坦福大學NER的迭代次數
Iter 335 evals 400 <D> [M 1.000E0] 2.880E3 38054.87s |5.680E1| {6.652E-6} 4.488E-4 -
Iter 336 evals 401 <D> [M 1.000E0] 2.880E3 38153.66s |1.243E2| {1.456E-5} 4.415E-4 -
-
性質 - 是有一些方法我可以限制迭代次數說20.
location of the training file
trainFile = TRAIN5000.tsv
#location where you would like to save (serialize to) your
#classifier; adding .gz at the end automatically gzips the file,
#making it faster and smaller
serializeTo = ner-model_TRAIN5000.ser.gz
#structure of your training file; this tells the classifier
#that the word is in column 0 and the correct answer is in
#column 1
map = word=0,answer=1
#these are the features we'd like to train with
#some are discussed below, the rest can be
#understood by looking at NERFeatureFactory
useClassFeature=true
useWord=true
useNGrams=true
#no ngrams will be included that do not contain either the
#beginning or end of the word
noMidNGrams=true
useDisjunctive=true
maxNGramLeng=6
usePrev=true
useNext=true
useSequences=true
usePrevSequences=true
maxLeft=1
#the next 4 deal with word shape features
useTypeSeqs=true
useTypeSeqs2=true
useTypeySequences=true
wordShape=chris2useLC
saveFeatureIndexToDisk = true
printFeatures=true
flag useObservedSequencesOnly=true
featureDiffThresh=0.05
我試過這個,它沒有工作 – arop