我需要使用NLTK執行文本預處理任務,如句子拆分,標記化和標記。我想使用GENIA標記器進行標記。我正在使用Anaconda版本3.10並通過以下命令安裝geniatagger。在Anconda/NLTK中找不到Genia標記文件錯誤
python setup.py install
在IPython控制檯中,我輸入了以下代碼。
import geniatagger
tagger =geniatagger.GeniaTagger('C:\Users\dell\Anaconda\geniatagger\geniatagger')
print tagger.parse('Welcome to natural language processing!')
按Enter鍵時出現以下錯誤消息。
---------------------------------------------------------------------------
WindowsError Traceback (most recent call last)
<ipython-input-2-52e4d65c2d02> in <module>()
----> 1 tagger = geniatagger.GeniaTagger('C:\Users\dell\Anaconda\geniatagger\geniatagger')
2 print tagger.parse('Welcome to natural language processing!')
3
C:\Users\dell\Anaconda\lib\site-packages\geniatagger_python-0.1-py2.7.egg\geniatagger.pyc in __init__(self, path_to_tagger)
19 self._tagger = subprocess.Popen('./'+os.path.basename(path_to_tagger),
20 cwd=self._dir_to_tagger,
---> 21 stdin=subprocess.PIPE, stdout=subprocess.PIPE)
22
23 def parse(self, text):
C:\Users\dell\Anaconda\lib\subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
708 p2cread, p2cwrite,
709 c2pread, c2pwrite,
--> 710 errread, errwrite)
711 except Exception:
712 # Preserve original exception in case os.close raises.
C:\Users\dell\Anaconda\lib\subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
956 env,
957 cwd,
--> 958 startupinfo)
959 except pywintypes.error, e:
960 # Translate pywintypes.error to WindowsError, which is
WindowsError: [Error 2] The system cannot find the file specified
爲什麼我收到此錯誤信息?我怎樣才能解決這個問題?
如果我馬上使用這個標記,它是否也會執行標記化部分?
注意:geniatagger python文件位於'geniatagger'文件夾內。
我在cmd中試過這個,輸出是3.0.3 –