0
- 消除標點符號
- 話結識新線和空間分割時,然後存儲在陣列
- 檢查文本文件有錯誤或不符合checkSpelling.m的函數文件
- 總和向上的誤差該文章中的總數假定
- 沒有建議是沒有錯誤,則返回-1
- 誤差的總和> 20,返回1
- 總和誤差< = 20,返回的-1
我想檢查某個段落的拼寫錯誤,我面臨的問題擺脫了標點符號。它可能有問題的其他原因,我返回如下錯誤:如何擺脫標點符號?並檢查拼寫錯誤
我DATA2文件是:
checkSpelling.m
function suggestion = checkSpelling(word)
h = actxserver('word.application');
h.Document.Add;
correct = h.CheckSpelling(word);
if correct
suggestion = []; %return empty if spelled correctly
else
%If incorrect and there are suggestions, return them in a cell array
if h.GetSpellingSuggestions(word).count > 0
count = h.GetSpellingSuggestions(word).count;
for i = 1:count
suggestion{i} = h.GetSpellingSuggestions(word).Item(i).get('name');
end
else
%If incorrect but there are no suggestions, return this:
suggestion = 'no suggestion';
end
end
%Quit Word to release the server
h.Quit
f19.m
for i = 1:1
data2=fopen(strcat('DATA\PRE-PROCESS_DATA\F19\',int2str(i),'.txt'),'r')
CharData = fread(data2, '*char')'; %read text file and store data in CharData
fclose(data2);
word_punctuation=regexprep(CharData,'[`[email protected]#$%^&*()-_=+[{]}\|;:\''<,>.?/','')
word_newLine = regexp(word_punctuation, '\n', 'split')
word = regexp(word_newLine, ' ', 'split')
[sizeData b] = size(word)
suggestion = cellfun(@checkSpelling, word, 'UniformOutput', 0)
A19(i)=sum(~cellfun(@isempty,suggestion))
feature19(A19(i)>=20)=1
feature19(A19(i)<20)=-1
end