爲什麼grep（）在readLines（）之後不起作用？

我中的R開發了一個程序來讀取報告可在網上和第2行是：爲什麼grep（）在readLines（）之後不起作用？

page1 <- readLines("http://reportviewer.tce.mg.gov.br/default.aspx?server=noruega&relatorio=SICOM_Consulta/2013_2014/Modulo_AM/UC03-LeisOrc-RL&municipioSelecionado=3100203&exercicioSelecionado=2014") 
line1 <- grep("Leis Autorizativas",page1)

程序的其餘部分工作得很好，我得到了我所需要的數據。然後我試圖去適應它讀出不同的報告，但此時第二行沒有工作：

page2 <- readLines("http://reportviewer.tce.mg.gov.br/default.aspx?server=noruega&relatorio=SICOM_Consulta/2013_2014/Modulo_AM/UC08-ConsultarDecretos-RL&municipioSelecionado=3101607&exercicioSelecionado=2013") 
line2 <- grep("Decretos de Alterações",page2)

在第一種情況下「第1頁」是一個字符向量，並在第二案「第2頁」是一個大字符矢量。這種差異可能導致問題嗎？如果是這樣，是否有人提示如何解決它？

（使用htmltab（）或readHTMLtable（）並沒有產生好的結果）

謝謝。

來源

2017-10-08 ViniLima

你表明不能在我結束 – akrun

這是因爲「Decretos deAlterações」不完全由ascii字符組成。

如果你嘗試用

page2 <- readLines("http://reportviewer.tce.mg.gov.br/default.aspx?server=noruega&relatorio=SICOM_Consulta/2013_2014/Modulo_AM/UC08-ConsultarDecretos-RL&municipioSelecionado=3101607&exercicioSelecionado=2013") 

grep("Decretos de Altera&#231;&#245;es ", page2) 

[1] 366

它的工作原理。

要知道把什麼號碼更換：

utf8ToInt("ç") 
[1] 231

然後把&和;之間所產生的數量，並替換非ASCII字符。

最佳

科林

來源

2017-10-08 20:36:05

大，科林開的聯繫！非常感謝你。 – ViniLima

爲什麼grep（）在readLines（）之後不起作用？

回答

相關問題