2011-09-20 47 views
3

這裏是代碼顯示問題:使用「貓」寫非英文字符爲.html文件(以R)

myPath = getwd() 
cat("abcd", append = T, file =paste(myPath,"temp1.html", sep = "\\")) # This is fine 
cat("<BR/><BR/><BR/>", append = T, file =paste(myPath,"temp1.html", sep = "\\")) # This is fine 
cat("שלום", append = F, file =paste(myPath,"temp1.html", sep = "\\")) # This text gets garbled when the html is opened using google chrome on windows 7. 
cat("שלום", append = F, file =paste(myPath,"temp1.txt", sep = "\\")) # but if I open this file in a text editor - the text looks fine 

# The text in the HTML folder would look as if I where to run this in R: 
(x <- iconv("שלום", from = "CP1252", to = "UTF8")) 
# But if I where to try and put it into the file, it wouldn't put anything in: 
cat(x, append = T, file =paste(myPath,"temp1.html", sep = "\\")) # empty 

編輯: 我用下面也嘗試編碼(沒有成功)

ff <-file(paste(myPath,"temp1.html", sep = "\\"), encoding="CP1252") 
cat("שלום", append = F, file =ff) 
ff<-file(paste(myPath,"temp1.html", sep = "\\"), encoding="utf-8") 
cat("שלום", append = F, file =ff) 
ff<-file(paste(myPath,"temp1.html", sep = "\\"), encoding="ANSI_X3.4-1986") 
cat("שלום", append = F, file =ff) 
ff<-file(paste(myPath,"temp1.html", sep = "\\"), encoding="iso8859-8") 
cat("שלום", append = F, file =ff) 

有什麼建議嗎?謝謝。

+0

它看起來像你需要一些睡眠... =) 'Sys.sleep(樣品(3600 * 1.5:8.5,1))' – aL3xa

+0

看看這個問題[關於使用UTF-8編碼保存csv](http://stackoverflow.com/q/7402307/168747)。 – Marek

+0

嗨馬雷克,當我嘗試使用它時,我得到的文字變成「\ xf9 \ xec \ xe5 \ xed」 –

回答

1

您的代碼有點多餘。第5行是temp1.txt錯字(.html)?無論如何,也許你應該在<meta>標記內設置字符集

拿這個作爲一個例子:

<html> 
<head> 
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 
</head> 
<body> 
<% 
cat("abcd") 
cat("<BR/><BR/><BR/>") 
cat("שלום") 
cat("שלום") 
(x <- iconv("שלום", from = "CP1252", to = "UTF8")) 
cat(x) 
-%> 
</body> 
</html> 

這是一個brew代碼,所以如果你繼續前進,brew它,你會得到正確的響應。長話短說,關鍵字是charset

1

問題不在於R(R正確生成UTF-8編碼輸出)......它只是在沒有顯式指定編碼的情況下,您的Web瀏覽器會採用錯誤的編碼。只是使用下面的代碼段(從內部R)代替:

<html> 
    <head> 
     <meta http-equiv="content-type" content="text/html; charset=utf-8"> 
    </head> 
    <body> 
     שלום 
    </body> 
</html> 

這指定了一個正確的編碼(UTF-8),並因此導致正確螺紋下面的文本瀏覽器。

+0

該死,我遲了2分鐘! =/ – aL3xa

1

嘗試這種方式

cat("abcd", file = (con <- file("temp1.html", "w", encoding="UTF-8"))); close(con) 
+0

感謝gd047,但它不起作用。它留給我這個:ש××××。而不是שלום –