Scala：用單引號替換雙引號

如何用Scala中的雙引號替換單引號？我有一個包含「abc」（雙引號）的記錄的數據文件。我需要用單引號替換這些引號並將其轉換爲數據幀。按行文件Scala：用單引號替換雙引號

val customSchema_1 =   
    StructType(Array(
    StructField("ID", StringType, true), 
    StructField("KEY", StringType, true), 
    StructField("CODE", StringType, true)) 

val df_1 = sqlContext.read 
    .format("com.databricks.spark.csv") 
    .option("delimiter", "¦") 
    .schema(customSchema_1) 
    .load("example")

來源

2017-01-03 SFatima

哪列有雙引號？你的火花版本是什麼？ – mrsrinivas

我正在使用火花芯1.6.0。引號中的數據分散在一些數據在列中有引號，而其他數據不包含。 – SFatima

這聽起來像是一個可能更容易用bash腳本解決的問題，但您基本上需要編寫一個正則表達式，它將在雙引號內找到所有雙引號（用於您的列字符串），並用單引號替換它們。 –

讀線，並應用下面的例子來他們每個人：

val text: String = """Here is a lot of text and "quotes" so you may think that everything is ok until you see something "special" or "weird" 
""" 

text.replaceAll("\"", "'")

這會給你加上引號，而不是雙引號的新字符串值。

來源

2017-01-03 21:00:23

感謝您的建議！如果您使用數據框架，您如何實現這一點？數據框中是否有一個函數可以允許這樣做？ – SFatima

您可以創建一個簡單的UDF用單引號

我更換雙引號是一個簡單的例子

import org.apache.spark.sql.functions.udf 

val removeDoubleQuotes = udf((x:String) => s.replace("\"","'")) 

//If df is the dataframe and use the udf to colName to replace " with ' 

df.withColumn("colName", removeDoubleQuotes($"colName"))

希望這有助於！

來源

2017-07-08 05:32:14

Scala：用單引號替換雙引號

回答

相關問題