1
我正在處理twitter文本c#庫,並且Twitter已經在其一致性測試中添加了雙字unicode字符測試。YamlDotNet沒有正確地反序列化雙字unicode字符
https://github.com/twitter/twitter-text-conformance/blob/master/validate.yml
這裏有一個NUnit的測試方法對上述文件運行。
[Test]
public void TestDoubleWordUnicodeYamlRetrieval()
{
var yamlFile = "validate.yml";
Assert.IsTrue(File.Exists(conformanceDir + yamlFile), "Yaml file " + conformanceDir + yamlFile + " does not exist.");
var stream = new StreamReader(Path.Combine(conformanceDir, yamlFile));
var yaml = new YamlStream();
yaml.Load(stream);
var root = yaml.Documents[0].RootNode as YamlMappingNode;
var testNode = new YamlScalarNode("tests");
Assert.IsTrue(root.Children.ContainsKey(testNode), "Document is missing test node.");
var tests = root.Children[testNode] as YamlMappingNode;
Assert.IsNotNull(tests, "Test node is not YamlMappingNode");
var typeNode = new YamlScalarNode("lengths");
Assert.IsTrue(tests.Children.ContainsKey(typeNode), "Test type lengths not found in tests.");
var typeTests = tests.Children[typeNode] as YamlSequenceNode;
Assert.IsNotNull(typeTests, "lengths tests are not YamlSequenceNode");
var list = new List<dynamic>();
var count = 0;
foreach (YamlMappingNode item in typeTests)
{
var text = ConvertNode<string>(item.Children.Single(x => x.Key.ToString() == "text").Value) as string;
var description = ConvertNode<string>(item.Children.Single(x => x.Key.ToString() == "description").Value) as string;
Assert.DoesNotThrow(() => {text.Normalize(NormalizationForm.FormC);}, String.Format("Yaml couldn't parse a double word unicode string at test {0} - {1}.", count, description));
count++;
}
}
這是產生的誤差: Vocus.TwitterText.Tests.ConformanceTest.TestDoubleWordUnicodeYamlRetrieval: YAML未能在試驗5解析一個雙字unicode字符串 - 計數基本多語種平面之外的unicode字符(雙字)。 意外的異常信息:System.ArgumentException
對不起在回答這麼晚了,但是這並沒有幫助。 有問題的特定線實際上不是UTF8字符,但unicoded字符表示: 文本:「\ U00010000 \ U0010ffff」 使用流讀取器時,輸出的文件爲一個字符串,字符是正確的。使用yaml檢索節點時,輸出爲\ 0。 –