2016-11-25 69 views
2

此刻,我將一段文本傳遞給以下函數,以確保每個句子的首字母都大寫。句子案例忽略了html元素中的文本段落

function sentenceCase(string) { 
    var n = string.split("."); 
    var vfinal = "" 
    for (i = 0; i < n.length; i++) { 
     var spaceput = "" 
     var spaceCount = n[i].replace(/^(\s*).*$/, "$1").length; 
     n[i] = n[i].replace(/^\s+/, ""); 
     var newstring = n[i].charAt(n[i]).toUpperCase() + n[i].slice(1); 
     for (j = 0; j < spaceCount; j++) spaceput = spaceput + " "; 
     vfinal = vfinal + spaceput + newstring + "."; 
    } 
    vfinal = vfinal.substring(0, vfinal.length - 1); 
    return vfinal; 
} 

如果文本不包含任何元素,並且所有內容都應該是大寫的,這很有效。

var str1 = 'he always has a positive contribution to make to the class. in class, he behaves well, but he should aim to complete his homework a little more regularly.'; 
console.log(sentenceCase(str1)); 

Returns >>> He always has a positive contribution to make to the class. In class, he behaves well, but he should aim to complete his homework a little more regularly. 

但是,如果文本包含<span>元素在句子中包裹的第一個字,那麼它顯然會導致問題,如圖所示。

var str2 = '<span class="pronoun subjective">he</span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.'; 
console.log(sentenceCase(str2)); 

Returns >>> <span class="pronoun subjective">he</span> always has a positive contribution to make to the class. In class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly. 

我正則表達式的技巧是遠遠恆星,所以我不知道如何從這裏出發,所以如何將其轉換爲句首字母大寫時忽略任何元素文本中的任何建議,將不勝感激。

編輯:爲了澄清 - 輸出應該仍然保持元素 - 他們只需要在考慮上層的句子時忽略。

+0

你的輸出應該包含HTML元素,或者它應該被刪除像你正在尋找他 ...或者只是他......... –

+0

可以消毒該字符串在傳遞給函數之前: 'string = string.replace(/ <.*?>/g,'');'這將刪除HTML標籤。 – sideroxylon

+0

對不起 - 澄清 - 輸出仍應保持元素 - 當考慮上層套管的句子時,他們只需要被忽略。 – user6790086

回答

4

這不是一個小問題。純粹使用正則表達式的做法是bad,因爲您可能會陷入毛病並弄亂事情--JS regexp根本不足以處理完整的HTML語法。

但是,瀏覽器已經有了一種處理HTML的方法。

var str2 = '<span class="pronoun subjective">he</span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.'; 
 

 
function capitalise(html) { 
 
    // HTML DOM parser: engage! 
 
    var div = document.createElement('div'); 
 
    div.innerHTML = html; 
 

 
    // assume the start of the string is also a start of a sentence 
 
    var boundary = true; 
 

 
    // go through every text node 
 
    var walker = document.createTreeWalker(div, NodeFilter.SHOW_TEXT, null, true); 
 
    while (walker.nextNode()) { 
 
    var node = walker.currentNode; 
 
    var text = node.textContent; 
 

 
    // if we are between sentences, capitalise the first letter 
 
    if (boundary) { 
 
     text = text.replace(/[a-z]/, function(letter) { 
 
     return letter.toUpperCase(); 
 
     }); 
 
    } 
 

 
    // capitalise for any internal punctuation 
 
    text = text.replace(/([.?!]\s+)([a-z])/g, function(_, punct, letter) { 
 
     return punct + letter.toUpperCase(); 
 
    }); 
 

 
    // If the current node ends in punctuation, we're back at sentence boundary 
 
    boundary = text.match(/[.?!]\s*$/); 
 

 
    // change the current node's text 
 
    node.textContent = text; 
 
    } 
 
    return div.innerHTML; 
 
} 
 

 
console.log(capitalise(str2));

+0

這是一個很好的解決方案 - 謝謝。 – user6790086

1

的另一種方法 - 如果分裂與<開始,找到以下關閉>的第一個字母,大寫字母替換它。即使有多個標籤,這也可以工作。

var string = '<span class="pronoun subjective"><strong = ">95">he</strong></span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete. <span class="pronoun possessive">his</span> homework a little more regularly.'; 
 
var n = string.split("."); 
 
var vfinal = "" 
 
for (i = 0; i < n.length; i++) { 
 
    var spaceput = "" 
 
    var spaceCount = n[i].replace(/^(\s*).*$/, "$1").length; 
 
    if (n[i].trim().charAt(0) == '<') { 
 
    var first = n[i].match(/"?>([a-zA-Z])/)[1]; 
 
    var firstCap = first.toUpperCase(); 
 
    var newstring = n[i].replace(first, firstCap); 
 
    } else { 
 
    n[i] = n[i].replace(/^\s+/, ""); 
 
    var newstring = n[i].charAt(n[i]).toUpperCase() + n[i].slice(1); 
 
    } 
 
    for (j = 0; j < spaceCount; j++) spaceput = spaceput + " "; 
 
    vfinal = vfinal + spaceput + newstring + "."; 
 
} 
 
vfinal = vfinal.substring(0, vfinal.length - 1); 
 
console.log(vfinal);

+0

不適用於' 95%」>此不起作用.'。獲得正則表達式來正確地做到這一點並不重要。 – Amadan

+0

公平的電話。更新爲允許關閉'>'或'「>'這可能不是一個理想的解決方案,但是如果輸入是一致的和可預測的,那麼也許更簡單的方法可以工作。 – sideroxylon