尋找url路徑最後部分的最快方法是什麼？

「HTTP：/www.someco.com/news/2016-01-03/waterloo-station」

的網址永遠不會包含查詢字符串。

提取字符串「滑鐵盧站」最乾淨的方法是什麼？

當然，我可以使用下面的代碼：

url.substring(url.lastIndexOf('/') + 1))

，但我不與它完全滿意，因爲它必須執行的最後一個索引搜索，然後得到的子字符串。我想知道是否有更好的方法（使用正則表達式？）在單個步驟中獲得相同的結果。

當然，執行數十億次時，解決方案應該快得多。

來源

2016-01-16 John Henry

我打電話大規模的過早優化。 – chrylis

爲什麼你認爲這可以做得更快？必須找到分隔符，並且必須構建一個新的字符串。這是沒有辦法的。 –

正則表達式比僅迭代char數組的17個元素複雜得多。這可以儘可能快地獲得，並且也儘可能簡單易讀。在您的應用程序中，它有可能成爲性能問題的原因：如果您必須執行數十億次，則需要以某種方式從磁盤讀取這些URL，並且比子字符串慢幾個數量級。 –

我不認爲它可以改善。簡單的答案是因爲搜索最後一個索引是一個簡單的操作，所以可以用一個快速算法來實現（直接在String類中！），並且正則表達式很難像這樣快。正如你所看到的，對字符串的第二次訪問成本不會降低：它只是新字符串的初始化。

如果在String類中直接實現了一個專用方法，它本可以更快。

如果您想了解更多詳細信息，您可以自己查看JDK中的代碼。複製在這裏爲您提供方便。

下面的代碼是在我的JDK方法lastIndexOf（）的實現：

public int lastIndexOf(int ch, int fromIndex) { 
    int min = offset; 
    char v[] = value; 

    int i = offset + ((fromIndex >= count) ? count - 1 : fromIndex); 

    if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) { 
     // handle most cases here (ch is a BMP code point or a 
     // negative value (invalid code point)) 
     for (; i >= min ; i--) { 
      if (v[i] == ch) { 
       return i - offset; 
      } 
     } 
     return -1; 
    } 

    int max = offset + count; 
    if (ch <= Character.MAX_CODE_POINT) { 
     // handle supplementary characters here 
     char[] surrogates = Character.toChars(ch); 
     for (; i >= min; i--) { 
      if (v[i] == surrogates[0]) { 
       if (i + 1 == max) { 
        break; 
       } 
       if (v[i+1] == surrogates[1]) { 
        return i - offset; 
       } 
      } 
     } 
    } 
    return -1; 
}

正在String類直接實現，它可以訪問它的私有成員：

/** The value is used for character storage. */ 
private final char value[]; 

/** The offset is the first index of the storage that is used. */ 
private final int offset; 

/** The count is the number of characters in the String. */ 
private final int count;

它不適用於子字符串。在同一時間，子方法是在Java中非常快，因爲它不會產生字符的一個新的數組，但它只是簡單地創建一個新的String對象改變偏移和計數：

public String substring(int beginIndex, int endIndex) { 
    if (beginIndex < 0) { 
     throw new StringIndexOutOfBoundsException(beginIndex); 
    } 
    if (endIndex > count) { 
     throw new StringIndexOutOfBoundsException(endIndex); 
    } 
    if (beginIndex > endIndex) { 
     throw new StringIndexOutOfBoundsException(endIndex - beginIndex); 
    } 
    return ((beginIndex == 0) && (endIndex == count)) ? this : 
     new String(offset + beginIndex, endIndex - beginIndex, value); 
} 

// Package private constructor which shares value array for speed. 
String(int offset, int count, char value[]) { 
    this.value = value; 
    this.offset = offset; 
    this.count = count; 
}

來源

2016-01-16 17:26:31

尋找url路徑最後部分的最快方法是什麼？

回答

相關問題