刪除URL中第一個斜槓之前的所有內容？

使用正則表達式，我怎麼能在URL中的第一個路徑/之前刪除所有內容？刪除URL中第一個斜槓之前的所有內容？

實例網址：https://www.example.com/some/page?user=1&[email protected]

從這一點，我只是想/some/page?user=1&[email protected]

在這只是根域（即https://www.example.com/）的話，那麼我只想/歸還。

該域可能有也可能沒有子域，它可能有也可能沒有安全協議。真的最終只是想在第一個路徑斜線之前去掉什麼。

如果它很重要，我運行Ruby 1.9.3。

2013-07-18 Shpigford

**正則表達式並不是一種魔術棒，它會在涉及到字符串的每一個問題上都發揮作用。**您可能想要使用已經編寫，測試和調試的現有代碼。在PHP中，使用['parse_url']（http://php.net/manual/en/function.parse-url.php）函數。 Perl：['URI' module]（http://search.cpan.org/dist/URI/）。 Ruby：['URI'' module]（http://www.ruby-doc.org/stdlib-1.9.3/libdoc/uri/rdoc/URI.html）。 .NET：['Uri'class]（http://msdn.microsoft.com/en-us/library/txt7706a.aspx） –

請勿爲此使用正則表達式。使用URI類。你可以寫：

require 'uri' 

u = URI.parse('https://www.example.com/some/page?user=1&[email protected]') 
u.path #=> "/some/page" 
u.query #=> "user=1&[email protected]" 

# All together - this will only return path if query is empty (no ?) 
u.request_uri #=> "/some/page?user=1&[email protected]"

來源

2013-07-18 21:29:46

+1你打我3分鐘:) – Tilo

require 'uri' 

uri = URI.parse("https://www.example.com/some/page?user=1&[email protected]") 

> uri.path + '?' + uri.query 
    => "/some/page?user=1&[email protected]"

由於加文還提到，它不是使用正則表達式的一個很好的想法，雖然這是很有誘惑力。您可能有URL中包含特殊字符，甚至包含UniCode字符，您在編寫RegExp時並不期待這些字符。這可能會發生在您的查詢字符串中。使用URI庫是更安全的方法。

來源

2013-07-18 21:32:48 Tilo

可使用String#index

索引來完成相同的（子串[，偏移]）

str = "https://www.example.com/some/page?user=1&[email protected]" 
offset = str.index("//") # => 6 
str[str.index('/',offset + 2)..-1] 
# => "/some/page?user=1&[email protected]"

來源

2013-07-18 22:04:18

我強烈與使用URI模塊在這種情況下，建議同意，而我並不認爲自己擅長正則表達。儘管如此，證明一種可能的方式來做你所要求的東西似乎是值得的。

test_url1 = 'https://www.example.com/some/page?user=1&[email protected]' 
test_url2 = 'http://test.com/' 
test_url3 = 'http://test.com' 

regex = /^https?:\/\/[^\/]+(.*)/ 

regex.match(test_url1)[1] 
# => "/some/page?user=1&[email protected]" 

regex.match(test_url2)[1] 
# => "/" 

regex.match(test_url3)[1] 
# => ""

注意，在後一種情況下，該URL沒有尾隨'/'所以結果是空字符串。

正則表達式（/^https?:\/\/[^\/]+(.*)/）表示的字符串（^）http（http）開始，任選接着進行s（s?），接着隨後在至少一個非斜槓字符（[^\/]+）://（:\/\/），之後是零個或多個字符，我們希望捕獲這些字符（(.*)）。

我希望你能找到這樣的例子和解釋教育，我再次建議不要在這種情況下實際使用正則表達式。 URI模塊使用起來更簡單，而且更加健壯。

來源

2013-07-19 06:17:14

刪除URL中第一個斜槓之前的所有內容？

回答

相關問題