試圖過濾日誌文件的URL像「/fcfc/fcf/fc.php」

在我的apache訪問日誌中，我收到很多來自機器人的無效請求（可能）。試圖過濾日誌文件的URL像「/fcfc/fcf/fc.php」

所有無效的URL都遵循相同的模式，我想用正則表達式來篩選它們。

下面是一些樣本：

/oaoa/oao/oa.php 
/fcfc/fcf/fc.php 
/mcmc/mcm/mc.php 
/rxrx/rxr/rx.php 
/wlwl/wlw/wl.php 
/nini/nin/ni.php 
/gigi/gig/gi.php 
/jojo/joj/jo.php 
/okok/oko/ok.php

我可以看到的圖案，但我不知道如何建立這種模式而不是像這樣的事情相匹配的（PHP的）正則表達式。 :-(

/help/one/xy.php 
/some/oth/er.php

我希望你們的人都知道一個解決方案，如果可能的話

來源

2015-01-20 Neb Rehtlaw

歡迎SO！我刪除了你的簽名，[請不要簽署你的帖子]（http://stackoverflow.com/help/behavior） - 我們知道你是誰！ ;） – georg 2015-01-20 22:31:56

不錯的問題，但你應該試着展示你的嘗試。 – HamZa 2015-01-20 23:25:23

如果這是您的精確輸入，下面的正則表達式應該做的伎倆

/\/(.)(.)\1\2\/\1\2\1\/\1\2\.php/

https://regex101.com/r/rU2sE6/2

來源

2015-01-20 22:29:13 georg

爲您列出的這些非常特殊的情況，這裏是一個簡單的正則表達式將匹配他們：

/([a-z])([a-z])\1\2/\1\2\1/\1\2.php

的\1和\2是第一組和第二組的引用。正斜槓需要進行轉義。這實質上說匹配一個字符，然後又接着第一個字符匹配，那麼第二個字符比賽ED，以斜線等

來源

2015-01-20 22:29:10

_{注：有趣的問題，雖然你應該向我們展示了你所嘗試過的東西。這就是爲什麼我把這個答案作爲社區維基不獲得任何聲譽。}

所以訣竅是捕獲組中的字符，然後斷言它存在於下一個塊中。我猜想，但這裏隱含的位的正則表達式：

^     # Assert begin of line 
(?:    # Non-capturing group 
    (    # Capturing group 1 
    /   # Match a forward slash 
     [^/]+  # Match anything not a forward slash one or more times 
    )    # End of capturing group 1 
    [^/]   # Match anything not a forward slash one time 
    (?=\1)   # Assert that what we've matched in group 1 is ahead of us 
        # (ie: a forward slash + the characters - the last character) 
)+    # End of non-capturing group, repeat this one or more times 
\1\.php   # Match what we've matched in group 1 followed by a dot and "php" 
$     # Assert end of line

不要忘記使用m modifier和x modifier。

Online demo

來源

2015-01-21 13:30:07 HamZa

試圖過濾日誌文件的URL像「/fcfc/fcf/fc.php」

回答

相關問題