搜索最常見OCCURENCES

我有一個文本日誌文件，包含由分隔數據的線「|」搜索最常見OCCURENCES

例如

date | time | ip | geo-location (city) | page viewed ......

我需要找到10個最發生的歷史「頁面視圖」在文本文件中....

頁面視圖的每個日誌被列爲：

//pageurl

爲TH Ë日誌是單獨的行我假設我會

// [url name] \r\n

之間搜索的網頁網址如何將我的代碼搜索，列出前10個網址，並列出他們到一個數組....

例如：

$url[0] <<this would be the most occuring url 
$url[1] <<thos would be the second most occuring url

等等.....直到我可以列出它們備份到：

$url[9] <<which would be the 10th most common url

我不肯定我怎麼會在「//」和「\ r \ n」個之間

在此先感謝您的幫助搜索

然後轉換前10位最常見的OCCURENCES到一個數組....：）

編輯：這裏是我的日誌的2倍線，只是爲了幫助更多的，如果我能

sunday, january 22, 2012 | 16:14:36 | 82.**.***.*** | bolton | //error 
sunday, january 22, 2012 | 17:12:52 | 82.**.***.*** | bolton | //videos

感謝

來源

2012-01-22 DJ-P.I.M.P

你想要什麼語言或工具OMN什麼平臺解決工作？或者你只想僞代碼 – rene

我使用的是Windows下的Apache服務器上的PHP編碼，感謝 –

$數據=「$時間| $ IP | $城市| $定位」。「$結束」; <<<<<這是我使用以將數據寫入到文本文件中的代碼...... $端=「\ r \ n」個; <<<<<即變量$結束表示寫入新的生產線，也許這將定義搜索的終點幫助.....我認爲這是\ n，而是忘了，我不得不改變它用\ r \ n所以它實際上創造新的生產線 –

根據所給出的信息，這裏是一個相當原始的方法：

/* get the contents of the log file */ 
$log_file = file_get_contents(__DIR__.'/log.txt'); 

/* split the log into an array of lines */ 
$log_lines = explode(PHP_EOL, $log_file); 

/* we don't need the log file anymore, so free up some memory */ 
unset($log_file); 

/* loop through each line */ 
$page_views = array(); 
foreach ($log_lines as $line) { 
    /* get the text after the last pipe character (the page view), minus the ' //' */ 
    $page_views[] = ltrim(array_pop(explode('|', $line)), ' /'); 
} 

/* we don't need the array of lines either, so free up that memory */ 
unset($log_lines); 

/* count the frequency of each unique occurrence */ 
$urls = array_count_values($page_views); 

/* sort highest to lowest (may be redundant, I think array_count_values does this) */ 
arsort($urls, SORT_NUMERIC); 

print_r($urls); 
/* [page_url] => num page views, ... */ 

/* that gives you occurrences, but you want a numerical 
    indexed array for a top ten, so... */ 

$top_ten = array(); 
$i = 0; 
/* loop through the array, and store the keys in a new one until we have 10 of them */ 
foreach ($urls as $url => $views) { 
    if ($i >= 10) break; 
    $top_ten[] = $url; 
    $i++; 
} 

print_r($top_ten); 
/* [0] => page url, ... */

**腳本輸出：**

Array 
(
    [videos] => 1 
    [error ] => 1 
) 
Array 
(
    [0] => videos 
    [1] => error 
)

這不是最優化的解決方案，以及更大的日誌文件，時間越長，將採取。爲此，您最好登錄到數據庫並從中查詢。

來源

2012-01-22 17:38:46 mrlee

感謝現在試試吧:) –

我創造了這個新的PHP文件，只是加載空白頁（我得到這個代碼時出錯）我也曾嘗試回聲插入「你好「;在開始時看看是否有輸出，什麼都沒有？這段代碼對我來說太複雜了，以至於錯誤檢查lol道歉 –

自從修復之後，我發現了一個小的語法錯誤。你會想添加'ini_set（'display_errors'，1）; error_reporting（E_ALL）;'開始，顯示確實發生的錯誤。 – mrlee

搜索最常見OCCURENCES

回答

相關問題