我正在製作一個腳本,通過檢查文件中的已知關鍵字將視頻文件分類到文件夾中。隨着關鍵字數量的增長失控,腳本變得非常慢,需要幾秒鐘處理每個文件。根據關鍵字排序文件,需要更多的數據庫-y解決方案
@echo off
cd /d d:\videos\shorts
if /i not "%cd%"=="d:\videos\shorts" echo invalid shorts dir. && exit /b
:: auto detect folder name via anchor file
for /r %%i in (*spirit*science*chakras*) do set conspiracies=%%~dpi
if not exist "%conspiracies%" echo conscpiracies dir missing. && pause && exit /b
for /r %%i in (*modeselektor*evil*) do set musicvideos=%%~dpi
if not exist "%musicvideos%" echo musicvideos dir missing. && pause && exit /b
for %%s in (*) do set "file=%%~nxs" & set "full=%%s" & call :count
for %%v in (*) do echo can't sort "%%~nv"
exit /b
:count
set oldfile="%file%"
set newfile=%oldfile:&=and%
if not %oldfile%==%newfile% ren "%full%" %newfile%
set count=0
set words= & rem
echo "%~n1" | findstr /i /c:"music" >nul && set words=%words%, music&& set /a count+=1
echo "%~n1" | findstr /i /c:"official video" >nul && set words=%words%, official video&& set /a count+=2
set words=%words:has, =has %
set words=%words: , =%
if not %count%==0 echo "%file%" has "%words%" %count%p for music videos
set musicvideoscount=%count%
set count=0
set words= & rem
echo "%~n1" | findstr /i /c:"misinform" >nul && set words=%words%, misinform&& set /a count+=1
echo "%~n1" | findstr /i /c:"antikythera" >nul && set words=%words%, antikythera&& set /a count+=2
set words=%words:has, =has %
set words=%words: , =%
if not %count%==0 echo "%file%" has "%words%" %count%p for conspiracies
set conspiraciescount=%count%
set wanted=3
set winner=none
:loop
:: count points and set winner (in case of tie lowest in this list wins, sort accordingly)
if %conspiraciescount%==%wanted% set winner=%conspiracies%
if %musicvideoscount%==%wanted% set winner=%musicvideos%
set /a wanted+=1
if not %wanted%==15 goto loop
if not "%winner%"=="none" move "%full%" "%winner%" >nul && echo "%winner%%file%" && echo.
注意每個關鍵字的「權重值」。它會計算每個類別的總點數,找到最大值並將文件移至指定給該類別的文件夾。它還顯示它找到的單詞,最後列出它找到的無法分類的文件,以便我可以添加關鍵字或調整權重值。
我已將本示例中的文件夾和關鍵字數量減至最少。完整的腳本有六個文件夾和64k大小的所有關鍵字(和增長)。
如果你想在PowerShell中使用它,你首先需要自己做一些基本的代碼,如果你有問題,請回答*關於什麼不工作的具體問題。從我所看到的情況來看,現有批處理代碼的主要問題在於性能,對嗎? – gravity
我明白了。沒錯,性能。我懷疑這是做錯事情的主要例子。我遇到的唯一的實際問題是特殊字符。 – bricktop