我用兩個GSSF過濾器,一個送BGRA幀,另一個PCM 16位48KHz的音頻採樣送的DirectShow圖形。GSSF回調調用數千次音頻
圖像過濾器調用回調函數以正確的頻率,或多或少30毫秒分開,因爲我在29.97我的工作。 但是對於音頻,一旦圖形開始,音頻回調被稱爲超過5000次。
視頻設置:
BitmapInfoHeader bmi = new BitmapInfoHeader();
bmi.Size = Marshal.SizeOf(typeof(BitmapInfoHeader));
bmi.Width = width; //1920
bmi.Height = height * -1; //1080
bmi.Planes = 1;
bmi.BitCount = (short)bpp; //32
bmi.Compression = 0;
bmi.ImageSize = (bmi.BitCount/8) * bmi.Width * bmi.Height; //8294400
bmi.XPelsPerMeter = 0;
bmi.YPelsPerMeter = 0;
bmi.ClrUsed = 0;
bmi.ClrImportant = 0;
int hr = ssi.SetMediaTypeFromBitmap(bmi, (long)fps); // (long)(10000000/29.97)
DsError.ThrowExceptionForHR(hr);
音頻設置
WaveFormatEx wfex = new WaveFormatEx();
wfex.wFormatTag = 1; //PCM
wfex.nSamplesPerSec = samplerate; //48000;
wfex.wBitsPerSample = (ushort)bps; //16
wfex.nChannels = (ushort)numChannels; //2
wfex.nAvgBytesPerSec = samplerate * (bps * numChannels/8); //192000
wfex.nBlockAlign = (ushort)(numChannels * bps/8); //4
wfex.cbSize = 0;
//Keep Data
bytesPerSample = wfex.nAvgBytesPerSec;
frequency = samplerate;
channels = numChannels;
bits = bps;
AMMediaType amt = new AMMediaType();
amt.majorType = MediaType.Audio;
amt.subType = MediaSubType.PCM;
amt.formatType = FormatType.WaveEx;
amt.temporalCompression = false;
amt.fixedSizeSamples = true;
amt.sampleSize = wfex.nBlockAlign;
amt.formatSize = Marshal.SizeOf(wfex);
amt.formatPtr = Marshal.AllocCoTaskMem(amt.formatSize);
Marshal.StructureToPtr(wfex, amt.formatPtr, false);
int hr = ssa.SetMediaTypeEx(amt, wfex.nAvgBytesPerSec);
DsError.ThrowExceptionForHR(hr);
Tools.FreeAMMediaType(amt);
我設置時間戳是這樣的:
對於視頻:
// fps is (long)(10000000/29.97)
DsLong rtStart = new DsLong(frameNumber * fps);
DsLong rtStop = new DsLong(rtStart + fps);
int hr = pSample.SetTime(rtStart, rtStop);
frameNumber++;
音頻:
//size is the number os audio samples written in bytes
//bits = 16
//channles = 2
//frequency = 48000
//timeUnit = 10000000
// lastTime starts from 0
long sampleCount = size * 8/bits/channels;
long frameLength = timeUnit * sampleCount/frequency;
DsLong rtStart = new DsLong(lastTime);
lastTime = rtStart + frameLength;
DsLong rtStop = new DsLong(lastTime);
int hr = pSample.SetTime(rtStart, rtStop);
我還沒有發佈完整的代碼,因爲大多是一樣GSSF的例子。但我可以發佈你認爲有必要的任何內容。
任何人有任何想法,爲什麼會出現這種情況?
謝謝
聲音像設計的行爲。這並不一定是錯誤的。 –
@ RomanR.你是指什麼設計的行爲?多次打電話可能是正常的?我不應該能夠以某種方式調節這個嗎? – tweellt
壓縮音頻幀長度爲20-30毫秒是正常的,解碼器每秒會產生40-50次回調。然而,與視頻不同,沒有特定的回叫頻率 - 無論出於何種原因,樣本可能會 - 在不破壞音頻的平滑度和質量的情況下,每秒重新組合48000次回放(針對48 kHz音頻)。所以一個人說,你得到5000個電話並不意味着出了問題。 –