mfcc.m
资源名称:speech.rar [点击查看]
上传用户:ay_070428
上传日期:2014-12-04
资源大小:11427k
文件大小:3k
源码类别:
语音合成与识别
开发平台:
Matlab
- function ccc = MFCC(x,Fs,nM,nC,nfft,fl,fh,ovlp)
- % This function is to calculate MFCC coefficients of input data 'x'. Default dimension
- % of MFCC is 24, including mfcc and △mfcc.
- %
- % Input Parameters:
- % x: input data, must be a vector.
- % Fs: sample frequency
- % nM: dimension of MFCC coefficients
- % nC: number of filters
- % nfft: number of fft, pow of 2
- % fl: the lowest frequency = fl * Fs
- % fh: the highest frequency = fh * Fs, so fh <= 0.5
- % ovlp: overlap, default 0.5
- %
- % Output Parameter:
- % ccc: MFCC matrix, each row is a MFCC pattern
- %
- % xiupingping@smth provided a matlab program to generate MFCC
- % patterns. This program derived from it.
- % No copyright reversed
- % QingRen
- % gly.nkioa#gmail.com
- if nargin < 2 Fs = 8000; end
- if nargin < 3 nM = 12; end
- if nargin < 4 nC = 24; end
- if nargin < 5 nfft = 256; end
- if nargin < 6 fl = 0; end
- if nargin < 7 fh = 0.5; end
- if nargin < 8 ovlp = 0.5; end
- x = x(:);
- % ----- mel filter bank -----
- % melbankm.m ect, you can get them from toolbox "VoiceBox"
- % http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
- bank = melbankm(nC,nfft,Fs,fl,fh);
- bank = full(bank);
- bank = bank / max(bank(:));
- % ----- DCT coefficients, a (N×nC) matrix -----
- % Reference
- % [1] "Generalized Mel frequency cepstral coefficients for large-vocabulary
- % speeker-independent continuous-speech recognition" IEEE 1996
- % [2] "Comparison of parametric representations for monosyllabic word
- % recognition" IEEE Trans Acoust.Speech.Signal Processing 1980
- j = 0:nC-1;
- for k = 1:nM
- dctcoef(k,:) = cos(k * (j+0.5) * pi / nC);
- end
- % Ceplifter
- % Reference: HTK book v3.1
- w = 1 + (nM/2) * sin(pi*[1:nM]./nM);
- w = w / max(w);
- % pre-emphasis filtering
- % x = double(x);
- % x = filter([1 -0.95],1,x);
- % ----- Calculate MFCC coefficients of each frame -----
- L_x = length(x);
- nOvlp = floor(nfft*ovlp);
- i = 0;
- while ((i*nOvlp+nfft) <= L_x)
- in = x(i*nOvlp+1:i*nOvlp+nfft);
- in = in ./ max(abs(in));
- in = in .* hamming(nfft);
- % --- Calculate the energy spectrum ---
- s = abs(fft(in));
- t = s.^2;
- t = t+2*realmin;
- t = t(1:nfft/2+1);
- % --- Calculate the energy in each channel ---
- t1 = bank * t;
- % --- Calculate MFCC ---
- t1 = log10(t1);
- c1 = dctcoef * t1;
- c2 = c1.*w';
- m(i+1,:) = c2';
- i = i + 1;
- end
- % ccc = m;
- % get △mfcc
- dtm=zeros(size(m));
- for i=3:size(m,1)-2
- dtm(i,:)=-2*m(i-2,:)-m(i-1,:)+m(i+1,:)+2*m(i+2,:);
- end
- dtm=dtm/3;
- % combine mfcc and △mfcc
- ccc = [m dtm];
- ccc=ccc(3:size(m,1)-2,:);
- return