Voicebox: Speech Processing Toolbox for MATLAB Audio File Input/Output readwav - Read a WAV file writewav - Write a WAV file readhtk - Read HTK waveform files writehtk - Write HTK waveform files readsfs - Read SFS files readsph - Read SPHERE/TIMIT waveform files readaif - Read AIFF Audio Interchange file format file readcnx - Raed BT Connex database files readau - Read AU files (from SUN) readflac - Read FLAC files Frequency Scales frq2mel mel2frq frq2erb erb2frq frq2bark bark2frq frq2midi midi2frq Convert Convert Convert Convert Convert Convert Convert Convert Hertz to mel scale mel scale to Hertz Hertz to erb rate scale erb rate scale to Hertz Hz to the Bark frequency scale the Bark frequency scale to Hz Hertz to midi scale of semitones midi scale of semitones to Hertz

Fourier/DCT/Hartley Transforms rfft - FFT of real data irfft - Inverse of FFT of real data rsfft - FFT of real symmetric data rdct - DCT of real data irdct - Inverse of DCT of real data rhartley - Hartley transform of real data zoomfft - calculate the fft over a portion of the spectrum with any resolution sphrharm - calculate forward and inverse shperical harmonic transformations Probability Distributions randvec - Generate random vectors randiscr - Generate discrete random values with prescribed probabilities rnsubset - Select a random subset

randfilt - Generate filtered random noise without transients stdspectrum - Generate standard audio and speech spectra gausprod - Calculate the product of multiple gaussians maxgauss - Calculate the mean and variance of max(x) where x is a gaussian vector gaussmix - Fit a gaussian mixture model to data values gaussmixp - Calculates full and marginal probability density from a Gaussian mixture gaussmixd - Calculate marginal and conditional density distributions and perform inference gmmlpdf - Prob density function of a multivariate Gaussian mixture lognmpdf - Prob density function of a lognormal distribution histndim - N-dimensional histogram (+ plot 2-D histogram) usasi - Generate USASI noise (obsolete: use stdspectrum instead) Vector Distances disteusq sets of vectors distchar distitar distisar sets distchpf distitpf distispf Calculate euclidean/mahanalobis distances between two COSH spectral distance between AR coefficient sets Itakura spectral distance between AR coefficient sets Itakura-Saito spectral distance between AR coefficient COSH spectral distance between power spectra Itakura spectral distance between power spectra Itakura-Saito spectral distance between power spectra

Speech Analysis activlev - Calculate the active level of speech (ITU-T P.56) dypsa - Estimate glottal closure instants from a speech waveform enframe - Divide a speech signal into frames for frame-based processing correlogram - calculate a 3-D correlogram ewgrpdel - Energy-weighted group delay waveform fram2wav - Interpolate frame-based values to a waveform filtbankm - Determine matrix for a linear/mel/erb/bark-spaced filterbank fxpefac - PEFAC pitch tracker fxrapt - RAPT pitch tracker gammabank - Calculate a bank of IIR gammatone filters importsii - Calculate the SII importance function (ANSI S3.5-1997) modspect - Caluclate the modulation specrogram overlapadd - Reconstitute an output waveform after frame-based processing psycdigit - Experimental estimation of monotonic/unimodal psychometric function using TIDIGITS psycest - Experimental estimation of monotonic psychometric function psycestu - Experimental estimation of unimodal psychometric function psychofunc - Psychometric functions sigma - Identify glottal closure and opening intstants from Lx or EGG waveform snrseg - Segmental SNR and Global SNR calculation soundspeed - Returns the speed of sound in air as a function of temperature spgrambw - Spectrogram with many options txalign - Align two sets of time markers vadsohn - Voice activity detector LPC Analysis of Speech lpcauto - LPC analysis: autocorrelation method

lpccovar lpc--2-lpcrr2am lpcconv lpcbwexp ccwarpf lpcifilt lpcrand


LPC analysis: covariance method Convert between alternative LPC representation Matrix with all LPC filters up to order p Arbitrary conversion between LPC representations Bandwidth expansion of LPC filter warp complex cepstrum coefficients inverse filter a speech signal create random stable filters

Speech Synthesis sapisynth - Text-to-speech synthesis of a string or matrix glotros - Rosenberg model of glottal waveform glotlf - Liljencrants-Fant model of glottal waveform Speech Enhancement estnoisem - Estimate the noise spectrum from noisy speech using minimum statistics specsub - Speech enhancement using spectral subtraction ssubmmse - Speech enhancement using MMSE estimate of spectral amplitude or log amplitude specsubm - (obsolete algorithm) Spectral subtraction Speech Coding lin2pcmu pcma2lin pcmu2lin lin2pcma kmeans kmeanlbg kmeanhar potsband Convert linear PCM to mu-law PCM Convert A-law PCM to linear PCM Convert mu-law PCM to linear PCM Convert linear PCM to A-law PCM Vector quantisation: k-means algorithm Vector quantisation: LBG algorithm Vector quantization: K-harmonic means Create telephone bandwidth filter

Speech Recognition melbankm - Mel filterbank transformation matrix melcepst - Mel cepstrum frontend for recogniser cep2pow - Convert mel cepstram means & variances to power domain pow2cep - Convert power domain means & variances to mel cepstrum ldatrace - constrained Linear Discriminant Analysis to maximize trace(W\B) Signal Processing ditherq findpeaks filterbank maxfilt meansqtf momfilt schmitt sigalign teager windinfo windows zerocros Add dither and quantize a signal Find peaks in a signal or spectrum Apply a bank of IIR filters to a signal Running maximum filter Output power of a filter with white noise input Generate running moments Pass a signal through a schmitt trigger Align a clean refeence with a noisy signal Calculate the Teager energy waveform Calculate window properties and figures of merit Window function generation Find interpolated zero crossings

Information Theory huffman - Generate Huffman code entropy - Calculate entropy and conditional entropy Computer Vision imagehomog - Apply a homography transformation to an image with bilinear interpolation

rot--2-qrabs qrmult qrdivide polygonarea polygonwind polygonxline sphrharm uniform, Gaussian

Convert between different representations of rotations Absolute value of a real quaternion multiply two real quaternions divide two real quaternions (or invert one) Calculate the area of a polygon Test if points are inside or outside a polygon Find where a line crosses a polygon forward and inverse spherical harmonic transform using

or arbitrary inclination (elevation) grids and a uniform azimuth grid. upolyhedron - Calculate the vertex coordinates and other characteristics of a uniform polyhedron Printing and Display functions xticksi - Label x-axis tick marks using SI multipliers yticksi - Label y-axis tick marks using SI multipliers xyzticksi - Helper function for xticksi and yticksi figbolden - Make a figure bold for printing clearly cblabel - Add a label onto the colorbar sprintsi - Print a value with an SI multiplier frac2bin - Convert numbers to fixed -point binary strings Voicebox Parameters and System Interface voicebox - Global installation-dependent parameters unixwhich - Search the WINDOWS system path for an executable program (like UNIX which) winenvar - Obtain WINDOWS environment variables Utility Functions atan2sc angle bitsprec choosenk choosrnk dlyapsq dualdiag finishat fopenmkd logsum m2htmlpwd current directory nearnonz element permutes quadpeak rotation skew3d zerotrim arctangent function that returns the sin and cos of the Rounds values to a precision of n bits All choices of k elements out of 1:n without replacement All choices of k elements out of 1:n with replacement Solve the discrete lyapunov equation Simultaneously diagonalise two hermitian matrices Estimate the finishing time of a long loop like FOPEN() but creates any missing directories/folders Calculates log(sum(exp(x))) without overflow/underflow Create HTML documentation of matlab routines in the Replace each zero element with the nearest non-zero All n! permutations of 1:n Find quadratically-interpolated peak in a 2D array Generate rotation matrices Generate 3x3 skew symmetric matrices Remove empty trailing rows and columns

