Basis Function Models

Often times we want to model data y that emerges from some underlying function f(x) of independent variables x such that for some future input we’ll be able to accurately predict the future output values. There are various methods for devising such a model, all of which make particular assumptions about the types of functions the model can emulate. In this post we’ll focus on one set of methods called Basis Function Models (BFMs).

Basis Sets and Linear Independence

The idea behind BFMs is to model the complex target function f(x) as a linear combination of a set of simpler functions, for which we have closed form expressions. This set of simpler functions is called a basis set, and work in a similar manner to bases that compose vector spaces in linear algebra. For instance, any vector in the 2D spatial coordinate system (which is a vector space in  \mathbb R^2) can be composed of linear combinations of the x and y directions. This is demonstrated in the figures below:

Illustration of basis vectors along the x (blue) and y(red) directions, along with a target vector (black)

Illustration of basis vectors along the x (blue) and y(red) directions, along with a target vector (black)

Above we see a target vector in black pointing from the origin (at xy coordinates (0,0)) to the xy coordinates (2,3), and the coordinate basis vectors b^{(x)} and b^{(y)}, each of which point one unit along the x- (in blue) and y- (in red) directions.

We can compose the target vector as as a linear combination of the x- and y- basis vectors. Namely the target vector can be composed by adding (in the vector sense) 2 times the basis b^{(x)} to 3 times the basis b^{(y)}:

Composing the target vector as a linear combination of the basis vectors

Composing the target vector as a linear combination of the basis vectors

One thing that is important to note about the bases b^{(x)} and b^{(y)} is that they are linearly independent. This means that no matter how hard you try, you can’t compose the basis vector b^{(x)} as a linear combination of the other basis vector b^{(y)}, and vice versa. In the 2D vector space, we can easily see this because the red and blue lines are perpendicular to one another (a condition called orthogonality). But we can formally determine if two (column) vectors are independent by calculating the (column) rank of a matrix A that is composed by concatenating the two vectors.

A = [b^{(x)},b^{(y)}]

= \begin{bmatrix} 1&0 \\ 0&1 \end{bmatrix}

The rank of a matrix is the number of linearly independent columns in the matrix. If the rank of A has the same value as the number of columns in the matrix, then the columns of  A  forms a linearly independent set of vectors. The rank of A above is 2. So is the number of columns. Therefore the basis vectors b^{(x)} and b^{(y)} are indeed linearly independent. We can use this same matrix rank-based test to verify if vectors of  much higher dimension than two are independent. Linear independence of the basis set is important if we want to be able to define a unique model.

%% EXAMPLE OF COMPOSING A VECTOR OF BASIS VECTORS
figure;
targetVector = [0 0; 2 3]
basisX = [0 0; 1 0];
basisY = [0 0; 0 1];
hv = plot(targetVector(:,1),targetVector(:,2),'k','Linewidth',2)
hold on;
hx = plot(basisX(:,1),basisX(:,2),'b','Linewidth',2);
hy = plot(basisY(:,1),basisY(:,2),'r','Linewidth',2);
xlim([-4 4]); ylim([-4 4]);
xlabel('x-direction'), ylabel('y-direction')
axis square
grid
legend([hv,hx,hy],{'Target','b^{(x)}','b^{(y)}'},'Location','bestoutside');

figure
hv = plot(targetVector(:,1),targetVector(:,2),'k','Linewidth',2);
hold on;
hx = plot(2*basisX(:,1),2*basisX(:,2),'b','Linewidth',2);
hy = plot(3*basisY(:,1),3*basisY(:,2),'r','Linewidth',2);
xlim([-4 4]); ylim([-4 4]);
xlabel('x-direction'), ylabel('y-direction');
axis square
grid
legend([hv,hx,hy],{'Target','2b^{(x)}','3b^{(y)}'},'Location','bestoutside')

A = [1 0;
0 1];

% TEST TO SEE IF basisX AND basisY ARE
% LINEARLY INDEPENDENT
isIndependent = rank(A) == size(A,2)

Modeling Functions with Linear Basis Sets

In a similar fashion to creating arbitrary vectors with vector bases, we can compose arbitrary functions in “function space” as a linear combination of simpler basis functions  (note that basis functions are also sometimes called kernels). One such set of basis functions is the set of polynomials:

b^{(i)} = x^i

Here each basis function is a polynomial of order i. We can then compose a basis set of D functions, where the D-th function is b^{(D)}, then model the function f(x) as a linear combinations of these D polynomial bases:

f(x) = \beta_0 b^{(0)} + \beta_1 b^{(1)} + ... \beta_D b^{(D)}

where \beta_i is the weight on the i-th basis function. In matrix format this model takes the form

f(x) = A \beta

Here, again the matrix A is the concatenation of each of the polynomial bases into its columns. What we then want to do is determine all the weights \beta such that A\beta is as close to f(x) as possible. We can do this by using Ordinary Least Squares (OLS) regression, which was discussed in earlier posts. The optimal solution for the weights under OLS is:

\hat \beta = (A^T A)^{-1}A^T y

Let’s take a look at a concrete example, where we use a set of  polynomial basis functions to model a complex data trend.

Example: Modeling f(x) with Polynomial Basis Functions

In this example we model a set of data y whose underlying function f(x) is:

f(x) = cos(x/2) + sin(x)

In particular we’ll create a polynomial basis set of degree 10 and fit the \beta weights using OLS. The Matlab code for this example, and the resulting graphical output are below:

Left: Basis set of 10 (scaled) polynomial functions. Center: estimated model weights for basis set. Right: Underlying model f(x) (blue), data sampled from the model (black circles), and the linear basis model fit (red).

Left: Basis set of 10 (scaled) polynomial functions. Center: estimated model weights for basis set. Right: Underlying model f(x) (blue), data sampled from the model (black circles), and the linear basis model fit (red).

%% EXAMPLE: MODELING A TARGET FUNCTION
x = [0:.1:20]';
f = inline('cos(.5*x) + sin(x)','x');

% CREATE A POLYNOMIAL BASIS SET
polyBasis = [];
nPoly = 10;
px = linspace(-10,10,numel(x))';
for iP = 1:nPoly
	polyParams = zeros(1,nPoly);
	polyParams(iP) = 1;
	polyBasis = [polyBasis,polyval(polyParams,px)];
end

% SCALE THE BASIS SET TO HAVE MAX AMPLTUDE OF 1
polyBasis = fliplr(bsxfun(@rdivide,polyBasis,max(polyBasis)));

% CHECK LINEAR INDEPENDENCE
isIndependent = rank(polyBasis) == size(polyBasis,2)

% SAMPLE SOME DATA FROM THE TARGET FUNCTION
randIdx = randperm(numel(x));
xx = x(randIdx(1:30));
y = f(xx) + randn(size(xx))*.2;

% FIT THE POLYNOMIAL BASIS MODEL TO THE DATA(USING polyfit.m)
basisWeights = polyfit(xx,y,nPoly);

% MODEL OF TARGET FUNCTION
yHat = polyval(basisWeights,x);

% DISPLAY BASIS SET AND AND MODEL
subplot(131)
plot(polyBasis,'Linewidth',2)
axis square
xlim([0,numel(px)])
ylim([-1.2 1.2])
title(sprintf('Polynomial Basis Set\n(%d Functions)',nPoly))

subplot(132)
bar(fliplr(basisWeights));
axis square
xlim([0 nPoly + 1]); colormap hot
xlabel('Basis Function')
ylabel('Estimated Weight')
title('Model Weights on Basis Functions')

subplot(133);
hy = plot(x,f(x),'b','Linewidth',2); hold on
hd = scatter(xx,y,'ko');
hh = plot(x,yHat,'r','Linewidth',2);
xlim([0,max(x)])
axis square
legend([hy,hd,hh],{'f(x)','y','Model'},'Location','Best')
title('Model Fit')
hold off;

First off, let’s make sure that the polynomial basis is indeed linearly independent. As above, we’ll compute the rank of the matrix composed of the basis functions along its columns. The rank of the basis matrix has a value of 10, which is also the number of columns of the matrix (line 19 in the code above). This proves that the basis functions are linearly independent.

We fit the model using Matlab’s internal function \text{polyfit.m}, which performs OLS on the basis set matrix. We see that the basis set of 10 polynomial functions (including the zeroth-bias term) does a pretty good job of modeling a very complex function f(x). We essentially get to model a highly nonlinear function using simple linear regression (i.e. OLS).

Wrapping up

Though the polynomial basis set works well in many modeling problems, it may be a poor fit for some applications. Luckily we aren’t limited to using only polynomial basis functions. Other basis sets include Gaussian basis functions, Sigmoid basis functions, and finite impulse response (FIR) basis functions, just to name a few (a future post, we’ll demonstrate how the FIR basis set can be used to model the hemodynamic response function (HRF) of an fMRI voxel measured from brain).

About dustinstansbury

I recently received my PhD from UC Berkeley where I studied computational neuroscience and machine learning.

Posted on December 9, 2012, in Regression and tagged , , , , , , , . Bookmark the permalink. 1 Comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: