Mercer’s Theorem and SVMs

Patterns of Ideas

In a funny coincidence, this post has the same basic structure as my previous one: proving some technical result, and then looking at an application to machine learning. This time it’s Mercer’s theorem from functional analysis, and the kernel trick for SVMs. The proof of Mercer’s theorem mostly follows Lax’s Functional Analysis.

1. Mercer’s Theorem

Consider a real-valued function $latex {K(s,t)}&fg=000000$, and the corresponding integral operator $latex {mathbf{K}: L^2[0,1]rightarrow L^2[0,1]}&fg=000000$ given by

$latex displaystyle (mathbf{K} u)(s)=int_0^1 K(s,t) u(t), dt.&fg=000000$

We begin with two facts connecting the properties of $latex {K}&fg=000000$ to the properties of $latex {mathbf{K} }&fg=000000$.

Proposition 1. If $latex {K}&fg=000000$ is continuous, then $latex {mathbf{K} }&fg=000000$ is compact.

Proof: Consider a bounded sequence $latex {{f_n}_{n=1}^{infty} subset L^2[0,1]}&fg=000000$. We wish to show that the image of this sequence, $latex {{mathbf{K} f_n}_{n=1}^{infty}}&fg=000000$, has a convergent subsequence. We show that $latex {{mathbf{K} f_n}}&fg=000000$ is equicontinuous, and Arzela-Ascoli then gives a…

View original post 848 more words

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s