Convexity in Subspace Approximation & Simple Convexity Limit Theorem

by Soren Larson

In this blog post I focus on interesting characteristics of convexity that we didn’t get to cover in class, borrowing problems and examples from Kreyszig’s “Introductory Functional Analysis with Applications” and Rockafellar’s famous “Convex Analysis” text. I’m particularly excited to discuss a theorem from Rockafellar’s text, as this text has been cited in virtually every analysis course lecture I’ve ever taken, though I’ve not gotten an opportunity yet to consider it. I suppose that the reader is familiar with basic notions of analysis.

I borrow this example from Problem 3 of Kreyszig’s Chapter 6.2. For normed space X and Y \subseteq X finite dimensional subspace, and for a given x\in X we want to best approximate x with an element y \in Y. To do this, most sensible will be to find a basis \{e_1,\ldots, e_n\} for Y and approximate x using a linear combination \sum \alpha_j e_j for scalars \alpha_j chosen to minimize the distance between x and our linear combination approximation. So, putting

f(\alpha) := \left\|x - \sum_{j=1}^n \alpha_j e_j\right\|,

we look to find \alpha that minimizes f. Before we proceed, we might like to be sure that f is continuous in \alpha, so we can be sure that f doesn’t `jump’ at a point e.g., of a possible minimum (e.g., \min\{(0,1]\} is not well defined).

So, taking \epsilon > 0 to be arbitrary fixed, writing \alpha = (\alpha_1,\ldots, \alpha_n) and \alpha' = (\alpha_1',\ldots, \alpha_n'), and putting \delta = \epsilon / \sum_j \|e_j\| we see that whenever \max_k |\alpha_k - \alpha_k'| < \delta and so \|\alpha - \alpha'\| \le \delta, by triangle inequality we have

|f(\alpha) - f(\alpha')|  \le \left| \left\|x - \sum_{j=1}^n \alpha_j e_j\right\| - \left\|x - \sum_{j=1}^n \alpha_j' e_j\right\|\right|  \le \left\|\sum_{j=1}^n (\alpha_j - \alpha_j')e_j \right\|  \le \max_k |\alpha_k - \alpha_k'| \sum_{j=1}^n \|e_j\|  \le \delta \sum_{j=1}^n \|e_j\| = \frac{\epsilon}{\sum_{j=1}^n \|e_j\|} \sum_j\|e_j\|  \le \epsilon.

But \epsilon was arbitrary, so we’re done. So f is continuous in \alpha, as desired.

Now we proceed to minimization of f. Indeed, we recall that convexity gives us that any local minimum is also a global minimum, giving us a guarantee that any minimum we find is the minimum.

I claim that f is convex in \alpha. That is, I plan to show for \alpha = (\alpha_1,\ldots, \alpha_n) and \beta = (\beta_1,\ldots, \beta_n), we have for \lambda \in [0,1] fixed f(\lambda \alpha + (1- \alpha)\beta) \le \lambda f(\alpha) + (1- \lambda) f(\beta). To see this, noticing x = \lambda x + (1 - \lambda) x and again applying triangle inequality we have

f(\lambda \alpha + (1 - \lambda) \beta)  = \left\| x - \sum_{j=1}^n (\lambda \alpha_j + (1 - \lambda) \beta_j) e_j \right\|  = \left\| x - \lambda \sum_{j=1}^n \alpha_j e_j - (1 - \lambda) \sum_{j=1}^n \beta_j e_j\right\|  = \left\|\lambda x - \lambda \sum_{j=1}^n \alpha_j e_j + (1 - \lambda)x - (1 - \lambda) \sum_{j=1}^n \beta_j e_j\right\|  = \left\| \lambda \left(x - \sum_{j=1}^n \alpha_j e_j\right) + (1 - \lambda) \left(x - \sum_{j=1}^n \beta_j e_j \right)\right\|  \le \lambda \left\| x - \sum_{j=1}^n \alpha_j e_j \right\| + (1 - \lambda) \left\| x - \sum_{j=1}^n \beta_j e_j \right\| \\  \le \lambda f(\alpha) + (1 - \lambda) f(\beta)

as desired! This result is particularly useful, e.g., in linear regression for normed cost function!

I now change gears from functional analysis to convex analysis. Indeed, students of analysis often like to know if properties of functions are preserved when taken to the limit. In this way, it’d be particularly nice to know that \emph{convexity} is preserved in the limit. I first make this claim precise, and then I prove it.

Take f_1,f_2,\ldots to be a sequence of finite convex functions on open convex set C of finite dimensional normed space. I claim that if f_n \to f pointwise (i.e., for \epsilon > 0 and x \in C arbitrary fixed, we may find N \in \mathbb{N} large enough such that \|f_n(x) - f(x)\| \le \epsilon whenever n \ge N), then f is convex. I prove this claim.

Let \lambda \in [0,1] be arbitrary, and take x,y \in C. Since f_n \to f pointwise, clearly we have

f_n(\lambda x + (1 - \lambda) y) \rightarrow f(\lambda x + (1 - \lambda) y)


\lambda f_n(x) + (1 - \lambda) f_n(y) \rightarrow \lambda f(x) + (1 - \lambda) f(y)

But since f_n is convex, we have

f_n(\lambda x + (1 - \lambda)y ) \le \lambda f_n(x) + (1 - \lambda) f_n(y)

and so

f(\lambda x + (1 - \lambda) y) \le \lambda f(x) + (1 - \lambda) f(y),

as desired. So, as we’d expect, convexity is preserved when taken to the limit. Rockafellar proves a more general form of this statement (in his Theorem 10.8), using weaker assumptions and more ideas from analysis. But since most of these ideas are outside the scope of this course, and the assumptions I use are reasonable or `practical’ ones, I leave it here.