Abstract
In this survey we introduce the general Theory of Approximation to functions in (quasisemi)normed spaces; the exposition starts with an explanation of the main problem: we impose certain family of subspaces as our approximants, and we need to obtain a description of the subspace(s) that are approximated by this family with a given approximation order. We introduce as well to some of the background and basic tools most often used to solve this kind of problems. Approximation Theory gets heavily improved when some efforts are put into the effective construction of the approximants on each given example, rather than simply stating its existence --this is what we call “Constructive Approximation”. The fact that we can handle actual functions, allows us to obtain yet more properties of the approximants. It is implicit throughout the exposition how Approximation Theory benefits from other branches of Mathematics, but also how Constructive Approximation can be used to prove results from those other subjects. Finally, we include extensive examples that help us better understand how all this can be achieved. |
Let (X,||.||X) be a (quasisemi)normed linear space. Consider a countable family of spaces in X, {Xn}n with associated error functionals E(.,Xn)X = inf gXn||.-g||X, satisfying the following properties:
We say g Y is an element of best approximation to f X, if ||f - g||X = E(f,Y )X. A near best approximation element to f from Y is by definition any function g such that ||f - g||X < E(f,Y )X for some value > 0. In that case, we will often refer to such a function g as a -near best approximation element. |
We will call {Xn} a family of approximants. For any such family, and given parameter values ,q > 0, consider the following (quasi)seminorms and associated subspaces:
We call these Approximation Spaces associated to the family of approximants {Xn}. They consist on those functions f X that are approximated in X by elements of Xn with error of order O() (i.e. , there exists C > 0 such that E(f,Xn)X < Cn- for all n) and “smoothness” q.Lemma 1.1 If the sequence n is monotone decreasing, then the (quasi)seminorms |.|Aq(X,Xn) are equivalent to the following:
Proof. For any k > 0 and any 2k-1 < n < 2k, we have the estimates
Lemma 1.2 Under the same hypothesis as before, and any value > 0, the following inclusion is verified for all 0 < q < p <:
Proof. This is just an application of the well known inclusions q p for 0 < q < p <: Consider the measure space (,), where (n) = 1 for all n, and consider for each function f X the (measurable in (,)) functions : k2kE(f;X2k)X +; then,
Remark. In the following sections we will learn to find descriptions of these spaces in terms of classical spaces. The main tools used in this sense are given in the following order:
Theorem 1 (Hardy) Given > 0 and 1 < q < , the following inequalities hold for each nonnegative measurable function :
For q = , the integral is replaced by the L norms:Proof. Let us prove the estimate (1.5). For any value > 0 we estimate first the interior integral using Hölder’s Inequality; let p be the conjugate exponent of q:
The remaining estimates can be obtained from this one by changes of variable or taking limits, so we skip their proofs.
Remark. It is possible to discretize integrals of the kind 0q when the functions (t) are nonnegative and monotone, using the same technique we used in the proof of Lemma 1.1:
Lemma 1.3 (Discrete Hardy’s Inequalities) Let a = (an)n, b = (bn)n be two nonnegative sequences such that there exist C0,, > 0 so that for all n, either
Proof. Let us assume that the first condition is satisfied. From the inclusions for 0 < < <, we infer that it must also be bn < C0 k=nak for all < . We can therefore assume that < q (if it is not, then we can certainly pick < q < ). We are now able to use Hölder’s Inequality; let 0 < < , and let r > 0 so that /q + /r = 1:
A similar proof serves to show that the second condition also gives the same estimate, but this time only for values 0 < < .
Theorem 2 Given > 0 and 0 < q < 1, there exists C > 0 that depends at most on and q such that the following inequalities hold for each nonnegative monotone function :
Proof. Given t > 0, there exists n such that 2-(n+1) < t < 2-n; therefore,
Definition 1 Given a (quasi)normed space (X,||.||X) and a subspace Y X, we say an operator : X Y is of best approximation if ||f - (f)||X = E(f,Y )X for all f X. Similarly, for a given > 0, we say is an operator of near-best approximation if ||f - (f)||X < E(f,Y )X for all f X.
Proof. Consider for each n an element gn Y such that d(f,gn) < inf gY d(f,g) + 1/n. Any sequence in a compact set has at least a limit element g0 Y ; this element is the best approximation to f from Y by definition.
Theorem 3 Let X be a (quasi)normed space. For each finite dimensional subspace Xd of X and each f X, there is a best approximation to f from Xd.
Proof. If Xd = {0} then there is nothing to prove. Otherwise, consider for each f X the set Y f = . Y f is a closed set:
Proof. Given x,y X, assume E(x-y,Z)X < E(y,Z)X (it’s not a lost of generality, since one can switch elements). Let z Z be a near-best approximation to x from Z, and z'',z''' Z elements of best approximation from Z to y - x and y respectively. We have then for z' = z'' + z,
Example. The spaces Lp are unicity spaces for 1 < p < , but not for 0 < p < 1 nor p = . A characterization of spaces (at least normed) with the unicity property can be made through the use of “strict convexity”:
Definition 3 A normed space is said to be strictly convex if the following property holds:
Proof. Assume the result is not true, and there exists a function f X and two different elements g1,g2 Y such that ||f - g1||X = ||f - g2||X = E(f,Y )X > 0. In that case,
Proof. Let : X X the operator of best approximation, and let CX > 1 be the constant in the (quasi)triangular inequality offered by the (quasi)norm ||.||X. Given > 0, let = (3CX2)-1. Notice that, if f,g X verify ||f - g||X < , then we have
Definition 4 A compatible couple is a pair of (quasisemi)normed linear spaces, (X,||.||X), and (Y,||.||Y )
continuously embedded in a Hausdorff topological linear space H.
We define the sum and intersection of such a couple as
An intermediate space Z of a compatible couple (X,Y ) is an interpolation space for the couple if T(Z) Z for all admisible operator T.
Remark. In the 70’s there were primaryly two methods for constructing interpolation spaces of a compatible couple: the complex method of Calderón [Cal1], and the real method of Lions and Peetre [Peet]. We are mainly interested in the latter, since it uses as building blocks similar quasi-seminorms to the ones in the description of the Approximation Spaces above.
Definition 7 We define the K-functional of a compatible couple (X,Y ) as follows:
Proof. Notice that, given f X + Y , the function K(f,.;X,Y ) is trivially nonnegative and monotone nondecreasing. Its concavity is proved in the following way: Given t1,t2 > 0 and 0 < < 1, and any decomposition f = fX + fY , we have, for t = t1 + (1 - )t2,
Proof. |.|(X,Y ),q is trivially linear and nonnegative. The (quasi)triangular inequality is directly inferred from part (ii) in Lemma 1.7.
Proposition 2 Given a compatible couple (X,||.||X), (Y,||.||Y ), and parameters 0 < < 1, 0 < q <, the (,q) spaces
Proof. This is inmediate from (iii) in Lemma 1.7.
Lemma 1.9 Given a (quasisemi)normed space (X,||.||X), and a continuously embedded subspace Y X with ||f||Y < CY ||f||X for some CY > 0, and all f Y : (i) The K-functional of the compatible couple (X,Y ) can also be written as
Proof. Part (i) is trivial. As for part (ii), we start noticing that
Theorem 4 (Holmsted (1970)) Let (X1,||.||X1), (X2,||.||X2) be a compatible couple of (quasi)normed spaces, let 0 < 1 < 2 < 1 and 0 < q1,q2 < , and consider the interpolation spaces Y 1 = (X1,X2)1,q1, Y 2 = (X1,X2)2,q2; then, for f Y 1 + Y 2 and = 2 - 1, we have the following equivalence:
Definition 8 Given a (quasisemi)normed space (X,||.||X), and (quasisemi)normed continuously embedded subspace (Y,|.|Y ) X, Jackson Inequality: We say the family of approximants {Xn}n verifies a Jackson Inequality with respect to Y if there exist r,C > 0 such that
Remark. In the literature of Approximation Theory, results that state Jackson’s Inequalities are refered as “direct theorems”, whereas Bernstein’s inequalities are also identified as “inverse theorems”.
Proposition 3 Given a (quasisemi)normed space (X,||.||X), a family of approximants {Xn}n satisfying both Jackson and Bernstein inequalities with respect to a (quasisemi)normed continuously embedded linear subspace (Y,|.|Y ) X, the following estimates hold:
Proof. To prove (1.11), given f X, consider any g Y and a best approximation gn to g in X from each Xn; then,
The second step is often provided by classical results in the Theory of Interpolation. The first step is the difficult one from the viewpoint of approximation. Theorem 6 provides a good start. Finally, Theorem 7 (proof not offered here, read it in [CDeH]) provides somehow an inverse result to Corollary 3.1.
Corollary 3.1 If the family of approximants {Xn}n satisfies the Jackson and Bernstein inequalities with respect to Y , and exponent r > 0, and the sequence of errors E(f,Xn) is monotone decreasing, then for each 0 < < r and 0 < q <, Aq(X,Xn) = (X,Y )/r,q with equivalent norms.
Proof. Estimate (1.11) gives us Aq(X,Xn) (X,Y )/r,q trivially; for example, for 0 < p < , and r that makes good both Jackson’s and Bernstein’s Inequalities, if f (X,Y )/r,q, then
Theorem 6 (DeVore, Popov) For any (quasisemi)normed space (X,||.||X) and family of approximants {Xn}X such that Xn Xn+1 for all n, as well as for any r > 0 and 0 < p <, the spaces Xn verify the Jackson and Bernstein inequalities for the exponent r > 0, with respect to Y = Apr(X,Xn). Therefore, for any 0 < < r and 0 < q <, we have
Proof. It is enough to show that {Xn}n verifies both Jackson and Bernstein’s Inequalities for the exponent r > 0 with respect to Apr(X,Xn):
The rest of the statement follows inmediatelly.
Theorem 7 (Cohen, DeVore, Hochmuth) Let X,Y,{Xn}n be as before, and suppose that {Xn}n satisfies the Jackson and Bernstein inequalities for r > 0. Suppose further that the sequence of operators {Tn}n verifies: (i) Tn : X Xn (not necessarily linear). (ii) There exists C > 0 such that ||f - Tnf||X < C E(f,Xn)X for all f X. (iii) |Tnf|Y < C|f|Y for all n and f X.
Then, {Tn}n realizes the K-functional; that is,
In this section we want to exemplify how to obtain the approximation spaces in the following case: Given the
unit cube d, X = Lp() for any choice of 0 < p <, and Xn is a linear space of box-splines
with coordinate order r, maximal smoothness, and associated to the dyadic n-th partition of the cube
.
In the search for the spaces Aq(X,Xn), we will go through different levels of abstraction: from the low-level
construction of best and near-best polynomial approximation to functions in Lp() on cubes, to the high-level
description of the K-functionals that will lead us into further results involving interpolation spaces for
compatible couples of Besov spaces. The logic step-by-step exposition is summarized in the following
table:
Lemma 2.1 Given r > 0, a cube d and 0 < q < p <, there is a constant C > 0 depending at most on p, q and d such that
Proof. Consider for all p > 0 the (quasi)norms |||.|||Lp() = ||-1/p||.||Lp() in (r), and apply Theorem 27 (page 101).
Lemma 2.2 Given r > 0, cubes I J d such that |J|< |I| for some > 0, and 0 < q <, there are constants C1,C2 > 0 depending at most on q and d such that
Proof. Consider the (quasi)norms ||.||I,Lq() = |I|-1/q||.||Lq() in (r) and apply Theorem 27 again.
Lemma 2.3 Let be a cube in d, and f Lp(). If g (r) is a near-best Lq() approximation to f for any 0 < q < p, then it is also a C near-best Lp() approximation to f, for some C > 0 that depends on d, p, q, r and , but does not depend on the size of .
Proof. Let P be the best Lp() approximation element to f from (r); then we have:
Lemma 2.4 Let I J be cubes in d such that |J|< |I| for some a > 0. Let f Lp(J), and g (r) a near-best Lq(I) approximation to f for any 0 < q < p. Then g is also a C near-best Lp(J) approximation to f, where C > 0 depends on , , d, p and q.
Proof. Let P be the best Lp(J) approximation element to f from (r). First, notice that for any cube I J,
Theorem 8 (Szegö (1928)) For each trigonometric polynomial Tr of order r,
Definition 9 Given a function f : and a finite collection of real numbers {t0,t1,...,tn}, we denote with | (f;t0,t1,...,tn) the leading coefficient of the polynomial of degree n that interpolates f at t0,...,tn. We call it the n-th divided difference of f. Divided differences are computed recursively as follows:
Proof. Given f : , consider the following interpolation polynomials for each k = 0,...,n - 1: Qk(x) = Pf(x;t0,...,tk) (k), Qk+1(x) = Pf(x;t0,...,tk+1) (k + 1). Notice that g = Qk+1 - Qk (k + 1) vanishes at the knots t0,...,tk, and by definition its leading coefficient is the divided difference | (f;t0,...,tk+1); hence,
Proof. This is a direct consequence of Rolle’s Theorem.
Lemma 2.7 (Leibnitz Formula for Divided Differences) Given functions f,g and knots t0, ..., tn, the n-th divided difference of their product is given by the following formula:
| (2.1) |
Proof. Assume all the knots are different; consider the polynomial of interpolation of h = fg at those knots (using Newton’s expression with the divided differences as coefficients).
Notice now that the leading coefficient of Ph(x;t0,...,tn) is | (fg;t0,...,tn), and the leading coefficient of the expression in (2.2) is the right-hand side of (2.1).
Definition 10 (Schoenberg spaces: knots-multiplicity form) Given a interval A = [a,b] in , we define initially spaces of splines in the following way: Fix r > 0 and let {a < t1 < t2 < < tn < b} be a partition of the interval, and associated to these knots, multiplicities 0 < mk < r. We denote t = (t1,...,tn), m = (m1,...,mn), and
Classically, these are called Schoenberg spaces on A.
Remarks. For instance, m = 1 gives one degree of freedom: the location of the image of f(t). In that case, the
smoothness of f at t is r - 1, which is the maximum possible degree. In particular, this shows that
(r) = Sr(t,1;A), where 1 = (1,...,1).
On the other hand, if m = r, then we have all possible degrees of freedom: we can choose location, and all
derivatives (both sides); this leaves us piecewise polynomials with possible discontinuities on each
knot.
With a slight abuse of notation, we can write Sr(t,r;A) =
k=1n(r)|(tk,tk+1), where r = (r,...,r).
Proposition 4 The space Sr(t,m;[a,b]) has the basis
Example. Consider A = [0,1] and the basic knot sequence t = {0,0,0,1/2,1/2,/12,1,1,1}. In this case, we have
Definition 12 (puB-splines) If tk << tk+r is a sequence of r + 1 knots with tktk+r, we define the puB-spline Nk,r as follows:
Proof. Notice that N is by definition a linear combination of truncated powers (tj - x)+r-kj, where kj is the number of repetitions ti = tj for i < j; therefore, it is a spline function. Furthermore, since any r-th order divided difference of a polynomial of degree r - 1 is zero, Nk,r vanishes identically when x < tk and x > tk+r (think leading coefficients).
The recurrence formula is a direct consecuence of the recurrence formula for divided differences and Leibnitz formula for the divided difference of a product of two functions:
Remarks. On his classnotes [deBo], Carl de Boor expresses the previous recurrence formula in the following way:
Theorem 10 (Curry, Schoenberg (1966), de Boor, Fix (1973)) Given a < b, r > 0, basic knots t = {a < t1 << tn < b} and 2r auxiliary knots {t1-r << t0 < a}, {b < tn+1 << tn+r}, the puB-splines Nk,r(x) = N(x|tk,...,tk+r) for k = 1 - r,...,n form a basis of Sr(t;[a,b]).
Proof. Although the result was first proved by Curry and Schoenberg [CuSc] in 1966, we will offer here a different proof by de Boor and Fix [dBFi], based on the Marsden identities:
Notice first that the previous Marsden’s identities can be completed with the expression of any power ( -x) for < r - 1 by differentiation:
It only remains to prove that a different choice of leads to the same coefficients, and therefore the functionals k,r : (r) do not depend on this choice; let ', and write each derivative P() in Taylor expansion around ':
Conclusion: We have expressions of every basic element of Sr(t;[a,b]) in terms of the constructed puB-splines. This means that they span the Schoenberg space, and because of their cardinality, they must form a basis of the space.
Remark. Consider the dual functionals associated to the basis of Sr(t;[a,b]) given by the puB-splines; let us denote them k,r. There are different ways of expressing these functionals; de Boor and Fix offer the most useful for our purposes: For each k = 1 - r,...,n, choose k,r supp(Nk,r) = (tk,tk+r) [a,b], and write
Consider the functional Qt : Sr(t,[a,b]) given by
It is not hard to show that the projector Qt is a bounded operator on the Schoenberg spaces; given S Sr(t;[a,b]), we have
Lemma 2.9 There exists C > 0 (depending at most on r), such that for all 0 < p <, k = 1 - r,...,n and S Sr(t;[a,b]),
Proof. We will use Lemma 2.1 (page 30), Markov’s Theorem (page 33), and the fact that the functions gk,r and their derivatives are polynomials, hence bounded in any compact. Consider any point k,r Jk,r:
Proposition 5 For all 1 < p <, and all f Lp[a,b], there exists a constant C > 0 that depends at most on p and the order r, such that the following local and global estimates hold:
Proof. Estimate (2.3) is a direct consecuence of the remark after Lemma 2.9, the “partition of unity” property of the puB-splines, and the fact that |Jk,r|> (tk+r - tk)/r and Jk,r Ij,r for all suitable index j:
Remark. Unfortunatelly, one cannot use Hahn-Banach to find the same kind of results for 0 < p < 1, although similar approximation results will remain valid. In §2.5 we will show how in a more general setup.
Definition 13 A tensor product puB-spline N : d of coordinate order r (coordinate degree < r) is a product of univariate puB-splines of order r, each of them with a different variable: N(x1,...,xd) = N1(x1)Nd(xd).
We have all the necessary ingredients to pose the problem of approximation on cubes of d by dyadic splines. Let = [0,1]d be the unit cube in d, and let X = Lp() with the corresponding (quasi)norm for all 0 < p <. The family of approximants will be constructed as spaces spanned by tensor product puB-splines. The construction of those spaces starts in the real line:
Consider for each n the basic knots in [0,1] given by tn = {k2-n : 0 < k < 2n}, and t0 = Ø by definition. Associated to these basic knots we will use the following Schoenberg spaces:
For each n > 0, denote N0,r[n](x) = N0,r(2nx) (the dyadic dilation of order 2-n), and then for each k = 1 - r,...,2n - r - 1, Nk,r[n](x) = N0,r(2nx - k) (horizontal left-shifts of length k2-n).
Notice that {Nk,r[n]}k=1-r2n-1 is a basis for Sr(tn,[0,1]); let us denote k,r[n] the corresponding dual functionals.
Let us move into d dimensions, where we will write x = (x1,...,xd) d. Consider for each multi-index k = (k1,...,kd), the tensor product puB-spline Nk,r[n](x) = Nk1,r[n](x1)Nkd,r[n](xd), and the functionals k,r[n] = k1,r[n] oo kd,r[n].
Lemma 2.10 For each S Xn, any tensor product puB-spline Nk,r[n] = Nk1,rNkd,r, and any point k,r = (k1,r,...,kd,r) supp(Nk,r[n]) , (kj,r supp(Nkj,r) projj()),
Proof. Given a multi-index k = (k1,...,kd), and a tensor product spline S Xn, write
It follows that these are the dual functionals of the constructed tensor product puB-splines, and therefore the functions Nk,r[n] are linearly independent in Lp(). Let us denote Xn = span{Nk,r[n]}, where k has all its indices between 1 - r and 2n - 1. This is the space of all piecewise polynomials with coordinate order r and maximal smoothness on dyadic subcubes of size 2-n in the cube (since DNk,r[n] L() for all multi-index 0 < < r - 1, and DNk,r[n] is continuous for multi-indices 0 < < r - 2). As in the univariate case, each tensor product puB-spline Nk,r[n] can be obtained from N0,r[0] by shifts and dilations:
The family {Xn}n is our family of approximants:
We would like now to have a projector, but this is not an easy task. The quasi-interpolant of §2.4.2 does not work for 0 < p < 1, but is still useful. The trick is to find first intermediate spaces Xn Y n X for each n, where the quasi-interpolants can be easily extended, and such that we can effectively compute (near)best approximations from Y n to elements of X. In the case we are studying here, the obvious choice works just fine:
For each multi-index j = (j1,...,jd) with 1 < ji < 2n, consider the dyadic cubes of , j,n = [(j1 - 1)2-n,j12-n] ×× [(jd - 1)2-n,jd2-n], and Dn = {j,r : 1 < j < 2n} the family of those cubes. Let us denote Y n = j=12n (r;j,n) the space of piecewise polynomials of coordinate order r associated to the dyadic partition Dn.
As we did before for univariate puB-splines, we need to consider for each tensor product puB-spline Nk,r[n], two especial cubes:
In order to obtain the de Boor-Fix expression of the dual functionals of the tensor product puB-splines Nk,r[n], we will choose canonically k,r to be the center of the cubes Jk,r. In that case, one realizes that these functionals can also be applied to any function f Lp() which is differentiable enough on each of the points k,r; in particular, any piecewise polynomial P Y n:
Lemma 2.11 For any 0 < p < and any piecewise polynomial P Y n, there exists a constant C > 0 which depends at most on the order r, such that
The proof of this lemma follows the same steps that the one for Lemma 2.9 (page 47) and its posterior remark. We can also construct quasi-interpolant operators Qn : Y n Xn, which act naturally as projectors.
Proposition 6 For 0 < p < and any piecewise polynomial P Y n, there is a constant c > 0 depending at most on r, d and p, such that the following estimates hold:
The proof is trivial; it uses the previous lemma, and follows the same steps that the proof of Proposition 5 and its posterior remark (page 48).
We are ready to construct the method of approximation: Given > 0, consider any operator of near-best Lp approximation by elements of Y n, say n : X Y n such that Lp() < E(f,Y n)p. Notice that such operators may be constructed by collecting the restriction on cubes of the near-best Lp approximations to f by polynomials in (r) on each cube j,n Dn, and patching them together. Let then
Proposition 7 Given > 0, there exists a constant C > 0 which depends at most on p, r d and such that ||Tn(f)||Lp() < C||f||Lp() for all f Lp().
Proof. Notice first that, for all subcube j,n Dn, we have
Proof. Given f Lp(), let Sn Xn be its best Lp approximation from Xn; then, we have the estimate:
Remark. Notice that now we may use indisctinctly for each f Lp() either E(f,Xn)p or ||f -Tn(f)||LP(), since they are equivalent. Moreover, when searching for the approximation spaces, we may use the (quasi)seminorm functions
In this section we introduce two important tools in the Theory of Sobolev Spaces: mollifiers and infinite differentiable partitions of unity. We also illustrate how to use the former to construct plateau functions and prove the density of C0(G) on Lp(G) for 1 < p < on domains G d.
Definition 14 We call a mollifying kernel to any nonnegative, real-valued function C0(d) such that
(x) = 0 for |x|> 1 and
d(x)dx = 1, we call a mollifier to any function (x) = -d(x/) for any
> 0.
Given a function f M(d,|.|) for which the integral
d(x - y)f(y)dy makes sense, we call the
convolution ( * f)(x) a mollification or regularization of f.
Example. An example of mollifying kernels are the “bump” functions d : d given by
Proof. Assume f M(d,|.|). Let (sn)n be a monotonically increasing sequence of nonnegative simple functions converging pointwise to f. As p > 1, we have 0 < sn(x)p < f(x)p a.e., and therefore, it must be sn Lp(G), and furthermore, by the Dominated convergence Theorem, limn||f - sn||Lp(G) = 0 (since |f(x) - sn(x)|p < f(x)p for all x). Given > 0, find sn such that ||f - sn||Lp(G) < /2. Use now Lusin’s Theorem to find a continuous function g C(G) such that |g(x)|<||sn||L(G) and more importantly,
Lemma 2.12 Given a domain G d, a mollification kernel and a function f M(d,|.|) such that f(x) = 0 for x / G, the following holds: (i) If f L1loc(G), then * f C(d) for all > 0. (ii) If also supp(f) is compact, then * f C0(G) for all 0 < < dist(supp(f),G). (iii) If f Lp(G) for any 1 < p < , then * f Lp(G); moreover,
Proof. Let : d be a mollifying kernel, and consider = *K 3/2, the mollification of K3/2 with . This is the function we are looking for.
Proof. This is a direct consequence of Theorem 11 and parts (ii) and (v) of the previous Lemma.
Theorem 14 Given an arbitrary subset A d and an open cover O of this set, there exists a collection of functions in C0(d) with the following properties: (i) 0 < (x) < 1 for all and all x d. (ii) Given a compact subset K A, all but possibly finitelly many vanish identically on K. (iii) Given , there exists U O such that supp() U. (iv) (x) = 1 for all x A.
Definition 15 Given a domain G d, consider the space D(G) consisting on those functions g C0(G)
such that there exists a compact set K G and a sequence (gn)n in C0(G) so that supp(g - gn) K for
all n, and limnDkgn(x) = Dkg(x) uniformly on K for each multi-index k.
The dual space D'(G) is called the space of (Schwartz) distributions if it is given the weak-start topology as
dual of D(G): limnTn = T in D'(G) if and only if limnTn(g) = T(g) in d for every g D(G).
Remark. The space L1loc(G) can be identified with a subspace of D'(G) as follows: given f L1loc(G), let Tf : C0(G) g Gf(x)g(x)dx . These functionals are trivially linear. Notice that it is also continuous: Given a sequence (gn)n in C0(G) such that there exists a compact K G so that supp(g -gn) K for all n, and limngn(x) = g(x) uniformly on K; we have
Definition 16 Given G d, a multi-index k d and given a distribution T D'(G), we define its
distributional k-th derivative DkT by (g) = (-1)|k|T(Dkg) for all g C0(G).
Similarly, for f L1loc(G) and k d, we say L1loc(G) is a weak k derivative of f if T is a
distributional k-th derivative of Tf. This weak derivative might not exist, but in case it does, it must be
unique a.e; we denote it Dwkf.
Definition 17 Given a domain G d, and r {0}, we define the following functionals:
Functionals (2.8) and (2.9) are trivially seminorms, and (2.10), (2.11) are norms. Associated to these functionals, we define the following spaces:
Remark. We have trivially Wp0(G) = Lp(G) for 1 < p <, and Wp0(G) = Lp(G) for 1 < p < (by Theorem 13). Notice also the chain of (continuous) embeddings for all r :
Proof. Let (fn)n be a Cauchy sequence in Wpr(G) Lp; then trivially (Dkfn)n are Cauchy sequences in Lp(G) for all multi-index k with 0 <|k|< r. Let f, (k) Lp(G) be such that limnfn = f and limnDkf = (k) both in Lp(G). As Lp(G) L1loc(G), each of those functions determines distributions Tf,T (k) D'(G). For any g D(G), we have then (let q be the conjugate exponent of p):
For functions of one variable, both ordinary and generalized derivatives produce the dame space. This is proved in the following two results:
Lemma 2.14 Let A be an open interval, and r {0}. If f L1loc(A) verifies Afg(r)dx = 0 for all g C0(), then f is a.e. a polynomial of order r.
Remark. In §2.7.2 we will make use of the K-functional of compatible couples (Lp(G),Wpr(G)) for G d. We can use the previous results to illustrate how to compute it in one dimension. We will base the proof in the availability of the Taylor polynomial for functions in Sobolev Spaces: For each f Wpr(A), consider the Taylor polynomial centered in c A:
Lemma 2.15 Given an open interval A R, 1 < p,q <, we have the estimate
Proof. Let p' be the conjugate exponent of p; let’s apply Hölder’s Inequality:
Proof. Given 0 < t <|A|, we have for all x At = {x A | x + t A},
| (2.12) |
since |At|< t trivially.
Consider now for our choice of 0 < t < |A|, x A and > 0 such that x + t A. In this case, we
have
Consider the difference operators: for each h d and measurable function f M(,|.|) for any subset d, let h(f,.) = f(. + h) - f(.), and hr = h(hr-1) for r > 1. It follows from the binomial theorem, that
| (2.13) |
for all x (rh) = {x ||x + kh| for all 1 < k < r}.
Definition 18 Given a rearrangement-invariant space (X,||.||X) over the space (,|.|), we define the r-th modulus of smoothness of f X by
The general setup is fairly complicated, and many different properties are to be taken into account in order to produce any general result on these functionals. We will focus on the spaces we are going to use in this survey: Lp() for 0 < p < , and C() for p = , where d is a compact cube.
Lemma 2.16 For any t > 0, the modulus of smoothness is a seminorm for 1 < p < and a quasi-seminorm for 0 < p < 1.
Proof. Notice that r(f,t)p <||r(f,t)p, and r(0,t)p = 0 trivially for all 0 < p <. As for the (quasi)triangular inequality, we have also trivially r(f + g,t)p < C, with the same constant from the (quasi)triangular inequality in Lp((rh)). The kernel of the r-th modulus of smoothness is precisely the set of polynomials (r) of coordinate order r.
Also, from the fact that ||f + g||Lpp <||f||Lpp + ||g||Lpp for all 0 < p < 1, we obtain similarly r(f + g,t)pp < r(f,t)pp + r(g,t)pp.
Lemma 2.17 (Properties of the modulus of smoothness in Lp) Given f Lp(), t > 0, the following estimates hold:
Proof. The first estimate can be proved directly from the identity hr(f,x) = hr-1(f,x) + hr-1(f,x + h), which is proved easily from (2.13). That gives r(f,t)pmin(1,p) < 2r-1(f,t)pmin(1,p), and from this the statement follows.
As for the second estimate, notice first that
Weak inverses to the first estimate in the previous lemma are offered by Marchaud and Timan. A proof of the following result, that is known as Marchaud’s Inequalities, can be read in chapter 2 of [DeLo].
Theorem 18 (Marchaud (1927),Timan (1958)) Given r > 2 and f Lp(), we have the following estimates for all 1 < k < r, t > 0 and 0 < p <:
Sketch of the Proof. The proof is similar to the proof of part (ii) in Lemma 1.9 (page 21): we start showing that the seminorm above is equivalent to the one obtained replacing the integral (or the supremum) over (0,) by one over (0,1), using the fact that, as is compact, then r(f,t)p < r(f,||)p for all t > ||. After that, discretization of the latter integral with partition {2-n | n } is applied, using the fact that r(f,.)p is a nondecreasing function. .
Remark. Unfortunately, the moduli of smoothness are not always suitable for applications because it is not easy to add up several such estimates over different intervals. New related (and equivalent) moduli of smoothness can be constructed by averaging:
Definition 20 We define the r-th averaged modulus of smoothness on the subcube I , for f Lp() and t > 0 by
Remark. Notice that, for I,J , we have (I J)(h) I(h) J(h) for all suitable h d; therefore, wr(f,t;I J)pp > wr(f,t;I)pp + wr(f,t;J)pp. We will prove now the equivalence with the moduli of smoothness.
Lemma 2.19 For all f Lp() and suitable s d, the following holds for all x d
Proof. Notice that
Proof. The left inequality is trivial:
Proof. Oh boy, this is a tough one.
Theorem 20 (Whitney (1957)) E(f,(r))p r(f,)p for all f Lp(), (0 < p <), t > 0 and r , where is the largest of the sides of .
Proof. For p > 1, let g Wpr() be arbitrary, and P (r) be the Taylor polynomial of g associated to one of the points in the boundary of . By some result in §2.6, we have
The right inequality is trivial for all p, since r(f,t)p = r(f -P,t)p < 2r0(f -P,t) = 2r||f -P||Lp() for all P (r).
Proposition 11 For any 0 < p < , r and > 0, given f Lp(), there exists C > 0 that depends at most on p, r and such that the following estimate holds for all n:
Proof. For each dyadic subcube j,n, denote j,n = n(f)|j,n the restriction of the piecewise polynomial of near-best approximation we get from the operator n. In that case,
Corollary 11.1 For any 0 < p < , r and > 0, given f Lp(), there exists C > 0 that depends at most on p, r and such that the following estimate holds for all n:
Proof. The left hand side is inmediate; given x , let n(x) = {k d | x supp(Nk,r[n])} (notice that this value does not depend on n, but on r and d):
Proof. Let Sn be an element of best Lp() approximation to f from Xn for all n, and s1 = S1, sn = Sn -Sn-1 Xn for n > 2. We can then write f = f - Sn + k=1nsk, and use this inside the difference operator; for suitable h d,
We need to estimate the terms in the sum on the right, but trying to introduce coefficients 2k on each of the estimates, so that later application of Lemma 1.3 in page 10 is possible; let x (rh): Now it all relies on estimates of the differences of the basic tensor product puB-splines; these depend heavily on the location of the point x and the size of |h|; let (rh) = ', where x if x and x + rh both belong to the same subcube i,k, and ' = (rh) \ :In order to estimate ||DhrNj,r[k]||L (segment[x,x+rh]), we need to use some basic multivariate calculus: Consider the univariate polynomial j,r,x,h[k] : [0,1] defined by the composition of Nj,r[k](x) = N0,r[0] ok,j ox,h, where k,j : d x 2kx - j d, and x,h : [0,1] t x + rht segment[x,x + rh] d.
We have then,
Remark. Abusing notation, we can denote X0 = {0}, and hence, E(f,X0) = ||f||Lp(), and we may simplify the previous estimate to read
Theorem 21 Given r , and 0 < p,q <; then for all 0 < < , the following quasinorms are equivalent to the Besov quasinorms ||.||Bq(Lp()) = |.|Bq(Lp()) + ||.||Lp():
Proof. The equivalence of N1, N2 and ||.||Bq(Lp()) follows directly from Proposition 11, Corollary 11.1, Proposition 12 and the Discrete Hardy’s Inequalities (Lemma 1.3 in page 10). As for the third quasinorm, notice that on one side,
Remark. We have just proved the goal of this chapter; we have precisely determined the approximation spaces in Lp() associated to the family of approximants Xn for all q > 0, and 0 < < :
Corollary 21.1 Given r , the following spaces are identical (with equivalent (quasi)norms) for all 0 < < :
Remark. Theorem 21 offers also the possibility of representing functions in Besov spaces by means of a sequence of nonnegative real functions satisfying certain properties. We will use this representation to find in the next section an equivalent expression for the K-functional of couples of Besov spaces; and with it, the computation of interpolation spaces for such couples.
Notice that n=1tn(f) = limnTn(f) = f a.e.; and as tn(f) Xn for each n > 1, we may write tn(f) = k=1-r2n-1 j,r[n](tn(f))Nj,r[n], and furthermore
| (2.24) |
This atomic decomposition of functions in Bq(Lp()) leads to yet another equivalent (quasi)norm:
Corollary 21.2 Given p,q,r, as before, f Lp() is in Bq(Lp()) if and only if f can be represented as in (2.24), with
Given a sequence of functions a = (fn)n in a (quasisemi)normed space (X,||.||X), consider for parameters ,q > 0 the (quasisemi)norms
Consider also, the following operator in Lp():
Theorem 22 Given r , 0 < p1,q1,p2,q2 < , 0 < 1,2 < r, denote Bi = Bqii(Lp i()), and i = qii(Lp i()); then, for all f B1 + B2, there exist constants C1,C2 > 0 which depend at most on r,d,1 and 2 such that for all t > 0,
Proof. Let us prove the left inequality: Given f B1 + B2, let a1 = (an[1])n 1 such that a2 = (an[2])n = Tf -a1 2; we have K(Tf,t;1,2) <||a1||1 + t||a2||2. From these sequences ai, we will construct functions fi Bi such that f = f1 + f2, and ||fi||Bi < C||ai||i.
We will be using the projectors Tn : L() Xn for both functions fi Bi; thus, we need to work in a space L() Lp1() Lp2(). As || = 1, use Jensen’s Inequality to realize that for each 0 < < min(p1,p2), and any function Lpi, we have || = /pi </pi = ||||L pi().
For each n, let gn = Tn(an[1]) Xn. By the equivalence of quasinorms in finite dimensional spaces, and Proposition 7 in page 54, we know that it must be ||gn||Lp 1() < C||gn||L < C,r,||an[1]||L() < C,r,||an[1]||Lp 1(). Consider now g = n=1gn, which converges trivially in Lp1(); notice that for each n, with = min(p1,1).
Let us prove the right inequality: Let g B1 such that f - g B2. Given > 0 and 0 < < min(p1,p2), we construct near-best elements of L() approximation to f from Y n via the operator n : L() Y n, and using Lemma 1.5 in page 14, we can obtain as well elements of L() approximation to g from Y n, say hn(g) Y n, such that n(f) - hn(g) are near-best elements of L() approximation to f - g from Y n. By an argument similar to the proof of Lemma 8 in page 55, we realize that Un = Qn(hn(g)) is a near-best Lp1() approximation to g from Xn, and Rn = Tn(f) - Qn(hn(f)) is a near-best Lp2() approximation to f - g from Xn.
Let un = Un - Un-1 and rn = Rn - Rn-1 for n > 1 (being U0 = R0 = 0 trivially), and consider the sequences u = (un)n,r = (rn)n n=1Xn. Notice that
Corollary 22.1 Under the same conditions as in the Theorem above, and given 0 < < 1 and 0 < q <, we have f (B1,B2),q if and only if Tf (1,2),q.
Notice that this result allows us to compute the interpolation spaces for compatible couples of Besov Spaces. It all depends on the computation of the interpolation spaces ,q; these are easily defined in terms of the Lorentz spaces Lp,q(), so we will introduce them in the next section.
Given a totally -finite measure space (,), and values 0 < p,q < , consider the Lorentz functionals
and for q = ,Lemma 2.21 We have the equivalence p,q(.) ||.||Lp,q(,) among Lorentz functionals for all p > 1 and 0 < q <.
Proof. For all 0 < q < we have trivially p,q(f) <||f||Lp,q(,), since f*(t) < f**(t) for all t > 0. On the other hand, for 0 < q < ,
Lemma 2.22 (Properties of Lorentz functionals) (i) Both p,q(f) < p,q(g) and ||f||Lp,q(,) <||g||Lp,q(,) for f,g M0(,) such that |f|<|g|. (ii) The functionals (2.26) and (2.28) are both (quasi)norms for all 1 < p < .
Proof. Part (i) is trivial. Using this, and the subadditivity property of the maximal functions f**, we infere that the functionals (2.26) and (2.28) are both (quasi)norms for all 0 < q < .
Remark. Notice that the lack of subadditivity of the decreasing rearrangements gives us that the functionals (2.25) and (2.27) cannot have any (quasi)triangular property; hence, they do not have (quasi)norm structure.
Definition 21 The Lorentz spaces Lp,q(,) are the Riesz spaces associated to the Lorentz (quasi)norms ||.||Lp,q(,):
Theorem 23 For all f L1() + L() and t > 0,
Proof. We prove first that the integral in the left-hand side is bounded above by the K-functional on the right-hand
side. We will use for this the sub-additivity of the maximal functions ((f + g)**(t) < f**(t) + g**(t) for all t > 0),
and the fact that the spaces Lp() are rearrangement-invariant.
Given f L1() + L(), and any decomposition f = f1 + f with fq Lq(), we have
Corollary 23.1 Given 0 < < 1 and 0 < q <, we have ,q = Lp,q(), where 1/p = 1-.
Proof. Use the previous Corollary and the Reiteration Theorem 5 (page 23).
Theorem 24 Let (X,||.||X), (X1,||.||X1) and (X2,||.||X2) be complete (quasi)normed spaces, and let 0 < 1 < 2, 0 < < 1 and 0 < q1,q2 <. Denote k(X) = qkk(X); then the following properties hold: (i) ,q = q(X) for all 0 < q <, where = (1 - )1 + 2. (ii) ,q = q, where = (1 - )1 + 2 and 1/q = (1 - )/q1 + /q2.
Lemma 2.23 Given p, > 0, r and d, consider > 0 defined by 1/ = /d + 1/p. Then for all n there exists C > 0 which depends at most on ****what?****, such that ||S||Lp() < 2npC||S||L() for all S Y n.
Throughout this section, d denotes the d-dimensional unit ball in d with respect to the euclidean norm; their d dimensional size is denoted d. d-1, the unit sphere, is the boundary of the previous set; and d-1 d-1 is the set of directions in d. We assume the latter to be a connected set for integration purposses.
Definition 22 Given d > 2, a univariate function f and a direction d-1, we define the d-dimensional ridge function on d generated by f with direction by
Proof.
If f Lq(d) and g L1(1), we may estimate,
Theorem 26 Let (X,||.||X) and (Y,||.||Y ) be (quasi)normed linear spaces and F : X Y a linear map. Then the following conditions are equivalent: (i) F is bounded on some closed ball about 0 of positive radius. (ii) F is continuous at 0. (iii) F is uniformly continuous on X. (iv) There exists > 0 such that ||F(x)||Y < ||x||X for all x X. (v) In particular, if Y = , with absolute value for a norm, then each of the above conditions is equivalent to the following: If F0, then the hyper-space Z(F) is closed in X.
Proof.
Remark. One should not be very happy about the previous result when dealing with quasinorms; still existence of continuous linear functionals has to be proved in the space of your choice. For instance, in Lp[0,1] for 0 < p < 1, the only continuous linear functional is the zero functional! A proof of this result (M. M. Day’s Theorem) can be read in [Torc].
Corollary 26.1 All linear functionals of a (quasi)normed linear space are continuous.
Proof. This is a direct consequence of (v) in Theorem 26 above.
Proof. Given two different (quasi)norms in a finite dimensional linear space Xd, ||.||1 and ||.||2, it will be enough to prove that the linear function F : (Xd,||.||1) xx (Xd,||.||2) is continuous. For that purpose, choose any basis of Xd, say {fk}k=1d, and decompose F in terms of the projections over the coordinate subspaces span(fk): F(.) = k=1dprojk(.)fk. We have written F as a finite sum of continuous functionals (by the previous Corollary); hence, F must be continuous. Apply now part (iv) of Theorem 26 to get the desired result.
Corollary 27.1 Every closed bounded set of a finite dimensional (quasi)normed linear space is compact.
Theorem 28 (Hahn-Banach Lemma) Let F : X be a sublinear function on a vector space X over a field , let Y be a subspace of X and let : Y be a linear functional such that |(x)|< F(x) for all x Y . Then there exists a linear map : X which extends and which is dominated by F on all of X.
Let (,) be a measure space; consider the following sets:
A mapping : M+ [0,] is called a quasinorm function, if for f,g M+(,), the following properties hold:
Given such a quasinorm function on (,), the collection X() = {f M(,) | (|f|) < } is called a Riesz space associated to . Such spaces inherit from its quasinorm special properties:
Definition 24 Given a measurable space (,), and f M(,), consider the associated functions:
We say two measurable functions f,g are equimeasurable (and we write f ~ g), if f = g.
Remark. These three new functions associated to f may be used to perform integral operations on f, but in a simpler setup. The three subsequent results show us how:
Lemma 4.4 Given a simple nonnegative function g M(,), the following estimate holds:
Example. Consider = {1,...,n} with measure : k1/n for all k. Notice that, given any measurable function g : kgk , then any ~ g may be obtained by mere permutation of the elements (there exists a permutation n such that k = g(k)). In this case, equality is attained in Corollary 13.1: Given f = (f1,...,fn), consider a permutation such that |f(k)|>|f(k+1)| for all k; then we have f = [0,|f (n)|) + k=1n-1[|f(k+1)|,|f(k)|), and f* = k=1n|f(k)|[(k-1)/n,k/n); therefore, for any given g = (g1,...,gn), it suffices to find two permutations: first ' permutes the indices so that |g'(k)|>|g'(k+1)|, and then matches '(k) with (k). This gives us, for k = g(k), that
Definition 25 We say a measure space (,) is resonant if
We will prove that any compact cube d with the Lebesgue measure is a strongly resonant space, and therefore we may use the previous results to simplify the computation of integral operations on it:
Lemma 4.5 Let d be a compact cube, and let = |.| denote the Lebesgue measure. Given f M0(,), and t [0,||], there exists a measurable subset t with |t| = t, and such that t|f(x)|dx = 0tf*(s)ds. Moreover, these sets can be chosen so that s < t implies s t.
Notice that the spaces Lp for any 0 < p < are all rearrangement-invariant.
Definition 27 Let (X,||.||X) be a rearrangement-invariant function space over a resonant measure space (,). Consider the function X : [0,()] t||E||X , where E is any measurable subset with (E) = t (notice that, if F , FE and (F) = (E), then F ~ E and they have the same norm).
[Adam] R.A. Adams, “Sobolev Spaces”, Academic Press, New York, 1975.
[deBo] C. de Boor, “Class notes for Math/CS 887, Spring’03”, http://www.cs.wisc.edu/~deboor.
[dBFi] C. de Boor and G.F. Fix, “Spline approximation by quasi-interpolants”, J. Approx. Theory 8 (1973), 19-45.
[BeSh] C. Bennet and R. Sharpley, “Interpolation of Operators”, Academic Press (1988), New York.
[BrLu] L. Brown and B. Lucier, “Best approximations in L1 are near best in Lp, p < 1”, Proc. Amer. Math. Soc. 120 (1994), 97-100.
[Bure] V.I. Burenkov, “Sobolev Spaces in Domains”, http://www.cf.ac.uk/maths/people/Sobol.pdf
[Cal1] A.P. Calderón, “Intermediate spaces and interpolation: the complex method”, Studia Math. 24 (1964), 113-190.
[Cal2] A.P. Calderón, “Spaces between L1 and L and the Theorem of Marcinkieiwicz: the complex method”, Studia Math. 26 (1964), 273-279.
[CDeH] A. Cohen, R. DeVore and R. Hochmuth, “Restricted Approximation”, Constr. Approx. 16 (2000), no. 1, 85-113.
[CuSc] H. B. Curry, I. J. Schoenberg, “On Pólya frequency functions. IV. The fundamental spline functions and their limits”, J. Analyse Math 17 (1966), 71-107.
[DeVo] R. DeVore, “Nonlinear Approximation”, Acta Numerica 7 (1998), 51-150.
[DeLo] R. DeVore and G. Lorentz, “Constructive Approximation”, Springer Grundlehren, Heidelberg, 1993.
[DeP1] R. DeVore and V. Popov, “Interpolation of Besov Spaces”, Trans. Amer. Math. Soc. 305 (1988), 397-414.
[DeP2] R. DeVore and V. Popov, “Interpolation spaces and nonlinear approximation”, Function Spaces and Applications (M. Cwikel et al., eds), Vol. 1302 of Lecture Notes in Mathematics, Springer, Berlin, 191-205.
[DeSh] R. DeVore and R. Sharpley, “Maximal Functions Measuring Smoothness”, Memoirs Vol. 293 (1984), American Mathematical Society, Providence, RI.
[Frie] A. Friedman, “Foundations of Modern Analysis”, Dover, New York, 1982.
[Peet] J. Peetre, “A Theory of Interpolation of Normed Spaces”, Course notes, University of Brasilia (1963).
[Petr] P. Petrushev, “Approximation by Ridge Functions and Neural Networks”, SIAM J. on Math Analysis, 30 (1998) 115-189.
[Torc] A. Torchinsky, “Real Variables”, Addison-Wesley, 1988.