Elements of Approximation Theory: Constructive Approximation and Examples

Elements of Approximation Theory:
Constructive Approximation and Examples

Francisco Blanco-Silva ^*

^*Department of Mathematics, Purdue University

Abstract

In this survey we introduce the general Theory of Approximation to functions in (quasisemi)normed spaces; the exposition starts with an explanation of the main problem: we impose certain family of subspaces as our approximants, and we need to obtain a description of the subspace(s) that are approximated by this family with a given approximation order. We introduce as well to some of the background and basic tools most often used to solve this kind of problems. Approximation Theory gets heavily improved when some efforts are put into the effective construction of the approximants on each given example, rather than simply stating its existence --this is what we call “Constructive Approximation”. The fact that we can handle actual functions, allows us to obtain yet more properties of the approximants. It is implicit throughout the exposition how Approximation Theory benefits from other branches of Mathematics, but also how Constructive Approximation can be used to prove results from those other subjects. Finally, we include extensive examples that help us better understand how all this can be achieved.

1 Approximation Theory
1.1 Introduction
1.2 Hardy’s Inequalities
1.3 Best and Near-Best Approximation
1.4 Interpolation Spaces: The Real Method
1.5 Jackson and Bernstein Inequalities
2 Dyadic maximal-smoothness Spline Approximation
2.1 Best and Near-Best L_p Polynomial Approximation in cubes of

^d
2.2 Markov’s Theorem
2.3 Divided Differences
2.4 Univariate Splines
  2.4.1 Definition and Basic Properties
  2.4.2 Quasi-Interpolant Operators
2.5 Tensor Product Splines: Description of the problem
2.6 Sobolev Spaces
  2.6.1 Mollifiers and Infinitelly Differentiable Partitions of Unity
  2.6.2 Distributions and Weak Derivatives
  2.6.3 Sobolev Spaces W_p^r(G) and H_p^r(G)
  2.6.4 Properties of Sobolev Spaces in one dimension
2.7 Modulus of smoothness and Besov Spaces
  2.7.1 Definitions and Properties
  2.7.2 Whitney’s Theorem
2.8 Other Seminorms for Besov Spaces
2.9 Further results: K-functional of compatible couples of Besov Spaces
2.10 Lorentz spaces
2.11 Further results: Interpolation of Besov Spaces
2.12 Further Results: Embedding Theorems for Besov Spaces
3 Approximation by Ridge Functions on Dyadic maximal-smoothness splines
3.1 General Theory of Ridge Functions
4 Appendix
4.1 Elements of Functional Analysis
  4.1.1 Linear Transformations
  4.1.2 The Hahn-Banach Theorem
4.2 Rearrangement-Invariant spaces
  4.2.1 Riesz Spaces
  4.2.2 Resonant measure spaces
4.3 To Do

1 Approximation Theory

1.1 Introduction

Let (X,||^.||_X) be a (quasisemi)normed linear space. Consider a countable family of spaces in X, {X_n}_n with associated error functionals E(^.,X_n)_X = inf _{gX_n}||^.-g||_X, satisfying the following properties:

Homogeneity: X_n = X_n for all n and
C-linearity: There exists C > 1 (independent of n) such that X_n + X_n = X_Cn for all n.

Local (near)best approximation: Given f

X, there exists an element of (near)best approximation to f from X_n for all n.

We say g Y is an element of best approximation to f X, if ||f - g||_X = E(f,Y )_X.

A near best approximation element to f from Y is by definition any function g such that ||f - g||_X < E(f,Y )_X for some value > 0. In that case, we will often refer to such a function g as a -near best approximation element.

Global best approximation: lim_nE(f,X_n)_X = 0 for all f X.

We will call {X_n} a family of approximants. For any such family, and given parameter values ,q > 0, consider the following (quasi)seminorms and associated subspaces:

{ } sum oo 1 a q 1/q |.|Aaq(X,Xn) = n (n E(.,Xn)X) for 0 < q < oo (1.1) n=1 a |.|Aa oo (X,Xn) = sunp>1{n E(.,Xn)X} (1.2) { } Aaq(X,Xn) = f (- X :| f|Aaq(X,Xn) < oo

We call these Approximation Spaces associated to the family of approximants {X_n}. They consist on those functions f

X that are approximated in X by elements of X_n with error of order O(

) (i.e. , there exists C > 0 such that E(f,X_n)_X < Cn^- for all n) and “smoothness” q.

Lemma 1.1 If the sequence (E(f,Xn)X) _n is monotone decreasing, then the (quasi)seminorms |^.|_{A_q(X,X_n)} are equivalent to the following:

{ } 1/q oo sum (ka )q |f| Aaq(X,Xn) )( 2 E(f,X2k )X for 0 < q < oo (1.3) k=0 |f|Aa oo (X,Xn) )( sup2kaE(f,X2k )X (1.4) k>0

Proof. For any k > 0 and any 2^k-1 < n < 2^k, we have the estimates

2k-1E(f,X k-1)X < naE(f, Xn)X < 2kE(f,X k)X; 2 2

therefore,

2 sum k-1 1(naE(f, Xn)X)q n=2k-1n 2k-1 < sum 1 (2kaE(f, X k) )q n=2k-1n 2 X k 2 sum -1--1- (ka )q < k-12k+1 2 E(f, X2k)X n(=2ka )q = 2 E(f,X2k)X ,

and similarly,

k 2 sum -1 1 a q ( (k-1)a )q n (n E(f,Xn)X) > 2 E(f,X2k-1)X . n=2k-1

Adding all the terms of the series that defines the seminorm associated to the spaces A_q(X,X_n), and applying the estimates above, we get the desired result. [#]

Lemma 1.2 Under the same hypothesis as before, and any value > 0, the following inclusion is verified for all 0 < q < p <:

Aa(X, Xn) < Aa(X,Xn). q p

Proof. This is just an application of the well known inclusions _q _p for 0 < q 2^kE(f;X_2^k)_X ⁺; then,

{ } 1/q oo sum ( ka )q (i ntegral q )1/q |f| Aaq(X,Xn) )( 2 E(f;X2k)X = Nf adm k=0

If f

A_q, then certainly

_q $/~\$

, and therefore;

(i ntegral )1/p |f| a )( fp dm < ||f || ||f || )( |f |a | f |a [#] A p(X,Xn) N a a l oo a lq A oo (X,Xn) Aq(X,Xn)

Remark. In the following sections we will learn to find descriptions of these spaces in terms of classical spaces. The main tools used in this sense are given in the following order:

The (quasi)seminorms (1.1), (1.2), (1.3) and (1.4) are discretizations of p-norms over ⁺ with the Haar measure dx/x. We have included some useful results related to these measures in §1.2.
§1.3 deals with the existence and properties of general (abstract) best and near-best approximations.
Both §1.4 and §1.5 introduce us to Interpolation Spaces and how these are related to our problem of approximation via Jackson and Bernstein’s inequalities.

1.2 Hardy’s Inequalities

Theorem 1 (Hardy) Given > 0 and 1 < q < , the following inequalities hold for each nonnegative measurable function :

integral oo ( -a integral t ds)q dt 1 integral oo [- a ]q dt t f(s)-s -t < aq- t f(t) -t (1.5) integral 0 oo ( integral 0 oo )q integral 0 oo ta f(s)ds dt < 1q- [taf(t)]q dt (1.6) 0 t s t a 0 t

For q =

, the integral is replaced by the L norms:

{ integral t ds} sup t-a f(s)-- < a1||t- af(t)||L oo (0, oo ) (1.7) t>0{ integral 0 oo s} sup ta f(s)ds < -1||taf(t)|| (1.8) t>0 t s a L oo (0, oo )

Proof. Let us prove the estimate (1.5). For any value > 0 we estimate first the interior integral using Hölder’s Inequality; let p be the conjugate exponent of q:

integral integral (i ntegral )1/q( integral )1/p tf(s)ds = ts- cf(s)sc-1ds < t[s-cf(s)]qds tsp(c- 1)ds 0 s 0 0 0

The second integral can be computed provided

< 1; in that case we obtain

integral t- p(1-c) 1-p(1-c)]t s ds = Cq,c s s=0 0

and if

> 1/q, we have integral

₀^ts^-p(1-)ds = C_q,t^1-p(1-). We can then estimate the left-hand side of (1.5) using this result and a change in the order of integration:

integral oo ( integral t )q t-aq f(s)ds dt 0 integral 0 s t integral oo -aq-1q/p-q+cq t[- c ]q < Cq,c 0 t t 0 s f(s) dsdt integral oo integral t[ ]q = Cq,c tq(c-a)-2 s-cf(s) ds dt integral 0 oo [ ] 0 integral oo = Cq,c s-cf(s)q tq(c-a)-2dtds 0 s

The latter integral can be computed provided

< 1/q +

; we get in that case

integral oo [ integral t ]q integral oo t-a f(s)ds dt < Cq,a,c [s-af(s)]q ds 0 0 s t 0 s

In order to get rid of the dependence of

in the constant, we may choose this parameter so that it depends solely on

and q, besides satisfying the constraints we have imposed. The obvious choice is

= 1/q +

/p, and in that case we get trivially C_q, =

^-q.

The remaining estimates can be obtained from this one by changes of variable or taking limits, so we skip their proofs. [#]

Remark. It is possible to discretize integrals of the kind ₀ [t-af(t)] ^q dt
t when the functions (t) are nonnegative and monotone, using the same technique we used in the proof of Lemma 1.1:

integral oo [ ]q dt sum integral 2k+1[ ]q dt sum t-af(t) t-= k t- af(t) -t )( 2aqkf(2-k)q. 0 k (- Z 2 k (- Z

Associated to this discrete functional, we have the following result, that will allow us to extend the previous theorem to 0 < p < 1 for nonnegative monotone functions.

Lemma 1.3 (Discrete Hardy’s Inequalities) Let a = (a_n)_n, b = (b_n)_n be two nonnegative sequences such that there exist C₀,, > 0 so that for all n, either

( )1/m { } 1/m sum oo m -nc sum n ( kc )m bn < C0 ak or bn < C02 2 ak k=n k=- oo

then, for all q > 0, and

> 0 (in the first case) or 0 <

(in the second),

sum sum (2nabn)q < Cq0Cq,a (2naan)q. n (- Z n (- Z

Proof. Let us assume that the first condition is satisfied. From the inclusions for 0 < < <, we infer that it must also be b_n < C₀ sum _k=na_k for all < . We can therefore assume that < q (if it is not, then we can certainly pick < q < ). We are now able to use Hölder’s Inequality; let 0 < < , and let r > 0 so that /q + /r = 1:

oo sum sum oo ( )m oo sum ( )m { sum oo ( )mq/m} m/q{ sum oo } m/r amk = 2bkak2-bk = 2bkak 2-bkm < 2bkak 2-bkmr/m ; k=n k=n k=n k=n k=n

therefore,

{ oo } 1/q{ oo } 1/r ( -bnr )1/r { oo } 1/q bn < C0 sum (2bkak)q sum 2-bkr = C0 -2------ sum (2bkak)q . k=n k=n 1- 2-br k=n

We have then

sum (2nab )q n (- Z n sum oo sum < C0Cb,q,m 2anq2-bnq (2bkak)q n (- Z k=n sum oo sum = C0Cb,q,m 2nq(a-b)2bkqaqk n (- Z k=n sum oo sum ( )q = C0Cb,q,m 2nq(a-b) 2bkak k (- Zn=k sum -2q(a--b)k--( bk )q = C0Cb,q,m 1- 2q(a- b) 2 ak n (- Z sum ( )q = C0Ca,b,q,m 2akak k (- Z

As in the proof of the previous Theorem, we may choose

depending only on

, and

depending solely on q, hence narrowing the dependence of the constants on parameters.

A similar proof serves to show that the second condition also gives the same estimate, but this time only for values 0 < < . [#]

Theorem 2 Given > 0 and 0 < q < 1, there exists C > 0 that depends at most on and q such that the following inequalities hold for each nonnegative monotone function :

integral oo ( - a integral t ds)q dt integral oo [ -a ]q dt t f(s)s- t- < C t f(t) t- (1.9) integral 0 oo ( integral 0 oo )q integral 0 oo ta f(s)ds dt < C [taf(t)]q dt (1.10) 0 t s t 0 t

Proof. Given t > 0, there exists n such that 2^-(n+1) < t < 2^-n; therefore,

integral t ds integral 2- n ds sum oo integral 2-k ds f(s)-s < f(s)-s < -(k+1) f(s)s 0 0 k=n 2 oo sum -k 2- k sum oo -k = f(2 ) log(s)]2- (k+1) = (log2) f(2 ). k=n k=n

Denote b_n = integral

₀^{2^-n}

(s)

, a_n =

(2^-n), and use the previous lemma to estimate the integral in the left-hand side of (1.9):

integral oo ( integral t )q t-a f(s)ds dt 0 0 -n s( t ) sum integral 2 - a integral t ds q dt = 2-(n+1) t 0 f(s) s t n (- Z integral - n sum ( a(n+1) )q 2 dt < 2 bn 2- (n+1) t n (- Z sum q = (log 2)2aq (2anbn) n (- Z ( use previous lemma ) < C sum (2ana )q q,an (- Z n

With the same technique, we find an upper estimate on the right-hand side of (1.9) in the same terms:

integral oo ( ) dt sum integral 2-(n-1)( ) dt log2 sum t-af(t)q --= t-af(t) q-- > -aq- (2anan)q. 0 t n (- Z 2-n t 2 n (- Z

This gives us the first estimate. For the second, a change of variables in this one suffices. [#]

1.3 Best and Near-Best Approximation

Definition 1 Given a (quasi)normed space (X,||^.||_X) and a subspace Y X, we say an operator : X --> Y is of best approximation if ||f - (f)||_X = E(f,Y )_X for all f X. Similarly, for a given > 0, we say is an operator of near-best approximation if ||f - (f)||_X < E(f,Y )_X for all f X.

Lemma 1.4 Let (X,d) be a metric space and let Y be a compact subset of X; then for each f

X there exists an element of best approximation to f from Y .

Proof. Consider for each n an element g_n Y such that d(f,g_n) < inf _gYd(f,g) + 1/n. Any sequence in a compact set has at least a limit element g₀ Y ; this element is the best approximation to f from Y by definition. [#]

Theorem 3 Let X be a (quasi)normed space. For each finite dimensional subspace X_d of X and each f X, there is a best approximation to f from X_d.

Proof. If X_d = {0} then there is nothing to prove. Otherwise, consider for each f X the set Y _f = {y (- Xd |||f- y||X < ||f||X} . Y _f is a closed set:

-1 Yf = F (- oo , ||f||X], where F : Y -) y '--> ||f - y|| X (- R

Y _f is also bounded, since for any y

Y _f,

||y||X = ||y- f + f||X < CX (||y- f||X + ||f||X) < 2CX||f||X

By Corollary 27.1 (page 101), it must be Y _f compact, and therefore a best approximation from Y _f exists. [#]

Lemma 1.5 Given (X,||^.||_X) a (quasi)normed space, and Z

X a linear subspace of X such that there exists an element of best approximation from it to every element in X. If x

X and z

Z is a near-best approximation to x from Z with constant

> 1, then for each y

X there is an element z'

Z of near-best approximation to y from Z such that z -z' is also near-best approximation to x-y from Z with constant

' depending at most on X and

Proof. Given x,y X, assume E(x-y,Z)_X < E(y,Z)_X (it’s not a lost of generality, since one can switch elements). Let z Z be a near-best approximation to x from Z, and z'',z''' Z elements of best approximation from Z to y - x and y respectively. We have then for z' = z'' + z,

' ||y- z ||X = ||y - x- z''+ x - z||X < CX {||y - x- z''||X + ||x- z||X} < C {E(y - x,Z) + tE(x,Z) } X X X''' '' < CX,t {E(y- x,Z)X + ||x -(z - z )||X} = CX,t {E(y- x,Z)X + ||x -y + y+ z''-z'''||X} < CX,t {E(y- x,Z)X + CX [||x- y +z''||X + ||y -z'''||X]} < C {E(x- y,Z) + E(y,Z) } X,t X X < CX,tE(y,Z)X [#]

Definition 2 A subset Y of a (quasi)normed space X is said to be a unicity space, if for each f

X there exists a unique element of best approximation from Y .

Example. The spaces L_p are unicity spaces for 1 < p < , but not for 0 < p < 1 nor p = . A characterization of spaces (at least normed) with the unicity property can be made through the use of “strict convexity”:

Definition 3 A normed space is said to be strictly convex if the following property holds:

Given f /= g with||f||X = ||g||X = 1, and 0 < c < 1, then ||cf + (1- c)g||X < 1

Lemma 1.6 If X is strictly convex, then the best approximation is unique.

Proof. Assume the result is not true, and there exists a function f X and two different elements g₁,g₂ Y such that ||f - g₁||_X = ||f - g₂||_X = E(f,Y )_X > 0. In that case,

|| || || ( ) ( )|| 1 = E(f,Y)X-< ---1----||||f - 1(g + g )|||| = |||| 1 -f---g1-- + 1 -f---g2- |||| < 1, E(f,Y)X E(f,Y)X || 2 1 2|| X || 2 E(f,Y)X 2 E(f,Y )X ||X

a contradiction. [#]

Proposition 1 Given a finite dimensional unicity subspace X_d

X, the operator of best approximation is continuous.

Proof. Let : X --> X the operator of best approximation, and let C_X > 1 be the constant in the (quasi)triangular inequality offered by the (quasi)norm ||^.||_X. Given > 0, let = (3C_X²)^-1. Notice that, if f,g X verify ||f - g||_X < , then we have

||f(f )- f(g)|| X = ||f(f)- f + f - g+ g -f(g)||X < C2 {||f(f)- f||X + ||f- g||X + ||g- f(g)||X} X2 < 3CX ||f - g||X < e [#]

1.4 Interpolation Spaces: The Real Method

Definition 4 A compatible couple is a pair of (quasisemi)normed linear spaces, (X,||^.||_X), and (Y,||^.||_Y) continuously embedded in a Hausdorff topological linear space H.
We define the sum and intersection of such a couple as

(X + Y,||.||X+Y ) with (quasisemi)norm ||f ||X+Y = f=ifnx+ffY {|| fX ||X + ||fY ||Y} fX (- X,fY (- Y (X /~\ Y,||.||X/~ \ Y) with (quasisemi)norm ||f ||X /~\ Y = max {||f||X,||f||Y}

Definition 5 Given a compatible couple (X,Y ), we say a linear operator T : X + Y -->

X + Y is admisible for the couple if (i) T(X)

X, and

_X is bounded. (ii) T(Y )

Y , and

_Y is bounded.

Definition 6 A (quasisemi)normed linear space (Z,||^.||_Z) is intermediate between X and Y (or for the couple given by those spaces), if (i) X $/~\$ Y

X + Y (ii) X $/~\$ Y is continuously embedded in Z: there exists C_XY > 0 such that ||f||_Z < C_XY||f||_XY for all f

X $/~\$ Y . (iii) Z is continuously embedded in X + Y : there exists C_X+Y > 0 such that ||f||_X+Y < C_X+Y||f||_Z for all f

An intermediate space Z of a compatible couple (X,Y ) is an interpolation space for the couple if T(Z) Z for all admisible operator T.

Remark. In the 70’s there were primaryly two methods for constructing interpolation spaces of a compatible couple: the complex method of Calderón [Cal1], and the real method of Lions and Peetre [Peet]. We are mainly interested in the latter, since it uses as building blocks similar quasi-seminorms to the ones in the description of the Approximation Spaces above.

Definition 7 We define the K-functional of a compatible couple (X,Y ) as follows:

K(f, t;X, Y) = f=ifnXf+fY{||fX||X + t||fY ||Y} fX (- X,fY (- Y

for each f

X + Y and t > 0.

Lemma 1.7 (Properties of the K-functional) Let (X,Y ) be a compatible couple: (i) K(f,^.;X,Y ) is a continuous subadditive convex-down monotone nondecreasing function of t that verifies K(f,t;X,Y ) = tK(f,q/t;Y,X), and K(f,nt;X,Y ) < nK(f,t;X,Y ) for all n

and t > 0. (ii) For each t > 0, K(^.,t;X,Y ) is a (quasi)seminorm equivalent to ||^.||_X+Y. (iii) Let (X,Y ) be a compatible couple and let T be an admisible operator; then for each f

X + Y and t > 0,

K(T f,t;X, Y) < M .K(f,t;X,Y ),

where M = max { }
|| T|X||B(X,X),||T |Y ||B(Y,Y)

Proof. Notice that, given f X + Y , the function K(f,^.;X,Y ) is trivially nonnegative and monotone nondecreasing. Its concavity is proved in the following way: Given t₁,t₂ > 0 and 0 < < 1, and any decomposition f = f_X + f_Y, we have, for t = t₁ + (1 - )t₂,

||f || +t||f || X X Y Y = [c + (1 - c)]|| fX ||X +[ct1 + (1- c)t2]||fY||Y = c{||fX||X + t1||fY||Y}+ (1 -c) {||f||X + t2||fY||Y} > cK(f,t1;X,Y )+ (1 - c)K(f, t2;X, Y)

(a more intuitive way to see that K(f,^.;X,Y ) is indeed a convex down function is to realize that we construct it as the infimum of lines with slopes ||f_Y||_Y and y-interceptions ||f_X||_X, where the previous functions come from the obvious decompositions of f in X + Y )
The identity K(f,t;X,Y ) = tK(f,1/t;Y,X) is also trivial:

( { } ) K(f,t;X,Y ) = inf {||f || + t|| f || }= t inf 1||f || + ||f|| . ffX= (- fXX,+fYfY (- Y X X Y Y ffX= (- fXx+,ffYY (- Y t X X Y

The subadditivity follows from the previous identity; let g(s) = K(f,1/s;X,Y ) = K(f,s;X,Y )/s. As K(f,s;X,Y ) is nondecreasing, g must be nonincreasing, and therefore, if t₁,t₂ > 0

K(f, tk;X, Y) K(f,t1 + t2;X, Y) -----t------ > -----t+-t------- for k = 1,2 k 1 2 K(f,t1;X, Y) + K(f,t2;X, Y) > t1K(f,t1 +-t2;X,Y-)+ t2K(f,t1 +-t2;X,-Y) t1 + t2 t1 + t2 = K(f, t1 + t2;X,Y )

From the last estimate, we find K(f,nt;X,Y ) < nK(f,t;X,Y ) trivially. The continuity also follows from the subadditivity, since given h > 0,

K(f,t+ h;X, Y)- K(f, t;X, Y) < K(f,h;X,Y ).

To prove property (ii), we realize first that both nonnegativity and homogeneity of K(^.,t;X,Y ) are trivial. Let C = max{C_X,C_Y} > 0 (associated to the constants from the definition of the (quasisemi)norms of X and Y ); then, for each f,g

X + Y and any decomposition f = f_X + f_Y, g = g_X + g_Y:

K(f +g,t;X,Y ) < ||fX + gX||X + t||gX + gY||Y < C {(||fX||+ t||fY||Y )+ (|| gX ||X + ||gY ||Y )}

The equivalence of K(^.,t) with ||^.||_X+Y comes from:

f=finXf+fY{||fx||X + t|| fY ||Y}< f=ifnXf+fYmax{1,t}{||fx||X + ||fY||Y}= max{1,t}||f ||X+Y fX( - X,fY (- Y fX (- X,fY (- Y

The other estimate is similar.
Let us prove now (iii): Given f

X + Y , decompose f = f_X + f_Y and notice that

||T(fX)||X + t|| T (fY )|| Y < M (|| fX ||X +t||fY||Y );

therefore, K(Tf,t;X,Y ) < M (|| fX ||X + t|| fY||Y )

, which proves the desired result. [#]

Lemma 1.8 Given a compatible couple (X,Y ), and parameters 0 <

< 1, 0 < q <

, the following functionals are (quasi)seminorms in X + Y :

{ integral } oo (- h )q dt 1/q |f |(X,Y )h,q = 0 t K(f, t;X, Y) t for 0 < q < oo |f| = sup {t- hK(f,t;X, Y)} (X,Y)h, oo t>0

Proof. |^.|_{(X,Y )_,q} is trivially linear and nonnegative. The (quasi)triangular inequality is directly inferred from part (ii) in Lemma 1.7. [#]

Proposition 2 Given a compatible couple (X,||^.||_X), (Y,||^.||_Y), and parameters 0 < < 1, 0 < q <, the (,q) spaces

{ } (X, Y)h,q = f (- X +Y :| f|(X,Y)h,q < oo

are interpolation spaces.

Proof. This is inmediate from (iii) in Lemma 1.7. [#]

Lemma 1.9 Given a (quasisemi)normed space (X,||^.||_X), and a continuously embedded subspace Y X with ||f||_Y < C_Y||f||_X for some C_Y > 0, and all f Y : (i) The K-functional of the compatible couple (X,Y ) can also be written as

K(f, t;X, Y) = inf {||f- g||X + t||g||Y}. g (- Y

(ii) The (quasi)seminorms |^.|_{(X,Y )_,q} are equivalent, for each r > 0, to the following discretizations:

{ oo } 1/q |f| )( sum (2khrK(f,2kr;X, Y))q for 0 < q < oo (X,Y)h,q k=0 khr kr |f|(X,Y)h, oo )( sukp>0{2 K(f,2 ;X,Y )}

Proof. Part (i) is trivial. As for part (ii), we start noticing that

{ integral 1(- h )q dt} 1/q |.|(X,Y)h,q )( 0 t K(f, t;X, Y) t ,

since for each g

Y , K(f,t;X,Y ) <||f||_X < C {||f- g||X + ||g||X}

< C

, and therefore, K(f,t;X,Y ) < C K(f,C_Y;X,Y ) for all t > 0. So, we can estimate (for 0 < q <

)

integral 1( )q dt integral 1( )q dt integral oo t-hK(f,t;X,Y ) --< |f| q(X,Y)h,q < t-hK(f,t;X,Y ) --+ K(f,CY ;X,Y )q t- h-1dt 0 t 0 t 1

Now, due to the monotonicity of the K-functional, we can discretize the previous integral, with an argument similar to the one employed in the proof of Lemma 1.1 (page 5). [#]

Remark. Notice how similar this (quasi)seminorms are to the ones in Lemma 1.1. One of the tricks in Approximation Theory is, given (X,||^.||_X) and a family of approximants {X_n}_n, find a continuously embedded (quasisemi)normed subspace (Y,||^.||_Y)

X so that the values E(f,X_n)_X can be estimated in terms of the K-functionals K(f,2^m;X,Y ) and viceversa. In §1.5 we outline some results that help in this sense.
Another important result related to K-functionals and the (

,q) interpolation spaces is the Reiteration Theorem, that states that no advantage is gained when applying succesive interpolation to a given compatible couple.

Theorem 4 (Holmsted (1970)) Let (X₁,||^.||_X₁), (X₂,||^.||_X₂) be a compatible couple of (quasi)normed spaces, let 0 < ₁ < ₂ < 1 and 0 < q₁,q₂ < , and consider the interpolation spaces Y ₁ = (X₁,X₂)_₁,q₁, Y ₂ = (X₁,X₂)_₂,q₂; then, for f Y ₁ + Y ₂ and = ₂ - ₁, we have the following equivalence:

( integral t )1/q1 (i ntegral oo )1/q2 K(f, td;Y1,Y2) )( [s-a1K(f,s;X1,X2)]q1 ds + td [s-a2K(f,s;X1,X2)]q2 ds , 0 s t s

with constants of equivalency independent of f and t. The usual change from integral to a supremum applies when either q₁ or q₂ are infinite.

Theorem 5 (Peetre’s Reiteration Theorem (1963)) Under the same hypothesis as the previous Theorem, and for any 0 <

< 1, 0 < q <

, we have (Y ₁,Y ₂)_,q = (X₁,X₂)_',q, where

' = (1-

)

₁+

₂.

1.5 Jackson and Bernstein Inequalities

Definition 8 Given a (quasisemi)normed space (X,||^.||_X), and (quasisemi)normed continuously embedded subspace (Y,|^.|_Y) X, Jackson Inequality: We say the family of approximants {X_n}_n verifies a Jackson Inequality with respect to Y if there exist r,C > 0 such that

E(f,Xn)X < Cn -r|f| Y for all f (- Y

Bernstein Inequality: We say the family of approximants {X_n}_n verifies a Bernstein inequality with respect to Y if there exist r,C > 0 such that

r |gn| Y < Cn || gn||X for all gn (- Xn

Remark. In the literature of Approximation Theory, results that state Jackson’s Inequalities are refered as “direct theorems”, whereas Bernstein’s inequalities are also identified as “inverse theorems”.

Proposition 3 Given a (quasisemi)normed space (X,||^.||_X), a family of approximants {X_n}_n satisfying both Jackson and Bernstein inequalities with respect to a (quasisemi)normed continuously embedded linear subspace (Y,|^.|_Y) X, the following estimates hold:

-r E(f,Xn)X < C K(f,n ;X,Y ) (1.11) -nr -nr n+ sum 1 kr K(f,2 ;X,Y ) < 2 C 2 E(f,X2k -1)X (1.12) k=1

Proof. To prove (1.11), given f X, consider any g Y and a best approximation g_n to g in X from each X_n; then,

E(f,Xn)X < ||f- gn||X < C{||f - g||X + ||g - gn||X} = C{||f - g|| + E(g,X ) } { X - rn X} < C ||f- g||X + Cn | g| Y ;

therefore proving property (1.11) (use part (i) of Lemma 1.7, page 18). To prove (1.12), denote g_k the best approximation to f from X_2^k, and

_k = g_k - g_k-1. We know that there exists a constant C > 0 such that X_n + X_n = X_Cn for all n; this means in particular that

X_2^kC for all k, and by the Bernstein property, |

_k|_Y < 2^krC||

_k||_X. But now,

||yk||X = ||gk- gk-1||X < C (|| gk - f||X + ||gk-1 - f||X) < C E(f,X2k-1)X;

We have then the estimate |

_k|_Y < 2^krC E(f,X_2^k-1)_X, which we can use to estimate the K-functional:

K(f, 2-nr;X,Y )X < ||f- g || + 2-nr| g | n X n|| Yn || = E(f,X n) + 2-nr|| sum y || 2 X |k=0 k| n Y < E(f,X2n)X + 2-nrC sum |yk|Y k=1 (since y0 = g0 = 0) n sum < E(f,X2n)X + 2-nrC 2krE(f,X2k- 1)X k=1 n+ sum 1 < 2-nrC 2krE(f,X2k -1)X [#] k=1

Remark. This last result allows the link we were looking for between Interpolation and Approximation Theories. The following three results give a very good example: Corollary 3.1 tells us that the problem of characterizing approximation spaces by means of interpolation spaces is solved if we know two ingredients:

An appropriate (quasisemi)normed linear subspace (Y,|^.|_Y) X for which the family of approximants {X_n}_n verifies the Jackson and Bernstein inequalities.
A characterization of the interpolation spaces (X,Y )_,q.

The second step is often provided by classical results in the Theory of Interpolation. The first step is the difficult one from the viewpoint of approximation. Theorem 6 provides a good start. Finally, Theorem 7 (proof not offered here, read it in [CDeH]) provides somehow an inverse result to Corollary 3.1.

Corollary 3.1 If the family of approximants {X_n}_n satisfies the Jackson and Bernstein inequalities with respect to Y , and exponent r > 0, and the sequence of errors E(f,X_n) is monotone decreasing, then for each 0 < < r and 0 < q <, A_q(X,X_n) = (X,Y )_/r,q with equivalent norms.

Proof. Estimate (1.11) gives us A_q(X,X_n) (X,Y )_/r,q trivially; for example, for 0 < p < , and r that makes good both Jackson’s and Bernstein’s Inequalities, if f (X,Y )_/r,q, then

oo sum gn q sum oo ( nr(g/r) - nr )q (2 E(f,X2n)X) < 2 K(f,2 ;X, Y) < oo n=0 n=0

On the other hand, we may use estimate (1.12) into Lemma 1.3 (page 10) to realize the other inclusion: Let b_n = K(f,2^-nr;X,Y ), a_n = E(f,X_2ⁿ)_X, and 0 <

< r; we have then

sum oo ( )q 2nr(g/r)K(f,2-nr;X,Y ) n=0 sum oo q = (2gnbn) n=0 sum oo gn q < C (2 an) n= oo 0 = C sum (2gnE(f,X n) )q [#] n=0 2 X

Theorem 6 (DeVore, Popov) For any (quasisemi)normed space (X,||^.||_X) and family of approximants {X_n}_X such that X_n X_n+1 for all n, as well as for any r > 0 and 0 0, with respect to Y = A_p^r(X,X_n). Therefore, for any 0 < < r and 0 < q <, we have

( ) Aaq(X, Xn) = X, Arp(X, Xn) a/r,q.

Proof. It is enough to show that {X_n}_n verifies both Jackson and Bernstein’s Inequalities for the exponent r > 0 with respect to A_p^r(X,X_n):

Let f A_p^r(X,X_n); we know that there exists C > 0 such that $oo sum 2krpE(f,X2k)p < C |f|Ar(X,X ) k=0 X p n$ Given n, find k such that 2^k < n < 2^k+1; then, $p p -(k+1)rp p -rp p E(f,Xn) X < E(f,X2k+1)X < 2 C |f|Arp(X,Xn) < Cn | f|Arp(X,Xn).$
Given g_n X_n, we have
$p |gn|Arp(X,Xn) sum oo p = 1k (krE(gn,Xk)X) k=1 n sum - 11 r p < k (k | gn|X) k=1 p n sum -1 rp-1 = |gn|X k k=1 p n sum -1 rp-1 < |gn|X (n- 1) k=1r p = [(n- 1) | gn|X] < (nr|gn| X)p .$

The rest of the statement follows inmediatelly. [#]

Theorem 7 (Cohen, DeVore, Hochmuth) Let X,Y,{X_n}_n be as before, and suppose that {X_n}_n satisfies the Jackson and Bernstein inequalities for r > 0. Suppose further that the sequence of operators {T_n}_n verifies: (i) T_n : X --> X_n (not necessarily linear). (ii) There exists C > 0 such that ||f - T_nf||_X < C E(f,X_n)_X for all f X. (iii) |T_nf|_Y < C|f|_Y for all n and f X.

Then, {T_n}_n realizes the K-functional; that is,

||f - Tnf||X + n-r|Tnf| Y < C K(f,n -r;X, Y) for all f (- X. []

2 Dyadic maximal-smoothness Spline Approximation

In this section we want to exemplify how to obtain the approximation spaces in the following case: Given the unit cube _O_ ^d, X = L_p( _O_ ) for any choice of 0 < p <, and X_n is a linear space of box-splines with coordinate order r, maximal smoothness, and associated to the dyadic n-th partition of the cube _O_ .
In the search for the spaces A_q(X,X_n), we will go through different levels of abstraction: from the low-level construction of best and near-best polynomial approximation to functions in L_p( _O_ ) on cubes, to the high-level description of the K-functionals that will lead us into further results involving interpolation spaces for compatible couples of Besov spaces. The logic step-by-step exposition is summarized in the following table:

Basic Pre-requisites:
- Construction and results on different norm estimations related to best and near best L_p() polynomial approximation in subcubes (§2.1).
- Markov’s Theorem on norm extimation of derivatives of polynomials over compact intervals of the real line (§2.2).
Construction of the approximants and projectors:
- Divided differences: definition and basic results (§2.3).
- Univariate splines: definition, properties and description of the quasi-interpolants as projectors associated to the construction of the puB-spline basis (§2.4).
- Multivariate tensor-product puB-splines associated to dyadic partitions of the unit cube , the linear spaces spanned by them, and the generalized quasi-interpolant for these spaces (§2.5).
Search for good candidates for approximation spaces:
- Sobolev Spaces (§2.6).
- Besov Spaces (§2.7).
Solution of the problem of approximation and related results:
- Construction of equivalent seminorm functionals for Besov Spaces that link them with the approximation spaces we are looking for (§2.8).
- Interpolation of compatible couples of Besov Spaces (sections 2.9, 2.10 and 2.11).
- Embedding Theorems for Besov Spaces (§2.12).

2.1 Best and Near-Best L_p Polynomial Approximation in cubes of ^d

Lemma 2.1 Given r > 0, a cube _O_ ^d and 0 < q 0 depending at most on p, q and d such that

( integral )1/q ( integral )1/p ( integral )1/q -1- |g|q < -1- |g|p < C 1-- |g| q |_O_| _O_ |_O_| _O_ |_O_ | _O_

for all g

(r).

Proof. Consider for all p > 0 the (quasi)norms |||^.|||_{L_p()} = | _O_ |^-1/p||^.||_{L_p()} in (r), and apply Theorem 27 (page 101). [#]

Lemma 2.2 Given r > 0, cubes I J ^d such that |J|< |I| for some > 0, and 0 < q <, there are constants C₁,C₂ > 0 depending at most on q and d such that

( 1 integral )1/q ( 1 integral )1/q ( 1 integral )1/q C1 |I|- |g|q < |J|- |g|q < C2 |I|- |g|q

for all g

(r). In particular,

||g||Lq(J) < c1/qC2 ||g||Lq(I).

Proof. Consider the (quasi)norms ||^.||_{I,L_q()} = |I|^-1/q||^.||_{L_q()} in (r) and apply Theorem 27 again. [#]

Lemma 2.3 Let _O_ be a cube in ^d, and f L_p( _O_ ). If g (r) is a near-best L_q( _O_ ) approximation to f for any 0 < q < p, then it is also a C near-best L_p( _O_ ) approximation to f, for some C > 0 that depends on d, p, q, r and , but does not depend on the size of _O_ .

Proof. Let P be the best L_p( _O_ ) approximation element to f from (r); then we have:

||f - g||Lp(_O_() ) < Cd,p ||f - P||Lp(_O_) + ||P- g||Lp(_O_) (apply Lemma 2.1) ( 1/p-1/q ) < Cd,p E(f, TT(r);_O_)p + Cd,q,r|_O_| ||P- g||Lq(_O_) ( 1/p-1/q [ ]) < Cd,p,q,r E(f,TT(r);_O_)p + |_O_ | ||P - f||Lq(_O_) + ||f- g||Lq(_O_) < C (E(f,TT(r);_O_) + |_O_ |1/p-1/q [||f - P|| + tE(f,TT(r);_O_) ]) d,p,q,r( p Lq(_O_) ) q < Cd,p,q,r E(f,TT(r);_O_)p + 2max(1,t)|_O_|1/p-1/q|| f - P||L (_O_) q (apply Ho¨l(der’s Inequality or Lemma 2.1) again) < Cd,q,p,r,t E(f,TT(r);_O_)p + ||f- P ||Lp(_O_) = Cd,q,p,r,tE(f,TT(r);_O_)p. [#]

Lemma 2.4 Let I J be cubes in ^d such that |J|< |I| for some a > 0. Let f L_p(J), and g (r) a near-best L_q(I) approximation to f for any 0 < q < p. Then g is also a C near-best L_p(J) approximation to f, where C > 0 depends on , , d, p and q.

Proof. Let P be the best L_p(J) approximation element to f from (r). First, notice that for any cube I J,

||P - g||Lp(I)( ) < Cp,d ||f - P||Lp(I) + ||f- g||Lp(I) (apply Lemma 2.1) ( 1/p-1/q ) < Cp,d ||f- P ||Lp(I) + Cd,q,r| _O_| ||f- g||Lq(I) ( 1/p+1/q ) < Cd,p,q(||f- P ||Lp(I) + |_O_| tE(f, TT(r);I)q) < Cd,p,q ||f- P || + |_O_| 1/p+1/qt||f - P|| Lp(I) Lq(I) (apply(Lemma 2.1 again) ) < Cd,p,q ||f- P ||Lp(I) + t||f- P ||Lp(I) = Cd,p,q,t||f- P || Lp(I) < Cd,p,q,t||f- P ||Lp(J) = Cd,p,q,tE(f,TT(r);J)p;

This estimate is all we need to finish the proof:

||f- g||Lp(J() ) < Cp,d ||f- P ||Lp(J) + ||P - g||Lp(J) (apply Lemma 2.2) ( 1/p ) < Cp,d E(f,TT(r);J)p + c Cp,d||P - g||Lp(I) < Cd,p,q,t,cE(f, TT(r);J)p. [#]

2.2 Markov’s Theorem

Theorem 8 (Szegö (1928)) For each trigonometric polynomial T_r of order r,

T '(x)2 + r2Tr(x)2 < r2||Tr||L (T) for all x (- T r oo

Corollary 8.1 (Bernstein’s Inequalities) ||T'_r||_L() < r||T_r||_L(). ||T_r^(k)||_L() < r^k||T_r||_L().

Corollary 8.2 For an algebraic polynomial P_r

(r), and all x

(-1,1),

|P'r(x)| < r||P V~ r||L oo [--1,1]. 1- x2

Corollary 8.3 For an algebraic polynomial P_r

(r) of order r with complex coefficients on the disk D = {z

: |z|< 1},

||P'r||L oo (D) < r||P ||L oo (D).

Theorem 9 (Markov) For an algebraic polynomial P_r

(r),

||P'r||L oo [- 1,1] < r2|| Pr||L oo [-1,1].

2.3 Divided Differences

Definition 9 Given a function f : --> and a finite collection of real numbers {t₀,t₁,...,t_n}, we denote with | $/_\$ (f;t₀,t₁,...,t_n) the leading coefficient of the polynomial of degree n that interpolates f at t₀,...,t_n. We call it the n-th divided difference of f. Divided differences are computed recursively as follows:

/_\ |(f ;t0) = f(t0) | /_\ (f;t0,t1) = f(t1)--f(t0) if t0 /= t1 t1- t0 | /_\ (f;t ,. k~..,t ) = f(k-1)(t0) if this derivative exists 0 0 (k- 1)! /_\ |(f-;t1,...,tn)- /_\ |(f-;t0,...,tn-1) | /_\ (f;t0,...,tn- 1,tn) = tn- t0

Lemma 2.5 (Newton’s Interpolation polynomial) Given a function f :

, and knots t₀ < ... < t_n, the interpolation polynomial of f at those knots, P_f(x;t₀,...,t_n) can be written in terms of the divided differences as follows:

sum n Pf(x;t0,...,tn) = /_\ |(f ;t0,...,tk)(x- t0)...(x - tk). k=0

Proof. Given f : --> , consider the following interpolation polynomials for each k = 0,...,n - 1: Q_k(x) = P_f(x;t₀,...,t_k) (k), Q_k+1(x) = P_f(x;t₀,...,t_k+1) (k + 1). Notice that g = Q_k+1 - Q_k (k + 1) vanishes at the knots t₀,...,t_k, and by definition its leading coefficient is the divided difference | $/_\$ (f;t₀,...,t_k+1); hence,

Pf(x;t0,...,tk+1) = Pf(x;t0,...,tk)+ | /_\ (f;t0,...,tk+1)(x- t0)...(x - tk). [#]

Lemma 2.6 If f

Cⁿ[a,b] and a < t_k < b for all k = 0,...,n, then there exists

(a,b) such that | $/_\$ (f;t₀,...,t_n) = 1-
n!

f⁽ⁿ⁾(

Proof. This is a direct consequence of Rolle’s Theorem. [#]

Lemma 2.7 (Leibnitz Formula for Divided Differences) Given functions f,g and knots t₀, ..., t_n, the n-th divided difference of their product is given by the following formula:

sum n /_\ |(f g;t0,...,tn) = /_\ |(f ;t0,...,tk) /_\ |(g; tk,...,tn) k=0

(2.1)

Proof. Assume all the knots are different; consider the polynomial of interpolation of h = fg at those knots (using Newton’s expression with the divided differences as coefficients).

Ph(x;t0,...,tn) = Pf (x;t0,...,tn)Pg(x;t0,...,tn) [n sum ] = | /_\ (f;t0,...,tk)(x- t0)...(x - tk) k=0[ ] sum n × | /_\ (g;tl+1,...,tn)(x- tl+1)...(x- tn) (2.2) l=0

Notice now that the leading coefficient of P_h(x;t₀,...,t_n) is | $/_\$ (fg;t₀,...,t_n), and the leading coefficient of the expression in (2.2) is the right-hand side of (2.1). [#]

2.4 Univariate Splines

2.4.1 Definition and Basic Properties

Definition 10 (Schoenberg spaces: knots-multiplicity form) Given a interval A = [a,b] in , we define initially spaces of splines in the following way: Fix r > 0 and let {a < t₁ < t₂ < < t_n < b} be a partition of the interval, and associated to these knots, multiplicities 0 < m_k < r. We denote t = (t₁,...,t_n), m = (m₁,...,m_n), and

{ (r-m -1) } Sr(t,m;A) = f : A --> R : f |(tk,tk+1) (- TT(r),f|{tk} (- C k for all k

The multiplicity m_k indicates the degrees of freedom (associated to polynomials with order r) on each knot t_k; hence, r-m_k-1 gives the smoothness of the spline function at those points (-1 meaning discontinuity).

Classically, these are called Schoenberg spaces on A.

Remarks. For instance, m = 1 gives one degree of freedom: the location of the image of f(t). In that case, the smoothness of f at t is r - 1, which is the maximum possible degree. In particular, this shows that (r) = S_r(t,1;A), where 1 = (1,...,1).
On the other hand, if m = r, then we have all possible degrees of freedom: we can choose location, and all derivatives (both sides); this leaves us piecewise polynomials with possible discontinuities on each knot.
With a slight abuse of notation, we can write S_r(t,r;A) = _k=1ⁿ(r)|_{(t_k,t_k+1)}, where r = (r,...,r).

Proposition 4 The space S_r(t,m;[a,b]) has the basis

1 S0,j(x) = j!(x - a)j for j = 0,...,r- 1 1 Sk,j(x) = --(x - tk)j+ for j = r- mk, ...,r- 1, k = 1,...n, j!

where x₊ = x

_{x>0} denotes truncated powers. The associated dual functionals are as follows:

(j) a0,j(S) = S (a)+ - ak,j(S) = S(j)(tk)- S(j)(tk).

In particular, dimS_r(t,m;[a,b]) = r + |m| = r + sum

_k=1ⁿm_k.

Definition 11 (Schoenberg spaces: basic knots form) Given an interval A = [a,b] in the real line, r > 0 and an increasing sequence of knots t = {a < t₁ <

< t_n < b}, where t_k < t_k+r for all k, we define the Schoenberg space S_r(t;A) to be the space of splines of order r with knots given by the partition generated by t, and multiplicities given by the number of repetitions of each knot in the sequence.

Example. Consider A = [0,1] and the basic knot sequence t = {0,0,0,1/2,1/2,/12,1,1,1}. In this case, we have

Sr(t;[0.1]) = Sr({0,1/2,1},{3,3,3};[0,1]).

Definition 12 (puB-splines) If t_k << t_k+r is a sequence of r + 1 knots with t_kt_k+r, we define the puB-spline N_k,r as follows:

Nk,r(x) = N (x|tk,...,tk+r) = (tk+r- tk)| /_\ ((.- x)r+-1;tk,...,tk+r)

Lemma 2.8 (Properties of puB-splines) (i) N is a spline function. (ii) supp(N_k,r) = [t_k,t_k+r] (iii) Recurrence formula:

--x---tk--- --tk+r--x-- N (x |tk,...,tk+r) = tk+r- 1- tk N(x|tk,...,tk+r-1)+ tk+r- tk+1N (x| tk+1,...,tk+r)

Proof. Notice that N is by definition a linear combination of truncated powers (t_j - x)₊^r-k_j, where k_j is the number of repetitions t_i = t_j for i < j; therefore, it is a spline function. Furthermore, since any r-th order divided difference of a polynomial of degree r - 1 is zero, N_k,r vanishes identically when x < t_k and x > t_k+r (think leading coefficients).

The recurrence formula is a direct consecuence of the recurrence formula for divided differences and Leibnitz formula for the divided difference of a product of two functions:

Nk,r(x) = (t - t) /_\ |((.- x)r- 1;t ,...,t ) k+(r kr-1 + k) k(+r r-1 ) = | /_\ ((.- x)+ ;tk+1,...,tk+r - /_\ |)(.-x)(+ ;tk,...,tk+r-1 ) = | /_\ (.- x)r+-2(.-x);tk+1,...,tk+r - /_\ | (.- x)(.- x)r+-2;tk,...,tk+r-1 = | /_\ ((.- x)r-2;t ,...,t )+ (t - x)| /_\ ((.-x)r-2;t ,...,t ) + (k+1 r-2k+r-1 k+r) ( r-+2 k+1 k+r ) - (tk- x) /_\ | (.- x)+ ;tk,...,tk+r- 1 - /_\ | (.- x)+ ;tk+1,...,tk+r-1 ;

therefore, proving our last statement. [#]

Remarks. On his classnotes [deBo], Carl de Boor expresses the previous recurrence formula in the following way:

Nk,1 = x [tk,tk+r) Nk,r = hk,rNk,r- 1 + (1- hk+1,r)Nk+1,r- 1

where

hk,r(x) = --x--tk--. tk+r-1 - tk

This allows computation of puB-splines following a Horner’s scheme-like method, and actually it can be used as startpoint of the development of the theory of B-splines, rather than using divided differences.

Theorem 10 (Curry, Schoenberg (1966), de Boor, Fix (1973)) Given a < b, r > 0, basic knots t = {a < t₁ << t_n < b} and 2r auxiliary knots {t_1-r << t₀ < a}, {b < t_n+1 << t_n+r}, the puB-splines N_k,r(x) = N(x|t_k,...,t_k+r) for k = 1 - r,...,n form a basis of S_r(t;[a,b]).

Proof. Although the result was first proved by Curry and Schoenberg [CuSc] in 1966, we will offer here a different proof by de Boor and Fix [dBFi], based on the Marsden identities:

Step 1. The restriction to the interval [a,b] of the previously defined puB-splines N_k,r(x) gives a partition of unity:

n sum Nk,r(x)| [a,b] = x k=1- r [a,b]

This is a direct consequence of the recurrence formula for puB-splines.

sum n Nk,r| [a,b] k=1-r sum n [ ] = hk,rNk,r-1| [a,b] + (1 - hk+1,r-1)Nk+1,r-1|[a,b] k=1-r n sum = h1-r,rN1 -r,r-1| [a,b]+(1- hn+1,r- 1)Nn+1,r-1|[a,b]+ Nk,r-1|[a,b] --0 in [a,b] --0 in [a,b] k=2- r n = sum N | k=2-r k,r-1[a,b]

We can therefore use induction on r, since for r = 1 we have trivially:

n n sum Nk,1| [a,b] = sum x = x k=0 k=0 [tk,tk+1) /~\ [a,b] [a,b]

Step 2. Marsden identities: For any

n --1---(q- x)r-1| = sum g (q)N (x)| , (r - 1)! [a,b] k=1- r k,r k,r [a,b]

where for all k = 1 - r,...,n,

gk,1(x) = 1 ---1--- gk,r(x) = (r- 1)!(x - tk+1)...(x - tk+r-1) for r > 1.

We will prove this statement by induction on r: It has been proved true for r = 1 above. Assume the property holds up to r - 1; then,

n sum gk,r(q)Nk,r| [a,b] k=1-r sum n [ ] = gk,r(q)hk,rNk.r-1|[a,b] + (1- hk+1,r)Nk+1,r- 1| [a,b] k=1-r sum n = [gk,r(q)hk,r + gk-1,r(q)(1- hk,r)]Nk,r-1|[a,b] k=2-r

Let us rewrite the coefficients of each puB-spline in the previous expression:

gk- 1,r(q) +[gk,r(q)- gk- 1,r(q)]hk,r(x) = gk-1,r(q)+ gk,r-1(q)(tk- tk+r-1)hk,r(x) t - x = gk-1,r(q)+ -k---gk,r-1(q) r - 1 = q--tkgk,r- 1(q)+ tk--x-gk,r-1(q) r- 1 r- 1 = q--x-gk,r-1(q); r - 1

therefore,

n sum q---x sum n (q--x)r-1 gk,r(q)Nk,r(x)|[a,b] = r- 1 gk,r- 1(q)Nk,r- 1(x)| [a,b] = (r- 1)! k=1- r k=2-r

Step 3. puB-spline series of polynomials in

(r): For any

, and any polynomial P

(r), we can write

n sum r sum -1 P = gk,r(P )Nk,r| [a,b], where gk,r(P) = (-1)ng(r- n- 1)(q)P (n)(q). k=1- r n=0 k,r

The functionals

_k,r are called de Boor-Fix functionals.

Notice first that the previous Marsden’s identities can be completed with the expression of any power ( -x) for < r - 1 by differentiation:

n (q---x)- n! ( ) = Dr-1-n (q--x)r-1 (r- 1)! ( sum n ) = Dr-1-n gk,r(q)Nk,r(x)| [a,b] k=1-r sum n = g(r-n-1)(q)Nk,r(x)| [a,b]. k=1-r

Given now any polynomial P

(r), consider any value

and the Taylor expansion of P around

r sum -1 (x- q)n r sum -1 sum n (r-n-1) P (x) = P(n)(q)---n!-- = (- 1)nP (n)(q) gk,r (q)Nk,r(x)| [a,b], n=0 n=0 k=1- r

which, after rearrangement, gives the desired coefficients.

It only remains to prove that a different choice of leads to the same coefficients, and therefore the functionals _k,r : (r) --> do not depend on this choice; let ', and write each derivative P() in Taylor expansion around ':

r sum -1 ' n-j P (n)(q) = P(j)(q')(q---q)---. j=n (n- j)!

Notice that then,

gk,r(P ) r sum -1 n (r-n-1) (n) = (- 1)gk,r (q)P (q) n=0 r sum -1 n (r-n-1) r sum -1 (j) ' (q'--q)n-j- = (- 1)gk,r (q) P (q ) (n- j)! n=0 j=n r sum -1 j (j) ' r sum -1 (r- n- 1) (q'--q)j-n- = (- 1)P (q ) gk,r (q) (j- n)! j=0 n=j r sum -1 j (r-j- 1) ' (j) ' = (- 1)gk,r (q)P (q ). j=0

Step 4. puB-spline series of truncated powers: For any basic knot t_j and a power

for 0 <

< r -m_j (being m_j the multiplicity of the knot t_j),

n n sum (x--tj)+-= (- 1)n g(rk,-rn-1)(tj)Nk,r(x)| [a,b]. n! k=j

This is a direct consequence of the previous step; for any basic knot t_j, it must be 1 < j < n, and we can write:

(x - tj)n+ ---n!--- (tj -x)n = (-1)n---n!--x(tj,b] sum n = (-1)n g(kr-,r n- 1)(tj)Nk,r(x)|[a,b] /~\ (tj,b] k=1-r sum n = (-1)n g(kr-,r n- 1)(tj)Nk,r(x)|[a,b], k=j

as we wanted to prove.

Conclusion: We have expressions of every basic element of S_r(t;[a,b]) in terms of the constructed puB-splines. This means that they span the Schoenberg space, and because of their cardinality, they must form a basis of the space. [#]

Remark. Consider the dual functionals associated to the basis of S_r(t;[a,b]) given by the puB-splines; let us denote them _k,r. There are different ways of expressing these functionals; de Boor and Fix offer the most useful for our purposes: For each k = 1 - r,...,n, choose _k,r supp(N_k,r) = (t_k,t_k+r) $/~\$ [a,b], and write

r sum -1 (r-n-1) ak,r(S) = (-1)ngk,r (qk,r)S(n)(qk,r). n=0

The convenction is that, if

_k,r is one of the knots, then some of the terms in the sum are zero, those where g_k,r vanishes.

Consider the functional Q_t : S_r(t,[a,b]) --> given by

n Qt(S) = sum ak,r(S)Nk,r. k=1- r

One may try to extend this functional to the whole space X; for this task it sometimes suffices to use the Hahn Banach Lemma (Theorem 28 in page 101), although in most of the cases, the hypothesis of this Theorem are not satisfied, and one must look for other related constructions and partial extensions into proper subspaces of X. We will get back to this idea in §2.5.

2.4.2 Quasi-Interpolant Operators

It is not hard to show that the projector Q_t is a bounded operator on the Schoenberg spaces; given S S_r(t;[a,b]), we have

||Qt(S)||||Lp[a,b] || |||| n sum |||| = |||| ak,r(S)Nk,r|||| k=1- r Lp[a,b] |||| sum n |||| < max |ak,r(S)||||| Nk,r|||| 0<k<r-1 ||k=1-r ||Lp[a,b] (notice ak,r is a continuous linear operator over a finite dimensional space of dimension r+ n; therefore, they are bounded) < Cr,n||S||Lp[a,b]||1||Lp[a,b] < C |b- a| 1/p||S|| r,n Lp[a,b]

Notice that the bound depends on the number of knots, n, and this is not useful in the sense that, when the number of knots gets bigger the constant may grown closer and closer to infinity. But there are good news; one can use the scaling properties of our puB-splines to find a bound independent of the number of knots. In order to achieve this surprising result, we must consider a special choice of subintervals associated to each puB-spline function: For each puB-spline N_k,r, we will denote J_k,r the largest subinterval (t_j,t_j+1) on its support (in case of several subintervals with the same property, choose the one with smallest index j). Notice that it must be |J_k,r|> (t_j+r - t_j)/r. Also, for each subinterval (t_j,t_j+1), we will consider the slightly larger subinterval (t_j-r+1,t_j+r) $/~\$ [a,b], which we denote I_j,r. Notice that J_k,r

I_j,r whenever the interval (t_j,t_j+1) is contained in the support of the puB-spline N_k,r, and that the number of intervals J_k,r contained on each I_j,r is preciselly r (this will be used several times to achieve dependence of r on several constants).

Lemma 2.9 There exists C > 0 (depending at most on r), such that for all 0 < p <, k = 1 - r,...,n and S S_r(t;[a,b]),

|ak,r(S)|< C |Jk,r|-1/p||S||L (J ) p k,r

Proof. We will use Lemma 2.1 (page 30), Markov’s Theorem (page 33), and the fact that the functions g_k,r and their derivatives are polynomials, hence bounded in any compact. Consider any point _k,r J_k,r:

|ak,r(S|)| | ||r sum -1 n (r-n-1) (n) || = || (-1) gk,r (qk,r)S (qk,r)|| n=0 r sum -1 (r- n- 1) (n) < |gk,r (qk,r)|||S ||L oo (Jk,r) n=0 (apply Markov ’s Theorem and find a common bound of the functions gk,r) r sum -1 < Cr ||S||L oo (Jk,r) n=0 (apply Lemma 2.1 [page 30]) < Cr |Jk,r|-1/p|| S ||L(J ) [#] p k,r

Remark. For p > 1, using the Hahn-Banach Theorem (Theorem 29 in page 102), we realize that we can extend these functionals to L_p[a,b]: For each k there exists a linear functional (abusing notation, we denote them equally) such that |ak,r(f)|

< C|J_k,r|^-1/p||f||_{L_p(J_k,r)} for all f

L_p(J_k,r). Furthermore, we can also extend the linear operator Q_t to L_p[a,b] for p > 1. This is what we call the quasi-interpolant of order r corresponding to the knots t in [a,b].

Proposition 5 For all 1 0 that depends at most on p and the order r, such that the following local and global estimates hold:

||Qt(f)||Lp[tj,tj+1] < C ||f||Lp(Ij,r) (2.3) ||Q (f )|| < C ||f|| (2.4) t Lp[a,b] Lp[a,b]

Proof. Estimate (2.3) is a direct consecuence of the remark after Lemma 2.9, the “partition of unity” property of the puB-splines, and the fact that |J_k,r|> (t_k+r - t_k)/r and J_k,r I_j,r for all suitable index j:

||Qt(f)||Lp[tj,tj+1] |||| sum n |||| = |||| ak,r(f )Nk,r|||| ||k=1-r ||L [t ,t ] |||| pn j j+1|||| < max |a (f)||||| sum N |||| 0<k<r-1 k,r ||k=1- r k,r|| Lp[tj,tj+1] < Cr 0m<akx<r-1| Jk,r|-1/p||f||Lp(Jk,r)||x[tj,tj+1]|| Lp[a,b] - 1/p 1/p < Cr(tk+r- tk) ||f||Lp(Ij,r)(tk+r- tk)

And the first estimate follows. Estimate (2.4) is a direct consequence of the previous:

p ||Qt(f)||Lp[a,b] sum n p = ||Qt(f)||Lp[tk,tk+1] k=1-r sum n p < Cr ||f ||Lp(Ik,r) k=1-r (apply the fact that on each interval Ij,r there are preciselly r subintervals Jk,r; therefore, we reduce the previous sum to a sum over mutually disjoint intervals that sum add up to [a,b]) < rCr ||f||p p Lp(Ij,r) = Cr||f||Lp[a,b] [#]

Remark. Unfortunatelly, one cannot use Hahn-Banach to find the same kind of results for 0 < p < 1, although similar approximation results will remain valid. In §2.5 we will show how in a more general setup.

2.5 Tensor Product Splines: Description of the problem

Definition 13 A tensor product puB-spline N : ^d --> of coordinate order r (coordinate degree < r) is a product of univariate puB-splines of order r, each of them with a different variable: N(x₁,...,x_d) = N₁(x₁)N_d(x_d).

We have all the necessary ingredients to pose the problem of approximation on cubes of ^d by dyadic splines. Let _O_ = [0,1]^d be the unit cube in ^d, and let X = L_p( _O_ ) with the corresponding (quasi)norm for all 0 < p <. The family of approximants will be constructed as spaces spanned by tensor product puB-splines. The construction of those spaces starts in the real line:

Consider for each n the basic knots in [0,1] given by t_n = {k2^-n : 0 < k < 2ⁿ}, and t₀ = Ø by definition. Associated to these basic knots we will use the following Schoenberg spaces:

n = 0 :: Consider auxiliary knots 1 - r,...,0 and 1,...,r, and the corresponding puB-splines N_k,r = N(^. | k,...,k + r) for all k = 1 - r,...,0. Notice that all of those functions may be obtained as (restrictions on [0,1] of) horizontal shifts of N_0,r. We can then write N_k,r(x) = N_0,r(x - k).
n > 1 :: Consider auxiliary knots k2^-n for k = 1 - r,...,0 and k = 2ⁿ,...,2ⁿ + r - 1, and the corresponding puB-splines. Notice that, as in the previous case, we can obtain each of them as (restrictions on [0,1] of) horizontal shifts of dyadic dilations of N_0,r. We need to update our notation in order to handle the new situation:
For each n > 0, denote N_0,r^[n](x) = N_0,r(2ⁿx) (the dyadic dilation of order 2^-n), and then for each k = 1 - r,...,2ⁿ - r - 1, N_k,r^[n](x) = N_0,r(2ⁿx - k) (horizontal left-shifts of length k2^-n).

Notice that {N_k,r^[n]}_k=1-r^2ⁿ-1 is a basis for S_r(t_n,[0,1]); let us denote _k,r^[n] the corresponding dual functionals.

Let us move into d dimensions, where we will write x = (x₁,...,x_d) ^d. Consider for each multi-index k = (k₁,...,k_d), the tensor product puB-spline N_k,r^[n](x) = N_k₁,r^[n](x₁)N_{k_d,r}^[n](x_d), and the functionals _k,r^[n] = _k₁,r^[n] oo _{k_d,r}^[n].

Lemma 2.10 For each S X_n, any tensor product puB-spline N_k,r^[n] = N_k₁,rN_{k_d,r}, and any point _k,r = (_k₁,r,...,_{k_d,r}) supp(N_k,r^[n]) $/~\$ _O_ , (_{k_j,r} supp(N_{k_j,r}) $/~\$ proj_j( _O_ )),

r-1 a[n](S) = sum a (q )DnS(q ), where a (q ) = (-1)|n|g(r-n1-1)(q )...g(r-nd- 1)(q ). k,r n=0 n,k,r k,r k,r n,k,r k,r k1,r k1,r kd,r kd,r

Proof. Given a multi-index k = (k₁,...,k_d), and a tensor product spline S X_n, write

sum sum S = cjN [nj,]r + Nkd,r cj'N[jn',]r. jd/=kd j' (- Zd- 1

Make

_{k_d,r} act on the previous expression to obtain

r sum -1 ( sum ) akd,r(S) = (-1)n1g(r-n1-1)(qkd,r) cj'N[n'] Dn1Nkd,r(qkd,r). n1=0 kd,r j' (- Zd-1 j,r

Make now the rest of the univariate functionals

_{k_j,r} act on the previous expression, one at a time in decreasing order, to obtain the desired result. [#]

It follows that these are the dual functionals of the constructed tensor product puB-splines, and therefore the functions N_k,r^[n] are linearly independent in L_p( _O_ ). Let us denote X_n = span{N_k,r^[n]}, where k has all its indices between 1 - r and 2ⁿ - 1. This is the space of all piecewise polynomials with coordinate order r and maximal smoothness on dyadic subcubes of size 2^-n in the cube _O_ (since DN_k,r^[n] L( _O_ ) for all multi-index 0 < < r - 1, and DN_k,r^[n] is continuous for multi-indices 0 < < r - 2). As in the univariate case, each tensor product puB-spline N_k,r^[n] can be obtained from N_0,r^[0] by shifts and dilations:

[n] [0] n N k,r(x) = N 0,r(2x - k)

The family {X_n}_n is our family of approximants:

Each X_n is a linear space: X_n = X_n, and X_n + X_n = X_n trivially.
By Theorem 3 (page 14), there exist elements of best approximation, although so far we have no means of computing them for a given f L_p(). By using the construction below and the results in sections 2.1, 2.4.2, we will be nevertheless able to construct suitable elements of near-best approximation.
Notice that X_n+1 X_n for all n. It can be proved that _nX_n is dense in X, and therefore lim_nE(^.,X_n) = 0 monotonically decreasing. We can therefore use Lemma 1.1 (page 5) if needed.

We would like now to have a projector, but this is not an easy task. The quasi-interpolant of §2.4.2 does not work for 0 < p < 1, but is still useful. The trick is to find first intermediate spaces X_n Y _n X for each n, where the quasi-interpolants can be easily extended, and such that we can effectively compute (near)best approximations from Y _n to elements of X. In the case we are studying here, the obvious choice works just fine:

For each multi-index j = (j₁,...,j_d) with 1 < j_i < 2ⁿ, consider the dyadic cubes of _O_ , _j,n = [(j₁ - 1)2^-n,j₁2^-n] ×× [(j_d - 1)2^-n,j_d2^-n], and D_n = {_j,r : 1 < j < 2ⁿ} the family of those cubes. Let us denote Y _n = _j=1^2ⁿ(r;_j,n) the space of piecewise polynomials of coordinate order r associated to the dyadic partition D_n.

As we did before for univariate puB-splines, we need to consider for each tensor product puB-spline N_k,r^[n], two especial cubes:

J_k,r:: J_k,r = J_k₁,r ×× J_{k_d,r} D_n (read §2.4.2 for the definition of the intervals J_j).
I_j,r:: For each 1 < j < 2ⁿ, denote I_j,r the smallest cube that contains each of the J_k,r for which supp(N_k,r^[n]) $/~\$ _j,nØ. Notice that I_j,r D_m for some m < n; therefore, = 2^(n-m)d. Also, and as before, the number of subcubes J_j,r contained on each I_k,r depends solely on r and d.

In order to obtain the de Boor-Fix expression of the dual functionals of the tensor product puB-splines N_k,r^[n], we will choose canonically _k,r to be the center of the cubes J_k,r. In that case, one realizes that these functionals can also be applied to any function f L_p( _O_ ) which is differentiable enough on each of the points _k,r; in particular, any piecewise polynomial P Y _n:

Lemma 2.11 For any 0 0 which depends at most on the order r, such that

| | ||a[n](P )|| < 2nd/pC||P || . [] k,r Lp(Jk)

The proof of this lemma follows the same steps that the one for Lemma 2.9 (page 47) and its posterior remark. We can also construct quasi-interpolant operators Q_n : Y _n --> X_n, which act naturally as projectors.

Proposition 6 For 0 0 depending at most on r, d and p, such that the following estimates hold:

||Q (P )|| < c||P|| (2.5) n Lp([]j,n) Lp(Ij,r) ||Qn(P )|| Lp(_O_) < c||P||Lp(_O_) (2.6) ||P - Qn(P )|| L ([] ) < cE(P;TT(r;Ij,r))p (2.7) p j,n

The proof is trivial; it uses the previous lemma, and follows the same steps that the proof of Proposition 5 and its posterior remark (page 48).

We are ready to construct the method of approximation: Given > 0, consider any operator of near-best L_p approximation by elements of Y _n, say _n : X --> Y _n such that ||f - tn(f)|| _{L_p()} < E(f,Y _n)_p. Notice that such operators may be constructed by collecting the restriction on cubes of the near-best L_p approximations to f by polynomials in (r) on each cube _j,n D_n, and patching them together. Let then

Tn : X -) f '--> (Qn o tn)(f) (- Xn

It follows easily that these are linear operators. The most important properties of these operators are presented in the following two propositions:

Proposition 7 Given > 0, there exists a constant C > 0 which depends at most on p, r d and such that ||T_n(f)||_{L_p()} < C||f||_{L_p()} for all f L_p( _O_ ).

Proof. Notice first that, for all subcube _j,n D_n, we have

||t(f)|| n Lp{([]j,n) } < Cp ||t{n(f)- f||Lp([]j,n) + ||f||Lp([]j,n)} < Cp,t E(f,TT(r);[]j,n)p + ||f||Lp([]j,n) < Cp,t||f|| ; Lp([]j,n)

therefore,

sum 2n ||Tn(f)||pLp(_O_) = ||Qn(tn(f))|| pLp(_O_) = ||Qn(tn(f))||pLp([]j,n) j=1 (apply estimate (2.5) above) 2n 2n < Cr,d,p sum ||tn(f)|| p < Cr,d,p,t sum ||f||p j=1 Lp(Ij,r) j=1 Lp(Ij,r) (independence of n is guaranteed, see description of I ) p j,r < Cr,d,p,t||f||Lp(_O_)

Proposition 8 T_n : L_p( _O_ ) --> X_n is an operator of near-best L_p approximation from elements of X_n.

Proof. Given f L_p( _O_ ), let S_n X_n be its best L_p approximation from X_n; then, we have the estimate:

||f- Tn(f)||Lp(_O_) < Cr {||f- Sn|| + ||Sn- Tn(f)|| } { Lp(_O_) Lp(_O_) } = Cr,d E{(f,Xn)p +||Qn(Sn + tn(f))|| Lp(}_O_) < Cr,d,p E(f,Xn)p + ||Sn + tn(f)||Lp(_O_) < C {E(f,X ) + ||S - f|| + ||f- t (f)|| } r,d,p n p n Lp(_O_) n Lp(_O_) < Cr,d,p{E{(f,Xn)p + tE(f,Yn)p} } < Cr,d,p,t E(f,Xn)p + ||f -Sn ||Lp(_O_) ;

which proves the statement. [#]

Remark. Notice that now we may use indisctinctly for each f L_p( _O_ ) either E(f,X_n)_p or ||f -T_n(f)||_{L_P()}, since they are equivalent. Moreover, when searching for the approximation spaces, we may use the (quasi)seminorm functions

|f |Aaq(Lp(_O_),Xn) { oo sum ( ) } 1/q )( 1n na||f- Tn(f)||Lp(_O_) q n=1 ( oo sum )1/q )( 2kqa||f - T2k(f )|| qL (_O_) k=0 p ( oo sum )1/q )( 2nqa||f- Tn(f)||q n=0 Lp(_O_)

Our next goal is to try to identify the approximation spaces A_q(L_p( _O_

),X_n) in terms of classical spaces. For this task, the first step to take is to find known (quasi)seminorms with the same properties than our objects ||f - T_n(f)||_{L_p()}. Among the properties we are interested, the most obvious is that the space of polynomials of coordinate order r is properly contained on each of the kernels. Good candidates are therefore the Sobolev seminorms (but these are only defined for values p > 1 and even in those cases, not for all functions in L_p( _O_

)), and the r-th moduli of smoothness (and these do exist for all functions f

L_p(

), 0 < p <

). We will explore both functionals and the spaces related to them in the next sections.

2.6 Sobolev Spaces

2.6.1 Mollifiers and Infinitelly Differentiable Partitions of Unity

In this section we introduce two important tools in the Theory of Sobolev Spaces: mollifiers and infinite differentiable partitions of unity. We also illustrate how to use the former to construct plateau functions and prove the density of C₀(G) on L_p(G) for 1 < p < on domains G ^d.

Definition 14 We call a mollifying kernel to any nonnegative, real-valued function C₀(^d) such that (x) = 0 for |x|> 1 and integral _^d(x)dx = 1, we call a mollifier to any function (x) = ^-d(x/) for any > 0.
Given a function f M(^d,|^.|) for which the integral _^d(x - y)f(y)dy makes sense, we call the convolution ( * f)(x) a mollification or regularization of f.

Example. An example of mollifying kernels are the “bump” functions _d : ^d --> given by

( ) Pd(x) = exp ---1--- xB(0,1) |x |2- 1

Figure 2.1:

₂(x) = exp ( --1-)
|x|2- 1

_{_B(0,1)}

Theorem 11 Given a domain G ^d, C₀(G) is dense in L_p(G) for all 1 < p < .

Proof. Assume f M(^d,|^.|). Let (s_n)_n be a monotonically increasing sequence of nonnegative simple functions converging pointwise to f. As p > 1, we have 0 < s_n(x)^p < f(x)^p a.e., and therefore, it must be s_n L_p(G), and furthermore, by the Dominated convergence Theorem, lim_n||f - s_n||_{L_p(G)} = 0 (since |f(x) - s_n(x)|^p < f(x)^p for all x). Given > 0, find s_n such that ||f - s_n||_{L_p(G)} < /2. Use now Lusin’s Theorem to find a continuous function g C(G) such that |g(x)|<||s_n||_L(G) and more importantly,

( -1 -1 )p |{x (- G |sn(x) /= g(x)}|< e| supp(sn)| ||sn||L oo (G) .

We have then

integral ||f - g||Lp(G) < ||f - sn||Lp(G) +||sn - g||Lp(G) < e/2 + supp(sn)sn(x)- g(x)dx < e,

and the result follows. [#]

Lemma 2.12 Given a domain G ^d, a mollification kernel and a function f M(^d,|^.|) such that f(x) = 0 for x / G, the following holds: (i) If f L₁^loc(G), then * f C(^d) for all > 0. (ii) If also supp(f) is compact, then * f C₀(G) for all 0 < < dist(supp(f),G). (iii) If f L_p(G) for any 1 < p < , then * f L_p(G); moreover,

lim ||P * f- f|| = 0 e-->0+ e Lp(G)

(iv) If f

C(G) and K

G is compact, then lim_0⁺

* f = f uniformly on K. (v) If f

C(G), then lim_0⁺

* f = f uniformly on G.

Theorem 12 Given a domain G

^d, a compact subset K

G and 0 <

< dist(K,

G), there is a plateau function

C₀(G) such that 0 <

(x) < 1 for all x

G, and

(x) = 1 for all x

K =

_yKB(y,

Proof. Let : ^d --> be a mollifying kernel, and consider = *_{_{K_3/2}}, the mollification of _{_{K_3/2}} with . This is the function we are looking for. [#]

Theorem 13 Given a domain G ^d and 1 < p < , C₀(G) is dense in L_p(G).

Proof. This is a direct consequence of Theorem 11 and parts (ii) and (v) of the previous Lemma. [#]

Theorem 14 Given an arbitrary subset A ^d and an open cover O of this set, there exists a collection of functions in C₀(^d) with the following properties: (i) 0 < (x) < 1 for all and all x ^d. (ii) Given a compact subset K A, all but possibly finitelly many vanish identically on K. (iii) Given , there exists U O such that supp() U. (iv) sum (x) = 1 for all x A.

2.6.2 Distributions and Weak Derivatives

Definition 15 Given a domain G ^d, consider the space D(G) consisting on those functions g C₀(G) such that there exists a compact set K G and a sequence (g_n)_n in C₀(G) so that supp(g - g_n) K for all n, and lim_nD^kg_n(x) = D^kg(x) uniformly on K for each multi-index k.
The dual space D'(G) is called the space of (Schwartz) distributions if it is given the weak-start topology as dual of D(G): lim_nT_n = T in D'(G) if and only if lim_nT_n(g) = T(g) in ^d for every g D(G).

Remark. The space L₁^loc(G) can be identified with a subspace of D'(G) as follows: given f L₁^loc(G), let T_f : C₀(G) g '--> integral _Gf(x)g(x)dx . These functionals are trivially linear. Notice that it is also continuous: Given a sequence (g_n)_n in C₀(G) such that there exists a compact K G so that supp(g -g_n) K for all n, and lim_ng_n(x) = g(x) uniformly on K; we have

integral |Tf(gn)- Tf(g)|< sup|g(x)- gn(x)| |f(x)|dx x (- K K

and the continuity holds in virtue of the uniform convergence of (g_n)_n. The identification is possible via Theorem 13: If T_f₁ = T_f₂, then integral

_G(f₁ -f₂)g dx = 0 for all g

C₀; the density gives integral

_G(f₁ -f₂)g dx = 0 for all g

L_p(G), and therefore, f₁ -f₂ = 0 a.e. This means that the map L₁^loc(G) -->

D'(G) that gives the identification is an injection.

Definition 16 Given G ^d, a multi-index k ^d and given a distribution T D'(G), we define its distributional k-th derivative D^kT by ( k )
D T (g) = (-1)^|k|T(D^kg) for all g C₀(G).
Similarly, for f L₁^loc(G) and k ^d, we say L₁^loc(G) is a weak k derivative of f if T is a distributional k-th derivative of T_f. This weak derivative might not exist, but in case it does, it must be unique a.e; we denote it D_w^kf.

2.6.3 Sobolev Spaces W_p^r(G) and H_p^r(G)

Definition 17 Given a domain G ^d, and r {0}, we define the following functionals:

{ sum } 1/p |f |W rp(G) = ||Dkf ||pLp(G) (for 1 < p < oo ) (2.8) |k|=r |f| r = max ||Dkf || (2.9) W oo (G) |k|=r L oo (G) 1/p { sum p } ||f||W rp(G) = | f |W jp(G) (for 1 < p < oo ) (2.10) 0<j<r ||f||Wr oo (G) = max ||Dkf ||L oo (G) (2.11) 0<|k|<r

Functionals (2.8) and (2.9) are trivially seminorms, and (2.10), (2.11) are norms. Associated to these functionals, we define the following spaces:

H_p^r(G), the completion of {f C^r(G) : ||f||_{W_p^r(G)} < } with respect to the norm ||^.||_{W_p^r(G)}.
W_p^r(G) = .
W_p^r(G), the closure of C₀ in the space W_p^r(G).

Remark. We have trivially W_p⁰(G) = L_p(G) for 1 < p <, and W_p⁰(G) = L_p(G) for 1 < p < (by Theorem 13). Notice also the chain of (continuous) embeddings for all r :

--- W rp(G) < W rp(G) < Lp(_O_).

We will prove that H_p^r(G) = W_p^r(G) for 1 < p <

, and H^r(G) (/=

W^r(G)

Theorem 15 W_p^r(G) is a Banach space for all G ^d, 1 < p < and r {0}.

Proof. Let (f_n)_n be a Cauchy sequence in W_p^r(G) L_p; then trivially (D^kf_n)_n are Cauchy sequences in L_p(G) for all multi-index k with 0 <|k|< r. Let f, ^(k) L_p(G) be such that lim_nf_n = f and lim_nD^kf = ^(k) both in L_p(G). As L_p(G) L₁^loc(G), each of those functions determines distributions T_f,T_^(k) D'(G). For any g D(G), we have then (let q be the conjugate exponent of p):

integral |Tfn(g)- Tf(g)|< G|fn(x)- f(x)||g(x)| dx < ||g||Lq(G)||fn- f||Lp(G);

therefore, lim_nT_{f_n}(g) = T_f(g) and similarly, lim_nT_{D^kf_n}(g) = T_^(k)(g) for all g

D(G). It follows that

|k| k |k| k Tf(k)(g) = limn TDk(f)(g) = linm(-1) Tfn(D g) = (- 1) Tf(D g),

and hence

^(k) = D^kf in the distributional sense. The statement follows. [#]

Lemma 2.13 Given domains G'

^d, such that the closure of G' in G is a compact, 1 < p <

, r

, a mollifying kernel

C₀(

^d), and f

W_p^r(G), we have lim_0⁺

* f = f in W_p^r(G').

Theorem 16 (Meyers, Serrin) Given a domain G

^d, 1 < p <

, and r

{0}, we have H_p^r(G) = W_p^r(G).

2.6.4 Properties of Sobolev Spaces in one dimension

For functions of one variable, both ordinary and generalized derivatives produce the dame space. This is proved in the following two results:

Lemma 2.14 Let A be an open interval, and r {0}. If f L₁^loc(A) verifies integral _Afg⁽r)dx = 0 for all g C₀(), then f is a.e. a polynomial of order r.

Proposition 9 If f

L₁^loc(A) has a weak r-th derivative

^(r)

L₁^loc(A), then it can be redefined on a set of measure zero so that f^(r-1) is absolutelly continuous, and f^(r) = ga.e. on A.

Remark. In §2.7.2 we will make use of the K-functional of compatible couples (L_p(G),W_p^r(G)) for G ^d. We can use the previous results to illustrate how to compute it in one dimension. We will base the proof in the availability of the Taylor polynomial for functions in Sobolev Spaces: For each f W_p^r(A), consider the Taylor polynomial centered in c A:

r sum -1 (k) Tf;c,r(x) = f--(c)(x - c)k k=0 k!

with error

integral f(x)- T (x) = xf(r)(t)(x---t)r-1dt f;c,r c (r -1)!

Lemma 2.15 Given an open interval A R, 1 < p,q <, we have the estimate

|A-|k-1/p+1/q- (k) ||f - Tf;c,r||Lq(A) < (k- 1)! ||f ||Lp(A)

Proof. Let p' be the conjugate exponent of p; let’s apply Hölder’s Inequality:

|f (x) - integral Tx(x)| < |f(r)(t)||x- t| r-1 c (r - 1)! |A |r-1 integral (r) < (r--1)! |f (t)| dt r-1 A < -|A-|---||f(r)||Lp(A)||||xA||Lp'(A) (r- 1)! |A|r-1/p (r) = (r - 1)!||f ||Lp(A);

therefore, for each 1 < q <

||f - Tf;c,r||qLq(A) integral = |f(x) - Tf;c,r(x)| qdx integral A( )q < |B-|r-1/p||f(r)|| dx A (r- 1)! Lp(A) ( |B|r- 1/p )q = (r--1)!||f(r)|| Lp(A) |A|;

hence proving the desired statement. [#]

Theorem 17 For r > 2, 1 < p <

, there is a constant C > 0 depending at most on r, such that for all f

W_p^r(A), 0 < t >|A| and 0 < k < r,

k (k) ( r (r) ) t ||f || Lp(A) < C ||f||Lp(A) + t||f || Lp(A)

Proof. Given 0 < t <|A|, we have for all x A_t = {x A | x + t A},

r- sum 1f(k) integral x+t (x+ t- s)r-1 f(x + t) = -k! tk + f(r)(s)---(r---1)!---ds k=0 x---------- ----------- Rr(x,t)

Notice that

|Rr(x,t)| tr-1 integral (r) < (r--1)! |f (t)| x(x,x+t)(t) dt r-1 A < --t----||f(r)||Lp(A)t1-1/p (r- 1)! -tr--1/p- (r) = (r- 1)!||f ||Lp(A);

and therefore,

r- 1/p 1/p r ||Rr(.,t)||Lp(At) < t----| At-|--||f(r)||Lp(A) < --t---||f(r)|| Lp(A) (r- 1)! (r- 1)!

(2.12)

since |A_t|< t trivially.
Consider now for our choice of 0 < t < |A|, x A and > 0 such that x + t A. In this case, we have

r-1 sum f(k)(x) k k f(x+ ct) = k! c t + Rr(x,ct). k=0

Choose

> 1 small, and r - 1 different values 1 <

₁ <

_r-1 <

. Construct the following system of linear equations:

( r-1 ) ( ) ( ) c1. ... c1. a1. b1. .. ... .. .. = .. ; cr-1 ... crr--11 ar-1 br-1

where

_k = t^kf^(k)(x)/k!, and

_k = f(x +

_kt) -f(x) -R_r(x,

_kt). The first matrix is a Vandermonde’s; hence this system has a solution:

( ) ( r-1 )-1 ( ) ( ) ( ) a1 c1 ... c1 b1 g1,1 ... g1,r-1 b1 ... = ... ... ... ... = ... ... ... ... , ar-1 cr-1 ... crr--11 br-1 gr-1,1 ... gr-1,r-1 br- 1

where the values

_i,j are controled by the values

_k. We have then

_k =

_j=1^r-1

_kj

_j for all j = 1,...,r - 1.
This leads to

k t-||f(k)|| L (A) k! || p || ||||r sum -1 |||| = |||| gkj(f(x + cjt)- f(x)- Rr(x,cjt))|||| || j=1 || Lp(A) r sum -1 < gkj(2||f||L (A) + ||Rr||L (A)) j=1 p p ( use estimate (2.12) above) r- 1 r-1 < 2 sum g || f|| + sum g --tr--|| f(r)|| j=1 kj Lp(A) j=1 kj(r- 1)! Lp(A)

and the statement follows.

2.7 Modulus of smoothness and Besov Spaces

2.7.1 Definitions and Properties

Consider the difference operators: for each h ^d and measurable function f M( _O_ ,|^.|) for any subset _O_ ^d, let _h(f,^.) = f(^. + h) - f(^.), and _h^r = _h(_h^r-1) for r > 1. It follows from the binomial theorem, that

r ( ) Drh(f,x) = sum (-1)r-k r f(x+ kh) k=0 k

(2.13)

for all x _O_ (rh) = {x _O_ ||x + kh| _O_ for all 1 < k < r}.

Definition 18 Given a rearrangement-invariant space (X,||^.||_X) over the space ( _O_ ,|^.|), we define the r-th modulus of smoothness of f X by

|| r || wr(f,t)X = sup ||D h(f,.)x_O_(rh)||X for all t > 0. 0<|h|<t

The general setup is fairly complicated, and many different properties are to be taken into account in order to produce any general result on these functionals. We will focus on the spaces we are going to use in this survey: L_p( _O_ ) for 0 < p < , and C( _O_ ) for p = , where _O_ ^d is a compact cube.

Lemma 2.16 For any t > 0, the modulus of smoothness is a seminorm for 1 < p < and a quasi-seminorm for 0 < p < 1.

Proof. Notice that _r(f,t)_p <||_r(f,t)_p, and _r(0,t)_p = 0 trivially for all 0 < p <. As for the (quasi)triangular inequality, we have also trivially _r(f + g,t)_p < C (w (f,t) + w (g,t))
r p r p , with the same constant from the (quasi)triangular inequality in L_p( _O_ (rh)). The kernel of the r-th modulus of smoothness is precisely the set of polynomials (r) of coordinate order r.

Also, from the fact that ||f + g||_{L_p}^p <||f||_{L_p}^p + ||g||_{L_p}^p for all 0 < p < 1, we obtain similarly _r(f + g,t)_p^p < _r(f,t)_p^p + _r(g,t)_p^p. [#]

Lemma 2.17 (Properties of the modulus of smoothness in L_p) Given f L_p( _O_ ), t > 0, the following estimates hold:

min(1,p) r- k min(1,p) wr(f,t)p < 2 wk(f,t)p for all 0 < k < r (2.14) wr(f,nt)mpin(1,p) < nrwr(f,t)mpin(1,p)for all n (- N (2.15) min(1,p) p min(1,p) wr(f,ct)p < (c + 1)wr(f,t)p for all c > 0 (2.16)

Proof. The first estimate can be proved directly from the identity _h^r(f,x) = _h^r-1(f,x) + _h^r-1(f,x + h), which is proved easily from (2.13). That gives _r(f,t)_p^min(1,p) < 2_r-1(f,t)_p^min(1,p), and from this the statement follows.

As for the second estimate, notice first that

r r wr(f,nt)p = 0<s|uhp|<nt||D h(f,.)||Lp(_O_(rh) = 0<su|hp|<t||D nh(f,.)||Lp(_O_(rnh)

We use next the following expansion for

_nh^r:

n-1 n-1 Dr (f,x) = sum ... sum Dr(f,x +k h + ...+ k h) nh k1=0 kr=0 h 1 r

And the third estimate is a direct consequence of the second. [#]

Weak inverses to the first estimate in the previous lemma are offered by Marchaud and Timan. A proof of the following result, that is known as Marchaud’s Inequalities, can be read in chapter 2 of [DeLo].

Theorem 18 (Marchaud (1927),Timan (1958)) Given r > 2 and f L_p( _O_ ), we have the following estimates for all 1 < k < r, t > 0 and 0 < p <:

integral oo min(1,p) wk(f,t)mpin(1,p) < Ctkmin(1,p) wr(f,s)p-----ds (2.17) t sk min(1,p)+1 if _O_ is not{ bounded. } integral |_O_|wr(f,s)mpin(1,p) ( ||f||Lp(_O_))min(1,p) wk(f,t)mpin(1,p) < Ctkmin(1,p) --skmin(1,p)+1--ds+ --|_O_-|k--- (2.18) t if _O_ is compact

And for 1 < p <

= min(p,2):

(i ntegral oo m )1/m wk(f,t)p < Ctk wr(f,s)p-ds (2.19) t skm+1 if _O_ is not bounded (i ntegral |_O_| m ||f||m )1/m wk(f,t)p < Ctk wr(f,s)pds + ---Lp(_O_)- (2.20) t skm+1 |_O_|km if _O_ is compact

Definition 19 Given _O_

^d and parameters

,q > 0, we define the Besov functionals in L_p( _O_

) by

{ integral oo ( )q dt} 1/q |.|Baq(Lp(_O_)) = t-awr(.,t)p t- for 0 < q < oo , 0( -a ) |.|Ba oo (Lp(_O_)) = stu>p0 t wr(.,t)p for q = oo ,

where r is the smallest integer greater than

. Associated to these functionals, we define the Besov spaces as

a B q(Lp(_O_)) = {f (- Lp(_O_) :|f| Baq(Lp(_O_)) < oo }.

Lemma 2.18 If _O_

is compact, then the previous (quasi)seminorms are equivalent to their discrete counterparts:

( ) sum oo qa q 1/q |.|Baq(Lp(_O_)) )( 2 wr(.,t)p k=0 |.|Ba oo (Lp(_O_)) )( sup 2kawr(.,2-k)p k>1

Sketch of the Proof. The proof is similar to the proof of part (ii) in Lemma 1.9 (page 21): we start showing that the seminorm above is equivalent to the one obtained replacing the integral (or the supremum) over (0,) by one over (0,1), using the fact that, as _O_ is compact, then _r(f,t)_p < _r(f,| _O_ |)_p for all t > | _O_ |. After that, discretization of the latter integral with partition {2^-n | n } is applied, using the fact that _r(f,^.)_p is a nondecreasing function. .

Remark. Unfortunately, the moduli of smoothness are not always suitable for applications because it is not easy to add up several such estimates over different intervals. New related (and equivalent) moduli of smoothness can be constructed by averaging:

Definition 20 We define the r-th averaged modulus of smoothness on the subcube I _O_ , for f L_p( _O_ ) and t > 0 by

( integral integral )1/p wr(f,t;I)p = -1d |Drh(f,x)|pdxdh . t [- t,t]d I(rh)

Remark. Notice that, for I,J _O_ , we have (I J)(h) I(h) J(h) for all suitable h ^d; therefore, w_r(f,t;I J)_p^p > w_r(f,t;I)_p^p + w_r(f,t;J)_p^p. We will prove now the equivalence with the moduli of smoothness.

Lemma 2.19 For all f L_p( _O_ ) and suitable s ^d, the following holds for all x ^d

r ( ) r sum r [ r r ] D h(f,x) = k Dks(f,x+ kh)- D h+ks(f,x) k=1

Proof. Notice that

sum r ( ) (- 1)r+k r Drh+ks(f,x) k=0 k sum r (r ) sum r (r) = (- 1)r+k k (- 1)r+j j f(x+ j(h+ ks)) k=0 j=0 sum r r+j(r ) sum r r+k(r) = (- 1) j (-1) k f(x+ jh+ ksj) j=0 k=0 sum r k(r) sum r r+j(r) r = (- 1) k f (x) + (-1) j Djs(f,x + jh) k=0 ( ) j=1 sum r r+j r r = (- 1) j D js(f,x+ jh); j=1

therefore,

sum r ( ) sum r ( ) (-1)rDrh(f,x) = (-1)r+j r Drjs(f,x +jh) - (-1)r+k r Drh+ks(f,x), j=1 j k=1 k

and the statement follows. [#]

Proposition 10 For all f

L_p(

), a subcube I

, r > 0 and any 0 < t < |I|(4r)^-1, there exists a constant C > 0 which depends at most on r such that

wr(f,t;I)p < wr(f,t)p < Cwr(f,t;I)p

Proof. The left inequality is trivial:

integral p -d r p p wr(f,t;I)p = t [-t,t]d||D h(f,.)||Lp(I(rh))dh < wr(f,t)p

Blah blah blah.

Remark. We will prove that the Besov spaces are precisely the ones we are looking for. The key is Whitney’s theorem.

2.7.2 Whitney’s Theorem

Theorem 19 (Johnen (1972)) K(f,t^r;L_p( _O_ ),W_p^r( _O_ )) _r(f,t)_p for all f L_p( _O_ ), (1 0 and r .

Proof. Oh boy, this is a tough one.

Theorem 20 (Whitney (1957)) E(f,(r))_p _r(f,)_p for all f L_p( _O_ ), (0 0 and r , where is the largest of the sides of _O_ .

Proof. For p > 1, let g W_p^r( _O_ ) be arbitrary, and P (r) be the Taylor polynomial of g associated to one of the points in the boundary of _O_ . By some result in §2.6, we have

||g- P ||Lp(_O_) <---1---| _O_ |r||Drg ||Lp(_O_); (r -1)!

in particular,

{ r r } ||f- P ||Lp(_O_) < Cp,r ||f - g||Lp(_O_) + |_O_| || D g||Lp(_O_)

This gives E(f,

(r))_p < C_p,rK(f,| _O_

|^r;L_p(

),W_p^r(

)), and application of theorem 19 offers the left inequality in this case.

The right inequality is trivial for all p, since _r(f,t)_p = _r(f -P,t)_p < 2^r₀(f -P,t) = 2^r||f -P||_{L_p()} for all P (r). [#]

2.8 Other Seminorms for Besov Spaces

Proposition 11 For any 0 0, given f L_p( _O_ ), there exists C > 0 that depends at most on p, r and such that the following estimate holds for all n:

||f - Tnf||Lp(_O_) < Cwr(f,2-n)p.

Proof. For each dyadic subcube _j,n, denote _j,n = _n(f)|_{_j,n} the restriction of the piecewise polynomial of near-best approximation we get from the operator _n. In that case,

||f- Tnf ||{Lp([]j,n) } < Cp ||f- tj,n||Lp([]j,n) + ||tj,n - Qn(tj,n)||Lp([]j,n) (use estimate 2.7 from Proposition 6 in page 53) < C {E(f,TT(r);[] ) + E(t ,TT(r);I )}. p,t j,n p j,n j,rp

Now, for each of the subcubes

_i,n

I_j,r, we have the estimate

E(tj,n,TT(r);[]i,n)p < ||tj,n- ti,n||Lp([]i,n) < Cp {||tj,n- f||L ([] ) + ||f- ti,n||L ([] )} p i,n p i,n (use Lemma{ 2.4 in page 32) } < Cp,|Ij,n| ,r ||tj,n- f||Lp(Ij,r) + ||f -ti,n||Lp(Ij,r) < Cp,|Ij,n| ,n,tE(f,TT(r);Ij,r)p;

therefore,

||f- Tnf||Lp([] ) < Cp,|I | ,n,tE(f,TT(r);Ij,r)p, j,n j,n

and from Whitney’s Theorem and Proposition 10 in page 72, we have

p ||f - Tnf||Lp(_O_) 2 sum n- 1 = ||f - Tnf||pLp([]j,n) j=1-r (set l = max{lIj,r |1- r < j < 2n} and c = 2nl) 2n-1 < C sum w (f,l ;I )p p,c,n,tj=1-r r Ij,r j,rp 2n-1 sum p < Cp,c,n,t wr(f,l;Ij,r)p. j=1-r

Notice that there is a finite number of cubes I_j,r, so the previous sum of averaged moduli of smoothness can be estimated by the averaged moduli of smoothness on the union:

||f - Tnf||pL (_O_) p p < Cp,c,n,twr(f,l,_O_)p < Cp,c,n,twr(f,2-n,_O_)pp,

and the statement follows. [#]

Corollary 11.1 For any 0 0, given f L_p( _O_ ), there exists C > 0 that depends at most on p, r and such that the following estimate holds for all n:

-n E(f,Xn)p < Cwr(f,2 )p. []

Lemma 2.20 For any 0 < p <

there exist C₁,C₂ > 0 which depend at most on d and p, such that for all S_n

X_n,

( ) 2n sum -1 || [n] ||p 1/p C1||Sn ||Lp(_O_) < 2-nd|ak,r(Sn)| < C2||Sn||Lp(_O_). k=1-r

Proof. The left hand side is inmediate; given x _O_ , let $/\$ _n(x) = {k ^d | x supp(N_k,r^[n])} (notice that this value does not depend on n, but on r and d):

| | | | | | |S |p < (2p-1)|/\n(x)|-1 sum ||a[n](S )||p|| N [n](x)|| p < C sum ||a[n](S )||px ; n k (- /\ (x) k,r n k,r p,d,rk (- /\ (x) k,r n supp(N[kn],r) n n

hence,

integral 2n-1 |S (x)|pdx < C sum ||a[n](S )||p||supp(N[n])||< C 2-nd sum ||a[n](S )||p. _O_ n p,d,r | k,r n || k,r| p,d,r k=1- r|k,r n| k (- /\n(x)

As for the right inequality, we need to make use of Lemma 2.11 in page 53:

n n - nd 2 sum -1 || [n] ||p -nd 2 sum -1 nd 2 |ak,r(Sn)| < 2 2 ||Sn||Lp(Jk,r), k=1-r k=1-r

and the statement follows. [#]

Proposition 12 For any 0 < p <

and r

, given f

L_p(

), there exists C > 0 that depends at most on p, d and r such that the following estimate holds for all n:

( n )1/m wr(f,2-n)p < C2 -nr ||f||m + sum [2krE(f,Xk)p]m , Lp(_O_) k=1

where

= min(1,p).

Proof. Let S_n be an element of best L_p( _O_ ) approximation to f from X_n for all n, and s₁ = S₁, s_n = S_n -S_n-1 X_n for n > 2. We can then write f = f - S_n + sum _k=1ⁿs_k, and use this inside the difference operator; for suitable h ^d,

r ||D h(f,||.)|| Lp(_O_(rh)) || |||| r sum n r |||| = ||||D h(f- Sn,.)+ D h(sk,.)|||| k=1 Lp(_O_(rh)) { sum n } 1/m < ||Drh(f - Sn,.)||mLp(_O_(rh)) + ||Drh(sk,.)||mLp(_O_(rh)) k=1 { n sum } 1/m < 2rm||f- Sn||mL (_O_(rh)) + ||Drh(sk,.)||mL (_O_(rh)) (2.21) p k=1 p

We need to estimate the terms in the sum on the right, but trying to introduce coefficients 2^k on each of the estimates, so that later application of Lemma 1.3 in page 10 is possible; let x

(rh):

|Drh(sk,x)| p || ( 2k-1 ) ||p = ||Dr sum a[k](sk)N [k],x || || h j=1-r j,r j,r || || k ||p ||2 sum -1 [k] r [k] || = || aj,r(sk)D h(Nj,r,x)|| j=1-r | | | | < (2p-1)|/\k(x)|-1 sum ||a[k](sk)||p||Dr(N [k],x)||p (2.22) ----- ----- x (- /\k(x) j,r h j,r Cp,d,r

Now it all relies on estimates of the differences of the basic tensor product puB-splines; these depend heavily on the location of the point x and the size of |h|; let _O_

(rh) =

', where x

if x and x + rh both belong to the same subcube

_i,k, and

' =

(rh) \

If x and x + rh are in the same cube _i,k supp(N_j,r^[k]), as N_j,r^[k] is a polynomial of coordinate order r there, we have:
$Drh(N [jk],r,x) r [k] = |h| | /_\ (N j,r(.);x,x+ h,...,x + rh) = 1r!|h| rDrhN[jk,r](qx,r,h), (2.23)$
where D_h^r denotes the derivative in the direction given by h ^d, and _x,r,h is a point in the segment with endpoints x and x + rh. Notice that, although N_j,r^[k]|_{_i,n} is a polynomial of coordinate order r, its total degree can be as large as (r - 1)d, and the previous directional derivative is not null in general: A simple computation gives that for d = 1, the derivative vanishes, but for d > 1, it vanishes if and only if r < d/(d - 1) < 2.
In order to estimate ||D_h^rN_j,r^[k]||_{L(segment[x,x+rh])}, we need to use some basic multivariate calculus: Consider the univariate polynomial _j,r,x,h^[k] : [0,1] defined by the composition of N_j,r^[k](x) = N_0,r^[0] o_k,j o_x,h, where _k,j : ^d x 2^kx - j ^d, and _x,h : [0,1] t x + rht segment[x,x + rh] ^d.
$[k] DN j,r,x,h((t) ) = D N[0]o fk,j o fx,h (t) 0,|r = DN [00],r|| .Dfk,j| f (t) .Dfx,h(t) ( fk,j(fx,h(t)) ) | x,h @N[00,]r @N [00],r || k = @x1--,...,-@xd- || .2 Idd .rh | fk,j(fx,h(t)) sum d @N [00],r|| = 2kr hi-@x--|| i=1 i fk,j(fx,h(t)) = 2krDhN [0](x) 0,r$
Use now Markov’s Theorem (page 33) and the fact that ||N_0,r^[0]||_L() < 1 to obtain ||D_hN_0,r^[0]||_L() < C_r; and therefore,
$|| || ||DhN [jk,r]||L oo (_O_) < 2kr||||DhN 0[0,]r|||| < 2kCr ( )r-1|| L oo (_O_||) ||DrhN[jk,r]||L oo (_O_) < 2kr ||||DhN [00],r|||| < 2krCr L oo (_O_)$
Use this estimate in equation (2.23) to get $| | ||Drh(N [kj,]r,x)|| < Cr(2k| h|)r$
Otherwise, if x and x + rh belong to different subcubes of D_k, as far as all the cubes supporting any point x + ih are contained in the support of N_j,r^[k], we simply have N_j,r^[k] W^r(); we can nevertheless apply a similar estimate as before:
$Drh(N [kj,]r,x) ( r-1 [k] ) = Dh D h (N j,r,x) = D (--1--|h| r-1Dr- 1N [k](q )); h (r-1)! h j,r x,r,h$
therefore, $|| r [k] || || ( k(r- 1) r-1)|| k r-1 |Dh(N j,r,x)|< |Dh Cr2 |h| |< Cr(2 |h| ) .$

We have then,

|| || ||||Dr (N[k],.)|||| p h j integral ,r Lp(_O_(rh)) || r [k] ||p = _O_(rh) /~\ supp(N[k])|D h(N j,r,x)| dx integral j,r | |p = ||Drh(Nj[k,r],x)|| dx (G U G') /~\ supp(N[j|k,r]) | | | < C {(2k|h| )rp||G /~\ supp(N[k])||+ (2k|h|)(r-1)p||G' /~\ supp(N[k])|| } r j,r j,r

But

= c2^-kd, and |

'|< c|h|2^-k(d-1); therefore,

|||| r [k] ||||p ||D h(Nj,r,.)||Lp(_O_(rh)) { } < Cr 2krp2-kd| h|rp + 2kp(r-1)2-k(d-1)| h| (r-1)p|h| kp(r- 1) -k(d- 1) kpr -kp -kd k (notice that 2 kpr 2-kp -kd =kp2 2kpr 2-kd 2 { < 2 2 2 2 = 2 2} ) < Cr 2krp2-kd| h|rp + 2kpr2-kd| h |p(r-1+1/p) < Cr2kpc2-kd|h| pc,

where

= min(r,r - 1 + 1/p). Using this last estimate on (2.22), we obtain

||Dch(sk,.)||Lp(_O_(rh)) ( k )1/p c kc 2 sum -1 -kd ||[k] ||p < Cp,d,r| h| 2 2 |aj,r(sk)| j=1-r (use Lemma 2.20 above) < Cp,d,r| h|c2kc|| sk||Lp(_O_(rh))

This leads to two different estimates; for k > 2:

||Dc(sk,.)||L (_O_(rh)) h p c kc( ) < Cp,d,r| h| 2 ||Sk- f||Lp(_O_(rh)) + ||f- Sk-1||Lp(_O_(rh)) < Cp,d,r| h|c2kc(E(f,Xk)p + E(f,Xk -1)p)

And for k = 1,

||Dch(s1,.)|| Lp(_O_(rh)) < C (2| h|)c||S || p,d,r ( 1 Lp(_O_(rh)) ) < Cp,d,r(2| h|)c ||S1- f||Lp(_O_(rh)) + ||f||Lp(_O_(rh)) < Cp,d,r(2| h|)c(E(f,X1)p + ||f||L (_O_(rh))) p

We can now find an upper bound on

_r(f,2^-n)_p using the previous estimates on (2.21):

||Drh(f,.)||mL (_O_(rh)) p ( ) < 2rmE(f,Xn)mp + Cp,d,r(2| h |)cm E(f,X1)mp + ||f||mLp(_O_(rh)) sum n ( ) + Cp,d,r |h|cm2kcm E(f,Xk)mp + E(f,Xk -1)mp k=2 { n sum - 1 < Cp,d,r (2| h|)cm||f||mLp(_O_(rh)) + |h| cm 2kcmE(f,Xk)mp k=1 ( cm cnm rm) m} + |h | 2 + 2 E(f,Xn) p ;

therefore,

{ n sum -1 } 1/m wr(f,2-n)p < Cp,d,r 2(1-n)cm||f||mLp(_O_) + 2-ncm 2kcmE(f,xk)mp + (1+ 2rm)E(f,Xn)mp k=1

As both 1 + 2^r and 2 are greater than one for all choices of p and r, we can bound the previous expression above by multiplying each term in the sum by these coefficients, and include them in the constant.

-n wr(f,2 )p - nc{ cm m n- sum 1 kcm m ncm m } 1/m < Cp,d,r2 2 ||f||Lp(_O_) + 2 E(f, Xk)p + 2 E(f,Xn)p { k=1 } m sum n ( )m 1/m < Cp,d,r2- nc ||f ||Lp(_O_) + 2kcE(f,Xk)p . k=1

As we wanted to prove. [#]

Remark. Abusing notation, we can denote X₀ = {0}, and hence, E(f,X₀) = ||f||_{L_p()}, and we may simplify the previous estimate to read

( sum n [ ] )1/m wr(f,2-n)p < Cp,d,r2-nc 2kcE(f,Xk)p m . k=0

Theorem 21 Given r , and 0 < p,q <; then for all 0 < < , the following quasinorms are equivalent to the Besov quasinorms ||^.||_{B_q(L_p())} = |^.|_{B_q(L_p())} + ||^.||_{L_p()}:

( oo sum )1/q N1(f) = [2naE(f,Xn)p]q n=0 ( oo sum [ ] )1/q N2(f) = 2na||f- Tnf||Lp(_O_)q n=0 ( oo sum )1/q N3(f) = [2na||tn(f)||Lp(_O_)]q n=1

where t_n(f) = T_n(f) - T_n-1(f) for n > 1 (and of course, T₀

0).

Proof. The equivalence of N₁, N₂ and ||^.||_{B_q(L_p())} follows directly from Proposition 11, Corollary 11.1, Proposition 12 and the Discrete Hardy’s Inequalities (Lemma 1.3 in page 10). As for the third quasinorm, notice that on one side,

||tn(f)||Lp(_O_) = ||Tn(f)- Tn-1(f)||Lp(_O_) < Cp {||f- Tn(f)||L (_O_) +||f - Tn-1(f)||L(_O_)} p p < Cp {E(f,Xn)p + E(f,Xn -1)p} (but E(f,Xn)p < E(f,Xn -1)p for all n) < CpE(f,Xn -1)p;

therefore, N₃(f) < C_pN₁(f) for all f

L_p(

). On the other hand,

sum oo sum oo tk(f) = (Tk(f)- Tk-1(f)) = - Tn(f)+ limk Tk(f) = f- Tn(f); k=n+1 k=n+1

therefore,

|| || ( )1/m |||| sum oo |||| oo sum m ||f- Tn(f)||Lp(_O_) = |||| tk+1(f )|||| < ||tk+1(f)||Lp(_O_) , k=n Lp(_O_) k=n

where

= min(1,p). We can use again Lemma 1.3 to obtain

sum oo ( na )q -r sum oo ( na )q 2 ||f- Tn(f)||Lp(_O_) < 2 Cq 2 ||tn(f)|| Lp(_O_) , n=0 n=1

which gives N₂(f) < C_q,rN₃(f); hence proving the statement. [#]

Remark. We have just proved the goal of this chapter; we have precisely determined the approximation spaces in L_p( _O_ ) associated to the family of approximants X_n for all q > 0, and 0 < < :

Corollary 21.1 Given r , the following spaces are identical (with equivalent (quasi)norms) for all 0 < < :

Aaq(Lp(_O_),Xn) -~ Baq(Lp(_O_)) []

Remark. Theorem 21 offers also the possibility of representing functions in Besov spaces by means of a sequence of nonnegative real functions satisfying certain properties. We will use this representation to find in the next section an equivalent expression for the K-functional of couples of Besov spaces; and with it, the computation of interpolation spaces for such couples.

Notice that sum _n=1t_n(f) = lim_nT_n(f) = f a.e.; and as t_n(f) X_n for each n > 1, we may write t_n(f) = sum _k=1-r^2ⁿ-1_j,r^[n](t_n(f))N_j,r^[n], and furthermore

n sum oo sum oo 2 sum - 1 [n] [n] f = tn(f) = aj,r(tn(f ))N j,r n=1 n=1j=1-r

(2.24)

This atomic decomposition of functions in B_q(L_p( _O_ )) leads to yet another equivalent (quasi)norm:

Corollary 21.2 Given p,q,r, as before, f L_p( _O_ ) is in B_q(L_p( _O_ )) if and only if f can be represented as in (2.24), with

( ) 1/p { sum oo 2n sum -1 | |p q/p} N4(f) = 2anq || a[nj,]r(tn(f))|| 2-nd < oo . n=1 j=1-r

2.9 Further results: K-functional of compatible couples of Besov Spaces

Given a sequence of functions a = (f_n)_n in a (quasisemi)normed space (X,||^.||_X), consider for parameters ,q > 0 the (quasisemi)norms

( oo sum )1/q |a| laq(X) = [2na||fn||X]q , n=0

and let

_q(X) =

Consider also, the following operator in L_p( _O_ ):

oo T : Lp(_O_) -) f '--> (tn(f))n (- o+ Xn. n=1

Following Theorem 21, we have that f

B_q(L_p(

)) if and only if Tf

_q(L_p(

)), and moreover, ||f||_{B_q(L_p())}

||Tf||_{_q(L_p())}. We can use this fact to find the K-functionals of compatible couples of Besov spaces:

Theorem 22 Given r , 0 < p₁,q₁,p₂,q₂ < , 0 < ₁,₂ < r, denote B_i = B_{q_i}^_i(L_{p_i}( _O_ )), and _i = _{q_i}^_i(L_{p_i}( _O_ )); then, for all f B₁ + B₂, there exist constants C₁,C₂ > 0 which depend at most on r,d,₁ and ₂ such that for all t > 0,

C1K(f, t;B1, B2) < K(T f,t;l1,l2) < C2K(f, t;B1,B2).

Proof. Let us prove the left inequality: Given f B₁ + B₂, let a₁ = (a_n^[1])_n ₁ such that a₂ = (a_n^[2])_n = Tf -a₁ ₂; we have K(Tf,t;₁,₂) <||a₁||_₁ + t||a₂||_₂. From these sequences a_i, we will construct functions f_i B_i such that f = f₁ + f₂, and ||f_i||_{B_i} < C||a_i||_{_i}.

We will be using the projectors T_n : L( _O_ ) --> X_n for both functions f_i B_i; thus, we need to work in a space L( _O_ ) L_p₁( _O_ ) $/~\$ L_p₂( _O_ ). As | _O_ | = 1, use Jensen’s Inequality to realize that for each 0 < < min(p₁,p₂), and any function L_{p_i}, we have integral || = p
(| f |i) ^/p_i < ( integral p )
_O_ |f |i ^/p_i = ||||_{L_{p_i}()}.

For each n, let g_n = T_n(a_n^[1]) X_n. By the equivalence of quasinorms in finite dimensional spaces, and Proposition 7 in page 54, we know that it must be ||g_n||_{L_p₁()} < C||g_n||_L < C_,r,||a_n^[1]||_L() < C_,r,||a_n^[1]||_{L_p₁()}. Consider now g = sum _n=1g_n, which converges trivially in L_p₁( _O_ ); notice that for each n, with = min(p₁,1).

E(g, Xn)p ( sum n sum o o ) = E ( sum k=1gk + k=n)+1gk,Xn p1 = E o o k=n+1gk,Xn p |||| oo |||| 1 < |||| sum gk|||| ||k=n+1 || ( Lp1(_O_) ) sum oo m 1/m < ||gk||Lp1(_O_) k=n+1 ( sum oo )1/m < Cr,r,t ||a[n1]||mLp1(_O_) . k=n+1

Application of Lemma 1.3 in page 10 gives that it must be

oo sum na p1 oo sum ( na [1])p1 [2 E(g, Xn)] < Cr,r,t 2 an ; n=1 n=1

and therefore, ||g||_B₁ < C_,r,||a₁||_₁. Similarly, f - g

L_p₂(

), and with

= min(p₂,1) this time,

E(f - g,Xn)p2 = E( sum o o k=1tk(f )- sum o o k=1gk,Xn)p ( sum n sum o o 2 ) = E( sum k=1[tk(f)- gk]+ k=)n+1[tk(f)- gk],Xn p2 = E o o k=n+1[tk(f)- gk],Xn p |||| oo |||| 2 < |||| sum [tk(f )- gk]|||| ||k=n+1 || ( Lp2(_O_) ) sum oo m 1/m < ||tk(f)- gk||Lp2(_O_) k=n+1 (notice that tk = Tk(tk(f )) = Qk(tk(f)) = tk(tk(f)) and gk = Tk(a[1]) = Qk(tk(f))) ( k )1/m sum oo |||| ( [1])||||m = ||Qk tk(f)- tk(ak ) ||Lp2(_O_) (k=n+1 ) sum oo |||| ( [1])||||m 1/m = ||(Qk o tk) tk(f)- ak ||Lp(_O_) k=n+1 2 ( sum oo |||| ||||m )1/m < Cr,r,t ||tk(f)- a[1k]|| ; k=n+1 Lp2(_O_)

and as before, we infere ||f - g||_B₂ < C_,r,||a₂||_₂, hence proving the left inequality of the statement.

Let us prove the right inequality: Let g B₁ such that f - g B₂. Given > 0 and 0 < < min(p₁,p₂), we construct near-best elements of L( _O_ ) approximation to f from Y _n via the operator _n : L( _O_ ) --> Y _n, and using Lemma 1.5 in page 14, we can obtain as well elements of L( _O_ ) approximation to g from Y _n, say h_n(g) Y _n, such that _n(f) - h_n(g) are near-best elements of L( _O_ ) approximation to f - g from Y _n. By an argument similar to the proof of Lemma 8 in page 55, we realize that U_n = Q_n(h_n(g)) is a near-best L_p₁( _O_ ) approximation to g from X_n, and R_n = T_n(f) - Q_n(h_n(f)) is a near-best L_p₂( _O_ ) approximation to f - g from X_n.

Let u_n = U_n - U_n-1 and r_n = R_n - R_n-1 for n > 1 (being U₀ = R₀ = 0 trivially), and consider the sequences u = (u_n)_n,r = (r_n)_n _n=1X_n. Notice that

||un||Lp (_O_) 1 { } < Cp1 ||Un - g||Lp1(_O_) + ||g- Un- 1|| Lp1(_O_) < Cr,d,r,p ,t {E(g,Xn)p + E(g,Xn -1)p } 1 1 1 < Cr,d,r,p1,tE(g,Xn)p1

and similarly, ||r_n||_{L_p₂()} < C_r,d,,p₂,E(f - g,X_n)_p₂; hence, by Theorem 21 in page 84, we have u

₁, r

₂, and moreover, for any t > 0, ||u||_₁ + t||r||_₂ < C {||g||B1 + t||f -g||B2}

, and the statement follows. [#]

Corollary 22.1 Under the same conditions as in the Theorem above, and given 0 < < 1 and 0 < q <, we have f (B₁,B₂)_,q if and only if Tf (₁,₂)_,q.

Notice that this result allows us to compute the interpolation spaces for compatible couples of Besov Spaces. It all depends on the computation of the interpolation spaces ( )
laq11(Lp1(_O_)),laq22(Lp2(_O_)) _,q; these are easily defined in terms of the Lorentz spaces L_p,q( _O_ ), so we will introduce them in the next section.

2.10 Lorentz spaces

Given a totally -finite measure space ( _O_ ,), and values 0 < p,q < , consider the Lorentz functionals

( integral oo [ 1/p * ]q dt)1/q + rp,q : M0(_O_, m) - ) f '--> t f (t) t- (- R (2.25) 0 ( integral oo [ ] )1/q ||.||L (_O_,m) : M0(_O_, m) - ) f '--> t1/pf**(t) q dt (- R+ (2.26) p,q 0 t

and for q =

1/p * rp, oo : M0(_O_,m) - ) f '--> sut>p0 t f (t) (2.27) 1/p ** || .||Lp, oo (_O_,m) : M0(_O_,m) - ) f '--> stu>p0t f (t) (2.28)

Lemma 2.21 We have the equivalence _p,q(^.) ||^.||_{L_p,q(,)} among Lorentz functionals for all p > 1 and 0 < q <.

Proof. For all 0 < q < we have trivially _p,q(f) <||f||_{L_p,q(,)}, since f^*(t) < f^**(t) for all t > 0. On the other hand, for 0 < q < ,

||f||q Lp,q integral (_O_,m) = oo [t1/pf**(t)]q dt 0 t integral oo [ integral t ]q dt = t1/p-1 f *(s)ds -t integral t= oo 0 s=[0 integral t ]q = t-q(1-1/p) sf *(s)ds dt t=0 s=0 s t (use Hardy’s Theorem: estimate (1.5) in page 8 for 1 < q < oo , and estimate (1.9) i integral n oo page 12 for 0 < q < 1 ) < Cp,q t-q(1- 1/p)[tf*(t)]q dt integral 0 t oo [ 1/p * ]q dt = Cp,q 0 t f (t) t

A similar proof can be applied for the case q =

Lemma 2.22 (Properties of Lorentz functionals) (i) Both _p,q(f) < _p,q(g) and ||f||_{L_p,q(,)} <||g||_{L_p,q(,)} for f,g M₀( _O_ ,) such that |f|<|g|. (ii) The functionals (2.26) and (2.28) are both (quasi)norms for all 1 < p < .

Proof. Part (i) is trivial. Using this, and the subadditivity property of the maximal functions f^**, we infere that the functionals (2.26) and (2.28) are both (quasi)norms for all 0 < q < . [#]

Remark. Notice that the lack of subadditivity of the decreasing rearrangements gives us that the functionals (2.25) and (2.27) cannot have any (quasi)triangular property; hence, they do not have (quasi)norm structure.

Definition 21 The Lorentz spaces L_p,q( _O_ ,) are the Riesz spaces associated to the Lorentz (quasi)norms ||^.||_{L_p,q(,)}:

{ } Lp,q(_O_) = f (- M0(_O_, m) :||f||Lp,q(_O_) < oo

2.11 Further results: Interpolation of Besov Spaces

Theorem 23 For all f L₁( _O_ ) + L( _O_ ) and t > 0,

integral t K(f, t;L1(_O_),L oo (_O_)) = f*(s)ds. 0

Proof. We prove first that the integral in the left-hand side is bounded above by the K-functional on the right-hand side. We will use for this the sub-additivity of the maximal functions ((f + g)^**(t) < f^**(t) + g^**(t) for all t > 0), and the fact that the spaces L_p( _O_ ) are rearrangement-invariant.
Given f L₁( _O_ ) + L( _O_ ), and any decomposition f = f₁ + f with f_q L_q( _O_ ), we have

integral t f*(s) ds 0 integral t integral t < f*(s)ds+ f* (s)ds 0 1 0 oo = ||f*1||L1(_O_) + t|| f* oo ||L oo (_O_) = ||f || +t||f || ; 1 L1(_O_) oo L oo (_O_)

therefore proving the stated inequality.
In order to prove the other inequality, it suffices to find for each t > 0 a decomposition f = f_(1,t) + f_(,t) with f_(q,t)

L_q(

) such that ||f_(1,t)||_L₁() + t||f_(,t)||_L() = tf^**(t): For this task, fix t > 0, consider E_t = {x

: |f(x)| > f^*(t)}, and let t₀ = |E_t|. Notice that t₀ < t trivially, and also f

L₁(E_t) (since f is bounded there). Set g(x) = max{|f(x)|- f^*(t),0}signf(x), and h(x) = min{|f(x)|,f^*(t)}signf(x).
Note first that h

), with ||h||_L() = f^*(t). Also, g

L₁(

integral integral integral integral t0 |g(x)| dx = (|f(x)|- f*(t)) dx = |f(x)|dx- t0f*(t) = f*(s)ds- t0f *(t); _O_ Et Et 0

therefore, ||g||_L₁() + t₀||h||_L() = integral

₀^t₀f^*(s)ds, and furthermore,

integral integral ||g|| + t||h|| = t0f*(s)ds + (t- t )f*(t) = tf*(s)ds, L1(_O_) L oo (_O_) 0 0 0

since f^*(s) = f^*(t) for all t₀ < s < t. [#]

Corollary 23.1 Given 0 < < 1 and 0 < q <, we have (L1(_O_),Lo o (_O_)) _,q = L_p,q( _O_ ), where 1/p = 1-.

Corollary 23.2 Given 1 < p₁,p₂ <

, 0 <

< 1 and 0 < q <

, we have (L_p₁( _O_

),L_p₂(

))_,q = L_p,q( _O_

), where 1/p = (1 -

)/p₁ +

/p₂.

Proof. Use the previous Corollary and the Reiteration Theorem 5 (page 23). [#]

Theorem 24 Let (X,||^.||_X), (X₁,||^.||_X₁) and (X₂,||^.||_X₂) be complete (quasi)normed spaces, and let 0 < ₁ < ₂, 0 < < 1 and 0 < q₁,q₂ <. Denote _k(X) = _{q_k}^_k(X); then the following properties hold: (i) (l1(X),l2(X)) _,q = _q(X) for all 0 < q <, where = (1 - )₁ + ₂. (ii) (l1(X1),l2(X2)) _,q = _q ((X1,X2)h,q) , where = (1 - )₁ + ₂ and 1/q = (1 - )/q₁ + /q₂.

Corollary 24.1 Under the same hypothesis as in the previous Theorem, if X_k = L_{p_k}( _O_

) for 1 < p₁,p₂ <

, then we have

( ) laq11(Lp1(_O_)),laq22(Lp2(_O_)) = laq (Lp,q(_O_)), h,q

where

= (1 -

)

₁ +

₂, 1/q = (1 -

)

₁ +

₂, and 1/p = (1 -

)/p₁ +

/p₂.

Corollary 24.2 Under the same hypothesis as in the previous Corollary, we have

(Ba1(Lp ),Ba2 (Lp )) = Ba(Lp), q1 1 q2 2 h,q q

where

= (1 -

)

₁ +

₂, 1/q = (1 -

)/q₁ +

/q₂, 1/p = (1 -

)/p₁ +

/p₂, and p = q.

2.12 Further Results: Embedding Theorems for Besov Spaces

Lemma 2.23 Given p, > 0, r and _O_ ^d, consider > 0 defined by 1/ = /d + 1/p. Then for all n there exists C > 0 which depends at most on ****what?****, such that ||S||_{L_p()} < 2^npC||S||_L() for all S Y _n.

Theorem 25 Given p,

> 0 as in the previous lemma, B_p(L( _O_

)) is continuously embedded in L_p( _O_

3 Approximation by Ridge Functions on Dyadic maximal-smoothness splines

3.1 General Theory of Ridge Functions

Throughout this section, _O_ _d denotes the d-dimensional unit ball in ^d with respect to the euclidean norm; their d dimensional size is denoted _d. _d-1, the unit sphere, is the boundary of the previous set; and _d-1 _d-1 is the set of directions in ^d. We assume the latter to be a connected set for integration purposses.

Definition 22 Given d > 2, a univariate function f and a direction ^d-1, we define the d-dimensional ridge function on _O_ _d generated by f with direction by

d /\[f,h](.) = /\(.|f, h) : R -) x '--> f(h.x)x_O_d(x) (- R

Lemma 3.1 Ridge functions have the following properties: (i) Given a direction

_d-1, and any univariate function f, the ridge function $/\$ [f,

] is constant over the intersection of _O_

_d with each affine fyperplane which is orthogonal to

. (ii) $/\$ [

f + g,

] =

$/\$ [f,

] + $/\$ [g,

] for all

_d-1,

and univariate functions f,g. (iii) If f

L_p(

₁), then for all

_d-1, $/\$ [f,

]

L_p(

_d), with || $/\$ [f,

]||_{L_p(_d)} < C_p,d||f||_{L_p(₁)} for some constant C > 0 that depends at most on d and p. (iv) Given f :

and a univariate function g, there exists h = h_f,g such that f * $/\$ [g,

] = $/\$ [h,

]. Moreover, if 1 < q <

, f

L_q(

_d), and g

L₁(

₁), then we have || $/\$ [h,

]||_{L_q(^d)} <

_d-1||f||_{L_q(^d)}||g||_L₁(₁).

Proof.

: (i) Given _d-1, let us denote the affine hyperplane which is orthogonal to and goes through the origin; we will also denote span{} the line trough the origin with direction . As ^d = span{}, we can then express each x ^d uniquely as x = u_x + , where u_x is the orthogonal projection of x over , and is the orthogonal projection of x over span{}. In that case, we have: $/\[f,h](x) = f(h .x) = f(h.ch) = f (c).$
: (ii) is trivial.
: (iii) In order to estimate norms of ridge functions for a given direction _d-1, we will make use of the tangent cylinders Cyl_d() to the unit spheres and with bases parallel to the hyperplanes (with this definition, each d dimensional cylinder is unique). Notice that those bases are (d-1) dimensional balls in ^d. In that case, we may estimate $integral integral integral integral |/\[f,h](x)| pdx = |f(h.x)| pdx < | f(h.x)|pdx = gd- 1 |f(t)| p dt, _O_d _O_d Cyld(h) _O_1$ what proves our statement.
: (iv) Given x = u_x + ^d as before, we observe that
${f */\[g integral ,h]}(x) = f(y)g(h .[x - y])dy Rd integral = Rd f(y)g(h .[ch - y])dy = {f * /\[g,h]}(ch);$
and therefore, this convolution is also a ridge function with the same direction. Denote h = h_f,g a univariate function such that f * $/\$ [g,] = $/\$ [h,].
If f L_q(_d) and g L₁(₁), we may estimate,
$||/\[h,h||]( integral x)|||Lq(_O_d) || |||| |||| = || Rd f (x - y)/\[g,h](y)dy||L (_O_ ) integral q d < |/\[g,h](y)|||f(.- y)||Lq(_O_d) Rd integral = ||f||L (_O_) | g(h.y)|dy q d integral _O_d < ||f||Lq(_O_d) Cyl (h)|g(h .y)| dy d integral = gd-1||f||Lq(_O_d) |g(t)| dt, _O_1$
which is what we wanted to prove.

Definition 23 Given 0 < p <

, and a homogeneous subspace of functions F

L_p(

₁), consider the following spaces of ridge functions on _O_

_d: Discrete non-linear ridgelets:

{ sum m } Ym(F) = /\[fk,hk] : fk (- F,hk (- Pd- 1 . k=1

Discrete linear ridgelets:

oo Y (F) = U Y (F). m=1 m

Radon Ridgelets: Let C_d =

_d-1 ×

₁, and consider the Inverse Radon linear functional R^* : M(C_d) -->

_d) defined by R^*g(x) = integral

_{_d-1}g(

^. x)d

{ * } Xd(F) = integral R g(*x) : gh =dgt(h,.) (- F (if d is odd) RR g(.,.- t) t : gh (- F (if d is even)

4 Appendix

4.1 Elements of Functional Analysis

4.1.1 Linear Transformations

Theorem 26 Let (X,||^.||_X) and (Y,||^.||_Y) be (quasi)normed linear spaces and F : X --> Y a linear map. Then the following conditions are equivalent: (i) F is bounded on some closed ball about 0 of positive radius. (ii) F is continuous at 0. (iii) F is uniformly continuous on X. (iv) There exists > 0 such that ||F(x)||_Y < ||x||_X for all x X. (v) In particular, if Y = , with absolute value for a norm, then each of the above conditions is equivalent to the following: If F0, then the hyper-space Z(F) is closed in X.

Proof.

(i)(ii): By hypothesis there exists , > 0 such that ||F(x)||_Y < ||x||_X for all x X with ||x||_X < . Given > 0, we can choose = min(,/) > 0 and we have continuity at 0.
(ii)(iii): Given > 0, choose > 0 to satisfy continuity at 0, and given any x X, consider y X with ||x - y||_X < . We have then ||F(x) - F(y)||_Y = ||F(x - y)||_Y < , and F is continuous at x. Notice that the choice of does not depend on the value of x.
(iii)(iv): Given = 1, choose the corresponding > 0 as in the definition of uniform continuity. Given x X, consider x' = x; notice that ||x'||_X = /2. Then it must be ||F(x')||_Y < 1; therefore proving the statement: $|||| ( )|||| ||F (x)||Y = ||||F 2||x||Xx' |||| = 2 ||x||X||F(x')||Y < 2||x||X d Y d d$
(iv)(i): This is trivial.
(iii)(v): If F is continuous, then since {0} is closed in , it must be Z(F) = F^-1({0}) closed in X. Conversely, assume that Z(F) is closed in X and that F is not continuous at 0; then there exists a sequence (x_n)_n converging to zero in X and a value > 0 such that |F(x_n)| > for all n. Consider an element x₀ X with ||x₀||_X = > 0, and the sequence (y_n)_n given by y_n = x₀ - x_n/F(x_n). Notice that F(y_n) = 0 for all n (and so y_n Z(F)), but lim_ny_n = x₀ / Z(F), a contradiction.

Remark. One should not be very happy about the previous result when dealing with quasinorms; still existence of continuous linear functionals has to be proved in the space of your choice. For instance, in L_p[0,1] for 0 < p < 1, the only continuous linear functional is the zero functional! A proof of this result (M. M. Day’s Theorem) can be read in [Torc].

Corollary 26.1 All linear functionals of a (quasi)normed linear space are continuous.

Proof. This is a direct consequence of (v) in Theorem 26 above. [#]

Theorem 27 Any two (quasi)norms are equivalent in a finite dimensional linear space.

Proof. Given two different (quasi)norms in a finite dimensional linear space X_d, ||^.||₁ and ||^.||₂, it will be enough to prove that the linear function F : (X_d,||^.||₁) x '--> x (X_d,||^.||₂) is continuous. For that purpose, choose any basis of X_d, say {f_k}_k=1^d, and decompose F in terms of the projections over the coordinate subspaces span(f_k): F(^.) = sum _k=1^dproj_k(^.)f_k. We have written F as a finite sum of continuous functionals (by the previous Corollary); hence, F must be continuous. Apply now part (iv) of Theorem 26 to get the desired result. [#]

Corollary 27.1 Every closed bounded set of a finite dimensional (quasi)normed linear space is compact.

4.1.2 The Hahn-Banach Theorem

Theorem 28 (Hahn-Banach Lemma) Let F : X --> be a sublinear function on a vector space X over a field , let Y be a subspace of X and let : Y --> be a linear functional such that |(x)|< F(x) for all x Y . Then there exists a linear map : X --> which extends and which is dominated by F on all of X.

Theorem 29 (Hahn-Banach) Let (X,||^.||)_X be a normed linear space and let Y be a linear subspace. Then to every linear functional

: Y

there corresponds another linear map

: X

such that

|_Y =

, and ||

||_Y = ||

||_X.

4.2 Rearrangement-Invariant spaces

4.2.1 Riesz Spaces

Let ( _O_ ,) be a measure space; consider the following sets:

: M(,), the set of measurable functions on .
: M⁺(,), the set of nonnegative measurable functions.
: M₀(,), the set of measurable functions f such that {|f| = } = 0.

A mapping : M⁺ --> [0,] is called a quasinorm function, if for f,g M⁺( _O_ ,), the following properties hold:

is a quasinorm.
If g < f ( - a.e.), then (g) < (f).
If (f_n)_n M⁺(,) verifies f_n+1 > f_n ( - a.e.) for all n, and lim_nf_n = f ( - a.e.), then also lim_n(f_n) = (f).
For any measurable subset E with (E) < , (_{_E}) < .
For E as before, there exists a constant C_E > 0 such that _Ef d < C_E(f) for all f.

Given such a quasinorm function on ( _O_ ,), the collection X() = {f M( _O_ ,) | (|f|) < } is called a Riesz space associated to . Such spaces inherit from its quasinorm special properties:

Lattice property: if |g|<|f|, then (g) < (f).
Fatou’s property: Let f,(f_n)_n M⁺(,) such that lim_nf_n = f. If f X(), then lim_n(f_n) = (f); otherwise, lim_n(f_n) = .
Fatou’s Lemma: Given (f_n)_n M(,), (liminf _nf_n) < liminf _n(f_n).
_{_E} X() for all measurable subset E with (E) < ; besides, there exists a constant C_E > 0 such that _Ef d < C_E(f) for all f X().
Every convergent sequence in X() has a - a.e. convergent subsequence.
Riesz-Fisher property: If _n=1(f_n) < , then _n=1f_n converges in X() to a function f X(), and (f) < _n=1(f_n).

4.2.2 Resonant measure spaces

Definition 24 Given a measurable space ( _O_ ,), and f M( _O_ ,), consider the associated functions:

: Distribution. _f : (0,) y ${x (- _O_ :| f (x)|> y}$ (0,).
: Decreasing Rearrangement. f^* : [0,) tinf _y>0 [0,).
: Maximal function. f^** : (0,) t ₀^tf^*(s)ds (0,).

We say two measurable functions f,g are equimeasurable (and we write f ~ g), if _f = _g.

Lemma 4.1 (Properties of the distribution function) Given f,g

_f is a nonnegative, right-continuous and monotone decreasing function.
If |g|<|f|( - a.e.), then _g < _f.
Given a sequence (f_n)_n M(,) such that |f_n+1|>|f_n| for all n, and lim_n|f_n| = f, then we also have lim_n_{f_n} = _f.

Lemma 4.2 (Properties of the decreasing rearrangement) Given f,g

f^* is a nonnegative, right-continuous decreasing function.
If |g|<|f|( - a.e.), then g^*< f^*.
(af)^* = |a|f^* for all a .
(f + g)^*(t + s) < f^*(t) + g^*(s) for all t,s > 0.
Fatou’s property: If |f|< liminf _n|f_n|( - a.e.), then also f^* < liminf _nf_n^*; in particular, if (f_n)_n verifies |f_n+1|>|f_n| for all n, and lim_n|f_n| = |f|( - a.e.), then lim_nf_n^* = f^*.
f ~ f^*.
^* = ^p for all 0 < p < .
For all f M₀(,) and 0 < p < , $integral p integral oo p-1 integral oo * p |f| dm = p c mf(c)dc = f (t) dt _O_ 0 0$
esssup|f| = inf{ | _f() = 0} = f^*(0).
Weak subadditivity: (f + g)^*(t + s) < f^*(t) + g^*(s) for all t,s > 0.

Lemma 4.3 (Properties of the maximal function) Given f,g

M₀(

f^** is a nonnegative, nonincreasing continuous function.
f^** 0 if and only if f 0.
f^*< f^**
If |g|<|f|( - a.e.), then g^**< f^**.
(af)^** = |a|f^** for all a .
If (f_n)_n M(,) verifies |f_n+1|>|f_n| for all n, and lim_n|f_n| = |f|, then lim_nf_n^** = f^**.
Subadditivity: (f + g)^**(t) < f^**(t) + g^**(t) for all t > 0.

Remark. These three new functions associated to f may be used to perform integral operations on f, but in a simpler setup. The three subsequent results show us how:

Lemma 4.4 Given a simple nonnegative function g M( _O_ ,), the following estimate holds:

integral integral m(_O_) * _O_ gdm < 0 g (s)ds.

Proposition 13 (Hardy-Littlewood) Given f,g

M₀(

), the following estimate holds:

integral integral oo * * _O_|fg|dm < 0 f (s)g(s)ds.

Corollary 13.1 Given f,g

M₀(

integral integral oo |f~g|dm < f*(s)g*(s)ds, _O_ 0

for all

~ g.

Example. Consider _O_ = {1,...,n} with measure : _O_ k '--> 1/n for all k. Notice that, given any measurable function g : _O_ k '--> g_k , then any ~ g may be obtained by mere permutation of the elements (there exists a permutation _n such that _k = g_(k)). In this case, equality is attained in Corollary 13.1: Given f = (f₁,...,f_n), consider a permutation such that |f_(k)|>|f_(k+1)| for all k; then we have _f = _{_{[0,|f_(n)|)}} + sum _k=1^n-1 k
n _{_{[|f_(k+1)|,|f_(k)|)}}, and f^* = sum _k=1ⁿ|f_(k)|_{_{[(k-1)/n,k/n)}}; therefore, for any given g = (g₁,...,g_n), it suffices to find two permutations: first ' permutes the indices so that |g_'(k)|>|g_'(k+1)|, and then matches '(k) with (k). This gives us, for _k = g_(k), that

integral n n integral |f ~g| dm = sum -1|f ~g |= sum 1|f g ' | = oo f*(s)g*(s)ds. _O_ k=1n k k k=1n s(k) s(k) 0

Definition 25 We say a measure space ( _O_ ,) is resonant if

integral integral oo sup |f~g|dm = f*(s)g*(s)ds ~g~g _O_ 0

for all f,g

M₀(

). We say the space is strongly resonant if the supremum is attained.

We will prove that any compact cube _O_ ^d with the Lebesgue measure is a strongly resonant space, and therefore we may use the previous results to simplify the computation of integral operations on it:

Lemma 4.5 Let _O_ ^d be a compact cube, and let = |^.| denote the Lebesgue measure. Given f M₀( _O_ ,), and t [0,| _O_ |], there exists a measurable subset _O_ _t _O_ with | _O_ _t| = t, and such that integral _{_t}|f(x)|dx = ₀^tf^*(s)ds. Moreover, these sets can be chosen so that s < t implies _O_ _s _O_ _t.

Proposition 14 Any cube in

^d with the Lebesgue measure is a strongly resonant space.

Definition 26 Given a totally

-finite measure space ( _O_

), a quasinorm function

: M⁺(

)

⁺ such that

(f) =

(g) for each f ~ g

M₀⁺(

) is called rearrangement-invariant. In that case, the space X(

) is said to be rearrangement-invariant as well.

Notice that the spaces L_p for any 0 < p < are all rearrangement-invariant.

Definition 27 Let (X,||^.||_X) be a rearrangement-invariant function space over a resonant measure space ( _O_ ,). Consider the function _X : [0,( _O_ )] t '--> ||_{_E}||_X , where E _O_ is any measurable subset with (E) = t (notice that, if F _O_ , FE and (F) = (E), then _{_F} ~ _{_E} and they have the same norm).

4.3 To Do

We need the proof of the Reiteration Theorem (page 5) for ,q interpolation spaces via the real method of Peetre and Lions, and the previous result. It will be used in section 2.11.
Perhaps a more exhaustive reading on the paper by Brown and Lucier. Mainly the characterization of best-approximations in L₁ and the main theorem.
Finish the proof of the Lemma on mollifiers in section 2.6.1.
Maybe include the construction of several K-functionals; this will give you the opportunity of presenting the Rearrangement-Invariant spaces and its applications. If you want to prove Whitney’s Theorem, then you must show the K-functional for the pair (L_p,W_p^r).
Prove the results in section 4.1.2, and include a few corollaries. Which ones? I am not sure; among the ones I have in Friedman and Prof. Philips’ notes, it is possible that there are some that have interest for Approximation Theory. Sit on it for a while.
Finish the proofs of the results in section 2.7.1 on Besov Spaces.
Extend the results in section 1.3 using chapter 3 of [DeLo]. Theorem 1.3 and its implications to best approximation in L_p for 1 < p < set an idea of what one can or cannot require of (near)best operators; namely, linearity, continuity, boundedness, and how to use those properties to construct (near)best approximations to any given function. This is a good spot to include item 2, although it might be even better in section 2.1.
Maybe prove all the claims in section 4.2.2; for completion, mainly. Or the reader can be directed to the proofs in [BeSh] if interested.
Finish the missing proofs in section 2.6.3 related to the equality of spaces H_p^r(G) = W_p^r(G).

References

[Adam] R.A. Adams, “Sobolev Spaces”, Academic Press, New York, 1975.

[deBo] C. de Boor, “Class notes for Math/CS 887, Spring’03”, http://www.cs.wisc.edu/~deboor.

[dBFi] C. de Boor and G.F. Fix, “Spline approximation by quasi-interpolants”, J. Approx. Theory 8 (1973), 19-45.

[BeSh] C. Bennet and R. Sharpley, “Interpolation of Operators”, Academic Press (1988), New York.

[BrLu] L. Brown and B. Lucier, “Best approximations in L₁ are near best in L_p, p < 1”, Proc. Amer. Math. Soc. 120 (1994), 97-100.

[Bure] V.I. Burenkov, “Sobolev Spaces in Domains”, http://www.cf.ac.uk/maths/people/Sobol.pdf

[Cal1] A.P. Calderón, “Intermediate spaces and interpolation: the complex method”, Studia Math. 24 (1964), 113-190.

[Cal2] A.P. Calderón, “Spaces between L₁ and L and the Theorem of Marcinkieiwicz: the complex method”, Studia Math. 26 (1964), 273-279.

[CDeH] A. Cohen, R. DeVore and R. Hochmuth, “Restricted Approximation”, Constr. Approx. 16 (2000), no. 1, 85-113.

[CuSc] H. B. Curry, I. J. Schoenberg, “On Pólya frequency functions. IV. The fundamental spline functions and their limits”, J. Analyse Math 17 (1966), 71-107.

[DeVo] R. DeVore, “Nonlinear Approximation”, Acta Numerica 7 (1998), 51-150.

[DeLo] R. DeVore and G. Lorentz, “Constructive Approximation”, Springer Grundlehren, Heidelberg, 1993.

[DeP1] R. DeVore and V. Popov, “Interpolation of Besov Spaces”, Trans. Amer. Math. Soc. 305 (1988), 397-414.

[DeP2] R. DeVore and V. Popov, “Interpolation spaces and nonlinear approximation”, Function Spaces and Applications (M. Cwikel et al., eds), Vol. 1302 of Lecture Notes in Mathematics, Springer, Berlin, 191-205.

[DeSh] R. DeVore and R. Sharpley, “Maximal Functions Measuring Smoothness”, Memoirs Vol. 293 (1984), American Mathematical Society, Providence, RI.

[Frie] A. Friedman, “Foundations of Modern Analysis”, Dover, New York, 1982.

[Peet] J. Peetre, “A Theory of Interpolation of Normed Spaces”, Course notes, University of Brasilia (1963).

[Petr] P. Petrushev, “Approximation by Ridge Functions and Neural Networks”, SIAM J. on Math Analysis, 30 (1998) 115-189.

[Torc] A. Torchinsky, “Real Variables”, Addison-Wesley, 1988.