Elements of Approximation Theory:
Constructive Approximation and Examples

Francisco Blanco-Silva *


*Department of Mathematics, Purdue University
Abstract

In this survey we introduce the general Theory of Approximation to functions in (quasisemi)normed spaces; the exposition starts with an explanation of the main problem: we impose certain family of subspaces as our approximants, and we need to obtain a description of the subspace(s) that are approximated by this family with a given approximation order. We introduce as well to some of the background and basic tools most often used to solve this kind of problems. Approximation Theory gets heavily improved when some efforts are put into the effective construction of the approximants on each given example, rather than simply stating its existence --this is what we call “Constructive Approximation”. The fact that we can handle actual functions, allows us to obtain yet more properties of the approximants. It is implicit throughout the exposition how Approximation Theory benefits from other branches of Mathematics, but also how Constructive Approximation can be used to prove results from those other subjects. Finally, we include extensive examples that help us better understand how all this can be achieved.

Contents

1 Approximation Theory
 1.1 Introduction
 1.2 Hardy’s Inequalities
 1.3 Best and Near-Best Approximation
 1.4 Interpolation Spaces: The Real Method
 1.5 Jackson and Bernstein Inequalities
2 Dyadic maximal-smoothness Spline Approximation
 2.1 Best and Near-Best Lp Polynomial Approximation in cubes of Rd
 2.2 Markov’s Theorem
 2.3 Divided Differences
 2.4 Univariate Splines
  2.4.1 Definition and Basic Properties
  2.4.2 Quasi-Interpolant Operators
 2.5 Tensor Product Splines: Description of the problem
 2.6 Sobolev Spaces
  2.6.1 Mollifiers and Infinitelly Differentiable Partitions of Unity
  2.6.2 Distributions and Weak Derivatives
  2.6.3 Sobolev Spaces Wpr(G) and Hpr(G)
  2.6.4 Properties of Sobolev Spaces in one dimension
 2.7 Modulus of smoothness and Besov Spaces
  2.7.1 Definitions and Properties
  2.7.2 Whitney’s Theorem
 2.8 Other Seminorms for Besov Spaces
 2.9 Further results: K-functional of compatible couples of Besov Spaces
 2.10 Lorentz spaces
 2.11 Further results: Interpolation of Besov Spaces
 2.12 Further Results: Embedding Theorems for Besov Spaces
3 Approximation by Ridge Functions on Dyadic maximal-smoothness splines
 3.1 General Theory of Ridge Functions
4 Appendix
 4.1 Elements of Functional Analysis
  4.1.1 Linear Transformations
  4.1.2 The Hahn-Banach Theorem
 4.2 Rearrangement-Invariant spaces
  4.2.1 Riesz Spaces
  4.2.2 Resonant measure spaces
 4.3 To Do

1 Approximation Theory

1.1 Introduction

Let (X,||.||X) be a (quasisemi)normed linear space. Consider a countable family of spaces in X, {Xn}n with associated error functionals E(.,Xn)X = inf g (- Xn||.-g||X, satisfying the following properties:

  1. Homogeneity: cXn = Xn for all n and c  (- R
  2. C-linearity: There exists C > 1 (independent of n) such that Xn + Xn = XCn for all n.
  3. Local (near)best approximation: Given f  (- X, there exists an element of (near)best approximation to f from Xn for all n.

    We say g  (- Y is an element of best approximation to f  (- X, if ||f - g||X = E(f,Y )X.

    A near best approximation element to f from Y is by definition any function g such that ||f - g||X < tE(f,Y )X for some value t > 0. In that case, we will often refer to such a function g as a t-near best approximation element.

  4. Global best approximation: limnE(f,Xn)X = 0 for all f  (- X.

We will call {Xn} a family of approximants. For any such family, and given parameter values a,q > 0, consider the following (quasi)seminorms and associated subspaces:

                {                  }
                  sum  oo  1 a         q  1/q
 |.|Aaq(X,Xn)  =        n (n E(.,Xn)X)      for 0 < q <  oo             (1.1)
                 n=1 a
|.|Aa oo (X,Xn) =   sunp>1{n E(.,Xn)X}                                    (1.2)
                {                     }
 Aaq(X,Xn)  =    f  (-  X :| f|Aaq(X,Xn) <  oo
We call these Approximation Spaces associated to the family of approximants {Xn}. They consist on those functions f  (- X that are approximated in X by elements of Xn with error of order O(a) (i.e. , there exists C > 0 such that E(f,Xn)X < Cn-a for all n) and “smoothness” q.

Lemma 1.1 If the sequence (E(f,Xn)X)n is monotone decreasing, then the (quasi)seminorms |.|Aqa(X,Xn) are equivalent to the following:

               {                   } 1/q
                  oo  sum  (ka          )q
|f| Aaq(X,Xn)   )(       2  E(f,X2k )X        for 0 < q <  oo              (1.3)
                 k=0
|f|Aa oo (X,Xn)   )(  sup2kaE(f,X2k )X                                    (1.4)
               k>0

Proof. For any k > 0 and any 2k-1 < n < 2k, we have the estimates

2k-1E(f,X k-1)X < naE(f, Xn)X < 2kE(f,X  k)X;
         2                             2
therefore,
 2 sum k-1
      1(naE(f, Xn)X)q
n=2k-1n
      2k-1
  <     sum   1 (2kaE(f, X k) )q
     n=2k-1n          2  X
       k
      2 sum -1--1- (ka          )q
  <     k-12k+1 2  E(f, X2k)X
     n(=2ka          )q
  =   2  E(f,X2k)X   ,
and similarly,
  k
 2 sum -1 1  a         q   ( (k-1)a           )q
      n (n E(f,Xn)X)  >  2     E(f,X2k-1)X  .
n=2k-1
Adding all the terms of the series that defines the seminorm associated to the spaces Aqa(X,Xn), and applying the estimates above, we get the desired result. [#]

Lemma 1.2 Under the same hypothesis as before, and any value a > 0, the following inclusion is verified for all 0 < q < p < oo :

Aa(X, Xn) < Aa(X,Xn).
 q           p

Proof. This is just an application of the well known inclusions lq < lp for 0 < q < p < oo : Consider the measure space (N,m), where m(n) = 1 for all n, and consider for each function f  (- X the (measurable in (N,m)) functions fa : N  -) k'-->2kaE(f;X2k)X  (- R+; then,

            {                   } 1/q
               oo  sum  ( ka         )q      (i ntegral   q  )1/q
|f| Aaq(X,Xn)  )(       2  E(f;X2k)X      =    Nf adm
              k=0
If f  (- Aqa, then certainly fa  (- lq  /~\ l oo , and therefore;
            (i ntegral      )1/p
|f| a       )(     fp dm    < ||f  ||  ||f ||    )(  |f |a     | f |a       [#]
  A p(X,Xn)     N a           a  l oo    a lq    A oo (X,Xn) Aq(X,Xn)

Remark. In the following sections we will learn to find descriptions of these spaces in terms of classical spaces. The main tools used in this sense are given in the following order:

1.2 Hardy’s Inequalities

Theorem 1 (Hardy) Given a > 0 and 1 < q <  oo , the following inequalities hold for each nonnegative measurable function f:

 integral   oo  ( -a  integral  t ds)q dt     1   integral   oo  [- a ]q dt
     t     f(s)-s   -t  <   aq-    t  f(t)  -t                  (1.5)
 integral 0 oo  (  integral  0 oo      )q             integral 0 oo 
      ta    f(s)ds   dt  <   1q-   [taf(t)]q dt                   (1.6)
 0       t      s    t      a  0          t
For q =  oo , the integral is replaced by the L oo norms:
   {     integral  t  ds}
sup t-a    f(s)--    <  a1||t- af(t)||L oo (0, oo )                   (1.7)
t>0{    integral  0 oo     s}
sup  ta     f(s)ds    <  -1||taf(t)||                            (1.8)
t>0     t      s       a        L oo (0, oo )

Proof. Let us prove the estimate (1.5). For any value c > 0 we estimate first the interior integral using Hölder’s Inequality; let p be the conjugate exponent of q:

 integral            integral                (i ntegral             )1/q( integral          )1/p
  tf(s)ds =  ts- cf(s)sc-1ds <    t[s-cf(s)]qds       tsp(c- 1)ds
 0     s     0                  0                  0
The second integral can be computed provided c < 1; in that case we obtain
 integral  t- p(1-c)        1-p(1-c)]t
  s      ds = Cq,c s        s=0
 0
and if c > 1/q, we have  integral 0ts-p(1-c)ds = Cq,ct1-p(1-c). We can then estimate the left-hand side of (1.5) using this result and a change in the order of integration:
 integral   oo    ( integral  t     )q
    t-aq    f(s)ds   dt
 0        integral 0    s    t      integral 
            oo  -aq-1q/p-q+cq  t[- c   ]q
  <  Cq,c 0  t     t        0  s  f(s)  dsdt
          integral   oo         integral  t[     ]q
  =  Cq,c    tq(c-a)-2    s-cf(s) ds dt
          integral 0 oo  [    ] 0 integral   oo 
  =  Cq,c     s-cf(s)q    tq(c-a)-2dtds
          0            s
The latter integral can be computed provided c < 1/q + a; we get in that case
 integral   oo  [  integral  t     ]q           integral   oo 
     t-a   f(s)ds  dt < Cq,a,c    [s-af(s)]q ds
 0       0     s    t         0            s
In order to get rid of the dependence of c in the constant, we may choose this parameter so that it depends solely on a and q, besides satisfying the constraints we have imposed. The obvious choice is c = 1/q + a/p, and in that case we get trivially Cq,a = a-q.

The remaining estimates can be obtained from this one by changes of variable or taking limits, so we skip their proofs. [#]

Remark. It is possible to discretize integrals of the kind  integral 0 oo [t-af(t)]qdt
 t when the functions f(t) are nonnegative and monotone, using the same technique we used in the proof of Lemma 1.1:

 integral   oo  [    ]q dt   sum   integral  2k+1[     ]q dt   sum 
    t-af(t)  t-=      k    t- af(t)  -t  )(    2aqkf(2-k)q.
 0               k (- Z 2                 k (- Z
Associated to this discrete functional, we have the following result, that will allow us to extend the previous theorem to 0 < p < 1 for nonnegative monotone functions.

Lemma 1.3 (Discrete Hardy’s Inequalities) Let a = (an)n, b = (bn)n be two nonnegative sequences such that there exist C0,m,c > 0 so that for all n, either

       (      )1/m                   {             } 1/m
          sum  oo  m                  -nc    sum n ( kc  )m
bn < C0     ak       or   bn < C02           2  ak
         k=n                          k=- oo
then, for all q > 0, and a > 0 (in the first case) or 0 < a < c (in the second),
 sum                    sum 
   (2nabn)q < Cq0Cq,a   (2naan)q.
n (- Z               n (- Z

Proof. Let us assume that the first condition is satisfied. From the inclusions ln < lm for 0 < n < m < oo , we infer that it must also be bnn < C0  sum k=n oo akn for all n < m. We can therefore assume that m < q (if it is not, then we can certainly pick n < q < m). We are now able to use Hölder’s Inequality; let 0 < b < a, and let r > 0 so that m/q + m/r = 1:

  oo  sum       sum  oo  (       )m    oo  sum  (     )m        {  sum  oo  (   )mq/m} m/q{  sum  oo        } m/r
    amk =    2bkak2-bk   =     2bkak  2-bkm <      2bkak             2-bkmr/m    ;
k=n     k=n               k=n                  k=n                k=n
therefore,
      {   oo         } 1/q{  oo     } 1/r     (   -bnr )1/r {  oo         } 1/q
bn < C0   sum  (2bkak)q      sum   2-bkr    = C0  -2------      sum  (2bkak)q    .
        k=n             k=n                1- 2-br      k=n
We have then
 sum  (2nab )q
n (- Z   n
             sum             oo  sum 
  <  C0Cb,q,m    2anq2-bnq    (2bkak)q
            n (- Z        k=n
             sum    oo  sum 
  =  C0Cb,q,m       2nq(a-b)2bkqaqk
            n (- Z k=n
             sum    oo  sum         (     )q
  =  C0Cb,q,m       2nq(a-b) 2bkak
            k (- Zn=k
             sum   -2q(a--b)k--( bk  )q
  =  C0Cb,q,m    1- 2q(a- b)  2 ak
            n (- Z sum  (     )q
  =  C0Ca,b,q,m    2akak
              k (- Z
As in the proof of the previous Theorem, we may choose b depending only on a, and m depending solely on q, hence narrowing the dependence of the constants on parameters.

A similar proof serves to show that the second condition also gives the same estimate, but this time only for values 0 < a < c. [#]

Theorem 2 Given a > 0 and 0 < q < 1, there exists C > 0 that depends at most on a and q such that the following inequalities hold for each nonnegative monotone function f:

 integral   oo  ( - a integral  t ds)q dt        integral   oo [ -a  ]q dt
     t     f(s)s-   t- <   C     t  f(t)  t-                  (1.9)
 integral 0 oo  (  integral  0 oo      )q           integral 0 oo 
     ta    f(s)ds   dt <   C    [taf(t)]q dt                  (1.10)
 0      t      s    t         0         t

Proof. Given t > 0, there exists n  (- Z such that 2-(n+1) < t < 2-n; therefore,

 integral  t  ds       integral  2- n  ds    sum  oo   integral  2-k   ds
  f(s)-s  <        f(s)-s <      -(k+1) f(s)s
 0             0           k=n 2
               oo  sum     -k       2- k           sum  oo   -k
          =      f(2  ) log(s)]2- (k+1) = (log2) f(2 ).
              k=n                          k=n
Denote bn =  integral 02-n f(s)ds
 s, an = f(2-n), and use the previous lemma to estimate the integral in the left-hand side of (1.9):
 integral   oo  (  integral  t     )q
     t-a   f(s)ds   dt
 0       0 -n  s(    t        )
      sum   integral  2     - a  integral  t ds  q dt
  =      2-(n+1)  t   0 f(s) s    t
     n (- Z            integral  - n
      sum  ( a(n+1) )q   2    dt
  <      2     bn    2- (n+1) t
     n (- Z      sum         q
  =  (log 2)2aq    (2anbn)
             n (- Z
     ( use previous lemma )
  <  C    sum  (2ana )q
      q,an (- Z    n
With the same technique, we find an upper estimate on the right-hand side of (1.9) in the same terms:
 integral   oo  (    )  dt   sum   integral  2-(n-1)(     ) dt   log2 sum 
     t-af(t)q --=             t-af(t) q-- > -aq-   (2anan)q.
 0            t   n (- Z 2-n             t    2  n (- Z
This gives us the first estimate. For the second, a change of variables in this one suffices. [#]

1.3 Best and Near-Best Approximation

Definition 1 Given a (quasi)normed space (X,||.||X) and a subspace Y < X, we say an operator p : X --> Y is of best approximation if ||f - p(f)||X = E(f,Y )X for all f  (- X. Similarly, for a given t > 0, we say p is an operator of t near-best approximation if ||f - p(f)||X < tE(f,Y )X for all f  (- X.

Lemma 1.4 Let (X,d) be a metric space and let Y be a compact subset of X; then for each f  (- X there exists an element of best approximation to f from Y .

Proof. Consider for each n  (- N an element gn  (- Y such that d(f,gn) < inf g (- Y d(f,g) + 1/n. Any sequence in a compact set has at least a limit element g0  (- Y ; this element is the best approximation to f from Y by definition. [#]

Theorem 3 Let X be a (quasi)normed space. For each finite dimensional subspace Xd of X and each f  (- X, there is a best approximation to f from Xd.

Proof. If Xd = {0} then there is nothing to prove. Otherwise, consider for each f  (- X the set Y f = {y  (-  Xd |||f- y||X < ||f||X}. Y f is a closed set:

       -1
Yf = F   (- oo , ||f||X], where F : Y  -)  y '--> ||f - y|| X  (-  R
Y f is also bounded, since for any y  (- Y f,
||y||X = ||y- f + f||X < CX (||y- f||X + ||f||X) < 2CX||f||X
By Corollary 27.1 (page 101), it must be Y f compact, and therefore a best approximation from Y f exists. [#]
Lemma 1.5 Given (X,||.||X) a (quasi)normed space, and Z < X a linear subspace of X such that there exists an element of best approximation from it to every element in X. If x  (- X and z  (- Z is a near-best approximation to x from Z with constant t > 1, then for each y  (- X there is an element z' (- Z of near-best approximation to y from Z such that z -z' is also near-best approximation to x-y from Z with constant t' depending at most on X and t.

Proof. Given x,y  (- X, assume E(x-y,Z)X < E(y,Z)X (it’s not a lost of generality, since one can switch elements). Let z  (- Z be a t near-best approximation to x from Z, and z'',z''' (- Z elements of best approximation from Z to y - x and y respectively. We have then for z' = z'' + z,

     '
||y- z ||X
  =  ||y - x- z''+ x - z||X
  <  CX {||y - x- z''||X + ||x- z||X}
  <  C  {E(y - x,Z)  + tE(x,Z)  }
       X           X          X'''   ''
  <  CX,t {E(y- x,Z)X + ||x -(z  - z )||X}
  =  CX,t {E(y- x,Z)X + ||x -y + y+ z''-z'''||X}
  <  CX,t {E(y- x,Z)X + CX [||x- y +z''||X + ||y -z'''||X]}
  <  C    {E(x- y,Z)  + E(y,Z)  }
       X,t           X         X
  <  CX,tE(y,Z)X   [#]
Definition 2 A subset Y of a (quasi)normed space X is said to be a unicity space, if for each f  (- X there exists a unique element of best approximation from Y .

Example. The spaces Lp are unicity spaces for 1 < p <  oo , but not for 0 < p < 1 nor p =  oo . A characterization of spaces (at least normed) with the unicity property can be made through the use of “strict convexity”:

Definition 3 A normed space is said to be strictly convex if the following property holds:

Given f /= g with||f||X = ||g||X = 1, and 0 < c < 1, then ||cf + (1- c)g||X < 1
Lemma 1.6 If X is strictly convex, then the best approximation is unique.

Proof. Assume the result is not true, and there exists a function f  (- X and two different elements g1,g2  (- Y such that ||f - g1||X = ||f - g2||X = E(f,Y )X > 0. In that case,

                       ||            ||    ||  (         )    (         )||
1 = E(f,Y)X-<  ---1----||||f - 1(g + g )||||  = |||| 1  -f---g1--  + 1  -f---g2- ||||  < 1,
    E(f,Y)X    E(f,Y)X ||    2  1   2|| X  || 2  E(f,Y)X     2  E(f,Y )X  ||X
a contradiction. [#]
Proposition 1 Given a finite dimensional unicity subspace Xd < X, the operator of best approximation is continuous.

Proof. Let f : X --> X the operator of best approximation, and let CX > 1 be the constant in the (quasi)triangular inequality offered by the (quasi)norm ||.||X. Given e > 0, let d = e(3CX2)-1. Notice that, if f,g  (- X verify ||f - g||X < d, then we have

||f(f )- f(g)|| X
  =   ||f(f)- f + f - g+ g -f(g)||X
  <   C2 {||f(f)- f||X + ||f- g||X + ||g- f(g)||X}
       X2
  <   3CX ||f - g||X < e  [#]

1.4 Interpolation Spaces: The Real Method

Definition 4 A compatible couple is a pair of (quasisemi)normed linear spaces, (X,||.||X), and (Y,||.||Y ) continuously embedded in a Hausdorff topological linear space H.
We define the sum and intersection of such a couple as

(X + Y,||.||X+Y  )  with (quasisemi)norm   ||f ||X+Y =  f=ifnx+ffY {|| fX ||X + ||fY ||Y}
                                              fX (- X,fY (- Y
(X  /~\  Y,||.||X/~ \ Y) with (quasisemi)norm  ||f ||X /~\ Y = max {||f||X,||f||Y}
Definition 5 Given a compatible couple (X,Y ), we say a linear operator T : X + Y --> X + Y is admisible for the couple if (i) T(X) < X, and T |X is bounded. (ii) T(Y ) < Y , and T|Y is bounded.
Definition 6 A (quasisemi)normed linear space (Z,||.||Z) is intermediate between X and Y (or for the couple given by those spaces), if (i) X  /~\ Y  (_ Z  (_ X + Y (ii) X  /~\ Y is continuously embedded in Z: there exists CX /~\ Y > 0 such that ||f||Z < CX /~\ Y ||f||X /~\ Y for all f  (- X  /~\ Y . (iii) Z is continuously embedded in X + Y : there exists CX+Y > 0 such that ||f||X+Y < CX+Y ||f||Z for all f  (- Z.

An intermediate space Z of a compatible couple (X,Y ) is an interpolation space for the couple if T(Z) < Z for all admisible operator T.

Remark. In the 70’s there were primaryly two methods for constructing interpolation spaces of a compatible couple: the complex method of Calderón [Cal1], and the real method of Lions and Peetre [Peet]. We are mainly interested in the latter, since it uses as building blocks similar quasi-seminorms to the ones in the description of the Approximation Spaces above.

Definition 7 We define the K-functional of a compatible couple (X,Y ) as follows:

K(f, t;X, Y) = f=ifnXf+fY{||fX||X + t||fY ||Y}
              fX (- X,fY  (- Y
for each f  (- X + Y and t > 0.
Lemma 1.7 (Properties of the K-functional) Let (X,Y ) be a compatible couple: (i) K(f,.;X,Y ) is a continuous subadditive convex-down monotone nondecreasing function of t that verifies K(f,t;X,Y ) = tK(f,q/t;Y,X), and K(f,nt;X,Y ) < nK(f,t;X,Y ) for all n  (- N and t > 0. (ii) For each t > 0, K(.,t;X,Y ) is a (quasi)seminorm equivalent to ||.||X+Y . (iii) Let (X,Y ) be a compatible couple and let T be an admisible operator; then for each f  (- X + Y and t > 0,
K(T f,t;X, Y) < M .K(f,t;X,Y ),
where M = max{                        }
 || T|X||B(X,X),||T |Y ||B(Y,Y).

Proof. Notice that, given f  (- X + Y , the function K(f,.;X,Y ) is trivially nonnegative and monotone nondecreasing. Its concavity is proved in the following way: Given t1,t2 > 0 and 0 < c < 1, and any decomposition f = fX + fY , we have, for t = ct1 + (1 - c)t2,

||f  ||  +t||f ||
  X  X     Y Y
  =   [c + (1 - c)]|| fX ||X +[ct1 + (1- c)t2]||fY||Y
  =   c{||fX||X + t1||fY||Y}+ (1 -c) {||f||X + t2||fY||Y}
  >   cK(f,t1;X,Y )+ (1 - c)K(f, t2;X, Y)
(a more intuitive way to see that K(f,.;X,Y ) is indeed a convex down function is to realize that we construct it as the infimum of lines with slopes ||fY ||Y and y-interceptions ||fX||X, where the previous functions come from the obvious decompositions of f in X + Y )
The identity K(f,t;X,Y ) = tK(f,1/t;Y,X) is also trivial:
                                         (        {              } )
K(f,t;X,Y ) =   inf  {||f ||  + t|| f  || }=  t    inf    1||f ||  + ||f||    .
              ffX= (- fXX,+fYfY (- Y  X X     Y  Y      ffX= (- fXx+,ffYY (- Y  t  X X      Y
The subadditivity follows from the previous identity; let g(s) = K(f,1/s;X,Y ) = K(f,s;X,Y )/s. As K(f,s;X,Y ) is nondecreasing, g must be nonincreasing, and therefore, if t1,t2 > 0
             K(f, tk;X, Y)      K(f,t1 + t2;X, Y)
             -----t------  >   -----t+-t-------  for k = 1,2
                   k                1   2
K(f,t1;X, Y) + K(f,t2;X, Y)  >  t1K(f,t1 +-t2;X,Y-)+ t2K(f,t1 +-t2;X,-Y)
                                     t1 + t2             t1 + t2
                           =  K(f, t1 + t2;X,Y )
From the last estimate, we find K(f,nt;X,Y ) < nK(f,t;X,Y ) trivially. The continuity also follows from the subadditivity, since given h > 0,
K(f,t+ h;X, Y)- K(f, t;X, Y) < K(f,h;X,Y ).
To prove property (ii), we realize first that both nonnegativity and homogeneity of K(.,t;X,Y ) are trivial. Let C = max{CX,CY } > 0 (associated to the constants from the definition of the (quasisemi)norms of X and Y ); then, for each f,g  (- X + Y and any decomposition f = fX + fY , g = gX + gY :
K(f +g,t;X,Y ) < ||fX + gX||X + t||gX + gY||Y < C {(||fX||+ t||fY||Y )+ (|| gX ||X + ||gY ||Y )}
The equivalence of K(.,t) with ||.||X+Y comes from:
 f=finXf+fY{||fx||X + t|| fY ||Y}< f=ifnXf+fYmax{1,t}{||fx||X + ||fY||Y}=  max{1,t}||f ||X+Y
fX( - X,fY (- Y                fX (- X,fY  (- Y
The other estimate is similar.
Let us prove now (iii): Given f  (- X + Y , decompose f = fX + fY and notice that
||T(fX)||X + t|| T (fY )|| Y < M (|| fX ||X +t||fY||Y );
therefore, K(Tf,t;X,Y ) < M(|| fX ||X + t|| fY||Y ), which proves the desired result. [#]
Lemma 1.8 Given a compatible couple (X,Y ), and parameters 0 < h < 1, 0 < q < oo , the following functionals are (quasi)seminorms in X + Y :
              {  integral                     }
                   oo  (- h         )q dt 1/q
 |f |(X,Y )h,q  =     0   t K(f, t;X, Y)   t     for 0 < q <  oo 
|f|        =  sup {t- hK(f,t;X, Y)}
  (X,Y)h, oo      t>0

Proof. |.|(X,Y )h,q is trivially linear and nonnegative. The (quasi)triangular inequality is directly inferred from part (ii) in Lemma 1.7. [#]

Proposition 2 Given a compatible couple (X,||.||X), (Y,||.||Y ), and parameters 0 < h < 1, 0 < q < oo , the (h,q) spaces

          {                       }
(X, Y)h,q =  f  (-  X +Y :| f|(X,Y)h,q <  oo
are interpolation spaces.

Proof. This is inmediate from (iii) in Lemma 1.7. [#]

Lemma 1.9 Given a (quasisemi)normed space (X,||.||X), and a continuously embedded subspace Y < X with ||f||Y < CY ||f||X for some CY > 0, and all f  (- Y : (i) The K-functional of the compatible couple (X,Y ) can also be written as

K(f, t;X, Y) = inf {||f- g||X + t||g||Y}.
              g (- Y
(ii) The (quasi)seminorms |.|(X,Y )h,q are equivalent, for each r > 0, to the following discretizations:
               {  oo                    } 1/q
 |f|        )(     sum  (2khrK(f,2kr;X, Y))q     for 0 < q <  oo 
   (X,Y)h,q      k=0
                   khr     kr
|f|(X,Y)h, oo   )(   sukp>0{2   K(f,2  ;X,Y )}

Proof. Part (i) is trivial. As for part (ii), we start noticing that

           {  integral  1(- h         )q dt} 1/q
|.|(X,Y)h,q  )(    0  t K(f, t;X, Y)   t    ,
since for each g  (- Y , K(f,t;X,Y ) <||f||X < C{||f- g||X + ||g||X}< C{||f - g||X +CY ||g||Y}, and therefore, K(f,t;X,Y ) < C K(f,CY ;X,Y ) for all t > 0. So, we can estimate (for 0 < q <  oo )
 integral  1(            )q dt              integral  1(            )q dt                 integral   oo 
    t-hK(f,t;X,Y )  --< |f| q(X,Y)h,q <    t-hK(f,t;X,Y )  --+ K(f,CY ;X,Y )q    t- h-1dt
 0                  t               0                  t                  1
Now, due to the monotonicity of the K-functional, we can discretize the previous integral, with an argument similar to the one employed in the proof of Lemma 1.1 (page 5). [#] Remark. Notice how similar this (quasi)seminorms are to the ones in Lemma 1.1. One of the tricks in Approximation Theory is, given (X,||.||X) and a family of approximants {Xn}n, find a continuously embedded (quasisemi)normed subspace (Y,||.||Y ) < X so that the values E(f,Xn)X can be estimated in terms of the K-functionals K(f,2m;X,Y ) and viceversa. In §1.5 we outline some results that help in this sense.
Another important result related to K-functionals and the (h,q) interpolation spaces is the Reiteration Theorem, that states that no advantage is gained when applying succesive interpolation to a given compatible couple.

Theorem 4 (Holmsted (1970)) Let (X1,||.||X1), (X2,||.||X2) be a compatible couple of (quasi)normed spaces, let 0 < a1 < a2 < 1 and 0 < q1,q2 <  oo , and consider the interpolation spaces Y 1 = (X1,X2)a1,q1, Y 2 = (X1,X2)a2,q2; then, for f  (- Y 1 + Y 2 and d = a2 - a1, we have the following equivalence:

               ( integral  t                      )1/q1    (i ntegral   oo                       )1/q2
K(f, td;Y1,Y2)  )(     [s-a1K(f,s;X1,X2)]q1 ds    + td     [s-a2K(f,s;X1,X2)]q2 ds    ,
                 0                      s            t                      s
with constants of equivalency independent of f and t. The usual change from integral to a supremum applies when either q1 or q2 are infinite.
Theorem 5 (Peetre’s Reiteration Theorem (1963)) Under the same hypothesis as the previous Theorem, and for any 0 < h < 1, 0 < q < oo , we have (Y 1,Y 2)h,q = (X1,X2)h',q, where h' = (1-h)a1+ha2.

1.5 Jackson and Bernstein Inequalities

Definition 8 Given a (quasisemi)normed space (X,||.||X), and (quasisemi)normed continuously embedded subspace (Y,|.|Y ) < X, Jackson Inequality: We say the family of approximants {Xn}n verifies a Jackson Inequality with respect to Y if there exist r,C > 0 such that

E(f,Xn)X < Cn -r|f| Y for all f  (-  Y
Bernstein Inequality: We say the family of approximants {Xn}n verifies a Bernstein inequality with respect to Y if there exist r,C > 0 such that
          r
|gn| Y < Cn || gn||X for all gn  (-  Xn

Remark. In the literature of Approximation Theory, results that state Jackson’s Inequalities are refered as “direct theorems”, whereas Bernstein’s inequalities are also identified as “inverse theorems”.

Proposition 3 Given a (quasisemi)normed space (X,||.||X), a family of approximants {Xn}n satisfying both Jackson and Bernstein inequalities with respect to a (quasisemi)normed continuously embedded linear subspace (Y,|.|Y ) < X, the following estimates hold:

                          -r
     E(f,Xn)X  <   C K(f,n  ;X,Y )                          (1.11)
     -nr            -nr n+ sum 1  kr
K(f,2   ;X,Y ) <   2   C    2 E(f,X2k -1)X                   (1.12)
                         k=1

Proof. To prove (1.11), given f  (- X, consider any g  (- Y and a best approximation gn to g in X from each Xn; then,

E(f,Xn)X
  <   ||f- gn||X
  <   C{||f - g||X + ||g - gn||X}
  =   C{||f - g||  + E(g,X ) }
       {       X     - rn X}
  <   C  ||f- g||X + Cn  | g| Y ;
therefore proving property (1.11) (use part (i) of Lemma 1.7, page 18). To prove (1.12), denote gk the best approximation to f from X2k, and yk = gk - gk-1. We know that there exists a constant C > 0 such that Xn + Xn = XCn for all n; this means in particular that yk  (- X2kC for all k, and by the Bernstein property, |yk|Y < 2krC||yk||X. But now,
||yk||X = ||gk- gk-1||X < C (|| gk - f||X + ||gk-1 - f||X) < C E(f,X2k-1)X;
We have then the estimate |yk|Y < 2krC E(f,X2k-1)X, which we can use to estimate the K-functional:
K(f, 2-nr;X,Y )X
  <   ||f- g ||  + 2-nr| g |
           n X        n|| Yn   ||
  =   E(f,X n)  + 2-nr|| sum  y ||
           2  X       |k=0 k|
                        n    Y
  <   E(f,X2n)X + 2-nrC  sum  |yk|Y
                        k=1
      (since y0 = g0 = 0)
                        n sum 
  <   E(f,X2n)X + 2-nrC    2krE(f,X2k- 1)X
                        k=1
            n+ sum 1
  <   2-nrC    2krE(f,X2k -1)X   [#]
            k=1
Remark. This last result allows the link we were looking for between Interpolation and Approximation Theories. The following three results give a very good example: Corollary 3.1 tells us that the problem of characterizing approximation spaces by means of interpolation spaces is solved if we know two ingredients:
  1. An appropriate (quasisemi)normed linear subspace (Y,|.|Y ) < X for which the family of approximants {Xn}n verifies the Jackson and Bernstein inequalities.
  2. A characterization of the interpolation spaces (X,Y )h,q.

The second step is often provided by classical results in the Theory of Interpolation. The first step is the difficult one from the viewpoint of approximation. Theorem 6 provides a good start. Finally, Theorem 7 (proof not offered here, read it in [CDeH]) provides somehow an inverse result to Corollary 3.1.

Corollary 3.1 If the family of approximants {Xn}n satisfies the Jackson and Bernstein inequalities with respect to Y , and exponent r > 0, and the sequence of errors E(f,Xn) is monotone decreasing, then for each 0 < g < r and 0 < q < oo , Aqg(X,Xn) = (X,Y )g/r,q with equivalent norms.

Proof. Estimate (1.11) gives us Aqg(X,Xn) < (X,Y )g/r,q trivially; for example, for 0 < p <  oo , and r that makes good both Jackson’s and Bernstein’s Inequalities, if f  (- (X,Y )g/r,q, then

 oo  sum    gn           q   sum  oo  ( nr(g/r)    - nr      )q
   (2  E(f,X2n)X) <      2     K(f,2   ;X, Y)  <  oo 
n=0                 n=0
On the other hand, we may use estimate (1.12) into Lemma 1.3 (page 10) to realize the other inclusion: Let bn = K(f,2-nr;X,Y ), an = E(f,X2n)X, and 0 < g < r; we have then
 sum  oo  (                   )q
    2nr(g/r)K(f,2-nr;X,Y )
n=0
      sum  oo       q
  =      (2gnbn)
     n=0
        sum  oo  gn   q
  <  C    (2  an)
       n= oo 0
  =  C  sum  (2gnE(f,X n)  )q  [#]
       n=0          2 X

Theorem 6 (DeVore, Popov) For any (quasisemi)normed space (X,||.||X) and family of approximants {Xn}X such that Xn < Xn+1 for all n, as well as for any r > 0 and 0 < p < oo , the spaces Xn verify the Jackson and Bernstein inequalities for the exponent r > 0, with respect to Y = Apr(X,Xn). Therefore, for any 0 < a < r and 0 < q < oo , we have

            (           )
Aaq(X, Xn) = X, Arp(X, Xn) a/r,q.

Proof. It is enough to show that {Xn}n verifies both Jackson and Bernstein’s Inequalities for the exponent r > 0 with respect to Apr(X,Xn):

The rest of the statement follows inmediatelly. [#]

Theorem 7 (Cohen, DeVore, Hochmuth) Let X,Y,{Xn}n be as before, and suppose that {Xn}n satisfies the Jackson and Bernstein inequalities for r > 0. Suppose further that the sequence of operators {Tn}n verifies: (i) Tn : X --> Xn (not necessarily linear). (ii) There exists C > 0 such that ||f - Tnf||X < C E(f,Xn)X for all f  (- X. (iii) |Tnf|Y < C|f|Y for all n and f  (- X.

Then, {Tn}n realizes the K-functional; that is,

||f - Tnf||X + n-r|Tnf| Y < C K(f,n -r;X, Y) for all f  (-  X. []

2 Dyadic maximal-smoothness Spline Approximation

In this section we want to exemplify how to obtain the approximation spaces in the following case: Given the unit cube _O_ < Rd, X = Lp(_O_) for any choice of 0 < p < oo , and Xn is a linear space of box-splines with coordinate order r, maximal smoothness, and associated to the dyadic n-th partition of the cube _O_.
In the search for the spaces Aqa(X,Xn), we will go through different levels of abstraction: from the low-level construction of best and near-best polynomial approximation to functions in Lp(_O_) on cubes, to the high-level description of the K-functionals that will lead us into further results involving interpolation spaces for compatible couples of Besov spaces. The logic step-by-step exposition is summarized in the following table:

  1. Basic Pre-requisites:

  2. Construction of the approximants and projectors:

  3. Search for good candidates for approximation spaces:

  4. Solution of the problem of approximation and related results:

2.1 Best and Near-Best Lp Polynomial Approximation in cubes of Rd

Lemma 2.1 Given r > 0, a cube _O_ < Rd and 0 < q < p < oo , there is a constant C > 0 depending at most on p, q and d such that

(     integral    )1/q   (    integral     )1/p    (     integral    )1/q
  -1-   |g|q    <  -1-   |g|p    < C   1--  |g| q
  |_O_| _O_           |_O_| _O_              |_O_ | _O_
for all g  (- TT(r).

Proof. Consider for all p > 0 the (quasi)norms |||.|||Lp(_O_) = |_O_|-1/p||.||Lp(_O_) in TT(r), and apply Theorem 27 (page 101). [#]

Lemma 2.2 Given r > 0, cubes I < J < Rd such that |J|< c|I| for some c > 0, and 0 < q < oo , there are constants C1,C2 > 0 depending at most on q and d such that

  ( 1  integral     )1/q  ( 1  integral    )1/q     (  1  integral    )1/q
C1  |I|-  |g|q    <   |J|-  |g|q    < C2  |I|-  |g|q
for all g  (- TT(r). In particular,
||g||Lq(J) < c1/qC2 ||g||Lq(I).

Proof. Consider the (quasi)norms ||.||I,Lq(_O_) = |I|-1/q||.||Lq(_O_) in TT(r) and apply Theorem 27 again. [#]

Lemma 2.3 Let _O_ be a cube in Rd, and f  (- Lp(_O_). If g  (- TT(r) is a t near-best Lq(_O_) approximation to f for any 0 < q < p, then it is also a C near-best Lp(_O_) approximation to f, for some C > 0 that depends on d, p, q, r and t, but does not depend on the size of _O_.

Proof. Let P be the best Lp(_O_) approximation element to f from TT(r); then we have:

||f - g||Lp(_O_()                       )
  <  Cd,p ||f - P||Lp(_O_) + ||P- g||Lp(_O_)
     (apply Lemma 2.1)
         (                      1/p-1/q          )
  <  Cd,p E(f, TT(r);_O_)p + Cd,q,r|_O_|      ||P- g||Lq(_O_)
            (                1/p-1/q [                       ])
  <  Cd,p,q,r E(f,TT(r);_O_)p + |_O_ |     ||P - f||Lq(_O_) + ||f- g||Lq(_O_)
  <  C      (E(f,TT(r);_O_) + |_O_ |1/p-1/q [||f - P||     + tE(f,TT(r);_O_) ])
       d,p,q,r(           p                  Lq(_O_)       )      q
  <  Cd,p,q,r E(f,TT(r);_O_)p + 2max(1,t)|_O_|1/p-1/q|| f - P||L (_O_)
                                                    q
     (apply Ho¨l(der’s Inequality or Lemma 2.1) again)
  <  Cd,q,p,r,t E(f,TT(r);_O_)p + ||f- P ||Lp(_O_)
  =  Cd,q,p,r,tE(f,TT(r);_O_)p.  [#]

Lemma 2.4 Let I < J be cubes in Rd such that |J|< c|I| for some a > 0. Let f  (- Lp(J), and g  (- TT(r) a t near-best Lq(I) approximation to f for any 0 < q < p. Then g is also a C near-best Lp(J) approximation to f, where C > 0 depends on t, c, d, p and q.

Proof. Let P be the best Lp(J) approximation element to f from TT(r). First, notice that for any cube I < J,

||P - g||Lp(I)(                       )
 <   Cp,d ||f - P||Lp(I) + ||f- g||Lp(I)
     (apply Lemma  2.1)
         (                    1/p-1/q          )
 <   Cp,d  ||f- P ||Lp(I) + Cd,q,r| _O_|     ||f- g||Lq(I)
          (               1/p+1/q            )
 <   Cd,p,q(||f- P ||Lp(I) + |_O_|    tE(f, TT(r);I)q)
 <   Cd,p,q ||f- P ||    + |_O_| 1/p+1/qt||f - P||
                  Lp(I)                  Lq(I)
     (apply(Lemma  2.1 again)          )
 <   Cd,p,q ||f- P ||Lp(I) + t||f- P ||Lp(I)
 =   Cd,p,q,t||f- P ||
                  Lp(I)
 <   Cd,p,q,t||f- P ||Lp(J)
 =   Cd,p,q,tE(f,TT(r);J)p;
This estimate is all we need to finish the proof:
||f- g||Lp(J()                       )
  <  Cp,d ||f- P ||Lp(J) + ||P - g||Lp(J)
     (apply Lemma  2.2)
         (               1/p              )
  <  Cp,d E(f,TT(r);J)p + c  Cp,d||P - g||Lp(I)
  <  Cd,p,q,t,cE(f, TT(r);J)p.  [#]

2.2 Markov’s Theorem

Theorem 8 (Szegö (1928)) For each trigonometric polynomial Tr of order r,

T '(x)2 + r2Tr(x)2 < r2||Tr||L (T) for all x  (-  T
 r                       oo
Corollary 8.1 (Bernstein’s Inequalities) ||T'r||L oo (T) < r||Tr||L oo (T). ||Tr(k)||L oo (T) < rk||Tr||L oo (T).
Corollary 8.2 For an algebraic polynomial Pr  (- TT(r), and all x  (- (-1,1),
|P'r(x)| < r||P V~ r||L oo [--1,1].
             1- x2
Corollary 8.3 For an algebraic polynomial Pr  (- TTC(r) of order r with complex coefficients on the disk D = {z  (- C : |z|< 1},
||P'r||L oo (D) < r||P ||L oo (D).
Theorem 9 (Markov) For an algebraic polynomial Pr  (- TT(r),
||P'r||L oo [- 1,1] < r2|| Pr||L oo [-1,1].

2.3 Divided Differences

Definition 9 Given a function f : R --> R and a finite collection of real numbers {t0,t1,...,tn}, we denote with | /_\ (f;t0,t1,...,tn) the leading coefficient of the polynomial of degree n that interpolates f at t0,...,tn. We call it the n-th divided difference of f. Divided differences are computed recursively as follows:

           /_\ |(f ;t0) =  f(t0)

        | /_\ (f;t0,t1)  =  f(t1)--f(t0)  if t0 /= t1
                         t1- t0
     | /_\ (f;t ,. k~..,t ) =  f(k-1)(t0)  if this derivative exists
          0    0       (k- 1)!
                       /_\ |(f-;t1,...,tn)- /_\ |(f-;t0,...,tn-1)
| /_\ (f;t0,...,tn- 1,tn)  =              tn- t0
Lemma 2.5 (Newton’s Interpolation polynomial) Given a function f : R --> R, and knots t0 < ... < tn, the interpolation polynomial of f at those knots, Pf(x;t0,...,tn) can be written in terms of the divided differences as follows:
                sum n
Pf(x;t0,...,tn) =    /_\ |(f ;t0,...,tk)(x- t0)...(x - tk).
               k=0

Proof. Given f : R --> R, consider the following interpolation polynomials for each k = 0,...,n - 1: Qk(x) = Pf(x;t0,...,tk)  (- TT(k), Qk+1(x) = Pf(x;t0,...,tk+1)  (- TT(k + 1). Notice that g = Qk+1 - Qk  (- TT(k + 1) vanishes at the knots t0,...,tk, and by definition its leading coefficient is the divided difference | /_\ (f;t0,...,tk+1); hence,

Pf(x;t0,...,tk+1) = Pf(x;t0,...,tk)+ | /_\ (f;t0,...,tk+1)(x- t0)...(x - tk). [#]
Lemma 2.6 If f  (- Cn[a,b] and a < tk < b for all k = 0,...,n, then there exists q  (- (a,b) such that | /_\ (f;t0,...,tn) = 1-
n!f(n)(q).

Proof. This is a direct consequence of Rolle’s Theorem. [#]

Lemma 2.7 (Leibnitz Formula for Divided Differences) Given functions f,g and knots t0, ..., tn, the n-th divided difference of their product is given by the following formula:

                 sum n
 /_\ |(f g;t0,...,tn) =   /_\ |(f ;t0,...,tk) /_\ |(g; tk,...,tn)
                k=0
(2.1)

Proof. Assume all the knots are different; consider the polynomial of interpolation of h = fg at those knots (using Newton’s expression with the divided differences as coefficients).

Ph(x;t0,...,tn)
  =  Pf (x;t0,...,tn)Pg(x;t0,...,tn)
     [n sum                                ]
  =       | /_\ (f;t0,...,tk)(x- t0)...(x - tk)
      k=0[                                   ]
          sum n
      ×     | /_\ (g;tl+1,...,tn)(x- tl+1)...(x- tn)                   (2.2)
         l=0
Notice now that the leading coefficient of Ph(x;t0,...,tn) is | /_\ (fg;t0,...,tn), and the leading coefficient of the expression in (2.2) is the right-hand side of (2.1). [#]

2.4 Univariate Splines

2.4.1 Definition and Basic Properties

Definition 10 (Schoenberg spaces: knots-multiplicity form) Given a interval A = [a,b] in R, we define initially spaces of splines in the following way: Fix r > 0 and let {a < t1 < t2 < ... < tn < b} be a partition of the interval, and associated to these knots, multiplicities 0 < mk < r. We denote t = (t1,...,tn), m = (m1,...,mn), and

            {                                  (r-m  -1)        }
Sr(t,m;A) =  f : A --> R : f |(tk,tk+1)  (-  TT(r),f|{tk} (-  C k   for all k
The multiplicity mk indicates the degrees of freedom (associated to polynomials with order r) on each knot tk; hence, r-mk-1 gives the smoothness of the spline function at those points (-1 meaning discontinuity).

Classically, these are called Schoenberg spaces on A.

Remarks. For instance, m = 1 gives one degree of freedom: the location of the image of f(t). In that case, the smoothness of f at t is r - 1, which is the maximum possible degree. In particular, this shows that TT(r) = Sr(t,1;A), where 1 = (1,...,1).
On the other hand, if m = r, then we have all possible degrees of freedom: we can choose location, and all derivatives (both sides); this leaves us piecewise polynomials with possible discontinuities on each knot.
With a slight abuse of notation, we can write Sr(t,r;A) =  o+ k=1nTT(r)|(tk,tk+1), where r = (r,...,r).

Proposition 4 The space Sr(t,m;[a,b]) has the basis

           1
S0,j(x) =   j!(x - a)j for j = 0,...,r- 1
           1
Sk,j(x) =   --(x - tk)j+  for j = r- mk, ...,r- 1, k = 1,...n,
           j!
where x+n = xnx{x>0} denotes truncated powers. The associated dual functionals are as follows:
            (j)
a0,j(S)  =  S  (a)+        -
ak,j(S)  =  S(j)(tk)- S(j)(tk).
In particular, dimSr(t,m;[a,b]) = r + |m| = r +  sum k=1nmk.
Definition 11 (Schoenberg spaces: basic knots form) Given an interval A = [a,b] in the real line, r > 0 and an increasing sequence of knots t = {a < t1 <...< tn < b}, where tk < tk+r for all k, we define the Schoenberg space Sr(t;A) to be the space of splines of order r with knots given by the partition generated by t, and multiplicities given by the number of repetitions of each knot in the sequence.

Example. Consider A = [0,1] and the basic knot sequence t = {0,0,0,1/2,1/2,/12,1,1,1}. In this case, we have

Sr(t;[0.1]) = Sr({0,1/2,1},{3,3,3};[0,1]).

Definition 12 (puB-splines) If tk <...< tk+r is a sequence of r + 1 knots with tk/=tk+r, we define the puB-spline Nk,r as follows:

Nk,r(x) = N (x|tk,...,tk+r) = (tk+r- tk)| /_\  ((.- x)r+-1;tk,...,tk+r)
Lemma 2.8 (Properties of puB-splines) (i) N is a spline function. (ii) supp(Nk,r) = [tk,tk+r] (iii) Recurrence formula:
                 --x---tk---                  --tk+r--x--
N (x |tk,...,tk+r) = tk+r- 1- tk N(x|tk,...,tk+r-1)+ tk+r- tk+1N (x| tk+1,...,tk+r)

Proof. Notice that N is by definition a linear combination of truncated powers (tj - x)+r-kj, where kj is the number of repetitions ti = tj for i < j; therefore, it is a spline function. Furthermore, since any r-th order divided difference of a polynomial of degree r - 1 is zero, Nk,r vanishes identically when x < tk and x > tk+r (think leading coefficients).

The recurrence formula is a direct consecuence of the recurrence formula for divided differences and Leibnitz formula for the divided difference of a product of two functions:

Nk,r(x)
  =  (t   - t) /_\ |((.- x)r- 1;t ,...,t  )
      k+(r   kr-1       +   k)    k(+r   r-1            )
  =  | /_\  ((.- x)+  ;tk+1,...,tk+r -  /_\ |)(.-x)(+ ;tk,...,tk+r-1           )
  =  | /_\   (.- x)r+-2(.-x);tk+1,...,tk+r  - /_\ | (.- x)(.- x)r+-2;tk,...,tk+r-1
  =  | /_\  ((.- x)r-2;t  ,...,t     )+ (t   - x)| /_\  ((.-x)r-2;t  ,...,t  )
             +  (k+1  r-2k+r-1     k+r)    (      r-+2   k+1     k+r )
     - (tk- x) /_\ | (.- x)+ ;tk,...,tk+r- 1 - /_\ | (.- x)+ ;tk+1,...,tk+r-1 ;
therefore, proving our last statement. [#]

Remarks. On his classnotes [deBo], Carl de Boor expresses the previous recurrence formula in the following way:

Nk,1 =   x
          [tk,tk+r)
Nk,r =   hk,rNk,r- 1 + (1- hk+1,r)Nk+1,r- 1
where
hk,r(x) = --x--tk--.
         tk+r-1 - tk
This allows computation of puB-splines following a Horner’s scheme-like method, and actually it can be used as startpoint of the development of the theory of B-splines, rather than using divided differences.

Theorem 10 (Curry, Schoenberg (1966), de Boor, Fix (1973)) Given a < b, r > 0, basic knots t = {a < t1 <...< tn < b} and 2r auxiliary knots {t1-r <...< t0 < a}, {b < tn+1 <...< tn+r}, the puB-splines Nk,r(x) = N(x|tk,...,tk+r) for k = 1 - r,...,n form a basis of Sr(t;[a,b]).

Proof. Although the result was first proved by Curry and Schoenberg [CuSc] in 1966, we will offer here a different proof by de Boor and Fix [dBFi], based on the Marsden identities:

Step 1. The restriction to the interval [a,b] of the previously defined puB-splines Nk,r(x) gives a partition of unity:
  n
  sum   Nk,r(x)| [a,b] = x
k=1- r             [a,b]
This is a direct consequence of the recurrence formula for puB-splines.
  sum n
     Nk,r| [a,b]
k=1-r
       sum n [                                     ]
  =        hk,rNk,r-1| [a,b] + (1 - hk+1,r-1)Nk+1,r-1|[a,b]
     k=1-r
                                                  n sum 
  =  h1-r,rN1 -r,r-1| [a,b]+(1- hn+1,r- 1)Nn+1,r-1|[a,b]+     Nk,r-1|[a,b]
           --0 in [a,b]               --0 in [a,b]   k=2- r
       n
  =    sum   N     |
     k=2-r  k,r-1[a,b]
We can therefore use induction on r, since for r = 1 we have trivially:
 n            n
 sum  Nk,1| [a,b] =  sum  x         = x
k=0          k=0  [tk,tk+1) /~\ [a,b]  [a,b]
Step 2. Marsden identities: For any q  (- R,
                      n
--1---(q- x)r-1|   =   sum   g  (q)N   (x)|   ,
(r - 1)!         [a,b]   k=1- r k,r    k,r   [a,b]
where for all k = 1 - r,...,n,
gk,1(x) =   1
           ---1---
gk,r(x) =   (r- 1)!(x - tk+1)...(x - tk+r-1)  for r > 1.
We will prove this statement by induction on r: It has been proved true for r = 1 above. Assume the property holds up to r - 1; then,
 n sum 
     gk,r(q)Nk,r| [a,b]
k=1-r
       sum n       [                                   ]
 =        gk,r(q)hk,rNk.r-1|[a,b] + (1- hk+1,r)Nk+1,r- 1| [a,b]
     k=1-r
       sum n
 =        [gk,r(q)hk,r + gk-1,r(q)(1- hk,r)]Nk,r-1|[a,b]
     k=2-r
Let us rewrite the coefficients of each puB-spline in the previous expression:
gk- 1,r(q) +[gk,r(q)- gk- 1,r(q)]hk,r(x)
  =  gk-1,r(q)+ gk,r-1(q)(tk- tk+r-1)hk,r(x)
               t - x
  =  gk-1,r(q)+ -k---gk,r-1(q)
               r - 1
  =  q--tkgk,r- 1(q)+ tk--x-gk,r-1(q)
      r- 1           r- 1
  =  q--x-gk,r-1(q);
     r - 1
therefore,
 n sum                     q---x  sum n                       (q--x)r-1
     gk,r(q)Nk,r(x)|[a,b] = r- 1     gk,r- 1(q)Nk,r- 1(x)| [a,b] = (r- 1)!
k=1- r                      k=2-r
Step 3. puB-spline series of polynomials in TT(r): For any q  (- R, and any polynomial P  (- TT(r), we can write
     n sum                                r sum -1
P =      gk,r(P )Nk,r| [a,b], where gk,r(P) = (-1)ng(r- n- 1)(q)P (n)(q).
    k=1- r                            n=0      k,r
The functionals gk,r are called de Boor-Fix functionals.

Notice first that the previous Marsden’s identities can be completed with the expression of any power (q -x)n for n < r - 1 by differentiation:

      n
(q---x)-
  n!        (         )
 =   Dr-1-n  (q--x)r-1
              (r- 1)!
            (  sum n                )
 =   Dr-1-n       gk,r(q)Nk,r(x)| [a,b]
             k=1-r
       sum n
 =        g(r-n-1)(q)Nk,r(x)| [a,b].
     k=1-r
Given now any polynomial P  (- TT(r), consider any value q  (- R and the Taylor expansion of P around q:
       r sum -1      (x- q)n   r sum -1             sum n  (r-n-1)
P (x) =    P(n)(q)---n!-- =   (- 1)nP (n)(q)     gk,r    (q)Nk,r(x)| [a,b],
       n=0                n=0           k=1- r
which, after rearrangement, gives the desired coefficients.

It only remains to prove that a different choice of q  (- R leads to the same coefficients, and therefore the functionals gk,r : TT(r) --> R do not depend on this choice; let q'/=q, and write each derivative P(q) in Taylor expansion around q':

         r sum -1        '    n-j
P (n)(q) =   P(j)(q')(q---q)---.
         j=n        (n- j)!
Notice that then,
gk,r(P )
      r sum -1   n (r-n-1)   (n)
  =      (- 1)gk,r   (q)P  (q)
      n=0
      r sum -1   n (r-n-1)  r sum -1  (j) ' (q'--q)n-j-
  =      (- 1)gk,r   (q)   P   (q ) (n- j)!
      n=0              j=n
      r sum -1   j  (j) ' r sum -1 (r- n- 1)   (q'--q)j-n-
  =      (- 1)P   (q )   gk,r    (q) (j- n)!
      j=0           n=j
      r sum -1   j (r-j- 1)  '  (j) '
  =      (- 1)gk,r    (q)P  (q ).
      j=0
Step 4. puB-spline series of truncated powers: For any basic knot tj and a power n for 0 < n < r -mj (being mj the multiplicity of the knot tj),
      n         n sum 
(x--tj)+-= (- 1)n    g(rk,-rn-1)(tj)Nk,r(x)| [a,b].
   n!           k=j
This is a direct consequence of the previous step; for any basic knot tj, it must be 1 < j < n, and we can write:
(x - tj)n+
---n!---
          (tj -x)n
 =   (-1)n---n!--x(tj,b]
            sum n
 =   (-1)n     g(kr-,r n- 1)(tj)Nk,r(x)|[a,b] /~\ (tj,b]
          k=1-r
           sum n
 =   (-1)n   g(kr-,r n- 1)(tj)Nk,r(x)|[a,b],
          k=j
as we wanted to prove.

Conclusion: We have expressions of every basic element of Sr(t;[a,b]) in terms of the constructed puB-splines. This means that they span the Schoenberg space, and because of their cardinality, they must form a basis of the space. [#]

Remark. Consider the dual functionals associated to the basis of Sr(t;[a,b]) given by the puB-splines; let us denote them ak,r. There are different ways of expressing these functionals; de Boor and Fix offer the most useful for our purposes: For each k = 1 - r,...,n, choose qk,r  (- supp(Nk,r) = (tk,tk+r)  /~\ [a,b], and write

         r sum -1      (r-n-1)
ak,r(S) =   (-1)ngk,r    (qk,r)S(n)(qk,r).
         n=0
The convenction is that, if qk,r is one of the knots, then some of the terms in the sum are zero, those where gk,r vanishes.

Consider the functional Qt : Sr(t,[a,b]) --> R given by

          n
Qt(S) =   sum   ak,r(S)Nk,r.
        k=1- r
One may try to extend this functional to the whole space X; for this task it sometimes suffices to use the Hahn Banach Lemma (Theorem 28 in page 101), although in most of the cases, the hypothesis of this Theorem are not satisfied, and one must look for other related constructions and partial extensions into proper subspaces of X. We will get back to this idea in §2.5.
2.4.2 Quasi-Interpolant Operators

It is not hard to show that the projector Qt is a bounded operator on the Schoenberg spaces; given S  (- Sr(t;[a,b]), we have

||Qt(S)||||Lp[a,b]         ||
      |||| n sum             ||||
  =   ||||     ak,r(S)Nk,r||||
       k=1- r          Lp[a,b]
                    ||||  sum n     ||||
  <    max   |ak,r(S)|||||     Nk,r||||
      0<k<r-1        ||k=1-r    ||Lp[a,b]
      (notice ak,r is a continuous linear operator

      over a finite dimensional space of dimension
      r+ n; therefore, they are bounded)
  <   Cr,n||S||Lp[a,b]||1||Lp[a,b]
  <   C  |b- a| 1/p||S||
       r,n           Lp[a,b]
Notice that the bound depends on the number of knots, n, and this is not useful in the sense that, when the number of knots gets bigger the constant may grown closer and closer to infinity. But there are good news; one can use the scaling properties of our puB-splines to find a bound independent of the number of knots. In order to achieve this surprising result, we must consider a special choice of subintervals associated to each puB-spline function: For each puB-spline Nk,r, we will denote Jk,r the largest subinterval (tj,tj+1) on its support (in case of several subintervals with the same property, choose the one with smallest index j). Notice that it must be |Jk,r|> (tj+r - tj)/r. Also, for each subinterval (tj,tj+1), we will consider the slightly larger subinterval (tj-r+1,tj+r)  /~\ [a,b], which we denote Ij,r. Notice that Jk,r < Ij,r whenever the interval (tj,tj+1) is contained in the support of the puB-spline Nk,r, and that the number of intervals Jk,r contained on each Ij,r is preciselly r (this will be used several times to achieve dependence of r on several constants).

Lemma 2.9 There exists C > 0 (depending at most on r), such that for all 0 < p < oo , k = 1 - r,...,n and S  (- Sr(t;[a,b]),

|ak,r(S)|< C |Jk,r|-1/p||S||L (J  )
                        p k,r

Proof. We will use Lemma 2.1 (page 30), Markov’s Theorem (page 33), and the fact that the functions gk,r and their derivatives are polynomials, hence bounded in any compact. Consider any point qk,r  (- Jk,r:

|ak,r(S|)|                         |
     ||r sum -1    n (r-n-1)     (n)    ||
  =  ||   (-1) gk,r    (qk,r)S   (qk,r)||
      n=0
     r sum -1  (r- n- 1)        (n)
  <     |gk,r    (qk,r)|||S   ||L oo (Jk,r)
     n=0
     (apply Markov ’s Theorem and find a
      common  bound of the functions gk,r)
        r sum -1
  <  Cr    ||S||L oo (Jk,r)
        n=0
     (apply Lemma 2.1 [page 30])
  <  Cr |Jk,r|-1/p|| S ||L(J ) [#]
                    p k,r
Remark. For p > 1, using the Hahn-Banach Theorem (Theorem 29 in page 102), we realize that we can extend these functionals to Lp[a,b]: For each k there exists a linear functional (abusing notation, we denote them equally) such that |ak,r(f)|< C|Jk,r|-1/p||f||Lp(Jk,r) for all f  (- Lp(Jk,r). Furthermore, we can also extend the linear operator Qt to Lp[a,b] for p > 1. This is what we call the quasi-interpolant of order r corresponding to the knots t in [a,b].

Proposition 5 For all 1 < p < oo , and all f  (- Lp[a,b], there exists a constant C > 0 that depends at most on p and the order r, such that the following local and global estimates hold:

||Qt(f)||Lp[tj,tj+1]  <  C ||f||Lp(Ij,r)                         (2.3)
  ||Q  (f )||       <  C ||f||                               (2.4)
     t   Lp[a,b]          Lp[a,b]

Proof. Estimate (2.3) is a direct consecuence of the remark after Lemma 2.9, the “partition of unity” property of the puB-splines, and the fact that |Jk,r|> (tk+r - tk)/r and Jk,r < Ij,r for all suitable index j:

||Qt(f)||Lp[tj,tj+1]
     ||||  sum n           ||||
  =  ||||     ak,r(f )Nk,r||||
     ||k=1-r          ||L [t ,t  ]
                    ||||  pn j j+1||||
  <    max  |a  (f)|||||  sum   N  ||||
     0<k<r-1  k,r    ||k=1- r k,r||
                              Lp[tj,tj+1]
  <  Cr 0m<akx<r-1| Jk,r|-1/p||f||Lp(Jk,r)||x[tj,tj+1]|| Lp[a,b]
                - 1/p                  1/p
  <  Cr(tk+r- tk)   ||f||Lp(Ij,r)(tk+r- tk)
And the first estimate follows. Estimate (2.4) is a direct consequence of the previous:
      p
||Qt(f)||Lp[a,b]
       sum n        p
 =        ||Qt(f)||Lp[tk,tk+1]
     k=1-r
         sum n    p
 <   Cr     ||f ||Lp(Ik,r)
       k=1-r
     (apply the fact that on each interval Ij,r
      there are preciselly r subintervals Jk,r;
      therefore, we reduce the previous sum to

      a sum over mutually disjoint intervals
      that sum  add up to [a,b])
 <   rCr   ||f||p
          p   Lp(Ij,r)
 =   Cr||f||Lp[a,b]  [#]

Remark. Unfortunatelly, one cannot use Hahn-Banach to find the same kind of results for 0 < p < 1, although similar approximation results will remain valid. In §2.5 we will show how in a more general setup.

2.5 Tensor Product Splines: Description of the problem

Definition 13 A tensor product puB-spline N : Rd --> R of coordinate order r (coordinate degree < r) is a product of univariate puB-splines of order r, each of them with a different variable: N(x1,...,xd) = N1(x1)...Nd(xd).

We have all the necessary ingredients to pose the problem of approximation on cubes of Rd by dyadic splines. Let _O_ = [0,1]d be the unit cube in Rd, and let X = Lp(_O_) with the corresponding (quasi)norm for all 0 < p < oo . The family of approximants will be constructed as spaces spanned by tensor product puB-splines. The construction of those spaces starts in the real line:

Consider for each n  (- N the basic knots in [0,1] < R given by tn = {k2-n : 0 < k < 2n}, and t0 = Ø by definition. Associated to these basic knots we will use the following Schoenberg spaces:

n = 0 :
Consider auxiliary knots 1 - r,...,0 and 1,...,r, and the corresponding puB-splines Nk,r = N(. | k,...,k + r) for all k = 1 - r,...,0. Notice that all of those functions may be obtained as (restrictions on [0,1] of) horizontal shifts of N0,r. We can then write Nk,r(x) = N0,r(x - k).
n > 1 :
Consider auxiliary knots k2-n for k = 1 - r,...,0 and k = 2n,...,2n + r - 1, and the corresponding puB-splines. Notice that, as in the previous case, we can obtain each of them as (restrictions on [0,1] of) horizontal shifts of dyadic dilations of N0,r. We need to update our notation in order to handle the new situation:

For each n > 0, denote N0,r[n](x) = N0,r(2nx) (the dyadic dilation of order 2-n), and then for each k = 1 - r,...,2n - r - 1, Nk,r[n](x) = N0,r(2nx - k) (horizontal left-shifts of length k2-n).

Notice that {Nk,r[n]}k=1-r2n-1 is a basis for Sr(tn,[0,1]); let us denote ak,r[n] the corresponding dual functionals.

Let us move into d dimensions, where we will write x = (x1,...,xd)  (- Rd. Consider for each multi-index k = (k1,...,kd), the tensor product puB-spline Nk,r[n](x) = Nk1,r[n](x1)...Nkd,r[n](xd), and the functionals ak,r[n] = ak1,r[n] o...o akd,r[n].

Lemma 2.10 For each S  (- Xn, any tensor product puB-spline Nk,r[n] = Nk1,r...Nkd,r, and any point qk,r = (qk1,r,...,qkd,r)  (- supp(Nk,r[n])  /~\ _O_, (qkj,r  (- supp(Nkj,r)  /~\ projj(_O_)),

         r-1
a[n](S) =  sum  a   (q  )DnS(q   ), where a  (q   ) = (-1)|n|g(r-n1-1)(q  )...g(r-nd- 1)(q  ).
  k,r     n=0 n,k,r k,r      k,r        n,k,r k,r         k1,r      k1,r    kd,r      kd,r

Proof. Given a multi-index k = (k1,...,kd), and a tensor product spline S  (- Xn, write

     sum                  sum 
S =     cjN [nj,]r + Nkd,r    cj'N[jn',]r.
    jd/=kd             j' (- Zd- 1
Make akd,r act on the previous expression to obtain
          r sum -1                   (   sum         )
akd,r(S) =    (-1)n1g(r-n1-1)(qkd,r)       cj'N[n'] Dn1Nkd,r(qkd,r).
          n1=0       kd,r          j' (- Zd-1   j,r
Make now the rest of the univariate functionals akj,r act on the previous expression, one at a time in decreasing order, to obtain the desired result. [#]

It follows that these are the dual functionals of the constructed tensor product puB-splines, and therefore the functions Nk,r[n] are linearly independent in Lp(_O_). Let us denote Xn = span{Nk,r[n]}, where k has all its indices between 1 - r and 2n - 1. This is the space of all piecewise polynomials with coordinate order r and maximal smoothness on dyadic subcubes of size 2-n in the cube _O_ (since DnNk,r[n]  (- L oo (_O_) for all multi-index 0 < n < r - 1, and DnNk,r[n] is continuous for multi-indices 0 < n < r - 2). As in the univariate case, each tensor product puB-spline Nk,r[n] can be obtained from N0,r[0] by shifts and dilations:

  [n]       [0] n
N k,r(x) = N 0,r(2x - k)

The family {Xn}n is our family of approximants:

  1. Each Xn is a linear space: cXn = Xn, and Xn + Xn = Xn trivially.
  2. By Theorem 3 (page 14), there exist elements of best approximation, although so far we have no means of computing them for a given f  (- Lp(_O_). By using the construction below and the results in sections 2.1, 2.4.2, we will be nevertheless able to construct suitable elements of near-best approximation.
  3. Notice that Xn+1 < Xn for all n. It can be proved that  U nXn is dense in X, and therefore limnE(.,Xn) = 0 monotonically decreasing. We can therefore use Lemma 1.1 (page 5) if needed.

We would like now to have a projector, but this is not an easy task. The quasi-interpolant of §2.4.2 does not work for 0 < p < 1, but is still useful. The trick is to find first intermediate spaces Xn < Y n < X for each n, where the quasi-interpolants can be easily extended, and such that we can effectively compute (near)best approximations from Y n to elements of X. In the case we are studying here, the obvious choice works just fine:

For each multi-index j = (j1,...,jd) with 1 < ji < 2n, consider the dyadic cubes of _O_, []j,n = [(j1 - 1)2-n,j12-n] ×...× [(jd - 1)2-n,jd2-n], and Dn = {[]j,r : 1 < j < 2n} the family of those cubes. Let us denote Y n =  o+ j=12n TT(r;[]j,n) the space of piecewise polynomials of coordinate order r associated to the dyadic partition Dn.

As we did before for univariate puB-splines, we need to consider for each tensor product puB-spline Nk,r[n], two especial cubes:

Jk,r:
Jk,r = Jk1,r ×...× Jkd,r  (- Dn (read §2.4.2 for the definition of the intervals Jj).
Ij,r:
For each 1 < j < 2n, denote Ij,r the smallest cube that contains each of the Jk,r for which supp(Nk,r[n]) /~\ []j,n/=Ø. Notice that Ij,r  (- Dm for some m < n; therefore, |Ij,r| = 2(n-m)d|[]j,n|. Also, and as before, the number of subcubes Jj,r contained on each Ik,r depends solely on r and d.

In order to obtain the de Boor-Fix expression of the dual functionals of the tensor product puB-splines Nk,r[n], we will choose canonically qk,r to be the center of the cubes Jk,r. In that case, one realizes that these functionals can also be applied to any function f  (- Lp(_O_) which is differentiable enough on each of the points qk,r; in particular, any piecewise polynomial P  (- Y n:

Lemma 2.11 For any 0 < p < oo and any piecewise polynomial P  (- Y n, there exists a constant C > 0 which depends at most on the order r, such that

|      |
||a[n](P )|| < 2nd/pC||P ||     . []
  k,r               Lp(Jk)

The proof of this lemma follows the same steps that the one for Lemma 2.9 (page 47) and its posterior remark. We can also construct quasi-interpolant operators Qn : Y n --> Xn, which act naturally as projectors.

Proposition 6 For 0 < p < oo and any piecewise polynomial P  (- Y n, there is a constant c > 0 depending at most on r, d and p, such that the following estimates hold:

    ||Q (P )||        <   c||P||                                (2.5)
      n    Lp([]j,n)         Lp(Ij,r)
      ||Qn(P )|| Lp(_O_) <  c||P||Lp(_O_)                            (2.6)
||P - Qn(P )|| L ([] ) <   cE(P;TT(r;Ij,r))p                      (2.7)
            p  j,n
[]

The proof is trivial; it uses the previous lemma, and follows the same steps that the proof of Proposition 5 and its posterior remark (page 48).

We are ready to construct the method of approximation: Given t > 0, consider any operator of t near-best Lp approximation by elements of Y n, say tn : X --> Y n such that ||f - tn(f)||Lp(_O_) < tE(f,Y n)p. Notice that such operators may be constructed by collecting the restriction on cubes of the t near-best Lp approximations to f by polynomials in TT(r) on each cube []j,n  (- Dn, and patching them together. Let then

Tn : X  -)  f '--> (Qn o tn)(f)  (-  Xn
It follows easily that these are linear operators. The most important properties of these operators are presented in the following two propositions:

Proposition 7 Given t > 0, there exists a constant C > 0 which depends at most on p, r d and t such that ||Tn(f)||Lp(_O_) < C||f||Lp(_O_) for all f  (- Lp(_O_).

Proof. Notice first that, for all subcube []j,n  (- Dn, we have

||t(f)||
 n    Lp{([]j,n)                        }
 <   Cp  ||t{n(f)- f||Lp([]j,n) + ||f||Lp([]j,n)}
 <   Cp,t  E(f,TT(r);[]j,n)p + ||f||Lp([]j,n)
 <   Cp,t||f||       ;
           Lp([]j,n)
therefore,
                                  sum 2n
||Tn(f)||pLp(_O_)  =  ||Qn(tn(f))|| pLp(_O_) =   ||Qn(tn(f))||pLp([]j,n)
                                 j=1
                 (apply estimate (2.5) above)
                     2n                      2n
             <  Cr,d,p sum  ||tn(f)|| p    < Cr,d,p,t  sum  ||f||p
                     j=1      Lp(Ij,r)         j=1    Lp(Ij,r)
                (independence of n is guaranteed, see description of I )
                         p                                   j,r
             <  Cr,d,p,t||f||Lp(_O_)

Proposition 8 Tn : Lp(_O_) --> Xn is an operator of near-best Lp approximation from elements of Xn.

Proof. Given f  (- Lp(_O_), let Sn  (- Xn be its best Lp approximation from Xn; then, we have the estimate:

||f- Tn(f)||Lp(_O_)
  <  Cr {||f- Sn||     + ||Sn- Tn(f)||    }
         {      Lp(_O_)             Lp(_O_)  }
  =  Cr,d  E{(f,Xn)p +||Qn(Sn + tn(f))|| Lp(}_O_)
  <  Cr,d,p E(f,Xn)p + ||Sn + tn(f)||Lp(_O_)
  <  C    {E(f,X  ) + ||S - f||     + ||f- t (f)||    }
      r,d,p       n p    n    Lp(_O_)       n    Lp(_O_)
  <  Cr,d,p{E{(f,Xn)p + tE(f,Yn)p}   }
  <  Cr,d,p,t E(f,Xn)p + ||f -Sn ||Lp(_O_) ;
which proves the statement. [#]

Remark. Notice that now we may use indisctinctly for each f  (- Lp(_O_) either E(f,Xn)p or ||f -Tn(f)||LP(_O_), since they are equivalent. Moreover, when searching for the approximation spaces, we may use the (quasi)seminorm functions

|f |Aaq(Lp(_O_),Xn)
     {  oo  sum    (                ) } 1/q
   )(       1n  na||f- Tn(f)||Lp(_O_) q
       n=1
     (  oo  sum                    )1/q
   )(       2kqa||f - T2k(f )|| qL (_O_)
       k=0               p
     (  oo  sum                    )1/q
   )(       2nqa||f- Tn(f)||q
       n=0              Lp(_O_)
Our next goal is to try to identify the approximation spaces Aqa(Lp(_O_),Xn) in terms of classical spaces. For this task, the first step to take is to find known (quasi)seminorms with the same properties than our objects ||f - Tn(f)||Lp(_O_). Among the properties we are interested, the most obvious is that the space of polynomials of coordinate order r is properly contained on each of the kernels. Good candidates are therefore the Sobolev seminorms (but these are only defined for values p > 1 and even in those cases, not for all functions in Lp(_O_)), and the r-th moduli of smoothness (and these do exist for all functions f  (- Lp(_O_), 0 < p < oo ). We will explore both functionals and the spaces related to them in the next sections.

2.6 Sobolev Spaces

2.6.1 Mollifiers and Infinitelly Differentiable Partitions of Unity

In this section we introduce two important tools in the Theory of Sobolev Spaces: mollifiers and infinite differentiable partitions of unity. We also illustrate how to use the former to construct plateau functions and prove the density of C0 oo (G) on Lp(G) for 1 < p <  oo on domains G < Rd.

Definition 14 We call a mollifying kernel to any nonnegative, real-valued function P  (- C0 oo (Rd) such that P(x) = 0 for |x|> 1 and  integral RdP(x)dx = 1, we call a mollifier to any function Pe(x) = e-dP(x/e) for any e > 0.
Given a function f  (- M(Rd,|.|) for which the integral  integral RdPe(x - y)f(y)dy makes sense, we call the convolution (Pe * f)(x) a mollification or regularization of f.

Example. An example of mollifying kernels are the “bump” functions Pd : Rd --> R given by

          (       )
Pd(x) = exp ---1--- xB(0,1)
            |x |2- 1

PIC
Figure 2.1: P2(x) = exp( --1-)
  |x|2- 1xB(0,1)


Theorem 11 Given a domain G < Rd, C0(G) is dense in Lp(G) for all 1 < p <  oo .

Proof. Assume f  (- M(Rd,|.|). Let (sn)n be a monotonically increasing sequence of nonnegative simple functions converging pointwise to f. As p > 1, we have 0 < sn(x)p < f(x)p a.e., and therefore, it must be sn  (- Lp(G), and furthermore, by the Dominated convergence Theorem, limn||f - sn||Lp(G) = 0 (since |f(x) - sn(x)|p < f(x)p for all x). Given e > 0, find sn such that ||f - sn||Lp(G) < e/2. Use now Lusin’s Theorem to find a continuous function g  (- C(G) such that |g(x)|<||sn||L oo (G) and more importantly,

                       (         -1    -1   )p
|{x  (-  G |sn(x) /= g(x)}|< e| supp(sn)| ||sn||L oo (G) .
We have then
                                                integral 

||f - g||Lp(G) < ||f - sn||Lp(G) +||sn - g||Lp(G) < e/2 + supp(sn)sn(x)- g(x)dx < e,
and the result follows. [#]

Lemma 2.12 Given a domain G < Rd, a mollification kernel P and a function f  (- M(Rd,|.|) such that f(x) = 0 for x / (- G, the following holds: (i) If f  (- L1loc(G), then Pe * f  (- C oo (Rd) for all e > 0. (ii) If also supp(f) is compact, then Pe * f  (- C0 oo (G) for all 0 < e < dist(supp(f),@G). (iii) If f  (- Lp(G) for any 1 < p <  oo , then Pe * f  (- Lp(G); moreover,

 lim  ||P * f- f||     = 0
e-->0+   e       Lp(G)
(iv) If f  (- C(G) and K < G is compact, then lime-->0+Pe * f = f uniformly on K. (v) If f  (- C(G), then lime-->0+Pe * f = f uniformly on G.
Theorem 12 Given a domain G < Rd, a compact subset K < G and 0 < d < dist(K,@G), there is a plateau function fd  (- C0 oo (G) such that 0 < fd(x) < 1 for all x  (- G, and fd(x) = 1 for all x  (- Kd =  U y (- KB(y,d).

Proof. Let P : Rd --> R be a mollifying kernel, and consider fd = Pd *xK 3d/2, the mollification of xK3d/2 with P. This is the function we are looking for. [#]

Theorem 13 Given a domain G < Rd and 1 < p <  oo , C0 oo (G) is dense in Lp(G).

Proof. This is a direct consequence of Theorem 11 and parts (ii) and (v) of the previous Lemma. [#]

Theorem 14 Given an arbitrary subset A < Rd and an open cover O of this set, there exists a collection of functions Y in C0 oo (Rd) with the following properties: (i) 0 < y(x) < 1 for all y  (- Y and all x  (- Rd. (ii) Given a compact subset K < A, all but possibly finitelly many y  (- Y vanish identically on K. (iii) Given y  (- Y, there exists U  (- O such that supp(y) < U. (iv)  sum y (- Yy(x) = 1 for all x  (- A.

2.6.2 Distributions and Weak Derivatives

Definition 15 Given a domain G < Rd, consider the space D(G) consisting on those functions g  (- C0 oo (G) such that there exists a compact set K < G and a sequence (gn)n in C0 oo (G) so that supp(g - gn) < K for all n, and limnDkgn(x) = Dkg(x) uniformly on K for each multi-index k.
The dual space D'(G) is called the space of (Schwartz) distributions if it is given the weak-start topology as dual of D(G): limnTn = T in D'(G) if and only if limnTn(g) = T(g) in Rd for every g  (- D(G).

Remark. The space L1loc(G) can be identified with a subspace of D'(G) as follows: given f  (- L1loc(G), let Tf : C0 oo (G)  -) g'--> integral Gf(x)g(x)dx  (- R. These functionals are trivially linear. Notice that it is also continuous: Given a sequence (gn)n in C0 oo (G) such that there exists a compact K < G so that supp(g -gn) < K for all n, and limngn(x) = g(x) uniformly on K; we have

                                 integral 
|Tf(gn)- Tf(g)|< sup|g(x)- gn(x)|   |f(x)|dx
                x (- K            K
and the continuity holds in virtue of the uniform convergence of (gn)n. The identification is possible via Theorem 13: If Tf1 = Tf2, then  integral G(f1 -f2)g dx = 0 for all g  (- C0 oo ; the density gives  integral G(f1 -f2)g dx = 0 for all g  (- Lp(G), and therefore, f1 -f2 = 0 a.e. This means that the map L1loc(G) -->D'(G) that gives the identification is an injection.

Definition 16 Given G < Rd, a multi-index k  (- Nd and given a distribution T  (- D'(G), we define its distributional k-th derivative DkT by ( k  )
 D T(g) = (-1)|k|T(Dkg) for all g  (- C0 oo (G).
Similarly, for f  (- L1loc(G) and k  (- Nd, we say f  (- L1loc(G) is a weak k derivative of f if Tf is a distributional k-th derivative of Tf. This weak derivative might not exist, but in case it does, it must be unique a.e; we denote it Dwkf.

2.6.3 Sobolev Spaces Wpr(G) and Hpr(G)

Definition 17 Given a domain G < Rd, and r  (- N  U {0}, we define the following functionals:

             {  sum             } 1/p
 |f |W rp(G)  =        ||Dkf ||pLp(G)     (for 1 < p <  oo )               (2.8)
               |k|=r
 |f| r     =  max ||Dkf ||                                         (2.9)
   W oo (G)    |k|=r      L oo (G)
                             1/p
             {   sum     p    }
||f||W rp(G)  =         | f |W jp(G)     (for 1 < p <  oo )                (2.10)
               0<j<r
||f||Wr oo (G)  =   max  ||Dkf ||L oo (G)                                (2.11)
             0<|k|<r
Functionals (2.8) and (2.9) are trivially seminorms, and (2.10), (2.11) are norms. Associated to these functionals, we define the following spaces:

Remark. We have trivially Wp0(G) = Lp(G) for 1 < p < oo , and Wp0(G) = Lp(G) for 1 < p <  oo (by Theorem 13). Notice also the chain of (continuous) embeddings for all r  (- N:

---
W rp(G) < W rp(G) < Lp(_O_).
We will prove that Hpr(G) = Wpr(G) for 1 < p <  oo , and H oo r(G) (/= W oo r(G)

Theorem 15 Wpr(G) is a Banach space for all G < Rd, 1 < p < oo and r  (- N  U {0}.

Proof. Let (fn)n be a Cauchy sequence in Wpr(G) < Lp; then trivially (Dkfn)n are Cauchy sequences in Lp(G) for all multi-index k with 0 <|k|< r. Let f,f (k)  (- Lp(G) be such that limnfn = f and limnDkf = f (k) both in Lp(G). As Lp(G) < L1loc(G), each of those functions determines distributions Tf,Tf (k)  (- D'(G). For any g  (- D(G), we have then (let q be the conjugate exponent of p):

                 integral 
|Tfn(g)- Tf(g)|<  G|fn(x)- f(x)||g(x)| dx < ||g||Lq(G)||fn- f||Lp(G);
therefore, limnTfn(g) = Tf(g) and similarly, limnTDkfn(g) = Tf (k)(g) for all g  (- D(G). It follows that
                             |k|     k        |k|    k
Tf(k)(g) = limn TDk(f)(g) = linm(-1) Tfn(D g) = (- 1) Tf(D g),
and hence f (k) = Dkf in the distributional sense. The statement follows. [#]
Lemma 2.13 Given domains G'< G < Rd, such that the closure of G' in G is a compact, 1 < p <  oo , r  (- N, a mollifying kernel P  (- C0 oo (Rd), and f  (- Wpr(G), we have lime-->0+Pe * f = f in Wpr(G').
Theorem 16 (Meyers, Serrin) Given a domain G < Rd, 1 < p <  oo , and r  (- N  U {0}, we have Hpr(G) = Wpr(G).

2.6.4 Properties of Sobolev Spaces in one dimension

For functions of one variable, both ordinary and generalized derivatives produce the dame space. This is proved in the following two results:

Lemma 2.14 Let A < R be an open interval, and r  (- N  U {0}. If f  (- L1loc(A) verifies  integral Afg(r)dx = 0 for all g  (- C0 oo (R), then f is a.e. a polynomial of order r.

Proposition 9 If f  (- L1loc(A) has a weak r-th derivative f (r)  (- L1loc(A), then it can be redefined on a set of measure zero so that f(r-1) is absolutelly continuous, and f(r) = ga.e. on A.

Remark. In §2.7.2 we will make use of the K-functional of compatible couples (Lp(G),Wpr(G)) for G < Rd. We can use the previous results to illustrate how to compute it in one dimension. We will base the proof in the availability of the Taylor polynomial for functions in Sobolev Spaces: For each f  (- Wpr(A), consider the Taylor polynomial centered in c  (- A:

         r sum -1  (k)
Tf;c,r(x) =    f--(c)(x - c)k
         k=0   k!
with error
                 integral 
f(x)- T    (x) =   xf(r)(t)(x---t)r-1dt
       f;c,r      c       (r -1)!

Lemma 2.15 Given an open interval A < R, 1 < p,q < oo , we have the estimate

                 |A-|k-1/p+1/q-  (k)
||f - Tf;c,r||Lq(A) <   (k- 1)!  ||f  ||Lp(A)

Proof. Let p' be the conjugate exponent of p; let’s apply Hölder’s Inequality:

|f (x) - integral Tx(x)|
  <      |f(r)(t)||x- t| r-1
      c  (r - 1)!
      |A |r-1   integral   (r)
  <  (r--1)!   |f  (t)| dt
        r-1  A
  <  -|A-|---||f(r)||Lp(A)||||xA||Lp'(A)
     (r- 1)!
     |A|r-1/p   (r)
  =   (r - 1)!||f   ||Lp(A);
therefore, for each 1 < q < oo ,
||f - Tf;c,r||qLq(A)
      integral 
  =     |f(x) - Tf;c,r(x)| qdx
      integral  A(                )q
  <       |B-|r-1/p||f(r)||       dx
       A  (r- 1)!     Lp(A)
     ( |B|r- 1/p          )q
  =    (r--1)!||f(r)|| Lp(A) |A|;
hence proving the desired statement. [#]
Theorem 17 For r > 2, 1 < p < oo , there is a constant C > 0 depending at most on r, such that for all f  (- Wpr(A), 0 < t >|A| and 0 < k < r,
 k  (k)         (          r  (r)     )
t ||f  || Lp(A) < C ||f||Lp(A) + t||f || Lp(A)

Proof. Given 0 < t <|A|, we have for all x  (- At = {x  (- A | x + t  (- A},

          r- sum  1f(k)     integral  x+t     (x+ t- s)r-1
f(x + t) =    -k! tk +    f(r)(s)---(r---1)!---ds
          k=0         x---------- -----------
                              Rr(x,t)
Notice that
|Rr(x,t)|
       tr-1    integral   (r)
  <  (r--1)!   |f  (t)| x(x,x+t)(t) dt
       r-1   A
  <  --t----||f(r)||Lp(A)t1-1/p
     (r- 1)!
     -tr--1/p-  (r)
  =  (r- 1)!||f  ||Lp(A);
and therefore,
                 r- 1/p  1/p               r
||Rr(.,t)||Lp(At) < t----| At-|--||f(r)||Lp(A) < --t---||f(r)|| Lp(A)
                  (r- 1)!              (r- 1)!
(2.12)

since |At|< t trivially.
Consider now for our choice of 0 < t < |A|, x  (- A and c > 0 such that x + ct  (- A. In this case, we have

          r-1
           sum  f(k)(x) k k
f(x+ ct) =     k!  c t + Rr(x,ct).
          k=0
Choose q > 1 small, and r - 1 different values 1 < c1 <...< cr-1 < q. Construct the following system of linear equations:
(             r-1 ) (      )   (       )
   c1.   ...  c1.        a1.          b1.
    ..   ...   ..         ..    =      ..    ;
  cr-1  ...  crr--11     ar-1        br-1
where ak = tkf(k)(x)/k!, and bk = f(x + ckt) -f(x) -Rr(x,ckt). The first matrix is a Vandermonde’s; hence this system has a solution:
(       )   (             r-1 )-1 (      )   (                     ) (      )
    a1         c1   ...  c1          b1          g1,1   ...   g1,r-1        b1
    ...     =     ...    ...   ...           ...    =      ...    ...     ...          ...    ,
   ar-1       cr-1  ...  crr--11       br-1        gr-1,1  ...  gr-1,r-1      br- 1
where the values gi,j are controled by the values ck. We have then ak =  sum j=1r-1gkjbj for all j = 1,...,r - 1.
This leads to
 k
t-||f(k)|| L (A)
k!   ||  p                              ||
     ||||r sum -1                              ||||
  =  ||||    gkj(f(x + cjt)- f(x)- Rr(x,cjt))||||
     || j=1                              || Lp(A)
     r sum -1
  <     gkj(2||f||L (A) + ||Rr||L (A))
     j=1         p          p
     ( use estimate (2.12) above)
       r- 1            r-1
  <  2  sum  g || f||    +  sum  g  --tr--|| f(r)||
       j=1  kj  Lp(A)  j=1 kj(r- 1)!    Lp(A)
and the statement follows.

2.7 Modulus of smoothness and Besov Spaces

2.7.1 Definitions and Properties

Consider the difference operators: for each h  (- Rd and measurable function f  (- M(_O_,|.|) for any subset _O_ < Rd, let Dh(f,.) = f(. + h) - f(.), and Dhr = Dh(Dhr-1) for r > 1. It follows from the binomial theorem, that

           r       (  )
Drh(f,x) =  sum  (-1)r-k r f(x+ kh)
         k=0        k
(2.13)

for all x  (- _O_(rh) = {x  (- _O_ ||x + kh| (- _O_ for all 1 < k < r}.

Definition 18 Given a rearrangement-invariant space (X,||.||X) over the space (_O_,|.|), we define the r-th modulus of smoothness of f  (- X by

                ||  r         ||
wr(f,t)X =  sup ||D h(f,.)x_O_(rh)||X for all t > 0.
           0<|h|<t

The general setup is fairly complicated, and many different properties are to be taken into account in order to produce any general result on these functionals. We will focus on the spaces we are going to use in this survey: Lp(_O_) for 0 < p <  oo , and C(_O_) for p =  oo , where _O_ < Rd is a compact cube.

Lemma 2.16 For any t > 0, the modulus of smoothness is a seminorm for 1 < p < oo and a quasi-seminorm for 0 < p < 1.

Proof. Notice that wr(cf,t)p <|c|wr(f,t)p, and wr(0,t)p = 0 trivially for all 0 < p < oo . As for the (quasi)triangular inequality, we have also trivially wr(f + g,t)p < C(w (f,t) + w (g,t))
 r    p    r    p, with the same constant from the (quasi)triangular inequality in Lp(_O_(rh)). The kernel of the r-th modulus of smoothness is precisely the set of polynomials TT(r) of coordinate order r.

Also, from the fact that ||f + g||Lpp <||f||Lpp + ||g||Lpp for all 0 < p < 1, we obtain similarly wr(f + g,t)pp < wr(f,t)pp + wr(g,t)pp. [#]

Lemma 2.17 (Properties of the modulus of smoothness in Lp) Given f  (- Lp(_O_), t > 0, the following estimates hold:

        min(1,p)      r- k      min(1,p)
 wr(f,t)p      <   2  wk(f,t)p     for all 0 < k < r            (2.14)
wr(f,nt)mpin(1,p) <   nrwr(f,t)mpin(1,p)for all n  (-  N                 (2.15)
        min(1,p)          p      min(1,p)
wr(f,ct)p      <   (c + 1)wr(f,t)p      for all c > 0              (2.16)

Proof. The first estimate can be proved directly from the identity Dhr(f,x) = Dhr-1(f,x) + Dhr-1(f,x + h), which is proved easily from (2.13). That gives wr(f,t)pmin(1,p) < 2wr-1(f,t)pmin(1,p), and from this the statement follows.

As for the second estimate, notice first that

                    r                      r
wr(f,nt)p = 0<s|uhp|<nt||D h(f,.)||Lp(_O_(rh) = 0<su|hp|<t||D nh(f,.)||Lp(_O_(rnh)
We use next the following expansion for Dnhr:
           n-1   n-1
Dr  (f,x) =  sum  ... sum   Dr(f,x +k h + ...+ k h)
  nh       k1=0   kr=0 h       1         r
And the third estimate is a direct consequence of the second. [#]

Weak inverses to the first estimate in the previous lemma are offered by Marchaud and Timan. A proof of the following result, that is known as Marchaud’s Inequalities, can be read in chapter 2 of [DeLo].

Theorem 18 (Marchaud (1927),Timan (1958)) Given r > 2 and f  (- Lp(_O_), we have the following estimates for all 1 < k < r, t > 0 and 0 < p < oo :

                           integral   oo       min(1,p)
wk(f,t)mpin(1,p)  <  Ctkmin(1,p)    wr(f,s)p-----ds                            (2.17)
                           t    sk min(1,p)+1
                  if _O_ is not{ bounded.                              }
                             integral  |_O_|wr(f,s)mpin(1,p)   ( ||f||Lp(_O_))min(1,p)
wk(f,t)mpin(1,p)  <  Ctkmin(1,p)      --skmin(1,p)+1--ds+   --|_O_-|k---              (2.18)
                             t
                  if _O_ is compact
And for 1 < p <  oo , m = min(p,2):
               (i ntegral   oo      m   )1/m
wk(f,t)p  <  Ctk      wr(f,s)p-ds                               (2.19)
                  t   skm+1
             if _O_ is not bounded
               (i ntegral  |_O_|      m     ||f||m   )1/m
wk(f,t)p  <  Ctk      wr(f,s)pds + ---Lp(_O_)-                    (2.20)
                  t    skm+1        |_O_|km
             if _O_ is compact
[]
Definition 19 Given _O_ < Rd and parameters a,q > 0, we define the Besov functionals in Lp(_O_) by
                {  integral   oo  (       )q dt} 1/q
|.|Baq(Lp(_O_)) =         t-awr(.,t)p   t-    for 0 < q <  oo ,
                  0( -a      )
|.|Ba oo (Lp(_O_)) =  stu>p0 t  wr(.,t)p  for q =  oo ,
where r is the smallest integer greater than a. Associated to these functionals, we define the Besov spaces as
  a
B q(Lp(_O_)) = {f  (-  Lp(_O_) :|f| Baq(Lp(_O_)) <  oo }.
Lemma 2.18 If _O_ is compact, then the previous (quasi)seminorms are equivalent to their discrete counterparts:
               (             )
                  sum  oo  qa    q  1/q
|.|Baq(Lp(_O_))   )(       2  wr(.,t)p
                 k=0
|.|Ba oo (Lp(_O_))   )(  sup 2kawr(.,2-k)p
               k>1

Sketch of the Proof. The proof is similar to the proof of part (ii) in Lemma 1.9 (page 21): we start showing that the seminorm above is equivalent to the one obtained replacing the integral (or the supremum) over (0, oo ) by one over (0,1), using the fact that, as _O_ is compact, then wr(f,t)p < wr(f,|_O_|)p for all t > |_O_|. After that, discretization of the latter integral with partition {2-n | n  (- N} is applied, using the fact that wr(f,.)p is a nondecreasing function. [].

Remark. Unfortunately, the moduli of smoothness are not always suitable for applications because it is not easy to add up several such estimates over different intervals. New related (and equivalent) moduli of smoothness can be constructed by averaging:

Definition 20 We define the r-th averaged modulus of smoothness on the subcube I < _O_, for f  (- Lp(_O_) and t > 0 by

            (    integral       integral                  )1/p
wr(f,t;I)p =  -1d           |Drh(f,x)|pdxdh     .
             t   [- t,t]d I(rh)

Remark. Notice that, for I,J < _O_, we have (I  U J)(h)  )_ I(h)  U J(h) for all suitable h  (- Rd; therefore, wr(f,t;I  U J)pp > wr(f,t;I)pp + wr(f,t;J)pp. We will prove now the equivalence with the moduli of smoothness.

Lemma 2.19 For all f  (- Lp(_O_) and suitable s  (- Rd, the following holds for all x  (- Rd

           r ( )
  r        sum   r  [ r              r       ]
D h(f,x) =     k   Dks(f,x+ kh)- D h+ks(f,x)
          k=1

Proof. Notice that

 sum r      (  )
  (- 1)r+k  r Drh+ks(f,x)
k=0         k
      sum r      (r )  sum r      (r)
  =     (- 1)r+k  k    (- 1)r+j j  f(x+ j(h+ ks))
     k=0           j=0
      sum r    r+j(r ) sum r    r+k(r)
  =     (- 1)    j    (-1)    k  f(x+ jh+ ksj)
     j=0          k=0
      sum r    k(r)       sum r    r+j(r)  r
  =     (- 1)  k f (x) +   (-1)    j  Djs(f,x + jh)
     k=0      (  )    j=1
      sum r    r+j  r   r
  =     (- 1)    j D js(f,x+ jh);
     j=1
therefore,
               sum r       ( )                sum r      ( )
(-1)rDrh(f,x) =   (-1)r+j r Drjs(f,x +jh) -   (-1)r+k  r Drh+ks(f,x),
              j=1        j               k=1        k
and the statement follows. [#]
Proposition 10 For all f  (- Lp(_O_), a subcube I < _O_, r > 0 and any 0 < t < |I|(4r)-1, there exists a constant C > 0 which depends at most on r such that
wr(f,t;I)p < wr(f,t)p < Cwr(f,t;I)p

Proof. The left inequality is trivial:

               integral 
        p   -d         r     p                p
wr(f,t;I)p = t  [-t,t]d||D h(f,.)||Lp(I(rh))dh < wr(f,t)p
Blah blah blah. [] Remark. We will prove that the Besov spaces are precisely the ones we are looking for. The key is Whitney’s theorem.
2.7.2 Whitney’s Theorem

Theorem 19 (Johnen (1972)) K(f,tr;Lp(_O_),Wpr(_O_))  )( wr(f,t)p for all f  (- Lp(_O_), (1 < p <  oo ), t > 0 and r  (- N.

Proof. Oh boy, this is a tough one. []

Theorem 20 (Whitney (1957)) E(f,TT(r))p  )( wr(f,l_O_)p for all f  (- Lp(_O_), (0 < p < oo ), t > 0 and r  (- N, where l_O_ is the largest of the sides of _O_.

Proof. For p > 1, let g  (- Wpr(_O_) be arbitrary, and P  (- TT(r) be the Taylor polynomial of g associated to one of the points in the boundary of _O_. By some result in §2.6, we have

||g- P ||Lp(_O_) <---1---| _O_ |r||Drg ||Lp(_O_);
             (r -1)!
in particular,
                 {                r  r      }
||f- P ||Lp(_O_) < Cp,r ||f - g||Lp(_O_) + |_O_| || D g||Lp(_O_)
This gives E(f,TT(r))p < Cp,rK(f,|_O_|r;Lp(_O_),Wpr(_O_)), and application of theorem 19 offers the left inequality in this case.

The right inequality is trivial for all p, since wr(f,t)p = wr(f -P,t)p < 2rw0(f -P,t) = 2r||f -P||Lp(_O_) for all P  (- TT(r). [#]

2.8 Other Seminorms for Besov Spaces

Proposition 11 For any 0 < p <  oo , r  (- N and t > 0, given f  (- Lp(_O_), there exists C > 0 that depends at most on p, r and t such that the following estimate holds for all n:

||f - Tnf||Lp(_O_) < Cwr(f,2-n)p.

Proof. For each dyadic subcube []j,n, denote tj,n = tn(f)|[]j,n the restriction of the piecewise polynomial of t near-best approximation we get from the operator tn. In that case,

||f- Tnf ||{Lp([]j,n)                               }
  <  Cp  ||f- tj,n||Lp([]j,n) + ||tj,n - Qn(tj,n)||Lp([]j,n)
     (use estimate 2.7 from Proposition 6 in page 53)
  <  C   {E(f,TT(r);[]  ) + E(t  ,TT(r);I  )}.
       p,t           j,n p     j,n       j,rp
Now, for each of the subcubes []i,n < Ij,r, we have the estimate
E(tj,n,TT(r);[]i,n)p
  <  ||tj,n- ti,n||Lp([]i,n)
  <  Cp {||tj,n- f||L ([]  ) + ||f- ti,n||L ([] )}
                  p  i,n            p i,n
     (use Lemma{ 2.4 in page 32)             }
  <  Cp,|Ij,n| ,r ||tj,n- f||Lp(Ij,r) + ||f -ti,n||Lp(Ij,r)
  <  Cp,|Ij,n| ,n,tE(f,TT(r);Ij,r)p;
therefore,
||f- Tnf||Lp([]  ) < Cp,|I | ,n,tE(f,TT(r);Ij,r)p,
            j,n       j,n
and from Whitney’s Theorem and Proposition 10 in page 72, we have
         p
||f - Tnf||Lp(_O_)
      2 sum n- 1
  =       ||f - Tnf||pLp([]j,n)
     j=1-r
     (set l = max{lIj,r |1- r < j < 2n}
      and c = 2nl)
            2n-1
  <  C        sum   w (f,l  ;I )p
       p,c,n,tj=1-r r   Ij,r  j,rp
            2n-1
              sum             p
  <  Cp,c,n,t     wr(f,l;Ij,r)p.
            j=1-r
Notice that there is a finite number of cubes Ij,r, so the previous sum of averaged moduli of smoothness can be estimated by the averaged moduli of smoothness on the union:
||f - Tnf||pL (_O_)
         p          p
 <   Cp,c,n,twr(f,l,_O_)p
 <   Cp,c,n,twr(f,2-n,_O_)pp,
and the statement follows. [#]

Corollary 11.1 For any 0 < p <  oo , r  (- N and t > 0, given f  (- Lp(_O_), there exists C > 0 that depends at most on p, r and t such that the following estimate holds for all n:

                  -n
E(f,Xn)p < Cwr(f,2  )p. []
Lemma 2.20 For any 0 < p <  oo there exist C1,C2 > 0 which depend at most on d and p, such that for all Sn  (- Xn,
             (                   )
               2n sum -1     || [n]    ||p 1/p
C1||Sn ||Lp(_O_) <       2-nd|ak,r(Sn)|     < C2||Sn||Lp(_O_).
               k=1-r

Proof. The left hand side is inmediate; given x  (- _O_, let /\n(x) = {k  (- Zd | x  (- supp(Nk,r[n])} (notice that this value does not depend on n, but on r and d):

                         |       | |      |              |       |
|S |p < (2p-1)|/\n(x)|-1  sum   ||a[n](S )||p|| N [n](x)|| p < C      sum    ||a[n](S )||px       ;
 n                 k (- /\ (x) k,r  n     k,r        p,d,rk (- /\ (x) k,r  n    supp(N[kn],r)
                      n                               n
hence,
 integral                                                         2n-1
   |S  (x)|pdx < C      sum    ||a[n](S )||p||supp(N[n])||< C    2-nd  sum    ||a[n](S  )||p.
 _O_   n          p,d,r       | k,r  n ||      k,r|    p,d,r    k=1- r|k,r  n|
                    k (- /\n(x)
As for the right inequality, we need to make use of Lemma 2.11 in page 53:
      n                    n
 - nd 2 sum -1 || [n]  ||p    -nd 2 sum -1  nd
2         |ak,r(Sn)| < 2        2  ||Sn||Lp(Jk,r),
     k=1-r                k=1-r
and the statement follows. [#]
Proposition 12 For any 0 < p <  oo and r  (- N, given f  (- Lp(_O_), there exists C > 0 that depends at most on p, d and r such that the following estimate holds for all n:
                  (           n               )1/m
wr(f,2-n)p < C2 -nr ||f||m   +  sum  [2krE(f,Xk)p]m    ,
                       Lp(_O_)  k=1
where m = min(1,p).

Proof. Let Sn be an element of best Lp(_O_) approximation to f from Xn for all n, and s1 = S1, sn = Sn -Sn-1  (- Xn for n > 2. We can then write f = f - Sn +  sum k=1nsk, and use this inside the difference operator; for suitable h  (- Rd,

   r
||D h(f,||.)|| Lp(_O_(rh))             ||
     ||||  r            sum n  r    ||||
  =  ||||D h(f- Sn,.)+    D h(sk,.)||||
                    k=1        Lp(_O_(rh))
     {                        sum n                 } 1/m
  <    ||Drh(f - Sn,.)||mLp(_O_(rh)) +   ||Drh(sk,.)||mLp(_O_(rh))
                             k=1
     {                     n sum                   } 1/m
  <    2rm||f- Sn||mL (_O_(rh)) +   ||Drh(sk,.)||mL (_O_(rh))                    (2.21)
                  p        k=1           p
We need to estimate the terms in the sum on the right, but trying to introduce coefficients 2k on each of the estimates, so that later application of Lemma 1.3 in page 10 is possible; let x  (- _O_(rh):
|Drh(sk,x)| p
     ||   ( 2k-1            ) ||p
  =  ||Dr     sum  a[k](sk)N [k],x  ||
     ||  h  j=1-r  j,r    j,r    ||
     || k                   ||p
     ||2 sum -1 [k]     r  [k]  ||
  =  ||     aj,r(sk)D h(Nj,r,x)||
      j=1-r             |      | |         |
  <  (2p-1)|/\k(x)|-1  sum    ||a[k](sk)||p||Dr(N [k],x)||p                 (2.22)
      ----- ----- x (- /\k(x) j,r       h  j,r
         Cp,d,r
Now it all relies on estimates of the differences of the basic tensor product puB-splines; these depend heavily on the location of the point x and the size of |h|; let _O_(rh) = G  U G', where x  (- G if x and x + rh both belong to the same subcube []i,k, and G' = _O_(rh) \ G:
  1. If x and x + rh are in the same cube []i,k < supp(Nj,r[k]), as Nj,r[k] is a polynomial of coordinate order r there, we have:
    Drh(N [jk],r,x)
        r    [k]
  =  |h| | /_\ (N j,r(.);x,x+ h,...,x + rh)
  =   1r!|h| rDrhN[jk,r](qx,r,h),                                  (2.23)
    where Dhr denotes the derivative in the direction given by h  (- Rd, and qx,r,h is a point in the segment with endpoints x and x + rh. Notice that, although Nj,r[k]|[]i,n is a polynomial of coordinate order r, its total degree can be as large as (r - 1)d, and the previous directional derivative is not null in general: A simple computation gives that for d = 1, the derivative vanishes, but for d > 1, it vanishes if and only if r < d/(d - 1) < 2.

    In order to estimate ||DhrNj,r[k]||L  oo (segment[x,x+rh]), we need to use some basic multivariate calculus: Consider the univariate polynomial Nj,r,x,h[k] : [0,1] --> R defined by the composition of Nj,r[k](x) = N0,r[0] ofk,j ofx,h, where fk,j : Rd  -) x --> 2kx - j  (- Rd, and fx,h : [0,1]  -) t --> x + rht  (- segment[x,x + rh] < Rd.

        [k]
DN j,r,x,h((t)             )
  =   D  N[0]o fk,j o fx,h (t)
          0,|r
  =   DN [00],r||         .Dfk,j| f  (t) .Dfx,h(t)
      (     fk,j(fx,h(t)) ) |   x,h
        @N[00,]r    @N [00],r  ||          k
  =     @x1--,...,-@xd-  ||        .2 Idd .rh
                   |    fk,j(fx,h(t))
          sum d  @N [00],r||
  =   2kr    hi-@x--||
         i=1      i  fk,j(fx,h(t))
  =   2krDhN  [0](x)
            0,r
    Use now Markov’s Theorem (page 33) and the fact that ||N0,r[0]||L oo (_O_) < 1 to obtain ||DhN0,r[0]||L oo (_O_) < Cr; and therefore,
                         ||       ||
||DhN [jk,r]||L oo (_O_) <  2kr||||DhN 0[0,]r||||     < 2kCr
                  (   )r-1||   L oo (_O_||)
||DrhN[jk,r]||L oo (_O_) <   2kr    ||||DhN [00],r||||      < 2krCr
                                  L oo (_O_)
    Use this estimate in equation (2.23) to get
    |         |
||Drh(N [kj,]r,x)|| < Cr(2k| h|)r
  2. Otherwise, if x and x + rh belong to different subcubes of Dk, as far as all the cubes supporting any point x + ih are contained in the support of Nj,r[k], we simply have Nj,r[k]  (- W oo r(_O_); we can nevertheless apply a similar estimate as before:
    Drh(N [kj,]r,x)
        (  r-1  [k]   )
  =  Dh  D h  (N j,r,x)
  =  D  (--1--|h| r-1Dr- 1N [k](q   ));
       h (r-1)!       h   j,r x,r,h
    therefore,
    || r   [k]  ||  ||   (   k(r- 1)  r-1)||      k   r-1
|Dh(N j,r,x)|< |Dh  Cr2     |h|   |< Cr(2 |h| )   .

We have then,

||         ||
||||Dr (N[k],.)|||| p
   h  j integral ,r   Lp(_O_(rh))
                   ||  r  [k]  ||p
  =    _O_(rh) /~\ supp(N[k])|D h(N j,r,x)| dx
       integral          j,r  |         |p
  =                 ||Drh(Nj[k,r],x)|| dx
       (G U G') /~\ supp(N[j|k,r])         |            |             |
  <   C {(2k|h| )rp||G /~\  supp(N[k])||+ (2k|h|)(r-1)p||G' /~\  supp(N[k])|| }
       r                   j,r                         j,r
But ||      [k]||
|supp(Nj,r)| = c2-kd, and |G'|< c|h|2-k(d-1); therefore,
||||  r  [k] ||||p
||D h(Nj,r,.)||Lp(_O_(rh))
        {                                    }
  <  Cr  2krp2-kd| h|rp + 2kp(r-1)2-k(d-1)| h| (r-1)p|h|
                  kp(r- 1) -k(d- 1)    kpr -kp -kd k
      (notice that 2 kpr 2-kp -kd =kp2  2kpr 2-kd 2
        {        < 2  2   2   2  = 2   2} )
  <  Cr  2krp2-kd| h|rp + 2kpr2-kd| h |p(r-1+1/p)

  <  Cr2kpc2-kd|h| pc,
where c = min(r,r - 1 + 1/p). Using this last estimate on (2.22), we obtain
||Dch(sk,.)||Lp(_O_(rh))
                (  k               )1/p
            c kc  2 sum  -1 -kd ||[k]  ||p
  <  Cp,d,r| h| 2        2    |aj,r(sk)|
                  j=1-r
      (use Lemma 2.20 above)
  <  Cp,d,r| h|c2kc|| sk||Lp(_O_(rh))
This leads to two different estimates; for k > 2:
||Dc(sk,.)||L (_O_(rh))
  h       p c kc(                                 )
  <  Cp,d,r| h| 2   ||Sk- f||Lp(_O_(rh)) + ||f- Sk-1||Lp(_O_(rh))
  <  Cp,d,r| h|c2kc(E(f,Xk)p + E(f,Xk -1)p)
And for k = 1,
||Dch(s1,.)|| Lp(_O_(rh))
  <  C    (2| h|)c||S ||
       p,d,r      ( 1 Lp(_O_(rh))                 )
  <  Cp,d,r(2| h|)c ||S1- f||Lp(_O_(rh)) + ||f||Lp(_O_(rh))
  <  Cp,d,r(2| h|)c(E(f,X1)p + ||f||L (_O_(rh)))
                               p
We can now find an upper bound on wr(f,2-n)p using the previous estimates on (2.21):
||Drh(f,.)||mL (_O_(rh))
         p                    (                     )
  <  2rmE(f,Xn)mp + Cp,d,r(2| h |)cm E(f,X1)mp + ||f||mLp(_O_(rh))
             sum n        (                     )
     + Cp,d,r   |h|cm2kcm E(f,Xk)mp + E(f,Xk -1)mp
            k=2
          {                       n sum - 1
  <  Cp,d,r (2| h|)cm||f||mLp(_O_(rh)) + |h| cm   2kcmE(f,Xk)mp
                                  k=1
       (  cm cnm   rm)        m}
     +  |h | 2   + 2   E(f,Xn) p ;
therefore,
                 {                      n sum -1                              } 1/m
wr(f,2-n)p < Cp,d,r 2(1-n)cm||f||mLp(_O_) + 2-ncm  2kcmE(f,xk)mp + (1+ 2rm)E(f,Xn)mp
                                        k=1
As both 1 + 2rm and 2cm are greater than one for all choices of p and r, we can bound the previous expression above by multiplying each term in the sum by these coefficients, and include them in the constant.
     -n
wr(f,2  )p
          - nc{ cm   m      n- sum  1 kcm        m   ncm       m } 1/m
 <   Cp,d,r2    2  ||f||Lp(_O_) +    2  E(f, Xk)p + 2  E(f,Xn)p
              {             k=1            }
                   m      sum n (          )m  1/m
 <   Cp,d,r2- nc ||f ||Lp(_O_) +    2kcE(f,Xk)p      .
                         k=1
As we wanted to prove. [#]

Remark. Abusing notation, we can denote X0 = {0}, and hence, E(f,X0) = ||f||Lp(_O_), and we may simplify the previous estimate to read

                     (  sum n [          ] )1/m
wr(f,2-n)p < Cp,d,r2-nc     2kcE(f,Xk)p m    .
                       k=0

Theorem 21 Given r  (- N, and 0 < p,q < oo ; then for all 0 < a < c, the following quasinorms are equivalent to the Besov quasinorms ||.||Bqa(Lp(_O_)) = |.|Bqa(Lp(_O_)) + ||.||Lp(_O_):

          (  oo  sum               )1/q
N1(f)  =       [2naE(f,Xn)p]q
            n=0
          (  oo  sum  [               ] )1/q
N2(f)  =        2na||f- Tnf||Lp(_O_)q
            n=0
          (  oo  sum                  )1/q
N3(f)  =       [2na||tn(f)||Lp(_O_)]q
            n=1
where tn(f) = Tn(f) - Tn-1(f) for n > 1 (and of course, T0  =_ 0).

Proof. The equivalence of N1, N2 and ||.||Bqa(Lp(_O_)) follows directly from Proposition 11, Corollary 11.1, Proposition 12 and the Discrete Hardy’s Inequalities (Lemma 1.3 in page 10). As for the third quasinorm, notice that on one side,

||tn(f)||Lp(_O_)
  =  ||Tn(f)- Tn-1(f)||Lp(_O_)
  <  Cp {||f- Tn(f)||L (_O_) +||f - Tn-1(f)||L(_O_)}
                    p                  p
  <  Cp {E(f,Xn)p + E(f,Xn -1)p}
      (but E(f,Xn)p < E(f,Xn -1)p for all n)
  <  CpE(f,Xn -1)p;
therefore, N3(f) < CpN1(f) for all f  (- Lp(_O_). On the other hand,
  sum  oo          sum  oo 
      tk(f) =      (Tk(f)- Tk-1(f)) = - Tn(f)+ limk Tk(f) = f- Tn(f);
k=n+1       k=n+1
therefore,
                 ||         ||       (                )1/m
                 |||| sum  oo      ||||          oo  sum          m
||f- Tn(f)||Lp(_O_) = ||||   tk+1(f )||||     <      ||tk+1(f)||Lp(_O_)     ,
                  k=n       Lp(_O_)    k=n
where m = min(1,p). We can use again Lemma 1.3 to obtain
 sum  oo  ( na            )q    -r   sum  oo  ( na         )q
    2  ||f- Tn(f)||Lp(_O_)  < 2  Cq    2  ||tn(f)|| Lp(_O_) ,
n=0                            n=1
which gives N2(f) < Cq,rN3(f); hence proving the statement. [#]

Remark. We have just proved the goal of this chapter; we have precisely determined the approximation spaces in Lp(_O_) associated to the family of approximants Xn for all q > 0, and 0 < a < c:

Corollary 21.1 Given r  (- N, the following spaces are identical (with equivalent (quasi)norms) for all 0 < a < c:

Aaq(Lp(_O_),Xn)  -~  Baq(Lp(_O_)) []

Remark. Theorem 21 offers also the possibility of representing functions in Besov spaces by means of a sequence of nonnegative real functions satisfying certain properties. We will use this representation to find in the next section an equivalent expression for the K-functional of couples of Besov spaces; and with it, the computation of interpolation spaces for such couples.

Notice that  sum n=1 oo tn(f) = limnTn(f) = f a.e.; and as tn(f)  (- Xn for each n > 1, we may write tn(f) =  sum k=1-r2n-1 aj,r[n](tn(f))Nj,r[n], and furthermore

                  n
     sum  oo        sum  oo  2 sum - 1 [n]      [n]
f =    tn(f) =        aj,r(tn(f ))N j,r
    n=1       n=1j=1-r
(2.24)

This atomic decomposition of functions in Bqa(Lp(_O_)) leads to yet another equivalent (quasi)norm:

Corollary 21.2 Given p,q,r,a as before, f  (- Lp(_O_) is in Bqa(Lp(_O_)) if and only if f can be represented as in (2.24), with

                (                     )     1/p
       {  sum  oo      2n sum -1 |        |p      q/p}
N4(f) =     2anq       || a[nj,]r(tn(f))|| 2-nd         <  oo .
         n=1      j=1-r

2.9 Further results: K-functional of compatible couples of Besov Spaces

Given a sequence of functions a = (fn)n in a (quasisemi)normed space (X,||.||X), consider for parameters a,q > 0 the (quasisemi)norms

         (  oo  sum            )1/q
|a| laq(X) =     [2na||fn||X]q    ,
           n=0
and let lqa(X) = {a = (fn)n :| a|a   <  oo }
             l1(X).

Consider also, the following operator in Lp(_O_):

                         oo 
T : Lp(_O_)  -)  f '--> (tn(f))n  (-   o+  Xn.
                       n=1
Following Theorem 21, we have that f  (- Bqa(Lp(_O_)) if and only if Tf  (- lqa(Lp(_O_)), and moreover, ||f||Bqa(Lp(_O_))  )( ||Tf||lqa(Lp(_O_)). We can use this fact to find the K-functionals of compatible couples of Besov spaces:

Theorem 22 Given r  (- N, 0 < p1,q1,p2,q2 <  oo , 0 < a1,a2 < r, denote Bi = Bqiai(Lp i(_O_)), and li = lqiai(Lp i(_O_)); then, for all f  (- B1 + B2, there exist constants C1,C2 > 0 which depend at most on r,d,a1 and a2 such that for all t > 0,

C1K(f, t;B1, B2) < K(T f,t;l1,l2) < C2K(f, t;B1,B2).

Proof. Let us prove the left inequality: Given f  (- B1 + B2, let a1 = (an[1])n  (- l1 such that a2 = (an[2])n = Tf -a1  (- l2; we have K(Tf,t;l1,l2) <||a1||l1 + t||a2||l2. From these sequences ai, we will construct functions fi  (- Bi such that f = f1 + f2, and ||fi||Bi < C||ai||li.

We will be using the projectors Tn : Lr(_O_) --> Xn for both functions fi  (- Bi; thus, we need to work in a space Lr(_O_) < Lp1(_O_)  /~\ Lp2(_O_). As |_O_| = 1, use Jensen’s Inequality to realize that for each 0 < r < min(p1,p2), and any function f  (- Lpi, we have  integral _O_|f|r =  integral _O_   p
(| f |i)r/pi <( integral     p )
 _O_ |f |ir/pi = ||f||L pi(_O_)r.

For each n, let gn = Tn(an[1])  (- Xn. By the equivalence of quasinorms in finite dimensional spaces, and Proposition 7 in page 54, we know that it must be ||gn||Lp 1(_O_) < Cr||gn||Lr < Cr,r,t||an[1]||Lr(_O_) < Cr,r,t||an[1]||Lp 1(_O_). Consider now g =  sum n=1 oo gn, which converges trivially in Lp1(_O_); notice that for each n, with m = min(p1,1).

E(g, Xn)p
        ( sum n        sum o o         )
  =   E ( sum  k=1gk +  k=n)+1gk,Xn  p1
  =   E  o o k=n+1gk,Xn p
      ||||  oo     ||||       1
  <   ||||  sum   gk||||
      ||k=n+1  ||
      (        Lp1(_O_) )
         sum  oo     m      1/m
  <         ||gk||Lp1(_O_)
       k=n+1
           (  sum  oo           )1/m
  <   Cr,r,t       ||a[n1]||mLp1(_O_)    .
            k=n+1
Application of Lemma 1.3 in page 10 gives that it must be
 oo  sum   na        p1         oo  sum  ( na [1])p1
   [2 E(g, Xn)]  < Cr,r,t     2  an    ;
n=1                     n=1
and therefore, ||g||B1 < Cr,r,t||a1||l1. Similarly, f - g  (- Lp2(_O_), and with m = min(p2,1) this time,
E(f - g,Xn)p2
  =   E( sum o o k=1tk(f )-  sum o o k=1gk,Xn)p
       ( sum n               sum o o  2             )
  =   E( sum  k=1[tk(f)- gk]+  k=)n+1[tk(f)- gk],Xn  p2
  =   E  o o k=n+1[tk(f)- gk],Xn p
      ||||  oo            ||||        2
  <   ||||  sum   [tk(f )- gk]||||
      ||k=n+1          ||
      (               Lp2(_O_) )
         sum  oo            m     1/m
  <         ||tk(f)- gk||Lp2(_O_)
       k=n+1
      (notice that tk = Tk(tk(f )) = Qk(tk(f)) = tk(tk(f))
      and gk = Tk(a[1]) = Qk(tk(f)))
      (           k                  )1/m
         sum  oo  ||||  (          [1])||||m
  =         ||Qk  tk(f)- tk(ak ) ||Lp2(_O_)
      (k=n+1                           )
         sum  oo  ||||       (        [1])||||m     1/m
  =         ||(Qk o tk) tk(f)- ak ||Lp(_O_)
       k=n+1                        2
           (  sum  oo  ||||         ||||m    )1/m
  <   Cr,r,t       ||tk(f)- a[1k]||         ;
            k=n+1            Lp2(_O_)
and as before, we infere ||f - g||B2 < Cr,r,t||a2||l2, hence proving the left inequality of the statement.

Let us prove the right inequality: Let g  (- B1 such that f - g  (- B2. Given t > 0 and 0 < r < min(p1,p2), we construct t near-best elements of Lr(_O_) approximation to f from Y n via the operator tn : Lr(_O_) --> Y n, and using Lemma 1.5 in page 14, we can obtain as well elements of Lr(_O_) approximation to g from Y n, say hn(g)  (- Y n, such that tn(f) - hn(g) are near-best elements of Lr(_O_) approximation to f - g from Y n. By an argument similar to the proof of Lemma 8 in page 55, we realize that Un = Qn(hn(g)) is a near-best Lp1(_O_) approximation to g from Xn, and Rn = Tn(f) - Qn(hn(f)) is a near-best Lp2(_O_) approximation to f - g from Xn.

Let un = Un - Un-1 and rn = Rn - Rn-1 for n > 1 (being U0 = R0 = 0 trivially), and consider the sequences u = (un)n,r = (rn)n  (-  o+ n=1 oo Xn. Notice that

||un||Lp (_O_)
      1  {                             }
  <  Cp1  ||Un - g||Lp1(_O_) + ||g- Un- 1|| Lp1(_O_)
  <  Cr,d,r,p ,t {E(g,Xn)p + E(g,Xn -1)p }
           1           1            1
  <  Cr,d,r,p1,tE(g,Xn)p1
and similarly, ||rn||Lp 2(_O_) < Cr,d,r,p2,tE(f - g,Xn)p2; hence, by Theorem 21 in page 84, we have u  (- l1, r  (- l2, and moreover, for any t > 0, ||u||l1 + t||r||l2 < C{||g||B1 + t||f -g||B2}, and the statement follows. [#]

Corollary 22.1 Under the same conditions as in the Theorem above, and given 0 < h < 1 and 0 < q < oo , we have f  (- (B1,B2)h,q if and only if Tf  (- (l1,l2)h,q. []

Notice that this result allows us to compute the interpolation spaces for compatible couples of Besov Spaces. It all depends on the computation of the interpolation spaces (                     )
 laq11(Lp1(_O_)),laq22(Lp2(_O_))h,q; these are easily defined in terms of the Lorentz spaces Lp,q(_O_), so we will introduce them in the next section.

2.10 Lorentz spaces

Given a totally s-finite measure space (_O_,m), and values 0 < p,q <  oo , consider the Lorentz functionals

                   ( integral   oo  [ 1/p * ]q dt)1/q   +
rp,q : M0(_O_, m) - )  f '-->     t  f (t)  t-     (-  R                   (2.25)
                     0    ( integral   oo  [      ]   )1/q
||.||L   (_O_,m) : M0(_O_, m) - )  f '-->    t1/pf**(t) q dt     (-  R+          (2.26)
    p,q                      0              t
and for q =  oo ,
                        1/p *
rp, oo  : M0(_O_,m) - )  f '--> sut>p0 t f (t)                        (2.27)
                               1/p **
|| .||Lp, oo (_O_,m) : M0(_O_,m) - )  f '--> stu>p0t f (t)                 (2.28)

Lemma 2.21 We have the equivalence rp,q(.)  )( ||.||Lp,q(_O_,m) among Lorentz functionals for all p > 1 and 0 < q < oo .

Proof. For all 0 < q < oo we have trivially rp,q(f) <||f||Lp,q(_O_,m), since f*(t) < f**(t) for all t > 0. On the other hand, for 0 < q <  oo ,

||f||q
   Lp,q integral (_O_,m)
  =     oo  [t1/pf**(t)]q dt
      0             t
      integral   oo  [    integral  t       ]q dt
  =       t1/p-1    f *(s)ds  -t
      integral t= oo 0      s=[0 integral  t         ]q
  =      t-q(1-1/p)      sf *(s)ds   dt
      t=0          s=0       s   t
     (use Hardy’s Theorem: estimate (1.5)
      in page 8 for 1 < q <  oo , and estimate

     (1.9) i integral n oo page 12 for 0 < q < 1 )
  <  Cp,q   t-q(1- 1/p)[tf*(t)]q dt
          integral 0                 t
            oo [ 1/p * ]q dt
  =  Cp,q 0  t   f (t)  t
A similar proof can be applied for the case q =  oo . [#]

Lemma 2.22 (Properties of Lorentz functionals) (i) Both rp,q(f) < rp,q(g) and ||f||Lp,q(_O_,m) <||g||Lp,q(_O_,m) for f,g  (- M0(_O_,m) such that |f|<|g|. (ii) The functionals (2.26) and (2.28) are both (quasi)norms for all 1 < p <  oo .

Proof. Part (i) is trivial. Using this, and the subadditivity property of the maximal functions f**, we infere that the functionals (2.26) and (2.28) are both (quasi)norms for all 0 < q <  oo . [#]

Remark. Notice that the lack of subadditivity of the decreasing rearrangements gives us that the functionals (2.25) and (2.27) cannot have any (quasi)triangular property; hence, they do not have (quasi)norm structure.

Definition 21 The Lorentz spaces Lp,q(_O_,m) are the Riesz spaces associated to the Lorentz (quasi)norms ||.||Lp,q(_O_,m):

         {                          }
Lp,q(_O_) =  f  (-  M0(_O_, m) :||f||Lp,q(_O_) <  oo

2.11 Further results: Interpolation of Besov Spaces

Theorem 23 For all f  (- L1(_O_) + L oo (_O_) and t > 0,

                      integral  t
K(f, t;L1(_O_),L oo (_O_)) =  f*(s)ds.
                      0

Proof. We prove first that the integral in the left-hand side is bounded above by the K-functional on the right-hand side. We will use for this the sub-additivity of the maximal functions ((f + g)**(t) < f**(t) + g**(t) for all t > 0), and the fact that the spaces Lp(_O_) are rearrangement-invariant.
Given f  (- L1(_O_) + L oo (_O_), and any decomposition f = f1 + f oo with fq  (- Lq(_O_), we have

 integral  t
   f*(s) ds
 0    integral  t         integral  t
  <     f*(s)ds+    f* (s)ds
      0  1        0   oo 
  =  ||f*1||L1(_O_) + t|| f* oo  ||L oo (_O_)
  =  ||f ||     +t||f  ||    ;
       1 L1(_O_)      oo  L oo (_O_)
therefore proving the stated inequality.
In order to prove the other inequality, it suffices to find for each t > 0 a decomposition f = f(1,t) + f( oo ,t) with f(q,t)  (- Lq(_O_) such that ||f(1,t)||L1(_O_) + t||f( oo ,t)||L oo (_O_) = tf**(t): For this task, fix t > 0, consider Et = {x  (- _O_ : |f(x)| > f*(t)}, and let t0 = |Et|. Notice that t0 < t trivially, and also f  (- L1(Et) (since f is bounded there). Set g(x) = max{|f(x)|- f*(t),0}signf(x), and h(x) = min{|f(x)|,f*(t)}signf(x).
Note first that h  (- L oo (_O_), with ||h||L oo (_O_) = f*(t). Also, g  (- L1(_O_):
 integral            integral                     integral                     integral  t0
   |g(x)| dx =    (|f(x)|- f*(t)) dx =    |f(x)|dx- t0f*(t) =    f*(s)ds- t0f *(t);
 _O_           Et                   Et                   0
therefore, ||g||L1(_O_) + t0||h||L oo (_O_) =  integral 0t0f*(s)ds, and furthermore,
                     integral                         integral 
||g||    + t||h||     =   t0f*(s)ds + (t- t )f*(t) =  tf*(s)ds,
   L1(_O_)      L oo (_O_)  0               0        0
since f*(s) = f*(t) for all t0 < s < t. [#]

Corollary 23.1 Given 0 < h < 1 and 0 < q < oo , we have (L1(_O_),Lo o (_O_))h,q = Lp,q(_O_), where 1/p = 1-h. []

Corollary 23.2 Given 1 < p1,p2 < oo , 0 < h < 1 and 0 < q < oo , we have (Lp1(_O_),Lp2(_O_))h,q = Lp,q(_O_), where 1/p = (1 - h)/p1 + h/p2.

Proof. Use the previous Corollary and the Reiteration Theorem 5 (page 23). [#]

Theorem 24 Let (X,||.||X), (X1,||.||X1) and (X2,||.||X2) be complete (quasi)normed spaces, and let 0 < a1 < a2, 0 < h < 1 and 0 < q1,q2 < oo . Denote lk(X) = lqkak(X); then the following properties hold: (i) (l1(X),l2(X))h,q = lqa(X) for all 0 < q < oo , where a = (1 - h)a1 + ha2. (ii) (l1(X1),l2(X2))h,q = lqa((X1,X2)h,q), where a = (1 - h)a1 + ha2 and 1/q = (1 - h)/q1 + h/q2.

Corollary 24.1 Under the same hypothesis as in the previous Theorem, if Xk = Lpk(_O_) for 1 < p1,p2 < oo , then we have
(                    )
 laq11(Lp1(_O_)),laq22(Lp2(_O_))    = laq (Lp,q(_O_)),
                      h,q
where a = (1 - h)a1 + ha2, 1/q = (1 - h)a1 + ha2, and 1/p = (1 - h)/p1 + h/p2. []
Corollary 24.2 Under the same hypothesis as in the previous Corollary, we have
(Ba1(Lp ),Ba2 (Lp ))   = Ba(Lp),
  q1   1   q2   2  h,q    q
where a = (1 - h)a1 + ha2, 1/q = (1 - h)/q1 + h/q2, 1/p = (1 - h)/p1 + h/p2, and p = q. []

2.12 Further Results: Embedding Theorems for Besov Spaces

Lemma 2.23 Given p,a > 0, r  (- N and _O_ < Rd, consider s > 0 defined by 1/s = a/d + 1/p. Then for all n there exists C > 0 which depends at most on ****what?****, such that ||S||Lp(_O_) < 2napC||S||Ls(_O_) for all S  (- Y n.

Theorem 25 Given p,a,s > 0 as in the previous lemma, Bpa(Ls(_O_)) is continuously embedded in Lp(_O_).

3 Approximation by Ridge Functions on Dyadic maximal-smoothness splines

3.1 General Theory of Ridge Functions

Throughout this section, _O_d denotes the d-dimensional unit ball in Rd with respect to the euclidean norm; their d dimensional size is denoted gd. Sd-1, the unit sphere, is the boundary of the previous set; and Pd-1 < Sd-1 is the set of directions in Rd. We assume the latter to be a connected set for integration purposses.

Definition 22 Given d > 2, a univariate function f and a direction h  (- Pd-1, we define the d-dimensional ridge function on _O_d generated by f with direction h by

                    d
/\[f,h](.) = /\(.|f, h) : R  -)  x '--> f(h.x)x_O_d(x)  (-  R
Lemma 3.1 Ridge functions have the following properties: (i) Given a direction h  (- Pd-1, and any univariate function f, the ridge function /\[f,h] is constant over the intersection of _O_d with each affine fyperplane which is orthogonal to h. (ii) /\[cf + g,h] = c/\[f,h] + /\[g,h] for all h  (- Pd-1, c  (- R and univariate functions f,g. (iii) If f  (- Lp(_O_1), then for all h  (- Pd-1, /\[f,h]  (- Lp(_O_d), with ||/\[f,h]||Lp(_O_d) < Cp,d||f||Lp(_O_1) for some constant C > 0 that depends at most on d and p. (iv) Given f : Rd --> R and a univariate function g, there exists h = hf,g such that f * /\[g,h] = /\[h,h]. Moreover, if 1 < q <  oo , f  (- Lq(_O_d), and g  (- L1(_O_1), then we have ||/\[h,h]||Lq(Rd) < gd-1||f||Lq(_O_d)||g||L1(_O_1).

Proof.

(i) Given h  (- Pd-1, let us denote h _L the affine hyperplane which is orthogonal to h and goes through the origin; we will also denote span{h} the line trough the origin with direction h. As Rd = h _L  o+ span{h}, we can then express each x  (- Rd uniquely as x = ux + ch, where ux  (- h _L is the orthogonal projection of x over h _L , and ch is the orthogonal projection of x over span{h}. In that case, we have:
/\[f,h](x) = f(h .x) = f(h.ch) = f (c).
(ii) is trivial.
(iii) In order to estimate norms of ridge functions for a given direction h  (- Pd-1, we will make use of the tangent cylinders Cyld(h) to the unit spheres and with bases parallel to the hyperplanes h _L (with this definition, each d dimensional cylinder is unique). Notice that those bases are (d-1) dimensional balls in Rd. In that case, we may estimate
 integral                  integral                integral                        integral 
   |/\[f,h](x)| pdx =   |f(h.x)| pdx <        | f(h.x)|pdx = gd- 1   |f(t)| p dt,
 _O_d                _O_d              Cyld(h)                  _O_1
what proves our statement.
(iv) Given x = ux + ch  (- Rd as before, we observe that
{f */\[g integral ,h]}(x)
  =      f(y)g(h .[x - y])dy
       Rd
       integral 
  =    Rd f(y)g(h .[ch - y])dy
  =   {f * /\[g,h]}(ch);
and therefore, this convolution is also a ridge function with the same direction. Denote h = hf,g a univariate function such that f * /\[g,h] = /\[h,h].

If f  (- Lq(_O_d) and g  (- L1(_O_1), we may estimate,

||/\[h,h||]( integral x)|||Lq(_O_d)           ||
     ||||                    ||||
  =  || Rd f (x - y)/\[g,h](y)dy||L (_O_ )
      integral                      q  d
  <     |/\[g,h](y)|||f(.- y)||Lq(_O_d)
      Rd      integral 
  =  ||f||L (_O_)    | g(h.y)|dy
         q  d integral  _O_d

  <  ||f||Lq(_O_d)  Cyl (h)|g(h .y)| dy
                 d integral 
  =  gd-1||f||Lq(_O_d)    |g(t)| dt,
                  _O_1
which is what we wanted to prove. [#]
Definition 23 Given 0 < p <  oo , and a homogeneous subspace of functions F < Lp(_O_1), consider the following spaces of ridge functions on _O_d: Discrete non-linear ridgelets:
        {  sum m                       }
Ym(F) =      /\[fk,hk] : fk  (-  F,hk  (-  Pd- 1 .
          k=1
Discrete linear ridgelets:
         oo 
Y (F) =   U  Y (F).
       m=1  m
Radon Ridgelets: Let Cd = Pd-1 × _O_1, and consider the Inverse Radon linear functional R* : M(Cd) --> M(_O_d) defined by R*g(x) =  integral Pd-1g(h,h . x)dh:
        {   *                               }
Xd(F) =    integral R g(*x) : gh =dgt(h,.)  (-  F (if d is odd)
           RR  g(.,.- t) t : gh  (-  F (if d is even)

4 Appendix

4.1 Elements of Functional Analysis

4.1.1 Linear Transformations

Theorem 26 Let (X,||.||X) and (Y,||.||Y ) be (quasi)normed linear spaces and F : X --> Y a linear map. Then the following conditions are equivalent: (i) F is bounded on some closed ball about 0 of positive radius. (ii) F is continuous at 0. (iii) F is uniformly continuous on X. (iv) There exists c > 0 such that ||F(x)||Y < c||x||X for all x  (- X. (v) In particular, if Y = R, with absolute value for a norm, then each of the above conditions is equivalent to the following: If F/=0, then the hyper-space Z(F) is closed in X.

Proof.

(i)==>(ii)
By hypothesis there exists c,b > 0 such that ||F(x)||Y < c||x||X for all x  (- X with ||x||X < b. Given e > 0, we can choose d = min(b,e/c) > 0 and we have continuity at 0.
(ii)==>(iii)
Given e > 0, choose d > 0 to satisfy continuity at 0, and given any x  (- X, consider y  (- X with ||x - y||X < d. We have then ||F(x) - F(y)||Y = ||F(x - y)||Y < e, and F is continuous at x. Notice that the choice of d does not depend on the value of x.
(iii)==>(iv)
Given e = 1, choose the corresponding d > 0 as in the definition of uniform continuity. Given x  (- X, consider x' = --d--
2|| x||Xx; notice that ||x'||X = d/2. Then it must be ||F(x')||Y < 1; therefore proving the statement:
          ||||  (        )||||
||F (x)||Y = ||||F   2||x||Xx' ||||  = 2 ||x||X||F(x')||Y < 2||x||X
                 d      Y   d                d
(iv)==>(i)
This is trivial.
(iii)<==>(v)
If F is continuous, then since {0} is closed in R, it must be Z(F) = F-1({0}) closed in X. Conversely, assume that Z(F) is closed in X and that F is not continuous at 0; then there exists a sequence (xn)n converging to zero in X and a value d > 0 such that |F(xn)| > d for all n. Consider an element x0  (- X with ||x0||X = c > 0, and the sequence (yn)n given by yn = x0 - cxn/F(xn). Notice that F(yn) = 0 for all n (and so yn  (- Z(F)), but limnyn = x0 / (- Z(F), a contradiction. [#]

Remark. One should not be very happy about the previous result when dealing with quasinorms; still existence of continuous linear functionals has to be proved in the space of your choice. For instance, in Lp[0,1] for 0 < p < 1, the only continuous linear functional is the zero functional! A proof of this result (M. M. Day’s Theorem) can be read in [Torc].

Corollary 26.1 All linear functionals of a (quasi)normed linear space are continuous.

Proof. This is a direct consequence of (v) in Theorem 26 above. [#]

Theorem 27 Any two (quasi)norms are equivalent in a finite dimensional linear space.

Proof. Given two different (quasi)norms in a finite dimensional linear space Xd, ||.||1 and ||.||2, it will be enough to prove that the linear function F : (Xd,||.||1)  -) x'-->x  (- (Xd,||.||2) is continuous. For that purpose, choose any basis of Xd, say {fk}k=1d, and decompose F in terms of the projections over the coordinate subspaces span(fk): F(.) =  sum k=1dprojk(.)fk. We have written F as a finite sum of continuous functionals (by the previous Corollary); hence, F must be continuous. Apply now part (iv) of Theorem 26 to get the desired result. [#]

Corollary 27.1 Every closed bounded set of a finite dimensional (quasi)normed linear space is compact.

4.1.2 The Hahn-Banach Theorem

Theorem 28 (Hahn-Banach Lemma) Let F : X --> R be a sublinear function on a vector space X over a field K, let Y be a subspace of X and let f : Y --> K be a linear functional such that |f(x)|< F(x) for all x  (- Y . Then there exists a linear map y : X --> K which extends f and which is dominated by F on all of X.

Theorem 29 (Hahn-Banach) Let (X,||.||)X be a normed linear space and let Y be a linear subspace. Then to every linear functional f : Y --> K there corresponds another linear map y : X --> K such that y|Y = f, and ||f||Y -->K = ||y||X-->K.

4.2 Rearrangement-Invariant spaces

4.2.1 Riesz Spaces

Let (_O_,m) be a measure space; consider the following sets:

M(_O_,m), the set of measurable functions on _O_.
M+(_O_,m), the set of nonnegative measurable functions.
M0(_O_,m), the set of measurable functions f such that m{|f| =  oo } = 0.

A mapping r : M+ --> [0, oo ] is called a quasinorm function, if for f,g  (- M+(_O_,m), the following properties hold:

  1. r is a quasinorm.
  2. If g < f (m - a.e.), then r(g) < r(f).
  3. If (fn)n  (- M+(_O_,m) verifies fn+1 > fn (m - a.e.) for all n, and limnfn = f (m - a.e.), then also limnr(fn) = r(f).
  4. For any measurable subset E < _O_ with m(E) <  oo , r(xE) <  oo .
  5. For E < _O_ as before, there exists a constant CE > 0 such that  integral Ef dm < CEr(f) for all f.

Given such a quasinorm function on (_O_,m), the collection X(r) = {f  (- M(_O_,m) | r(|f|) <  oo } is called a Riesz space associated to r. Such spaces inherit from its quasinorm special properties:

  1. Lattice property: if |g|<|f|, then r(g) < r(f).
  2. Fatou’s property: Let f,(fn)n  (- M+(_O_,m) such that limnfn = f. If f  (- X(r), then limnr(fn) = r(f); otherwise, limnr(fn) =  oo .
  3. Fatou’s Lemma: Given (fn)n  (- M(_O_,m), r(liminf nfn) < liminf nr(fn).
  4. xE  (- X(r) for all measurable subset E < _O_ with m(E) <  oo ; besides, there exists a constant CE > 0 such that  integral Ef dm < CEr(f) for all f  (- X(r).
  5. Every convergent sequence in X(r) has a m - a.e. convergent subsequence.
  6. Riesz-Fisher property: If  sum n=1 oo r(fn) <  oo , then  sum n=1 oo fn converges in X(r) to a function f  (- X(r), and r(f) < sum n=1 oo r(fn).

4.2.2 Resonant measure spaces

Definition 24 Given a measurable space (_O_,m), and f  (- M(_O_,m), consider the associated functions:

Distribution. mf : (0, oo )  -) y'-->m{x  (-  _O_ :| f (x)|> y} (- (0, oo ).
Decreasing Rearrangement. f* : [0, oo )  -) t'-->inf y>0{mf(y) < t} (- [0, oo ).
Maximal function. f** : (0, oo )  -) t'-->1t  integral 0tf*(s)ds  (- (0, oo ).

We say two measurable functions f,g are equimeasurable (and we write f ~ g), if mf = mg.

Lemma 4.1 (Properties of the distribution function) Given f,g  (- M(_O_,m),
  1. mf is a nonnegative, right-continuous and monotone decreasing function.
  2. If |g|<|f|(m - a.e.), then mg < mf.
  3. Given a sequence (fn)n  (- M(_O_,m) such that |fn+1|>|fn| for all n, and limn|fn| = f, then we also have limnmfn = mf.

Lemma 4.2 (Properties of the decreasing rearrangement) Given f,g  (- M(_O_,m),
  1. f* is a nonnegative, right-continuous decreasing function.
  2. If |g|<|f|(m - a.e.), then g*< f*.
  3. (af)* = |a|f* for all a  (- R.
  4. (f + g)*(t + s) < f*(t) + g*(s) for all t,s > 0.
  5. Fatou’s property: If |f|< liminf n|fn|(m - a.e.), then also f* < liminf nfn*; in particular, if (fn)n verifies |fn+1|>|fn| for all n, and limn|fn| = |f|(m - a.e.), then limnfn* = f*.
  6. f ~ f*.
  7.    p
(|f|)* =   *
(f )p for all 0 < p <  oo .
  8. For all f  (- M0(_O_,m) and 0 < p <  oo ,
     integral    p       integral   oo  p-1          integral   oo  *  p
   |f| dm = p    c   mf(c)dc =    f (t) dt
 _O_           0                 0
  9. esssup_O_|f| = inf{c | mf(c) = 0} = f*(0).
  10. Weak subadditivity: (f + g)*(t + s) < f*(t) + g*(s) for all t,s > 0.

Lemma 4.3 (Properties of the maximal function) Given f,g  (- M0(_O_,m),
  1. f** is a nonnegative, nonincreasing continuous function.
  2. f** =_ 0 if and only if f  =_ 0.
  3. f*< f**
  4. If |g|<|f|(m - a.e.), then g**< f**.
  5. (af)** = |a|f** for all a  (- R.
  6. If (fn)n  (- M(_O_,m) verifies |fn+1|>|fn| for all n, and limn|fn| = |f|, then limnfn** = f**.
  7. Subadditivity: (f + g)**(t) < f**(t) + g**(t) for all t > 0.

Remark. These three new functions associated to f may be used to perform integral operations on f, but in a simpler setup. The three subsequent results show us how:

Lemma 4.4 Given a simple nonnegative function g  (- M(_O_,m), the following estimate holds:

 integral         integral 
           m(_O_) *
 _O_ gdm <  0   g (s)ds.
Proposition 13 (Hardy-Littlewood) Given f,g  (- M0(_O_,m), the following estimate holds:
 integral          integral 
             oo  *    *
 _O_|fg|dm < 0  f (s)g(s)ds.
Corollary 13.1 Given f,g  (- M0(_O_,m),
 integral          integral   oo 
  |f~g|dm <    f*(s)g*(s)ds,
 _O_         0
for all ~g ~ g.

Example. Consider _O_ = {1,...,n} with measure m : _O_  -) k'-->1/n for all k. Notice that, given any measurable function g : _O_  -) k'-->gk  (- R, then any ~g ~ g may be obtained by mere permutation of the elements (there exists a permutation s  (- Sn such that ~g k = gs(k)). In this case, equality is attained in Corollary 13.1: Given f = (f1,...,fn), consider a permutation s such that |fs(k)|>|fs(k+1)| for all k; then we have mf = x[0,|f s(n)|) +  sum k=1n-1 k
nx[|fs(k+1)|,|fs(k)|), and f* =  sum k=1n|fs(k)|x[(k-1)/n,k/n); therefore, for any given g = (g1,...,gn), it suffices to find two permutations: first s' permutes the indices so that |gs'(k)|>|gs'(k+1)|, and then t matches s'(k) with s(k). This gives us, for ~g k = gt(k), that

 integral           n           n               integral 
   |f ~g| dm =  sum  -1|f ~g |=  sum  1|f   g ' | =   oo  f*(s)g*(s)ds.
 _O_         k=1n  k k   k=1n  s(k) s(k)    0

Definition 25 We say a measure space (_O_,m) is resonant if

    integral           integral   oo 
sup   |f~g|dm =    f*(s)g*(s)ds
~g~g  _O_         0
for all f,g  (- M0(_O_,m). We say the space is strongly resonant if the supremum is attained.

We will prove that any compact cube _O_ < Rd with the Lebesgue measure is a strongly resonant space, and therefore we may use the previous results to simplify the computation of integral operations on it:

Lemma 4.5 Let _O_ < Rd be a compact cube, and let m = |.| denote the Lebesgue measure. Given f  (- M0(_O_,m), and t  (- [0,|_O_|], there exists a measurable subset _O_t < _O_ with |_O_t| = t, and such that  integral _O_t|f(x)|dx =  integral 0tf*(s)ds. Moreover, these sets can be chosen so that s < t implies _O_s < _O_t.

Proposition 14 Any cube in Rd with the Lebesgue measure is a strongly resonant space.
Definition 26 Given a totally s-finite measure space (_O_,m), a quasinorm function r : M+(_O_,m) --> R+ such that r(f) = r(g) for each f ~ g  (- M0+(_O_,m) is called rearrangement-invariant. In that case, the space X(r) is said to be rearrangement-invariant as well.

Notice that the spaces Lp for any 0 < p < oo are all rearrangement-invariant.

Definition 27 Let (X,||.||X) be a rearrangement-invariant function space over a resonant measure space (_O_,m). Consider the function fX : [0,m(_O_)]  -) t'-->||xE||X  (- R, where E < _O_ is any measurable subset with m(E) = t (notice that, if F < _O_, F/=E and m(F) = m(E), then xF ~ xE and they have the same norm).

4.3 To Do

  1. We need the proof of the Reiteration Theorem (page 5) for h,q interpolation spaces via the real method of Peetre and Lions, and the previous result. It will be used in section 2.11.
  2. Perhaps a more exhaustive reading on the paper by Brown and Lucier. Mainly the characterization of best-approximations in L1 and the main theorem.
  3. Finish the proof of the Lemma on mollifiers in section 2.6.1.
  4. Maybe include the construction of several K-functionals; this will give you the opportunity of presenting the Rearrangement-Invariant spaces and its applications. If you want to prove Whitney’s Theorem, then you must show the K-functional for the pair (Lp,Wpr).
  5. Prove the results in section 4.1.2, and include a few corollaries. Which ones? I am not sure; among the ones I have in Friedman and Prof. Philips’ notes, it is possible that there are some that have interest for Approximation Theory. Sit on it for a while.
  6. Finish the proofs of the results in section 2.7.1 on Besov Spaces.
  7. Extend the results in section 1.3 using chapter 3 of [DeLo]. Theorem 1.3 and its implications to best approximation in Lp for 1 < p <  oo set an idea of what one can or cannot require of (near)best operators; namely, linearity, continuity, boundedness, and how to use those properties to construct (near)best approximations to any given function. This is a good spot to include item 2, although it might be even better in section 2.1.
  8. Maybe prove all the claims in section 4.2.2; for completion, mainly. Or the reader can be directed to the proofs in [BeSh] if interested.
  9. Finish the missing proofs in section 2.6.3 related to the equality of spaces Hpr(G) = Wpr(G).

References

[Adam]   R.A. Adams, “Sobolev Spaces”, Academic Press, New York, 1975.

[deBo]   C. de Boor, “Class notes for Math/CS 887, Spring’03”, http://www.cs.wisc.edu/~deboor.

[dBFi]   C. de Boor and G.F. Fix, “Spline approximation by quasi-interpolants”, J. Approx. Theory 8 (1973), 19-45.

[BeSh]   C. Bennet and R. Sharpley, “Interpolation of Operators”, Academic Press (1988), New York.

[BrLu]   L. Brown and B. Lucier, “Best approximations in L1 are near best in Lp, p < 1, Proc. Amer. Math. Soc. 120 (1994), 97-100.

[Bure]   V.I. Burenkov, “Sobolev Spaces in Domains”, http://www.cf.ac.uk/maths/people/Sobol.pdf

[Cal1]   A.P. Calderón, “Intermediate spaces and interpolation: the complex method”, Studia Math. 24 (1964), 113-190.

[Cal2]   A.P. Calderón, “Spaces between L1 and L oo and the Theorem of Marcinkieiwicz: the complex method”, Studia Math. 26 (1964), 273-279.

[CDeH]   A. Cohen, R. DeVore and R. Hochmuth, “Restricted Approximation”, Constr. Approx. 16 (2000), no. 1, 85-113.

[CuSc]   H. B. Curry, I. J. Schoenberg, “On Pólya frequency functions. IV. The fundamental spline functions and their limits”, J. Analyse Math 17 (1966), 71-107.

[DeVo]   R. DeVore, “Nonlinear Approximation”, Acta Numerica 7 (1998), 51-150.

[DeLo]   R. DeVore and G. Lorentz, “Constructive Approximation”, Springer Grundlehren, Heidelberg, 1993.

[DeP1]   R. DeVore and V. Popov, “Interpolation of Besov Spaces”, Trans. Amer. Math. Soc. 305 (1988), 397-414.

[DeP2]   R. DeVore and V. Popov, “Interpolation spaces and nonlinear approximation”, Function Spaces and Applications (M. Cwikel et al., eds), Vol. 1302 of Lecture Notes in Mathematics, Springer, Berlin, 191-205.

[DeSh]   R. DeVore and R. Sharpley, “Maximal Functions Measuring Smoothness”, Memoirs Vol. 293 (1984), American Mathematical Society, Providence, RI.

[Frie]   A. Friedman, “Foundations of Modern Analysis”, Dover, New York, 1982.

[Peet]   J. Peetre, “A Theory of Interpolation of Normed Spaces”, Course notes, University of Brasilia (1963).

[Petr]   P. Petrushev, “Approximation by Ridge Functions and Neural Networks”, SIAM J. on Math Analysis, 30 (1998) 115-189.

[Torc]   A. Torchinsky, “Real Variables”, Addison-Wesley, 1988.