1.19 Lesson 19

# Lesson 19.  Richardson Extrapolation and Romberg Integration
# ------------------------------------------------------------
# 
# Last time we defined the Newton-Cotes rules for numerical 
# integration of int(f(x), x=a..b).
# 
> restart: 
> h:= n ->(b-a)/n:
> X:= (i,n) -> a + i*(b-a)/n:
> a:= 0: b:= 1: 
> J:= f -> int(f(x), x=a..b): 
> rule:= (n,k) -> 
>    h(n) * sum(sum(d[j]*f(X(k*i+j,n)), j=0..k), i=0..n/k-1): 
> eqns:= k -> eval({ seq(subs(f=unapply(x^ii,x), rule(k,k)=J(f)), ii=0..k) }): 
Warning, `ii` is implicitly declared local

> for kk from 2 to 12 do 
>  ncrule[kk]:= unapply(subs(solve(eqns(kk)), rule(n,kk)), n) od:
# We saw that the Newton-Cotes rule ncrule[k](n) gives the 
# correct answer when f is a polynomial of degree q = k (if k is 
# odd) or k+1 (if k is even).  The error is a constant times 
# (D@@(q+1))(f)(t)/n^(q+1). 
# 
# If we compare a Newton-Cotes rule with 1/n^p in the error to 
# one with 1/n^q in the error, where q > p, the second one is 
# more accurate when n is large.  However, as we saw last time 
# this is not necessarily true for any given n, especially when 
# the higher order derivatives of f are large.
> f:= x -> 1/(x^2 + 1/20):
> map(k -> evalf(sqrt(20)*arctan(sqrt(20))-ncrule[k](36)), [2,3,4,6,9,12]);

            -7         -6           -5          -5          -5          -6
     [.62*10  , .223*10  , -.1006*10  , .3563*10  , .1121*10  , -.503*10  ]
# Here's an even more striking example.  This time we take n=k.  
# For each n we find what would appear to be the best possible 
# approximation to the integral, given the values at the points 
# x[0],...,x[n] - an approximation that is exact for polynomials 
# of the greatest possible degree.  But in this case it turns 
# out that the errors actually get worse as n increases.  
> f:= x -> 1/((8*x-4)^2+1):
> map(k -> evalf(J(f)-ncrule[k](k)), [2,4,6,8,10,12]);

[-.3548200938, .0467485336, -.0846453499, .0888176280, -.1179906340, .1646286874

    ]
# Richardson extrapolation is a very widely applicable idea, 
# which can often be used to improve the results of a numerical 
# method.
# Suppose you want to approximate a quantity J, and you have 
# available approximations Q(n).  Typically, you know that this 
# approximation is of a certain order, say J = Q(n) + O(1/n^p). 
# But often, more can be said: 
> eq1:= J = Q(n) + A/n^p + O(1/n^(p+1)):\

# 
# The idea of Richardson extrapolation is to take two different 
# values of n, and eliminate the A term.  Typically, we take n 
# and n/2.  Now
> eq2:= J = Q(n/2) + 2^p*A/n^p + O(1/n^(p+1)):
# We can take a combination of this equation and the last one
# and eliminate the A/n^p term which is the main contribution
# to the error.
> solve({eq1,eq2},{J,A});

                       p                    1             1      p
                 Q(n) 2  - Q(1/2 n) - O(--------) + O(--------) 2
                                         (p + 1)       (p + 1)
                                        n             n
            {J = -------------------------------------------------,
                                             p
                                      - 1 + 2

                       p
                      n  (- Q(n) + Q(1/2 n))
                A = - ----------------------}
                                    p
                             - 1 + 2
# Thus (ignoring the higher-order terms)
> R:=subs(O(1/n^(p+1))=0, subs(",J));

                                        p
                                  Q(n) 2  - Q(1/2 n)
                             R := ------------------
                                              p
                                       - 1 + 2
# should be a better approximation to the true value.  Probably 
# more important is the fact that 
> subs("", A/n^p);

                                 - Q(n) + Q(1/2 n)
                               - -----------------
                                             p
                                      - 1 + 2
# is an approximation for the error in Q(n).  The error in our
# improved approximation R is not known (though it should be
# O(1/n^(p+1))), but to be conservative we can take the value
# above as an indication of its error as well.
# 
# Suppose we apply this idea to the Trapezoid Rule.
> traprule:= n ->\
     h(n) * sum((f(X(i,n))+f(X(i+1,n)))/2, i=0..n-1): 
# It turns out that the error in traprule(n) for smooth functions
# f is of the form
# sum(A[2*k]/n^(2*k), k=1..N) + O(1/n^(2N+1))
# The leading term in the error is A[2]/n^2.  Richardson 
# extrapolation removes this:
> T[1]:= n -> (4*traprule(n)-traprule(n/2))/3;

               T[1] := n -> 4/3 traprule(n) - 1/3 traprule(1/2 n)
> f:= 'f': T[1](2);

                        1/6 f(0) + 2/3 f(1/2) + 1/6 f(1)
# Look familiar? This is the Simpson's Rule for n=2.  In fact, 
# T1 is just the same as Simpson's Rule.  The error in T1 should
# be O(1/n^4), which is true for Simpson's Rule.  What is new
# is the error estimate
> Er[1]:= n -> (traprule(n)-traprule(n/2))/3;

               Er[1] := n -> 1/3 traprule(n) - 1/3 traprule(1/2 n)
# But there's no reason to stop here.  We can apply Richardson
# Extrapolation to Simpson's Rule as well.
> T[2]:= n -> (16*T[1](n)-T[1](n/2))/15;

                  T[2] := n -> 16/15 T[1](n) - 1/15 T[1](1/2 n)
> T[2](4);

                      16                          16
         7/90 f(0) + ---- f(1/4) + 2/15 f(1/2) + ---- f(3/4) + 7/90 f(1)
                      45                          45
# This turns out to be the same as the Newton-Cotes rule 
# ncrule[4].  But after this the Richardson Extrapolation results
# are not Newton-Cotes rules.
> T[3]:= n -> (64*T[2](n)-T[2](n/2))/63;

                  T[3] := n -> 64/63 T[2](n) - 1/63 T[2](1/2 n)
> T[3](8);

  31         512           176           512           218           512
 --- f(0) + ---- f(1/8) + ---- f(1/4) + ---- f(3/8) + ---- f(1/2) + ---- f(5/8)
 810        2835          2835          2835          2835          2835

         176           512           31
      + ---- f(3/4) + ---- f(7/8) + --- f(1)
        2835          2835          810
> ncrule[8](8);

       989          2944           464            5248           454
      ----- f(0) + ----- f(1/8) - ----- f(1/4) + ----- f(3/8) - ---- f(1/2)
      28350        14175          14175          14175          2835

              5248           464            2944           989
           + ----- f(5/8) - ----- f(3/4) + ----- f(7/8) + ----- f(1)
             14175          14175          14175          28350
# T[3] has error O(1/n^8) while ncrule[8] has error O(1/n^10), so
# ncrule[8] should be better for large n.  Let's try the last 
# example.
> f:= x -> 1/((8*x-4)^2+1):\
[seq(evalf(J(f)-T[3](8*jj)), jj= 1 .. 10)];

     [.0085039025, -.0002856241, -.0002048834, -.0000297271, -.0000143341,

                 -6            -5          -6           -6          -6
         .6357*10  , -.15203*10  , .4373*10  , -.2525*10  , .1042*10  ]
> [seq(evalf(J(f)-ncrule[8](8*jj)), jj= 1 .. 10)];

                                                   -5                       -6
 [.0888176280, -.0006237649, .0005724945, .93610*10  , .0000112229, .9989*10  ,

             -6         -8         -7          -7
     .5013*10  , -.15*10  , .452*10  , -.110*10  ]
# We can carry the Richardson extrapolation idea as far as we 
# want.  It's usually done in a table.  Note that only at the
# first level (the trapezoid rule) do you actually evaluate f
# (and the values from traprule(n/2) can be reused in 
# traprule(n)), so the following will only require the same
# number of function evaluations as traprule(32).  I'm going to 
# use a different example, that works out better than the last 
# one.
> f:= x -> x/(x^2 + 1/10):
> for jj from 1 to 5 do R[jj,1]:= evalf(traprule(2^jj)) od:
> for kk from 2 to 5 do \
  for jj from kk to 5 do \
R[jj,kk]:= (4^(kk-1)*R[jj,kk-1] - R[jj-1,kk-1])/(4^(kk-1)-1)\
od od;
> with(linalg, matrix):\
for kk from 2 to 5 do\
  for jj from 1 to kk-1 do\
R[jj,kk]:= `                `\
od od;
> R[1,1];

                                   .9415584416
> matrix([seq([seq(R[jj,kk], jj=2..5)], kk=1..5)]);

      [    1.138413473       1.184736526       1.195437378    1.198072507 ]
      [                                                                   ]
      [    1.204031817       1.200177544       1.199004329    1.198950883 ]
      [                                                                   ]
      [                      1.199920592       1.198926115    1.198947320 ]
      [                                                                   ]
      [                                        1.198910329    1.198947656 ]
      [                                                                   ]
      [                                                       1.198947802 ]
# The true value, for comparison, is
> evalf(J(f));

                                   1.198947637
# And here are the error estimates.
> for kk from 1 to 4 do for jj from kk+1 to 5 do \
Ee[jj,kk]:= (R[jj,kk]-R[jj-1,kk])/(4^kk-1)\
od od;
> for kk from 2 to 4 do\
  for jj from 1 to kk do\
Ee[jj,kk]:= `                `\
od od;
> matrix([seq([seq(Ee[jj,kk], jj=2..5)], kk=1..4)]);

   [   .06561834379      .01544101767      .003566950666     .0008783763332  ]
   [                                                                         ]
   [                                                                      -5 ]
   [                    -.0002569515333  -.00007821433334  -.3563066667*10   ]
   [                                                                         ]
   [                                                                      -6 ]
   [                                     -.00001578534920   .3365873015*10   ]
   [                                                                         ]
   [                                                                      -6 ]
   [                                                        .1463803921*10   ]
# For comparison, here are the true errors.
> for kk from 1 to 5 do for jj from kk to 5 do \
Et[jj,kk]:= evalf(J(f))-R[jj,kk]\
od od;
> for kk from 2 to 5 do\
  for jj from 1 to kk-1 do\
Et[jj,kk]:= `                `\
od od;
> matrix([seq([seq(Et[jj,kk], jj=2..5)], kk=1..5)]);

      [    .060534164        .014211111        .003510259      .000875130 ]
      [                                                                   ]
      [                                                                -5 ]
      [    -.005084180       -.001229907       -.000056692    -.3246*10   ]
      [                                                                   ]
      [                                                               -6  ]
      [                      -.000972955       .000021522      .317*10    ]
      [                                                                   ]
      [                                                               -7  ]
      [                                        .000037308      -.19*10    ]
      [                                                                   ]
      [                                                                -6 ]
      [                                                        -.165*10   ]
# One indication of whether everything is going according to 
# plan is whether the error estimates behave the right way as 
# you go to across the row.  In the k=1 row, where R[j,1] = 
# traprule(2^j) which should have an error proportional to 1/n^2 
# = 1/4^j, each error should be about 1/4 of the previous one.
> seq(Ee[jj,1]/Ee[jj+1,1], jj = 2 .. 4);

                      4.249612635, 4.328912597, 4.060845598
# In the k=2 row, the error should be proportional to 1/n^4, so 
# each error should be about 1/16 of the previous one.
> seq(Ee[jj,2]/Ee[jj+1,2], jj = 3 .. 4);

                            3.285223083, 21.95140890
# The last one's not too bad, though the first one's way off.
# In the k=3 row, we should have about 1/64.
> Ee[4,3]/Ee[5,3];

                                  -46.89823154
# Not only is it not 64, it's negative.  It appears that the 
# error estimates in the third row are not trustworthy (and 
# indeed Ee[4,3] is quite bad).
#