1.2 Lesson 2

# Lesson 2: Floating point computations
# ===========================
# 
# As we saw in Lesson 1,  Maple will write decimals or floating-point numbers 
# with a certain number of significant digits, specified by the variable "Digits".  
# Here are some floating-point numbers, with the default Digits = 10.
> evalf(exp(20)), evalf(exp(-7)), evalf(exp(-20));  

                              9                                -8
                .4851651954*10 , .0009118819656, .2061153622*10
# In principle a floating point number should be written as
# (possibly -) .digits * 10^exponent, but Maple doesn't use the exponent if it is 
# small.  The exponent can be anywhere from  -9989 to 10009 for a floating 
# point number to be written this way.
> bignumber:=evalf(.10*10^10009);

                                                   10009
                        bignumber := .1000000000*10
> smallnumber:= evalf(.10*10^(-9989));  

                                                    -9989
                       smallnumber := .1000000000*10
# That should be a big enough range for most purposes, but actually there is 
# another notation that is used for exponents outside that range, and these can be 
# used for some REALLY big or small numbers.
> bignumber*10, bignumber^1000, smallnumber^10000;

                    10010                10008001                -99899999
      .1000000000*10     , .1000000000*10        , .1000000000*10
# Many implementations of floating point actually store and calculate numbers 
# in base 2 (converting to base 10 for output), but Maple actually uses base 10.  
# So this calculation has no roundoff error, because 1/5 can be represented 
# exactly in base 10:
> evalf(1/5)*5;

                                   1.000000000
# This one, on the other hand, does have roundoff error, because 1/3 can't be 
# represented exactly.
> evalf(1/3)*3;

                                   .9999999999
# You can think of a floating point number as representing a whole interval of 
# numbers, all of which have the same floating point representation.  Thus 
# everything from 1.9999999995 to 2.0000000005 is represented as 2.000000000.  
# When you add a small floating point number to a larger number, some 
# significant digits of the small number are lost.
> .1234567891*10^(-5)+1;

                                   1.000001235
# In an extreme case, adding two numbers that differ by at least "Digits" orders 
# of magnitude, the smaller number is lost altogether.
> 1.+ 10^(-10);

                                   1.000000000
# This can have serious consequences if you are trying to add up many numbers 
# that are each very small (e.g. in numerical integration).  Suppose we start with 
# 0 and add 10000 numbers, each equal to 4.444444444 * 10^(-4).  The correct 
# answer would be 4.444444444.
# I'll use a "do loop": the command total:= total + 4.444444444 * 10^(-4) will be 
# executed
# 10000 times, once for each integer value of j from 1 to 10000.  Any sequence of 
# commands
# could go between the "do" and the "od" (which is "do" backwards).  Note the 
# ":" instead
# of ";", because we don't want to see 10000 outputs.
> total:= 0:
> for j from 1 to 10000 do 
>    total:= total + 4.444444444 * 10^(-4) od:
> total;

                                   4.444440910
# The way to fix this would be to increase Digits: 14 should be enough.
> Digits:= 14:
> total:= 0:
> for j from 1 to 10000 do 
>    total:= total + 4.444444444 * 10^(-4) od:
> total;

                                 4.4444444440000
> Digits:= 10:
# Another case where roundoff error can be severe is when we subtract two 
# numbers that are almost equal.  This is called "catastrophic cancellation".
> 1.000000001 - 1.000000000;

                                          -8
                                     .1*10
# We subtracted two numbers with 10 significant digits and the result has only 
# one significant digit.  
# One good way to think about this is in terms of absolute and relative errors.  In 
# approximating a "true value" T by an approximate value A, the absolute error 
# is |T-A|.  The relative error is |T-A|/|T|.  In representing a number in floating 
# point the relative error should be at most 5 * 10^(-Digits).
# When two numbers are multiplied or divided, the relative error of the result is 
# at most (approximately) the sum of the two relative errors.  This usually 
# doesn't cause any problems.
# When two numbers are added or subtracted, the absolute error of the result is 
# at most the sum of the two absolute errors.  If the result is close to 0, this 
# means the relative error can be much larger than the relative errors of the 
# original numbers.
# Here is an example of catastrophic cancellation affecting the solution of a 
# quadratic equation.  Here's the solution formula from high school.  
> r:= solve(a*x^2+b*x+c=0, x);

                             2         1/2              2         1/2
                     - b + (b  - 4 a c)         - b - (b  - 4 a c)
            r := 1/2 ---------------------, 1/2 ---------------------
                               a                          a
# Let's try it when b is large compared to a and c.
> s:=solve(x^2 + 9751*x + 1 = 0, x);

                                       1/2                         1/2
           s := - 9751/2 + 1/2 95081997   , - 9751/2 - 1/2 95081997
# Let's use "evalf" to get floating point answers.
# There's no accuracy problem with the second one.  We get 10 significant digits.
> evalf(s[2]);

                                  -9750.999898
# The first, however, involves subtracting two numbers that are very close 
# together. The result has about the same absolute error as for the second 
# solution, but  only 3 significant digits.
> evalf(s[1]);

                                    -.000102
# Using a higher setting of "Digits" gives us more accuracy.
> evalf(s[1], 20); evalf(s[2], 20);

                               -.0001025535853263

                             -9750.9998974464146737
# But we can actually get better relative error at the same setting of "Digits" by 
# using a different formula that won't involve this catastrophic cancellation. 
# Recall that the product of the roots is the constant term c. 
> expand(r[1]*r[2]);   

                                       c/a
# So the first root should be
> r[1] = c/r[2];

                           2         1/2
                   - b + (b  - 4 a c)                 c a
               1/2 --------------------- = 2 ---------------------
                             a                       2         1/2
                                             - b - (b  - 4 a c)
# which won't produce a catastrophic cancellation (when b is positive).  In our 
# case:
> evalf(1/s[2]);

                                 -.0001025535853
# In addition to "solve" which finds exact solutions (when possible), Maple has 
# "fsolve" which finds floating-point approximations to solutions of equations, 
# even those that "solve" can't handle.  The syntax is similar to that of "solve".  
# Let's see how it handles our quadratic:
> fsolve(x^2 + 9751*x + 1 = 0);

                          -9750.999897, -.0001025535853
# It won't work, of course, on equations that have symbolic parameters.
> fsolve(a*x^2 + b*x + c=0, x);
Error, (in fsolve) should use exactly all the indeterminates

# How about our fifth-degree polynomial from last time?
> fsolve(x^5 - x^3 + 2*x + 1);

                                  -.5603499931
# For a polynomial, "fsolve" finds all the real roots.  This polynomial has only 
# one.  There should be four other complex roots.  To tell "fsolve" to find both 
# real and complex roots, you give it a third input, the word "complex".
> s:=fsolve(x^5 - x^3 + 2*x + 1, x, complex);

   s :=  - .7961643382 - .6551644381 I,  - .7961643382 + .6551644381 I,

       -.5603499931, 1.076339335 - .7212070670 I, 1.076339335 + .7212070670 I
# Should we trust "fsolve" here?  One way to check a solution is to substitute the 
# answer into the original problem.  The "subs" command does this.
> subs(x = s[1], x^5 - x^3 + 2*x + 1);

                                    -9        -9
                               .6*10   + .4*10   I
# Not bad.  You can't expect more accuracy than that at Digits=10.  To get more 
# accuracy, of course, you can use a higher "Digits" setting. Now let's try a 
# non-polynomial equation that stumped "solve".
> fsolve(sin(x)=1-x, x);

                                   .5109734294
# This one has only one real solution (since x + sin(x) is an increasing function).  
# But even if there were more, "fsolve" would only give us one solution for a 
# non-polynomial equation.
> fsolve((x^2-1)/(x+2), x);

                                  -1.000000000
# For the polynomial x^2-1 itself, "fsolve" would give us both solutions.
> fsolve(x^2-1, x);

                                -1.000000000, 1.
# Of course the solutions of (x^2-1)/(x+2)=0 are the two solutions of x^2-1=0, 
# but since the expression was not a polynomial "fsolve" only gave us one 
# solution. 
# Let's try something more complicated. 
> f:= x - 10*sin(x);

                               f := x - 10 sin(x)
# Before trying "fsolve", let's get a picture of what happens here by plotting a 
# graph.
# The "plot" command, in its simplest form, takes two inputs, an expression f(v) 
# depending
# on a variable v, and a range for the v axis in the form v = a .. b.  This plots the 
# curve
# y = f(v) for v from a to b.  The scale for the y axis is chosen automatically.  If 
# you prefer,
# you can set it yourself by including another input y = c .. d. 
> plot(f, x=-15 .. 15);
# The plot comes up in a separate window, which can be copied (in the Edit 
# menu of the plot window) and pasted into the worksheet (in the Edit menu of 
# the worksheet).
# 

   ** Maple V Graphics **

# There appear to be 7 real solutions of f=0, at about x=-8.5, -7, -3, 0, 3, 7 and 
# 8.5.  In general you'd have to worry about solutions outside the interval we 
# plotted, but here It's not hard to show there are no solutions outside the 
# interval [-10, 10], because -1 <= sin(x) <= 1 means x-10 <= f <= x+10.
# By the way, there's a very useful feature of these plot windows: if you click the 
# left mouse button with the mouse pointer in the window, Maple shows you the 
# X and Y coordinates of the tip of the arrow.  This lets you locate any point on 
# the plot with better precision than you would have guessing "by eye", especially 
# for points not on the axes.
> fsolve(f);

                                   2.852341894
# This is the solution near 3.  To get other solutions, you can use an option in 
# "fsolve" to find roots in a given interval.
> fsolve(f, x=6 .. 8);

                                   7.068174358
> fsolve(f, x=8 .. 9);

                                   8.423203932
# The other roots can be obtained by symmetry (f is an odd function).
# 
#