1.12 Lesson 12

# Lesson 12.  Newton's method in several variables
# =======================================
# 
> restart: with(linalg):
Warning: new definition for   norm
Warning: new definition for   trace

# We have used Newton's method to solve equations in one
# variable.  Suppose now we wish to solve a system of m equations in m 
# variables:
# f[1](x[1], ..., x[m])=0, ...  f[m](x[1],...,x[m]) = 0
# 
# The basic idea is the same: 
# - Start with an initial guess.
# - Consider linear approximations to the functions  near the initial 
# guess.
# - Find a point where the linear approximations are all 0.
# - Use that as the next guess.  Repeat this process until it appears to 
# converge.
# 
> f1:= (x,y,z) -> x^2 + x*y*z + y^2*z^3 - 3; 

                                       2            2  3
                     f1 := (x,y,z) -> x  + x y z + y  z  - 3
> f2:= (x,y,z) -> x*y^2 - y*z^2 - 2*x^2;

                                           2      2      2
                       f2 := (x,y,z) -> x y  - y z  - 2 x
> f3:= (x,y,z) -> x^2*y + y^2*z+ z^4-4;

                                        2      2      4
                      f3 := (x,y,z) -> x  y + y  z + z  - 4
# We'd like to consider these as making up one mapping
# from R^3 to R^3. We could use Maple's "vector" structures
# to represent the points of R^3.  
> V:= vector([a,b,c]);

                                V := [ a, b, c ]
# Note the absence of commas between the components.
# However,  I find it more convenient to use lists for this purpose.  If V 
# = [x,y,z] is a list (representing a vector), op(V) is the sequence x,y,z, 
# which can be used
# as the inputs to f1, f2 or f3.
> F:= V -> [f1(op(V)), f2(op(V)), f3(op(V))];

                   F := V -> [f1(op(V)), f2(op(V)), f3(op(V))]
> F([s,t,u]);

         2            2  3         2      2      2   2      2      4
       [s  + s t u + t  u  - 3, s t  - t u  - 2 s , s  t + t  u + u  - 4]
# We'll be using the list [x,y,z] a lot, so assign it to a name.
> xyz:= [x,y,z];

                                xyz := [x, y, z]
# Now the linear approximation to F(V) for V near V0 is
# F(V0) + J(V0) (V-V0) where J(V0) is the Jacobian matrix
# of F.  The "linalg" package has the "jacobian" command
# to produce it.  This takes two inputs, a list of expressions in the 
# variables
# and a list of the variables.
> jacobian(F(xyz), xyz);

                   [                       3           2  2 ]
                   [ 2 x + y z  x z + 2 y z   x y + 3 y  z  ]
                   [                                        ]
                   [   2                  2                 ]
                   [  y  - 4 x   2 x y - z       - 2 y z    ]
                   [                                        ]
                   [              2              2      3   ]
                   [   2 x y     x  + 2 y z     y  + 4 z    ]
# We'd like to make this a function of [x,y,z].  Unfortunately Maple's 
# matrices don't seem to work well with functions.  Most of the obvious 
# approaches don't work.  
> j:=unapply(",xyz);
Error, (in unapply) 
variables must be unique and of type name

> j:= unapply(",(x,y,z));

              j := (x,y,z) -> [tables and arrays not printed here]
> j(x,y,z);

                   [                       3           2  2 ]
                   [ 2 x + y z  x z + 2 y z   x y + 3 y  z  ]
                   [                                        ]
                   [   2                  2                 ]
                   [  y  - 4 x   2 x y - z       - 2 y z    ]
                   [                                        ]
                   [              2              2      3   ]
                   [   2 x y     x  + 2 y z     y  + 4 z    ]
> j(1,2,3);

                   [                       3           2  2 ]
                   [ 2 x + y z  x z + 2 y z   x y + 3 y  z  ]
                   [                                        ]
                   [   2                  2                 ]
                   [  y  - 4 x   2 x y - z       - 2 y z    ]
                   [                                        ]
                   [              2              2      3   ]
                   [   2 x y     x  + 2 y z     y  + 4 z    ]
# One approach that does work is to construct the function we want 
# using lists of lists instead of matrices, and then use that with 
# "matrix".  We take the expression for the Jacobian and convert it into 
# a list of lists.
> convert(jacobian(F(xyz),xyz), listlist);

                          3           2  2     2                 2
  [[2 x + y z, x z + 2 y z , x y + 3 y  z ], [y  - 4 x, 2 x y - z , - 2 y z],

               2           2      3
      [2 x y, x  + 2 y z, y  + 4 z ]]
# Now we make this list of lists into a function of x,y and z using 
# "unapply".
> jlist:= unapply(",(x,y,z));

                                                  3           2  2
      jlist := (x,y,z) -> [[2 x + y z, x z + 2 y z , x y + 3 y  z ],

            2                 2                     2           2      3
          [y  - 4 x, 2 x y - z , - 2 y z], [2 x y, x  + 2 y z, y  + 4 z ]]
# Now we can define a function that will take a list (representing a 
# vector [x,y,z]) and return the Jacobian at that point as a matrix.  
> j:= V -> matrix(jlist(op(V)));

                         j := V -> matrix(jlist(op(V)))
> j(xyz);

                   [                       3           2  2 ]
                   [ 2 x + y z  x z + 2 y z   x y + 3 y  z  ]
                   [                                        ]
                   [   2                  2                 ]
                   [  y  - 4 x   2 x y - z       - 2 y z    ]
                   [                                        ]
                   [              2              2      3   ]
                   [   2 x y     x  + 2 y z     y  + 4 z    ]
> j([1,2,3]);

                                 [ 8  111  110 ]
                                 [             ]
                                 [ 0   -5  -12 ]
                                 [             ]
                                 [ 4   13  112 ]
# Now the linear approximation formula involves a multiplication 
# (matrix)(vector).
# The usual "*" for multiplication of numbers (or expressions that 
# represent numbers) can't be used for matrices and vectors, mainly 
# because this is a noncommutative multiplication: A B and B A are 
# different in general if A and B are matrices.  But Maple would assume 
# they are the same.
> A*B - B*A;

                                        0
# Instead, the multiplication symbol for use with matrices and vectors is 
# "&*"
> A &* B - B &* A;

                               (A &* B) - (B &* A)
> matrix([[a,b],[c,d]]) &* [e,f];

                               [ a  b ]
                               [      ] &* [e, f]
                               [ c  d ]
# To actually perform linear algebra operations such as matrix 
# multiplication (also addition, subtraction, and powers), you use the 
# "evalm" command.
> evalm(");

                            [ a e + b f, c e + d f ]
# This produces a vector or matrix.  To make a vector into a list, we can 
# use "convert".
> convert(",list);

                             [a e + b f, c e + d f]
# When I'm doing linear algebra, I find myself using the combination of 
# evalm and convert(",list) so much that I define a single command to 
# combine them:
> evl:= V -> convert(evalm(V),list);

                       evl := V -> convert(evalm(V), list)
# Here is the linear approximation to F(V) near V=[x0,y0,z0].
> evl(F([x0,y0,z0]) + j([x0,y0,z0]) &* (xyz-[x0,y0,z0]));

     2                    2   3                                           3
[- x0  - 2 x0 y0 z0 - 4 y0  z0  - 3 + 2 x0 x + y0 z0 x + x0 z0 y + 2 y0 z0  y

                      2   2
      + x0 y0 z + 3 y0  z0  z,

         2          2       2     2                            2
- 2 x0 y0  + 2 y0 z0  + 2 x0  + y0  x - 4 x0 x + 2 x0 y0 y - z0  y - 2 y0 z0 z,

      2          2          4                     2                   2
- 2 x0  y0 - 2 y0  z0 - 3 z0  - 4 + 2 x0 y0 x + x0  y + 2 y0 z0 y + y0  z

           3
     + 4 z0  z                                                           ]
> collect(",{x,y,z});

                                    3                   2   2        2
[(2 x0 + y0 z0) x + (x0 z0 + 2 y0 z0 ) y + (x0 y0 + 3 y0  z0 ) z - x0

                         2   3
      - 2 x0 y0 z0 - 4 y0  z0  - 3,

     2                          2             2          2       2
  (y0  - 4 x0) x + (2 x0 y0 - z0 ) y - 2 x0 y0  + 2 y0 z0  + 2 x0  - 2 y0 z0 z,

                 2                   2       3          2          2          4
  2 x0 y0 x + (x0  + 2 y0 z0) y + (y0  + 4 z0 ) z - 2 x0  y0 - 2 y0  z0 - 3 z0

       - 4                                                                     ]
# We'll make this a function of x0, y0 and z0.
> linapp:= unapply(",(x0,y0,z0));

linapp := (x0,y0,z0) -> [

                                     3                   2   2        2
  (2 x0 + y0 z0) x + (x0 z0 + 2 y0 z0 ) y + (x0 y0 + 3 y0  z0 ) z - x0
                          2   3
       - 2 x0 y0 z0 - 4 y0  z0  - 3,

     2                          2             2          2       2
  (y0  - 4 x0) x + (2 x0 y0 - z0 ) y - 2 x0 y0  + 2 y0 z0  + 2 x0  - 2 y0 z0 z,

                 2                   2       3          2          2          4
  2 x0 y0 x + (x0  + 2 y0 z0) y + (y0  + 4 z0 ) z - 2 x0  y0 - 2 y0  z0 - 3 z0
       - 4                                                                     ]
> linapp(0,1,1);

               [x + 2 y + 3 z - 7, x - y + 2 - 2 z, 2 y + 5 z - 9]
# Now we solve the equations from each component of this list to get 
# the next Newton approximation.  For "solve" we need a set of 
# expressions, not a list.
> solve({op(")},{x,y,z});

                            {z = 9/5, x = 8/5, y = 0}
# We'd like a command that will take the list [x0,y0,z0] as input and 
# returns its result as a list [x,y,z].  We first use "op" on [x0,y0,z0] to 
# get x0, y0, z0 which can be used as the input to "linapp", and we 
# substitute the result of "solve" into [x,y,z].  One more thing: we want 
# the result in floating point, not as an exact fraction (which after a few
# iterations could have very large numerator and denominator), so we 
# use "evalf".
> newt:= V0 -> evalf(subs(solve({op(linapp(op(V0)))},{x,y,z}),[x,y,z]));

  newt := V0 -> evalf(subs(solve({op(linapp(op(V0)))}, {x, y, z}), [x, y, z]))
> newt([0,1,1]);

                          [1.600000000, 0, 1.800000000]
# An alternative would be to use the "linsolve" command.
# linsolve(A,b) solves the vector equation A x = b, where A is a matrix 
# and b is a vector.  In our case the equation to solve is F(V0) + j(V0) X 
# = 0, i.e. j(V0) X = - F(V0), and we add this to V0 to get our next 
# guess.  We can take the "-" outside the "linsolve".
> [0,1,1] - linsolve(j([0,1,1]),  F([0,1,1]));

                          [0, 1, 1] - [ -8/5, 1, -4/5 ]
# We want the vector addition evaluated, and the result returned as a 
# list.
> newt2:= V0 -> evalf(evl(V0 - linsolve(j(V0),  F(V0))));\


             newt2 := V0 -> evalf(evl(V0 - linsolve(j(V0), F(V0))))
> newt2([0,1,1]);

                          [1.600000000, 0, 1.800000000]
# According to my experiments, "newt" is considerably faster than 
# "newt2".  One iteration of "newt2" more than 4 times as long as one 
# iteration of "newt".  I'm not sure exactly why.
> V:= [0,1,1]: timer:= time():\
for jj from 1 to 20 do V:= newt(V) od: \
time()-timer;

                                      .417
> V:= [0,1,1]: timer:= time():\
for jj from 1 to 20 do V:= newt2(V) od:\
time()-timer;

                                      1.766
> V:= [0,1,1];

                                 V := [0, 1, 1]
> for jj from 1 to 10 do V:= newt(V) od;

                       V := [1.600000000, 0, 1.800000000]

                  V := [-.4053571430, 2.380952381, 1.260183553]

                  V := [.1238168236, 1.962724782, 1.080486675]

                  V := [.3893816263, 2.074499230, .8377064543]

                   V := [.426964196, 1.979477834, .818450908]

                  V := [.4302423924, 1.978996584, .8151083308]

                   V := [.430244248, 1.978999685, .8150943555]

                   V := [.430244241, 1.978999694, .815094354]

                   V := [.430244244, 1.978999690, .8150943547]

                   V := [.430244243, 1.978999694, .8150943547]
# It's settled down nicely to what appears to be a solution.  The last 
# digit or two changes at each iteration because of roundoff error.


> F(V);

                               -8         -8        -7
                         [.7*10  , -.12*10  , .10*10  ]
# Try a different starting point.
> V:= [1,0,0];

                                 V := [1, 0, 0]
> V:= newt(V);

                                 V := [x, y, z]
# The first step didn't seem to work.  What was the linear 
# approximation?
> linapp(1,0,0);

                           [2 x - 4, - 4 x + 2, y - 4]
# Well, this clearly can't be solved for x, y and z.  We were unlucky to 
# choose an initial V where the Jacobian is singular.  The "solve" 
# command then produced nothing.  Curiously, if "subs" is given only 
# one input it returns that input unchanged.
> subs([x,y,z]);

                                    [x, y, z]
> subs(solve({op(linapp(1,0,0))},{x,y,z}),[x,y,z]);

                                    [x, y, z]
# It would be better if "newt" detected this fact and produced an error 
# message instead of returning [x,y,z].  We can do this by replacing 
# "subs".  The replacement doesn't do anything differently, it just 
# produces an error if there is only one input instead of two.
> newsubs:= proc(x,y)   subs(x,y) end;

newsubs := proc(x,y) subs(x,y) end

> newt:= subs(subs=newsubs, eval(newt));

 newt := V0 -> evalf(newsubs(solve({op(linapp(op(V0)))}, {x, y, z}), [x, y, z]))
> newt([1,0,0]);
Error, (in newsubs) newsubs
 uses a 2nd argument, y, which is missing

# Another starting point.
> V:= [0.2,1.3,1]; 

                                V := [.2, 1.3, 1]
> for jj from 1 to 10 do V:= newt(V) od;

                  V := [-1.071128893, 5.499445912, -.611168750]

                 V := [-1.049215594, 2.534205466, -.6192708733]

                 V := [-1.204204108, .5177684731, -.5997565973]

                 V := [-1.039618373, -7.863509003, -18.14022514]

                 V := [-14.91802573, -6.837306518, -13.63820635]

                 V := [-10.43664265, -5.919103138, -10.29934921]

                 V := [-7.234861711, -5.100300579, -7.801733768]

                 V := [-5.029357694, -4.362483323, -5.938344249]

                 V := [-3.537389653, -3.688691206, -4.553543125]

                 V := [-2.538975612, -3.064095347, -3.529896885]
# There's no sign yet of this one settling down.  Eventually it does, but 
# to a different solution than the last one.
> for jj from 1 to 20 do V:= newt(V) od:
> V, newt(V);

[1.174658303, 1.674928536, .5655524760], [1.174658306, 1.674928535, .5655524741]
# How can we find out how many solutions there are?
# 
# Our equations in this case involve polynomials, so "solve" might be 
# useful.  Unfortunately, it isn't.
> # s:= solve({f1(x,y,z), f2(x,y,z), f3(x,y,z)}, {x,y,z});
# I put "#" before this one in the worksheet because I don't want
# you to try it: it seems to take forever.  Presumably the solution is
# extremely complicated.  But we can be a bit less ambitious and
# get some results.
# The intersections of f1(x,y,z)=0 and f2(x,y,z)=0 will correspond to
# the following curve in the yz plane.
> r1:= resultant(f1(x,y,z), f2(x,y,z), x);

          2  4      3  5         2      4  6       2  3         4  3      5  4
   r1 := y  z  - 4 y  z  + 12 y z  + 4 y  z  - 24 y  z  + 36 + y  z  + 2 y  z

             3      6  3      4      3  4
        - 6 y  z + y  z  - 3 y  + 2 y  z
# And similarly for f2 and f3.
> r2:=resultant(f1(x,y,z), f3(x,y,z),x);

        5  3    3  6      3  2      3  3           2  5        4    4  2    6  6
 r2 := y  z  + y  z  - 4 y  z  + 8 y  z  + 16 + 2 y  z  + 6 y z  + y  z  + y  z

           7  3    8      4      3        5  4      4  3      2             2
      - 2 z  y  + z  - 8 z  + 6 y  z - 2 y  z  - 6 y  z  + 9 y  - 24 y - 8 y  z
# We'll see the results next time.
>