# Lesson 12. Newton's method in several variables # ======================================= # > restart: with(linalg): Warning: new definition for norm Warning: new definition for trace # We have used Newton's method to solve equations in one # variable. Suppose now we wish to solve a system of m equations in m # variables: # f[1](x[1], ..., x[m])=0, ... f[m](x[1],...,x[m]) = 0 # # The basic idea is the same: # - Start with an initial guess. # - Consider linear approximations to the functions near the initial # guess. # - Find a point where the linear approximations are all 0. # - Use that as the next guess. Repeat this process until it appears to # converge. # > f1:= (x,y,z) -> x^2 + x*y*z + y^2*z^3 - 3; 2 2 3 f1 := (x,y,z) -> x + x y z + y z - 3 > f2:= (x,y,z) -> x*y^2 - y*z^2 - 2*x^2; 2 2 2 f2 := (x,y,z) -> x y - y z - 2 x > f3:= (x,y,z) -> x^2*y + y^2*z+ z^4-4; 2 2 4 f3 := (x,y,z) -> x y + y z + z - 4 # We'd like to consider these as making up one mapping # from R^3 to R^3. We could use Maple's "vector" structures # to represent the points of R^3. > V:= vector([a,b,c]); V := [ a, b, c ] # Note the absence of commas between the components. # However, I find it more convenient to use lists for this purpose. If V # = [x,y,z] is a list (representing a vector), op(V) is the sequence x,y,z, # which can be used # as the inputs to f1, f2 or f3. > F:= V -> [f1(op(V)), f2(op(V)), f3(op(V))]; F := V -> [f1(op(V)), f2(op(V)), f3(op(V))] > F([s,t,u]); 2 2 3 2 2 2 2 2 4 [s + s t u + t u - 3, s t - t u - 2 s , s t + t u + u - 4] # We'll be using the list [x,y,z] a lot, so assign it to a name. > xyz:= [x,y,z]; xyz := [x, y, z] # Now the linear approximation to F(V) for V near V0 is # F(V0) + J(V0) (V-V0) where J(V0) is the Jacobian matrix # of F. The "linalg" package has the "jacobian" command # to produce it. This takes two inputs, a list of expressions in the # variables # and a list of the variables. > jacobian(F(xyz), xyz); [ 3 2 2 ] [ 2 x + y z x z + 2 y z x y + 3 y z ] [ ] [ 2 2 ] [ y - 4 x 2 x y - z - 2 y z ] [ ] [ 2 2 3 ] [ 2 x y x + 2 y z y + 4 z ] # We'd like to make this a function of [x,y,z]. Unfortunately Maple's # matrices don't seem to work well with functions. Most of the obvious # approaches don't work. > j:=unapply(",xyz); Error, (in unapply) variables must be unique and of type name > j:= unapply(",(x,y,z)); j := (x,y,z) -> [tables and arrays not printed here] > j(x,y,z); [ 3 2 2 ] [ 2 x + y z x z + 2 y z x y + 3 y z ] [ ] [ 2 2 ] [ y - 4 x 2 x y - z - 2 y z ] [ ] [ 2 2 3 ] [ 2 x y x + 2 y z y + 4 z ] > j(1,2,3); [ 3 2 2 ] [ 2 x + y z x z + 2 y z x y + 3 y z ] [ ] [ 2 2 ] [ y - 4 x 2 x y - z - 2 y z ] [ ] [ 2 2 3 ] [ 2 x y x + 2 y z y + 4 z ] # One approach that does work is to construct the function we want # using lists of lists instead of matrices, and then use that with # "matrix". We take the expression for the Jacobian and convert it into # a list of lists. > convert(jacobian(F(xyz),xyz), listlist); 3 2 2 2 2 [[2 x + y z, x z + 2 y z , x y + 3 y z ], [y - 4 x, 2 x y - z , - 2 y z], 2 2 3 [2 x y, x + 2 y z, y + 4 z ]] # Now we make this list of lists into a function of x,y and z using # "unapply". > jlist:= unapply(",(x,y,z)); 3 2 2 jlist := (x,y,z) -> [[2 x + y z, x z + 2 y z , x y + 3 y z ], 2 2 2 2 3 [y - 4 x, 2 x y - z , - 2 y z], [2 x y, x + 2 y z, y + 4 z ]] # Now we can define a function that will take a list (representing a # vector [x,y,z]) and return the Jacobian at that point as a matrix. > j:= V -> matrix(jlist(op(V))); j := V -> matrix(jlist(op(V))) > j(xyz); [ 3 2 2 ] [ 2 x + y z x z + 2 y z x y + 3 y z ] [ ] [ 2 2 ] [ y - 4 x 2 x y - z - 2 y z ] [ ] [ 2 2 3 ] [ 2 x y x + 2 y z y + 4 z ] > j([1,2,3]); [ 8 111 110 ] [ ] [ 0 -5 -12 ] [ ] [ 4 13 112 ] # Now the linear approximation formula involves a multiplication # (matrix)(vector). # The usual "*" for multiplication of numbers (or expressions that # represent numbers) can't be used for matrices and vectors, mainly # because this is a noncommutative multiplication: A B and B A are # different in general if A and B are matrices. But Maple would assume # they are the same. > A*B - B*A; 0 # Instead, the multiplication symbol for use with matrices and vectors is # "&*" > A &* B - B &* A; (A &* B) - (B &* A) > matrix([[a,b],[c,d]]) &* [e,f]; [ a b ] [ ] &* [e, f] [ c d ] # To actually perform linear algebra operations such as matrix # multiplication (also addition, subtraction, and powers), you use the # "evalm" command. > evalm("); [ a e + b f, c e + d f ] # This produces a vector or matrix. To make a vector into a list, we can # use "convert". > convert(",list); [a e + b f, c e + d f] # When I'm doing linear algebra, I find myself using the combination of # evalm and convert(",list) so much that I define a single command to # combine them: > evl:= V -> convert(evalm(V),list); evl := V -> convert(evalm(V), list) # Here is the linear approximation to F(V) near V=[x0,y0,z0]. > evl(F([x0,y0,z0]) + j([x0,y0,z0]) &* (xyz-[x0,y0,z0])); 2 2 3 3 [- x0 - 2 x0 y0 z0 - 4 y0 z0 - 3 + 2 x0 x + y0 z0 x + x0 z0 y + 2 y0 z0 y 2 2 + x0 y0 z + 3 y0 z0 z, 2 2 2 2 2 - 2 x0 y0 + 2 y0 z0 + 2 x0 + y0 x - 4 x0 x + 2 x0 y0 y - z0 y - 2 y0 z0 z, 2 2 4 2 2 - 2 x0 y0 - 2 y0 z0 - 3 z0 - 4 + 2 x0 y0 x + x0 y + 2 y0 z0 y + y0 z 3 + 4 z0 z ] > collect(",{x,y,z}); 3 2 2 2 [(2 x0 + y0 z0) x + (x0 z0 + 2 y0 z0 ) y + (x0 y0 + 3 y0 z0 ) z - x0 2 3 - 2 x0 y0 z0 - 4 y0 z0 - 3, 2 2 2 2 2 (y0 - 4 x0) x + (2 x0 y0 - z0 ) y - 2 x0 y0 + 2 y0 z0 + 2 x0 - 2 y0 z0 z, 2 2 3 2 2 4 2 x0 y0 x + (x0 + 2 y0 z0) y + (y0 + 4 z0 ) z - 2 x0 y0 - 2 y0 z0 - 3 z0 - 4 ] # We'll make this a function of x0, y0 and z0. > linapp:= unapply(",(x0,y0,z0)); linapp := (x0,y0,z0) -> [ 3 2 2 2 (2 x0 + y0 z0) x + (x0 z0 + 2 y0 z0 ) y + (x0 y0 + 3 y0 z0 ) z - x0 2 3 - 2 x0 y0 z0 - 4 y0 z0 - 3, 2 2 2 2 2 (y0 - 4 x0) x + (2 x0 y0 - z0 ) y - 2 x0 y0 + 2 y0 z0 + 2 x0 - 2 y0 z0 z, 2 2 3 2 2 4 2 x0 y0 x + (x0 + 2 y0 z0) y + (y0 + 4 z0 ) z - 2 x0 y0 - 2 y0 z0 - 3 z0 - 4 ] > linapp(0,1,1); [x + 2 y + 3 z - 7, x - y + 2 - 2 z, 2 y + 5 z - 9] # Now we solve the equations from each component of this list to get # the next Newton approximation. For "solve" we need a set of # expressions, not a list. > solve({op(")},{x,y,z}); {z = 9/5, x = 8/5, y = 0} # We'd like a command that will take the list [x0,y0,z0] as input and # returns its result as a list [x,y,z]. We first use "op" on [x0,y0,z0] to # get x0, y0, z0 which can be used as the input to "linapp", and we # substitute the result of "solve" into [x,y,z]. One more thing: we want # the result in floating point, not as an exact fraction (which after a few # iterations could have very large numerator and denominator), so we # use "evalf". > newt:= V0 -> evalf(subs(solve({op(linapp(op(V0)))},{x,y,z}),[x,y,z])); newt := V0 -> evalf(subs(solve({op(linapp(op(V0)))}, {x, y, z}), [x, y, z])) > newt([0,1,1]); [1.600000000, 0, 1.800000000] # An alternative would be to use the "linsolve" command. # linsolve(A,b) solves the vector equation A x = b, where A is a matrix # and b is a vector. In our case the equation to solve is F(V0) + j(V0) X # = 0, i.e. j(V0) X = - F(V0), and we add this to V0 to get our next # guess. We can take the "-" outside the "linsolve". > [0,1,1] - linsolve(j([0,1,1]), F([0,1,1])); [0, 1, 1] - [ -8/5, 1, -4/5 ] # We want the vector addition evaluated, and the result returned as a # list. > newt2:= V0 -> evalf(evl(V0 - linsolve(j(V0), F(V0))));\ newt2 := V0 -> evalf(evl(V0 - linsolve(j(V0), F(V0)))) > newt2([0,1,1]); [1.600000000, 0, 1.800000000] # According to my experiments, "newt" is considerably faster than # "newt2". One iteration of "newt2" more than 4 times as long as one # iteration of "newt". I'm not sure exactly why. > V:= [0,1,1]: timer:= time():\ for jj from 1 to 20 do V:= newt(V) od: \ time()-timer; .417 > V:= [0,1,1]: timer:= time():\ for jj from 1 to 20 do V:= newt2(V) od:\ time()-timer; 1.766 > V:= [0,1,1]; V := [0, 1, 1] > for jj from 1 to 10 do V:= newt(V) od; V := [1.600000000, 0, 1.800000000] V := [-.4053571430, 2.380952381, 1.260183553] V := [.1238168236, 1.962724782, 1.080486675] V := [.3893816263, 2.074499230, .8377064543] V := [.426964196, 1.979477834, .818450908] V := [.4302423924, 1.978996584, .8151083308] V := [.430244248, 1.978999685, .8150943555] V := [.430244241, 1.978999694, .815094354] V := [.430244244, 1.978999690, .8150943547] V := [.430244243, 1.978999694, .8150943547] # It's settled down nicely to what appears to be a solution. The last # digit or two changes at each iteration because of roundoff error. > F(V); -8 -8 -7 [.7*10 , -.12*10 , .10*10 ] # Try a different starting point. > V:= [1,0,0]; V := [1, 0, 0] > V:= newt(V); V := [x, y, z] # The first step didn't seem to work. What was the linear # approximation? > linapp(1,0,0); [2 x - 4, - 4 x + 2, y - 4] # Well, this clearly can't be solved for x, y and z. We were unlucky to # choose an initial V where the Jacobian is singular. The "solve" # command then produced nothing. Curiously, if "subs" is given only # one input it returns that input unchanged. > subs([x,y,z]); [x, y, z] > subs(solve({op(linapp(1,0,0))},{x,y,z}),[x,y,z]); [x, y, z] # It would be better if "newt" detected this fact and produced an error # message instead of returning [x,y,z]. We can do this by replacing # "subs". The replacement doesn't do anything differently, it just # produces an error if there is only one input instead of two. > newsubs:= proc(x,y) subs(x,y) end; newsubs := proc(x,y) subs(x,y) end > newt:= subs(subs=newsubs, eval(newt)); newt := V0 -> evalf(newsubs(solve({op(linapp(op(V0)))}, {x, y, z}), [x, y, z])) > newt([1,0,0]); Error, (in newsubs) newsubs uses a 2nd argument, y, which is missing # Another starting point. > V:= [0.2,1.3,1]; V := [.2, 1.3, 1] > for jj from 1 to 10 do V:= newt(V) od; V := [-1.071128893, 5.499445912, -.611168750] V := [-1.049215594, 2.534205466, -.6192708733] V := [-1.204204108, .5177684731, -.5997565973] V := [-1.039618373, -7.863509003, -18.14022514] V := [-14.91802573, -6.837306518, -13.63820635] V := [-10.43664265, -5.919103138, -10.29934921] V := [-7.234861711, -5.100300579, -7.801733768] V := [-5.029357694, -4.362483323, -5.938344249] V := [-3.537389653, -3.688691206, -4.553543125] V := [-2.538975612, -3.064095347, -3.529896885] # There's no sign yet of this one settling down. Eventually it does, but # to a different solution than the last one. > for jj from 1 to 20 do V:= newt(V) od: > V, newt(V); [1.174658303, 1.674928536, .5655524760], [1.174658306, 1.674928535, .5655524741] # How can we find out how many solutions there are? # # Our equations in this case involve polynomials, so "solve" might be # useful. Unfortunately, it isn't. > # s:= solve({f1(x,y,z), f2(x,y,z), f3(x,y,z)}, {x,y,z}); # I put "#" before this one in the worksheet because I don't want # you to try it: it seems to take forever. Presumably the solution is # extremely complicated. But we can be a bit less ambitious and # get some results. # The intersections of f1(x,y,z)=0 and f2(x,y,z)=0 will correspond to # the following curve in the yz plane. > r1:= resultant(f1(x,y,z), f2(x,y,z), x); 2 4 3 5 2 4 6 2 3 4 3 5 4 r1 := y z - 4 y z + 12 y z + 4 y z - 24 y z + 36 + y z + 2 y z 3 6 3 4 3 4 - 6 y z + y z - 3 y + 2 y z # And similarly for f2 and f3. > r2:=resultant(f1(x,y,z), f3(x,y,z),x); 5 3 3 6 3 2 3 3 2 5 4 4 2 6 6 r2 := y z + y z - 4 y z + 8 y z + 16 + 2 y z + 6 y z + y z + y z 7 3 8 4 3 5 4 4 3 2 2 - 2 z y + z - 8 z + 6 y z - 2 y z - 6 y z + 9 y - 24 y - 8 y z # We'll see the results next time. >