Lab 11b: Rootfinding and function pointers.
(Tues Nov 22, Thurs Nov 24, Fri Nov 25, 2005)

The material in this part of the lab (11b) will not be on the final exam. So if you have not yet completed all the of previous labs, you should probably go back and complete them now. However, if you have finished the previous labs, this lab gives you a glimpse into the field of "numerical methods" (ie, solving mathematical equations with computer programs).

Engineers frequently write computer programs to solve mathematical equations derived from engineering applications. In this practical session, you will explore two algorithms for computing the roots of nonlinear algebraic equations. Recall that the roots of a function f(x) are those values of x for which f(x)=0.

1. The bisection method

You know that if f(x) is a continuous function on some interval [a,b], and if K is some number between f(a) and f(b), then there exists a number c in the interval [a,b] such that f(c) = K. This result is called the Intermediate Value Theorem.

How can this result be used to derive an algorithm that finds roots? Well, if f(x) is a continuous function on [a,b], with f(a) and f(b) having opposite signs, then either f(a) <= 0 <= f(b) or f(a) >= 0 >= f(b). Hence, by the Intermediate Value Theorem, there must be a value r, with a <= r <= b, that is such that f(r) = 0. That is, there must be a root in [a,b].

The theorem doesn't tell us how many roots there are in [a,b], only that there is at least one. Also, if there is a root in [a,b], it need not be the case that f(a) and f(b) have opposite signs. All we know is that if f(a)*f(b) <= 0 (which will be true when f(a) and f(b) have opposite signs), then there is a root in [a,b].

The bisection method follows from repeated application of the Intermediate Value Theorem. Suppose you have values a and b such that f(a)*f(b) <= 0. Compute the midpoint of the interval, m = a + (b-a)/2. Then it must be the case that at least one of f(a)*f(m) > 0 or f(a)*f(m) <= 0. If f(a)*f(m) > 0, then we know that there is a root in the interval [m,b]. (Because we know that there is a root in [a,b] and also that it is not necessarily the case that there is a root in [a,m].) We can then replace our ``root-bounding-interval'' [a,b] by [m,b]. If, on the other hand, f(a)*f(m) <= 0, we know that there is a root in the interval [a,m] . We can then replace our ``root-bounding-interval'' [a,b] by [a,m]. Our new interval will be half the length of our old interval.

If we repeat these calculations on the new interval [a,m] or [m,b], we can find a new ``root-bounding'' interval that is one-quarter the length of the original interval. If we keep repeating these sorts of calculations, we will end up with a very small interval that is known to contain a root of f(x).

The approach described above is called the Bisection Method.

Write a C program that implements the Bisection Method, and use it to find a root of the equation f(x) = x - 0.2*sin(x) - 0.5. Start with [a,b] = [ 0, 1]. Stop the iteration when the interval width is less than 1.0 x 10^-6. You should find a root at approximately x = 0.6154685.

Use double variables to represent ``real'' values. Print out each new ``root-bounding-interval'' so that you can observe the method converging.

2. Newton's method

In part (1), you should have observed that the Bisection Method found a root of f(x), but that it took many iterations to meet the given error tolerance of 1.0 x 10^-6.

The Bisection Method only used 1 bit of information about each value of f(a) and f(b). It only used their sign. (It turns out that one bit is used to store the sign of a floating point number.) If we make better use of the information that we know about f(a) or f(b), we can derive a method that usually computes roots more quickly.

Suppose that you have a guess x₁ at a root of f(x). Now consider a straight line tangent to the function at point x₁. This line goes through the point (x₁, f(x₁)) and has the same slope as f(x) at x₁. It's equation is

 
  y - f(x₁) = (df/dx)(x₁)*(x - x₁),

or equivalently

  y = f(x₁) - (df/dx)(x₁)*(x - x₁).

Now since this straight line will approximate f(x) near x=x₁ (all functions are essentially linear if you zoom in on them enough!), it is reasonable to suspect that the root of the straight line may give a good approximation to the root of f(x). Setting y to 0 and solving for x gives

  x = x₁ - (f(x₁)/(df/dx)(x₁)) .

Lets call this approximation to the root of f(x), x₂.

We can repeat this ``straight-line'' construction to get a value:

  x₃ = x₂ - ( f(x₂)/(df/dx)(x₂) ).

Repeating again gives the iteration formula:

  x_i+1 = x_i - ( f(x_i)/(df/dx)(x_i) ), i = 1,2,3,....

This iteration is called Newton's method for computing a root of f(x).

We call the values x₁, x₂, x₃, etc Newton iterates. They are approximations to the roots of f(x). The quantity ( f(x_i)/(df/dx)(x_i) ) is called the Newton update. It is used to ``update'' the previous approximation to the root. We usually stop the iteration once the update is small in size, that is abs( f(x_i)/(df/dx)(x_i) ) is small.

Write a C program that implements Newton's method and use it to find a root of the equation f(x) = x - 0.2*sin(x) - 0.5. Start with x₁=0. Stop the iteration when the update is less than 1.0 x 10^-6 in size. Print out the values of x_i and the updates.

Hint: For this exercise, go ahead and "hard-code" the function f(x) into your program. Then differentiate the equation, using what you have learned in calculus, and hard-code that equation into your program as well.

You should find a root at approximately x = 0.6154682 after 4 applications of the formula. Note that the size of the update is roughly squared each time. When the update is small (ie, much less than 1), squaring it makes it shrink very quickly!

Revise your programs and try to compute a root of the function f(x) = x - tan(x). Start the bisection method with [a,b] = [3.5, 4.5], and start Newton's method with x₁ = 4.0. Note that (df/dx)(x) = -tan²(x). Stop when you have an approximation to the root that is within 5.0 x 10^-6 of the true root.

You should observe that the bisection method converges to approximately x = 4.49341. You should also observe that Newton's method does not converge to the root. Try Newton's method again for x₁ = 4.2, 4.4, 4.6, 4.8. You should observe that it converges sometimes, but not every time.

This function f(x) = x - tan(x) has a root at x = 0. Try Newton's method with x₁ = 0.1. You should observe that the method converges very slowly to x = 0.

Now you know why Newton's method is not always the method of choice for computing roots of equations. Sometimes it diverges from the root. Sometimes it converges slowly to the root. But sometimes it converges very quickly to the root. Many good root-finding algorithms are hybrid combinations of the Bisection Method and Newton's Method. The slow but steady Bisection Method is applied a few times until the interval length is fairly small. And then Newton's method is used. If Newton's method appears to be diverging, a switch is made back to the Bisection Method, until the interval known to contain a root is even smaller. If Newton's method appears to be converging, it will find an approximation to the root very quickly.

4. Function pointers

If you were to regularly use these root finding algorithms, you would want to implement each of them in a C function. Suppose you implement Newton's method in a C function named Newton. How would you tell the Newton function that you wanted to compute the root of f(x) = x - 0.2*sin(x) - 0.5 or f(x) = x - tan(x) ? You wouldn't want to ``hard-code'' the mathematical function into your Newton function, because then you would have to change your Newton function for each new math function whose roots were to be determined.

A good approach would be to write C functions like:

        #include <math.h>
        double f1( double x )
        {
            return x  - 0.2*sin(x) - 0.5;
        }
        double f1prime( double x )
        {
            return 1.0 - 0.2*cos(x);
        }

and

        #include <math.h>
        double f2( double x )
        {
            return x - tan(x);
        }
        double f2prime( double x )
        {
            double tanx;
            tanx = tan(x);
            return tanx*tanx;
        }

and then pass either f1, f1prime or f2, f2prime to to your Newton function. This may be accomplished in C by using function pointers.

We would like to be able to write a C statement like:

        root = Newton( xGuess, f1, f1prime, tolerance, &successIndicator );

The value returned by Newton would be the approximate root and the value of the variable successIndicator could indicate whether or not Newton's method coverged. The parameters f1 and f1prime would refer to the particular C functions that implement the math function whose root is to be determined. And the parameter tolerance would specify the requested error tolerance for the root.

In this case, the Newton function would have to have the prototype:

        double Newton( double x_1, 
                       double (*f)(double), 
                       double (*fprime)(double), 
                       double tolerance,
                       int *successIndicator );

Note that the parameter declarations for the Newton function contain the prototypes for the two functions f and fprime.

To call the functions f or fprime within the Newton function, you would have to write, for example,:

    double x = 42.0, fofx, fprimeofx;
    fofx = (*f)(x);
    fprimeofx = (*fprime)(x);

See section 17.7 of your text for more details.

Now modify your programs from parts (1) and (2) so that you have functions named Bisection and Newton that may be used to find roots of any function.

Lab 11b: Rootfinding and function pointers. (Tues Nov 22, Thurs Nov 24, Fri Nov 25, 2005)

Lab 11b: Rootfinding and function pointers.
(Tues Nov 22, Thurs Nov 24, Fri Nov 25, 2005)