CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS


From MATLAB to Minimization

Jack Crenshaw
After a brief MATLAB update, Jack continues his discussion of function minimization, showing how to improve the efficiency of the calculation.

Before getting into this month’s topic, I’d like to bring you up to date on the continuing saga of math analysis tools. Regular readers will surely recall that I recently voiced some not-so-flattering complaints about the stability of Mathcad v. 8.0-8.03. I speculated that it was time for me to start looking around for a new general-purpose tool. Well, I’ve found one. Boy, have I found one!

The tool is the one I mentioned a couple of months ago: MATLAB, a product of The MathWorks. MATLAB is primarily a tool for doing vector and matrix algebra, but it will also do the other kind as well. MATLAB is popular in the electrical engineering departments of universities worldwide. In that arena, it’s about as ubiquitous as Unix was in universities in the ’80s. Electrical engineers cut their teeth on MATLAB these days, and not surprisingly, MATLAB has a correspondingly huge collection of utilities that help in designing stable control systems. Look at Amazon.com’s Web site and you can find whole “shelves” full of EE books which include in their title, “with exercises in MATLAB.” I bought a bunch. With a total expenditure running something like $1,500 in the last three months, I seem to have adopted Amazon.com as a dependent.

I won’t pretend to give a definitive review of MATLAB here, because I haven’t had nearly enough riding time on it. In fact, all my work with it has been done at my day job. I’ve been so busy here at home, I blushingly admit I haven’t even gotten around to installing it.

Nevertheless, my first impressions are as I expected: I’m in love.

Let me hasten to add: MATLAB is not a GUI-based product like Mathcad. In fact, Mathcad is almost unique in that respect. Only in Mathcad can you write equations that look like equations, rather than something typed in character mode on a glass TTY. That sexy look of Mathcad is hard to beat; if only it would work properly.

In the defense of Mathsoft, the vendor of Mathcad, I should add that I do believe they’re trying hard to restore it to its former glory as a useful tool. I also must hasten to add that they really need to try; even with three service packs, Mathcad is still flaky. I used it for most of the last couple of days, and, on average, it crashed hourly. Twice I had to reboot the system to get Windows working properly again. I’ve found at least two serious bugs in the Mathcad equation symbolic engine, which is surprising, since that engine is based on the highly respected Maple.

How serious, you ask? Well, Mathcad v. 8.03 generated the following “simplification”:

You have to admit, that’s a pretty fundamental booboo. If you can’t trust a symbolic math program to get simple algebra right, what good is it?

By contrast, MATLAB is solid as a rock. I’ve been using it now, quite heavily, for at least two months. It hasn’t crashed once. Not once. In fact, though this seems hard to explain, I certainly get the impression that MATLAB, running under Windows 98, is noticeably more stable than Windows 98 itself. I’ve discussed this with the folks at The MathWorks who say they get this kind of report a lot.

But manipulating matrices and finding eigenvalues is by no means what MATLAB is about anymore. For one thing, MATLAB includes a huge collection of “toolboxes,” which support some truly serious computations, mostly for problems in control theory. Browsing through these toolboxes, I must admit to coming away with a serious inferiority complex. I saw tools for things I didn’t even know existed. My first impression was, “Wow, I need to go back to school.” Which might explain, partially at least, why I’ve adopted Amazon.com.

But that’s not the half of it. MATLAB also now includes Simulink v. 2.0, a GUI-based simulation program. I can’t say enough good things about Simulink. I’m in love, totally sold on, committed to—use whatever superlatives you like. With Simulink, I’m like a kid in a candy store. There seems to be no problem I can’t solve.

Simulink works to help create simulations and/or analyses from control system block diagrams. You begin with a blank worksheet, and a menu of black boxes. You drag the boxes onto the worksheet and connect them up using lines as though they were wires. (If you’ve ever used the excellent electronics simulation, Electronics Workbench, you know the drill.)

The black boxes include all the things controls engineers like to use, like integrators (1/s boxes), adders, multipliers, transfer functions, trig and other math functions, and so on. You can work in the analog, continuous-time domain, which is the usual situation, or you can work in the discrete-time domain, using z -transforms. You can also mix the two, using sample-and-holds to connect the analog and digital worlds. You can run different tasks at different frequencies. You can combine continuous equations of motion with discrete events and discontinuities.

You wanna see (simulated) real-time output? Simply drag over a box called “scope” and connect it up. The next time you press “simulate,” you’ll see the output waveform as it’s generated. A single scope can display as many waveforms as your screen has room for.

Are you worried about jitter caused by the finite resolution in the integer arithmetic you’ll be using? No problem; Simulink will let you do things in integer arithmetic, to any level of resolution. It will also accurately reproduce overflow situations, saturations, and so on.

The number of black boxes in that menu is staggering, and some of them are decidedly nontrivial. For example, in one library is the little box named “Kalman filter.”

If your purpose is to design a feedback control system to stabilize a complicated, multivariate, nonlinear system, you can do so, and feel free to play with the coefficients until you’re satisfied with the system’s performance. Two input blocks simulate a step function and an impulse function, so it’s easy to see what the response is going to be.

However, after you’ve had your fun fooling around with constants, the real fun begins. Press a few buttons, and Simulink will linearize your model around its equilibrium state. Then it will analyze the system and generate the coefficients that produce optimal control of your system, using one of a huge variety of methods, most of which I can’t even pronounce. Careful tweaking of feedback coefficients, as we did it in the good old days, is apparently a thing of the past. I’d suggest that you let Simulink do it for you while you go for coffee, but that would be misleading. Simulink will be done before you ever get out of the door.

Out with the old, in with the new

I’ve been pondering the ramifications of such tools as MATLAB and Simulink, and I think I’m finally seeing the picture with sufficient clarity to understand how a tool like this fits into the general scheme of things. Bear with me as I tell the story and see if you agree.

Dynamic simulation has always been a key application for computers, both analog and digital. In the former case, it was almost the application. At NASA’s Marshall Space Flight Center, where I used to do a lot of work, the analog computing facility filled a whole building. The Babbage Analytical Engine was sold to the 19th century British version of the Department of Defense (DoD) on the basis of its promise for simulating artillery rounds. We’re still doing that, to this day.

Just so we’re on the same page, a dynamic simulation is one that solves for the time history of a dynamic system; that is, a system described by a set of ordinary differential equations:

Where x is some set of system parameters that change with time. This set is called the state vector for the system, though it may be a true vector only in the sense of a one-dimensional C array. The solution of the problem involves numerically integrating Equation 2, given an initial condition:

All systems of ordinary differential equations can be cast into the form of Equation 2. In truth, most dynamic systems are really described by second-order differential equations, since that’s the form taken by Newton’s second law. Even so, the system of second-order equations can be cast into the form of Equation 2 by a change of variables.

The first computer program I ever submitted input to was a dynamic simulation, for simulating lunar trajectories. The first Fortran code I ever wrote was for another simulation, this time for artificial satellites. Eventually, I became something of a simulation guru, and had developed a standard architecture for simulations.

In those days, if you wanted a simulation, you had one decision to make: should I write it myself or ask a colleague to do it? There was no such thing as a canned library of functions, much less a canned solution. Anything we did, we did ourselves. The first thing you were going to have to do was design a numerical integration routine, capable of solving Equation 2 to sufficient accuracy. This usually meant a trip to the library to find out the formulae for numerical integrations. Most people stopped when they got to fixed-step-size, fourth-order Runge-Kutta (RK4), which was stable, accurate, and easy to write, but tricky to use (because of the fixed step size). There was almost no hope of finding a canned RK4 solution. Usually, the implementation for a given simulation was so deeply embedded in the simulation that you’d kill the “organism” trying to remove it.

Also in those days, managers tended to blanch or faint whenever the word “simulation” was mentioned. And for good reason; say, “I think we need to simulate this system,” and the managers had visions of ten people writing code for a year or two—and then, quite likely, never producing something usable. For that reason, we often found ourselves using badly written, badly documented legacy code (still do, for that matter), or making do without a simulation at all.

I mentioned that the first Fortran code I ever wrote was a subroutine for a simulation. Significantly, it added two vectors. Vector arithmetic is the backbone of dynamic simulation (see Equation 2 again). In short order, I had a complete package for doing vector arithmetic, then another for doing matrix arithmetic, yet another for crunching rotation matrices, and so on. Eventually I developed a rather complete toolbox of subroutines, including a nice numerical integrator, which I carried around from job to job, like any good journeyman, in two Army surplus boxes for IBM cards.

Along the way, I developed a reputation (underserved, I believe) for having a strong NIH (Not Invented Here) bias. Given someone else’s simulation to work with, my tendency was to “jack up the hubcap and drive a new car under it.” That is, I’d rewrite the sim using my own subroutine library. This really wasn’t a matter of NIH (at least, I don’t think so). It was a matter of saving time and assuring correctness. I had learned the hard way that other people’s implementations were all too prone to error. I had learned the hard way that far more wrong ways exist to implement RK4 integrators than right ones. Faulty implementation turned out to be the most common source of error, by far. Ditto for canned subroutines. To see just how much detailed scrutiny was required, consider the simple task of finding the length of a three-vector:

The natural, but incorrect, way to write this formula in Fortran was using its “exponent” operator:

R=SQRT(X**2 + Y**2 + Z**2)

This was incorrect because exponentiation made library calls to both the log and exp functions. Do six such calls per instance of Equation 4, and do it four times per integration step, and do that for, say, 100,000 integration steps, and you’re going to waste a lot of computer time. The proper way to write Equation 4 was, and still is:

R=SQRT(X*X + Y*Y + Z*Z)

Well, imagine being given someone else’s simulation of a very complex dynamic system and having to scrutinize it to the level of checking for errors like this and you can see why I wanted to use my own stuff. I found that, most times, I could rewrite the whole simulation from scratch, using routines I trusted, in the time it took to try to figure out what the other fellow had done, and make sure he’d done it correctly.

Between the mid ’70s and mid ’80s, several companies came out with subroutine libraries and other tools. I tried, really I did, to use such libraries, but I usually found myself sliding back into the tried and true routines in my own toolbox. Either the canned library routines had unintelligible names and awkward calling lists, or they were implemented badly, or the vendors wouldn’t tell you how they were implemented at all. They were truly black boxes, and I do mean black .

So here I was, with my toolbox of math functions, my numerical integrator, and the rest. I carried them around the way a pool shark carries his cue in a leather case. I used them, and only them, for a very good reason: I trusted them. I had researched such topics as numerical integration to the point where I was quite satisfied I had near-optimal solutions, and certainly correct ones. I didn’t trust someone else’s tools, especially those that they wouldn’t give me visibility into. I’d been burned too many times by rotten implementations, or even rotten methods.

Now stop for a moment and ask yourself how many others there were, and still are, like me, who also carried around their toolboxes. How many times must we write a routine to add two vectors? Imagine the incredible duplications of effort. Finally, imagine the chances of getting any one of us to use the other guys’ tools.

The biggest change over the last 10 years or so is that we now have access to tools we can trust. This is certainly true of the MATLAB crowd. After all, their whole raison d’etre is to perform vector and matrix math. Do you think I’m going to write, say, a matrix inversion routine that’s better and faster than theirs? No way; the best I can hope for is a tie. If I have any doubts, I need only read the list of the people who wrote their routines, which reads like a who’s who of numerical computing. I’ve done my share of research into numerical methods, but these are the guys who wrote the books.

Somewhere I read that as recently as 10 years ago, if you wanted to simulate, study, and optimize a control system, you had no choice but to write the simulation yourself. And you’d better figure several weeks, at least, to write it (well, that’s a big improvement; it used to take several months to years). Today, you can do the same thing in a tool like Simulink, in an hour or two. I can’t begin to express to you how utterly liberating this ability is. I can solve problems today that I would never before have dreamed of attempting. That’s got to be a step forward.

To show you just how very easy it is to build a simulation nowadays, consider Figure 1 , which is a handy-dandy, universal solution, implemented in Simulink. I can’t stress enough: this is the complete simulation. It solves all problems described by Equation 2, except that, of course, I haven’t described what goes into the box marked f ( x , t ). We probably would never really implement a simulation this way, mainly because it complicates the form of what goes into the box. Nevertheless, it’s a complete solution.

Having met Simulink, I’m joining the canned, black box revolution, happily waving my banner in the parade. Hey, I never wanted to reinvent the wheel anyway. I only did so because I couldn’t find a trustworthy alternative. Now I have. I still have my subroutine library, which is now in its n th incarnation. I have copies in Fortran, Pascal, C, and C++. I still do occasional research into the best methods of doing things; the topic of function minimization, that we’ve been discussing, is a case in point. But I don’t need my library nearly as much as I used to.

Back to work

Speaking of minimization, let’s get back to it. You’ll recall from last month that I defined a test function:

We showed that the function has a known minimum of –1, at the value

Last month, I gave you a teaser for the solution, starting with the dumbest algorithm I could think of: trial and error. That program is shown in Listing 1 . In it, we simply try a bunch of values of x , and take the one that yields the smallest f ( x ).

Hey, don’t laugh! It may be simple-minded, but if you only have to do this once or twice, such as to compute the best coefficients for a function fit, there’s no law that says you have to be efficient. I first used this method on my old TRS-80, programmed in BASIC. I used it to find an optimal approximation to the arctangent function, which means I had to have loops within loops. I left it to run overnight, and the next morning I had my answers. No sweat. Sometimes people tend to forget that, when it comes to PCs, the P stands for personal . That means the computer is mine, all mine; its job is to sit there all day, every day, waiting for me to tell it to do something. Then it does it. That’s what it’s paid to do. Sometimes I don’t care if it’s used efficiently or not.

A little modularity

Before we go any further with more sophisticated algorithms, let’s try to modularize Listing 1 a bit. With any luck, we’re going to end up with a very efficient algorithm to replace the dumb one, but the change should be relatively transparent to the user. We should encapsulate the algorithm in a user-callable function, so that future improvements won’t change the way the function is called.

Right off the bat, we’re going to find ourselves faced with a decision: does the algorithm have the right to assume that we know that there’s a minimum somewhere in the region of interest, or should we force it to go find that minimum? Old-time readers might remember the super-duper root finder that I presented a few years ago (“Roots of a Function,” August 1994, p. 13). With that program, we wanted to find the place at which some function had a root:

Of all the blunders you can make, the one thing you don’t want to do, either with root-finders or minimum finders, is to lose the solution. In the case of a root finder, it’s easy to make sure you don’t. If you enter the function with two points giving different signs for f ( x ), you can be 100% sure that, somewhere in between, there’s a place where f ( x ) = 0 is satisfied (that’s the central limit theorem). Once the root-finder takes control, it will act to make sure that it always keeps the root bracketed between two points. To make sure it’s going to work right, we need only give it a good starting point, which is to say, we should give it end points that bracket the root, going in.

The same sort of concept works with the minimum-finder. We want to be able to assure the routine that there’s a real minimum in the region of interest. How we assure that is another matter, and one we’ll discuss in a moment. For now, let’s assume that I know about the function at hand to be able to define a region, however broad, that brackets a minimum. In that case, we’ll pass the two bracketing points to the function.

Listing 2 shows the result. Note that I’m passing the name of the function to minimize as a parameter. When the function returns, it should give me back the best guess for the value of x that minimizes f ( x ). You’ll note also that I’ve changed the initial value of ymin, which assures that we find the minimum, even if it’s at x 0 or x 1.

Because we’ve moved all the processing into the minimizer, the main program is a shell that simply calls that function:

void main(void){
long n;
double xmin;
cin >> n;
xmin = minimize(f, 0, 1, n);
cout 
<
<
xmin 
<
<
' ' 
<
<
f(xmin) 
<
<
endl;
}

Starting to improve

Although the brute-force algorithm we’re using is dumb in the extreme, it really doesn’t work all that badly. A 400MHz Pentium II, with its built-in floating-point processor, makes up for a lot of stupidity in algorithms. However, consider the number of steps we must take to get the result. We get essentially perfect results in 10,000 steps, but that’s only because the solution happens to be very close to a grid point (see Equation 6). Even so, notice that the last 20% of our steps are wasted. The minimizer finds the minimum at x = 0.7937, but it steams ahead to 1.0 regardless. That’s pointless, or almost so; once the minimum has been found, we might as well give it up and return.

The minimizer of Listing 2 is what’s called a global minimizer. It will find the value of x that gives the lowest f ( x ) throughout the entire input range, even if more than one minimum in that range exists. That’s a nice feature, and one that we might want to tuck away for a rainy day. You see, then, that the code of Listing 2 isn’t quite as dumb as it looks; it can handle problems that virtually no other minimizer can, which is something nice to know.

In most cases, though, we can’t afford the wasted time for global minimization. When we call a procedure to find the minimum of a function, it’s usually because we have additional information that gives us reason to believe there’s only one minimum in the range of interest. In that case, we don’t want to spend the time looking for others. Instead, we agree, in the interest of efficiency, to take the first minimum we find, even if it’s at the very first point, x 0.

If you look at the code, you can see that we’re doing nothing fancy to find the minimum; we’re simply stepping along in x , sliding x 0 along until we find the minimum. How can we know that we’ve found one? Simple: the function value is higher on the current step than on the last step. As soon as this happens, we know we’ve passed the minimum; it’s time to get out.

Listing 3 shows the function, modified to kick out on the first step with increasing y .

Decimating

The biggest problem with our algorithm is, of course, that we have to take too many steps to find the minimum: 7,838 for our test case. Can we reduce this number? Of course we can. One obvious way to do so is not to start with such a small step size. Suppose, for example, we started with n = 10. We’d evaluate the function for 0, 0.1, 0.2, and so on, passing the minimum at x = 0.8. See Table 1 for the values we have at this point.

As it turns out, we’ve already passed the mininum when we get to x = 0.8. That minimum is between 0.7 and 0.8. Our algorithm doesn’t know that, though; it only sees that we’ve passed it at the next point, which is x = 0.9.

At this juncture, what do we know? We clearly still don’t know where the minimum is. It could be between 0.7 and 0.8 (and it is), or between 0.8 and 0.9. What we do know for sure is that there’s a minimum somewhere between 0.7 and 0.9. We know that because we can see at least one point, at 0.8, where the function value is less than both end points.

At this point we’ve evaluated the function 10 times, at x = 0 through x = 0.9. Now we can run the same function again over the range 0.7 to 0.9, still using 10 divisions. We’ll find that we end up evaluating the function for values of x of 0.70, 0.72, and so on, through 0.82. At the end of that run, we will have evaluated the function seven more times, for a total of 17. Repeating the process to an accuracy of one part in 20,000, we find that we get the answer with a total of 46 function evaluations. If we can agree that 46 evaluations is better than 7,838, then we can agree that successive reduction of the range is better than a single pass.

This is actually the way I did it with my old TRS-80 program. When you’re running a 2MHz processor in interpreted BASIC, brute force is not an option. So what I actually did was reduce the resolution gradually, by factors of 10, until I got no significant improvement.

We’d like to do better than that, however. We’d like to automate the process of range reduction. Except for some bother at the end points, the change is straightforward enough. We always know that the solution lies somewhere between the last three points, so we take the last point returned and spread out one step on either side. The only thing about the end points is we can’t extend beyond them, so in those cases our spread must be one-sided. Listing 4 gives the resulting code. Try it; I think you’ll find it’s already getting pretty darned effective.

We still have a long way to go, however. So far, we’ve seen that we can hone in on a solution without using a lot of function evaluations, by gradually reducing the range. The question we should ask next is, what’s the best number of intervals to use at each step? Listing 4 uses 10 steps per iteration, but is this the best choice of intervals? I’ll bet you can guess that it is not, and next month we’ll discover what is.

There are some other inefficiencies and other problems in what we’ve done so far. Chief among the inefficiencies is that we find ourselves recalculating a few of the points more than once. Surely we can manage to avoid doing this. Finally, even though we know we’ve captured the solution when we see three points, with the middle one having the smallest value of f ( x ), we don’t really use that middle point in Listing 4 . We’ll correct that oversight next month. We’re closer than you think to a near-optimal implementation, using the minimum number of recalculated points. I’ll be showing you that algorithm, which is perfectly suitable for general-purpose work, next month.

See you then.

Jack W. Crenshaw a senior principal design engineer at Alliant Tech Systems Inc. in Clearwater, FL. He did much early work in the space program and has developed numerous analysis and real-time programs. He holds a PhD in physics from Auburn University. Crenshaw enjoys contact and can be reached via e-mail at jcrens@earthlink.net .

Resources
Figure 1 : Handy-dandy universal simulator
Table 1 : Values near 0.7937
Listing 1 : A dumb minimizer
Listing 2 : A modular minimizer
Listing 3 : An improved version
Listing 4 : Automating range reduction

Return to Table of Contents

Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS





 :