optimization: a technical overview

(This is the fourth in the PuneTech series of articles on optimization by Dr. Narayan Venkatasubramanyan, an Optimization Guru and one of the original pioneers in applying Optimization to Supply Chain Management. The first one was an ‘overview’ case study of optimization. The second was architecture of a decision support system. The third was optimization and organizational readiness for change.

For Dr. Narayan Venkatasubramanyan’s detailed bio, please click here. For the full series of articles, click here.)

this is a follow-up to optimization: a case study. frequent references in this article to details in that article would make this one difficult to read for someone who hasn’t at least skimmed through that.

the problem of choice

the wikipedia article on optimization provides a great overview of the field. it does a thorough job by providing a brief history of the field of mathematical optimization, breaking down the field into its various sub-fields, and even making a passing reference to commercially available packages that help in the rapid development of optimization-based solutions. the rich set of links in this page lead to detailed discussions of each of the topics touched on in the overview.

i’m tempted to stop here and say that my job is done but there is one slight problem: there is a complete absence of any reference to helicopter scheduling in an offshore oil-field. not a trace!

this brings me to the biggest problem facing a young practitioner in the field: what to do when faced with a practical problem?

of course, the first instinct is to run with the technique one is most familiar with. being among the few in our mba program that had chosen the elective titled “selected topics in operations research” (a title that i’m now convinced was designed to bore and/or scare off prospective students who weren’t self-selected card-carrying nerds), we came to the problem of helicopter scheduling armed with a wealth of text-book knowledge.

an overview of linear programming

A series of linear constraints on two variable...
the lines represent the constraints. the blue region is the set of all “permissible values”. the objective function is used to choose one (“the most optimal”) out of the blue points. image via wikipedia

having recently studied linear and integer programming, we first tried to write down a mathematical formulation of the problem. we knew we could describe each sortie in terms of variables (known as decision variables). we then had to write down constraints that ensured the following:

  • any set of values of those decision variables that satisfied all the constrains would correspond to a sortie
  • any sortie could be described by a set of permissible set of values of those decision variables

this approach is one of the cornerstones of mathematical programming: given a practical situation to optimize, first write down a set of equations whose solutions have a one-to-one correspondence to the set of possible decisions. typically, these equations have many solutions.

click here for an animated presentation that shows how the solutions to a system of inequalities can be viewed graphically.

the other cornerstone is what is called an objective function, i.e., a mathematical function in those same variables that were used to describe the set of all feasible solutions. the solver is directed to pick the “best” solution, i.e., one that maximizes (or minimizes) the objective function.

the set of constraints and the objective function together constitute a mathematical programming problem. the solution that maximizes (or minimizes) the objective function is called an optimal solution.

linear programming – an example

googling for “linear programming examples” leads to millions of hits, so let me borrow an example at random from here: “A farmer has 10 acres to plant in wheat and rye. He has to plant at least 7 acres. However, he has only $1200 to spend and each acre of wheat costs $200 to plant and each acre of rye costs $100 to plant. Moreover, the farmer has to get the planting done in 12 hours and it takes an hour to plant an acre of wheat and 2 hours to plant an acre of rye. If the profit is $500 per acre of wheat and $300 per acre of rye how many acres of each should be planted to maximize profits?”

the decisions the farmer needs to make are: how many acres of wheat to plant? how many acres of rye to plant? let us call these x and y respectively.

so what values can x and y take?

  • since we know that he has only 10 acres, it is clear that x+y must be less than 10.
  • the problem says that he has to plant at least 7 acres. we have two choices: we can be good students and write down the constraint “x+y >= 7” or we can be good practitioners and demand to know more about the origins of this constraint (i’m sure every OR professional of long standing has scars to show from the times when they failed to ask that question.)
  • the budget constraint implies that 200x + 100y <= 1200. again, should we not be asking why this farmer cannot borrow money if doing so will increase his returns?
  • finally, the time constraint translates into x + 2y <= 12. can he not employ farm-hands to increase his options?
  • the non-negativity constraints (x, y >= 0) are often forgotten. in the absence of these constraints, the farmer could plant a negative amount of rye because doing so would seem to get him more land, more money, and more time. clearly, this is practically impossible.

as you will see if you were to scroll down that page, these inequalities define a triangular region in the x,y plane. all points on that triangle and its interior represents feasible solutions: i.e., if you were to pick a point, say (5,2), it means that the the farmer plants 5 acres each of wheat and 2 acres of rye. it is easy to confirm that this represents no more than 10 acres, no less than 7 acres, no more than $1200 and no more than 12 hours. but is this the best solution? or is there another point within that triangle?

this is where the objective function helps. the objective is to maximize the profit earner, i.e., maximize 500x + 300y. from among all the points (x,y) in that triangle, which one has the highest value for 500x + 300y?

this is the essence of linear programming. LPs are a subset of problems that are called mathematical programs.

real life isn’t always lp

in practice, not all mathematical programs are equally hard. as we saw above, if all the constraints and the objective function are linear in the decision variables and if the decision variables can take on any real value, we have a linear program. this is the easiest class of mathematical programs. linear programming models can be used to describe, sometimes approximately,a large number of commercially interesting problems like supply chain planning. commercial packages like OPL, GAMS, AMPL, etc can be used to model such problems without having to know much programming. packages like CPLEX can solve problems with millions of decision variables and constraints and produce an optimal solution in reasonable time. lately, there have been many open source solvers (e.g., GLPK) that have been growing in their capability and competing with commercial packages.

Illustrates a cutting plane algorithm to solve...
integer programming problems constrain the solution to specific discrete values. while the blue lines represent the “feasible region”, the solution is only allowed to take on values represented by the red dots. this makes the problem significantly more difficult. image via wikipedia

in many interesting commercial problems, the decision variables is required to take on discrete values. for example, a sortie that carries 1/3 of a passenger from point a to point b and transports the other 2/3 on a second flight from point a to point b would not work in practice. a helicopter that lands 0.3 in point c and 0.7 in point d is equally impractical. these variables have to be restricted to integer values. such problems are called integer programming problems. (there is a special class of problems in which the decision variables are required to be 0 or 1; such problems are called 0-1 programming problems.) integer programming problems are surprisingly hard to solve. such problems occur routinely in scheduling problems as well as in any problem that involves discrete decisions. commercial packages like CPLEX include a variety of sophisticated techniques to find good (although not always optimal) solutions to such problems. what makes these problems hard is the reality that the solution time for such problems grows exponentially with the growth in the size of the problem.

another class of interesting commercial problems involves non-linear constraints and/or objective functions. such problems occur routinely in situations such refinery planning where the dynamics of the process cannot be described (even approximately) with linear functions. some non-linear problems are relatively easy because they are guaranteed to have unique minima (or maxima). such well-behaved problems are easy to solve because one can always move along an improving path and find the optimal solution. when the functions involved are non-convex, you could have local minima (or maxima) that are worse than the global minima (or maxima). such problems are relatively hard because short-sighted algorithms could find a local minimum and get stuck in it.

fortunately for us, the helicopter scheduling problem had no non-linear effects (at least none that we accounted for in our model). unfortunately for us, the discrete constraints were themselves extremely hard to deal with. as we wrote down the formulation on paper, it became quickly apparent that the sheer size and complexity of the problem was beyond the capabilities of the IBM PC-XT that we had at our disposal. after kicking this idea around for a bit, we abandoned this approach.

resorting to heuristics

we decided to resort to a heuristic approach, i.e., an approach that used a set of rules to find good solutions to the problem. the approach we took involved the enumeration of all possible paths on a search tree and then an evaluation of those paths to find the most efficient one. for example, if the sortie was required to start at point A and drop off m1 men at point B and m2 men at point C, the helicopter could

  • leave point A with the m1 men and proceed to point B, or
  • leave point A with the m2 men and proceed to point C, or
  • leave point A with the m1 men and some of the m2 men and proceed to point B, or
  • leave point A with the m1 men and some of the m2 men and proceed to point C, or
  • . . .

if we were to select the first possibility, it would drop off the m1 men and then consider all the options available to it (return to A for the m2 men? fly to point D to refuel?)

we would then traverse this tree enumerating all the paths and evaluating them for their total cost. finally, we would pick the “best” path and publish it to the radio operator.

at first, this may seem ridiculous. the explosion of possibilities meant that this tree was daunting.

there were several ways around this problem. firstly, we never really explicitly enumerated all possible paths. we built out the possibilities as we went, keeping the best solution until we found one that was better. although the number of possible paths that a helicopter could fly in the course of a sortie was huge, there were simple rules that directed the search in promising directions so that the algorithm could quickly find a “good” sortie. once a complete sortie had been found, the algorithm could then use it to prune searches down branches that seemed to hold no promise for a better solution. the trick was to tune the search direction and prune the tree without eliminating any feasible possibilities. of course, aggressive pruning would speed up the search but could end up eliminating good solutions. similarly, good rules to direct the search could help find good solutions quickly but could defer searches in non-obvious directions. since we were limited in time, so the search tree was never completely searched, so if the rules were poor, good solutions could be pushed out so late in the search that they were never found, at least not in time to be implemented.

one of the nice benefits of this approach was that it allowed the radio operator to lock down the first few steps in the sortie and leave the computer to continue to search for a good solution for the remainder of the sortie. this allowed the optimizer to continue to run even after the sortie had begun. this bought the algorithm precious time. allowing the radio operator the ability to override also had the added benefit of putting the user in control in case what the system recommended was infeasible or undesirable.

notice that this approach is quite far from mathematical programming. there is no guarantee of an optimal solution (unless one can guarantee that pruning was never too aggressive and that we exhaustively searched the tree, neither of which could be guaranteed in practical cases). nevertheless, this turned out to be quite an effective strategy because it found a good solution quickly and then tried to improve on the solution within the time it was allowed.

traditional operations research vs. artificial intelligence

this may be a good juncture for an aside: the field of optimization has traditionally been the domain of operations researchers (i.e., applied mathematicians and industrial engineers). even though the field of artificial intelligence in computer science has been the source of many techniques that effectively solve many of the same problems as operations research techniques do, OR-traditionalists have always tended to look askance at their lowly competitors due to the perceived lack of rigour in the AI techniques. this attitude is apparent in the wikipedia article too: after listing all the approaches that are born from mathematical optimization, it introduces “non-traditional” methods with a somewhat off-handed “Here are a few other popular methods:” i find this both amusing and a little disappointing. there have been a few honest attempts at bringing these two fields together but a lot more can be done (i believe). it would be interesting to see how someone steeped in the AI tradition would have approached this problem. perhaps many of the techniques for directing the search and pruning the tree are specific instances of general approaches studied in that discipline.

if there is a moral to this angle of our off-shore adventures, it is this: when approaching an optimization problem, it is tempting to shoot for the stars by going down a rigorous path. often, reality intrudes. even when making technical choices, we need to account for the context in which the software will be used, how much time there is to solve the problem, what are the computing resources available, and how it will fit into the normal routine of work.

other articles in this series

this article is the fourth in the series of short explorations related to the application of optimization. i’d like to share what i’ve learned over a career spent largely in the business of applying optimization to real-world problems. interestingly, there is a lot more to practical optimization than models and algorithms. each of the the links leads to a piece that dwells on one particular aspect.

optimization: a case study
architecture of a decision-support system
optimization and organizational readiness for change
optimization: a technical overview (this article)

About the author – Dr. Narayan Venkatasubramanyan

Dr. Narayan Venkatasubramanyan has spent over two decades applying a rare combination of quantitative skills, business knowledge, and the ability to think from first principles to real world business problems. He currently consults in several areas including supply chain and health care management. As a Fellow at i2 Technologies, he tackled supply chains problems in areas as diverse as computer assembly, semiconductor manufacturer, consumer goods, steel, and automotive. Prior to that, he worked with several airlines on their aircraft and crew scheduling problems. He topped off his days at IIT-Bombay and IIM-Ahmedabad with a Ph.D. in Operations Research from the University of Wisconsin-Madison.

He is presently based in Dallas, USA and travels extensively all over the world during the course of his consulting assignments. You can also find Narayan on Linkedin at: http://www.linkedin.com/in/narayan3rdeye

Reblog this post [with Zemanta]