Meta-Heuristics

The following is a heuristic look at how to develop heuristic problem solvers. It is based on the personal experience of the author in developing heuristic algorithms for various scheduling problems.

Introduction

The word heuristic has many meanings. Following the Dictionary of Algorithms and Data Structures, I will use it to describe a technique that usually, but not always, works or that gives nearly the right answer. Although heuristic is often seen as the opposite of an algorithm (in the sense of mathematical algorithm), for want of a better term I'll use heuristic algorithm to describe software which solves a problem heuristically.

Why would one ever want to use a method that doesn't find a solution in all cases?

The answer normally is: using a heuristic is the only way to find a solution at all, given the constraints that apply.

Examples of such constraints are

limited knowledge
limited response time
limited resources
the problem is really hard, e.g., NP-complete

In short, a heuristic algorithm gives you a tentative solution now instead of a perfect solution at some unspecified time in the (possibly far) future.

Application Domains

Heuristic methods are useful in a broad number of cases. Three typical application domains are

classification
scheduling
search

Personally, I've encountered a number of different scheduling problems amenable to heuristics. I tackled each of them by developing and fine-tuning a tailor-made heuristic. Only in retrospect did I notice the conceptual and structural similarities between these widely differing solutions.

An Analogy

A heuristic search for a solution to a complex problem can be interpreted as finding a path through the state space (also called phase space) of the problem domain. Each degree of freedom of the problem domain is represented by one axis of the (normally highly) multidimensional state space. Each point of the state space corresponds to one specific configuration of the problem domain. Unfortunately, a mind bogglingly huge majority of the points don't represent feasible solutions to the problem.

Lets simplify this analogy a bit to make it more palpable. Imagine being lost in a desert. Although you have a rough knowledge of the desert's shape, you neither know your exact location nor the closest point of exit. Making things worse, you have a serious shortage of drinking water. To escape, you must rely on locally available knowledge (sorry, no GPS) in trying to find a route out.

Add several thousand (or million) degrees of freedom and a few dozen additional constraints and there is the task faced by a typical heuristic solver.

How-To

Immersion

Get familiar with the problem:

You need to be really familiar with it
Best way is total immersion

Think about the problem in the shower, while driving the car, while walking the dog, before you fall asleep, i.e., during most of your waking hours (but not during dinner with your family, most significant other, or new date ;-)
Identify and understand the cost drivers
- factors determining quality of solution
- find relative priority
- be prepared for changes of cost factors
  - relative priority
  - set of factors
  - maybe dependent on job
- examples:
  - machine setup time
  - amount of manual intervention
  - number of preemptions
  - number of slots/task-chains/pallets
  - amount of raw material wasted
hard constraints vs. soft constraints:
- hard constraints must be satisfied to get a viable solution
- violations of soft constraints still yield a viable solution but these violations reduce the quality of the solution
- in most cases, it is impossible to satisfy all soft constraints

Conceptual model

Find a good conceptual model for the problem:

Having the right model makes all the difference
- Trying to save one hour here might cost months of development time later
Ignore common wisdom
Take your time here

Guidelines

Define your priorities
Be ready to compromise
Be prepared for surprise
Beware of YAGNI

Frugal heuristics

Frugal heuristics use as little information as possible to make a decision
KISS
Fast and less accurate is better than slow and slightly more accurate
The time won by a fast, frugal step might be used for other (fast, frugal) steps to improve the result (when necessary)
Frugal heuristics are less susceptible to noise
Consider greedy algorithms (which always take the best immediate, or local, solution while finding an answer)

Specialize

Take the low hanging fruit first:

Reduce the problem to be solved so that solving is easier
- Reduces the state space
Solution to contracted problem might open new ways towards solving the original problem
Often means adding or tightening constraints
Compare to Contract to expand

Generalize

Try to solve a more general problem
- Enlarges the state space
- Opens additional pathes to solution
Often means removing or relaxing constraints
Compare to Expand to contract

Early feedback

Get (and use) feedback as early as possible, as often as possible:

Pester collegues for reviews
Test the behavior of the solution with realistic data
Test huge input sets, too
Weed out non-working ideas ruthlessly
- Throw away any idea if tests show it doesn't work no matter how beautiful/clever/effective it seemed
Factor and refactor

Step-by-step

Divide heuristic solver into mostly independent steps:

Get rudimentary solver running as quickly as possible to get early-feedback.
Implementation and test of each step can be done separately
Sequence of steps might change later
Some (sequence of) steps might be executed more than once
Try making steps as orthogonal as possible
Don't think of steps as forming a pipeline
Don't force steps to one interface

Most difficult first

If solving for one item has consequences for the solution of other items:

Do the most difficult items first
- Metric of difficulty depends on problem domain
- There might be multiple dimensions of difficulty: need to set priorities right
Might need to revisit already solved items do be able to solve "less difficult" items.
- Cleanup steps
- Try to avoid backtracking at all costs

Design for change

Tuning of heuristics requires changes
Changing user requirements
Assume that heuristic solver is going to be used for more than ten years

The following table shows the timeline of requirements changes for one specific heuristic solver originally implemented in 1998:

Year	Small	Medium	Large	Major	Total	Expected
1998	1	1			2	1
1999	1	1			2
2000		1			1
2001			1		1	1
2002	1	2	2	1	6	2
2003	1	1		1	3	1
2004	2	1	1		4
2005	3	1			4
2006	4	1	3	3	11
2007			1	3	4	1
2008	1	2	1		4
Total	14	11	9	8	42	6

Decision timing

Consider the timing of decisions during solving carefully:

premature decisions foreclose paths to possible solutions
overdue decisions leave too many possibilities for the solver to consider
some decisions need to be done early, others need to be delayed
postpone decisions as long as possible

Independence

Keep heuristic solver independent of the application's essential object model:

transform essential input model into solver model
execute solver
transform solver model output into essential output model

Robustness

Produce an incorrect solution instead of no solution:

An incorrect solution still gives information to the user
The user might be able to correct the errors and derive a correct solution
You might be able to see where the heuristic is too weak and improve it