Lazy Greedy

The lazy/accelerated greedy algorithm for optimization.

The lazy (or accelerated) greedy algorithm is a fast alternate to the naive greedy algorithm that results in the same items being selected. This algorithm exploits the diminishing returns property of submodular functions in order to avoid re-evaluating examples that are known to provide little gain. If an example has a small gain relative to other examples, it is unlikely to be the next selected example because that gain can only go down as more items are selected. Thus, the example should only be re-evaluated once the gains of other examples have gotten smaller.

The key idea of the lazy greedy algorithm is to maintain a priority queue where the examples are the elements in the queue and the priorities are the gains the last time they were evaluated. The algorithm has two steps. The first step is to calculate the gain that each example would have if selected first (the modular upper bound) and populate the priority queue using these values. The second step is to recalculate the gain of the first example in the priority queue and then add the example back into the queue. If the example remains at the front of the queue it is selected because no other example could have a larger gain once re-evaluated (due to the diminishing returns property).

While the worst case time complexity of this algorithm is the same as the naive greedy algorithm, in practice it can be orders of magnitude faster. Empirically, it appears to accelerate graph-based functions much more than it does feature-based ones. Functions also seem to be more accelerated the more curved they are.

param self.function:
 A submodular function that implements the _calculate_gains and _select_next methods. This is the function that will be optimized.
type self.function:
 base.BaseSelection
param self.verbose:
 Whether to display a progress bar during the optimization process.
type self.verbose:
 bool
self.function

A submodular function that implements the _calculate_gains and _select_next methods. This is the function that will be optimized.

Type:base.BaseSelection
self.verbose

Whether to display a progress bar during the optimization process.

Type:bool
self.pq

The priority queue used to order examples for evaluation.

Type:utils.PriorityQueue
self.gains_

The gain that each example would give the last time that it was evaluated.

Type:numpy.ndarray or None