Two-Stage Greedy

The two-stage greedy optimizer is a general purpose framework for combining two optimizers by making the first \(n\) selections out of \(k\) total selections using one optimizer, and then making the remainder using the other. When the first optimizer is random selection and the second approach is naive/lazy greedy, this becomes partial enumeration. By default, the first algorithm is the naive greedy optimizer and the second algorithm is the lazy greedy. This combination results in the same selection as either optimizer individually but replaces the computationally intensive first few steps for the priority queue, where the algorithm may require scanning through almost the entire queue, with the parallelizable naive greedy algorithm. While, in theory, the lazy greedy algorithm will never perform more function calls than the naive greedy algorithm, there are costs associated both with maintaining a priority queue and with evaluating a single example instead of a batch of examples.

This optimizer, with the naive greedy optimizer first and the lazy greedy optimizer second, is the default optimizer for apricot selectors.

API Reference

An approach that uses both naive and lazy greedy algorithms.

This optimization approach starts off by using the naive greedy algorithm to select the first few examples and then switches to use the lazy greedy algorithm. This two stage procedure is designed to overcome the limitations of the lazy greedy algorithm—specifically, the inability to parallelize the implementation. Additionally, early iterations frequently require iterating over most of the examples, which is particularly costly to do on a priority queue. Thus, one can frequently see large speed gains simply by doing the first few iterations using the naive greedy algorithm and then switching to the lazy greedy algorithm.

Parameters

self.function : base.SubmodularSelection
A submodular function that implements the _calculate_gains and _select_next methods. This is the function that will be optimized.
self.n_first_selections : int
The number of selections to perform using the naive greedy algorithm before populating the priority queue and using the lazy greedy algorithm.
self.verbose : bool
Whether to display a progress bar during the optimization process.

Attributes

self.function : base.SubmodularSelection
A submodular function that implements the _calculate_gains and _select_next methods. This is the function that will be optimized.
self.verbose : bool
Whether to display a progress bar during the optimization process.