Approximate Lazy Greedy

The approximate lazy greedy algorithm is a simple extension of the lazy greedy algorithm that, rather than requiring that an element remains at the top of the priority queue after being re-evaluated, only requires that the gain is within a certain user-defined percentage of the best gain to be selected. The key point in this approach is that finding the very best element while maintaining the priority queue may be expensive, but finding elements that are good enough is simple. While the best percentage to use is data set specific, even values near 1 can lead to large savings in computation.

API Reference

The approximate lazy/accelerated greedy algorithm for optimization.

This optimization approach is the lazy/accelerated greedy algorithm. It will return the same subset as the naive greedy algorithm, but it uses a priority queue to store the examples to prevent the repeated evaluation of examples that are unlikely to be the next selected one. The benefit of using this approach are that using a priority queue can significantly improve the speed of optimization, but the downsides are that maintaining a priority queue can be costly and that it’s difficult to parallelize the approach or put it on a GPU.

Parameters

self.function : base.SubmodularSelection
A submodular function that implements the _calculate_gains and _select_next methods. This is the function that will be optimized.
self.verbose : bool
Whether to display a progress bar during the optimization process.

Attributes

self.function : base.SubmodularSelection
A submodular function that implements the _calculate_gains and _select_next methods. This is the function that will be optimized.
self.verbose : bool
Whether to display a progress bar during the optimization process.
self.pq : utils.PriorityQueue
The priority queue used to order examples for evaluation.
self.gains_ : numpy.ndarray or None
The gain that each example would give the last time that it was evaluated.