Model Compression
Cristian Bucila, Rich Caruana, Alexandru Niculescu-Mizil
Often the best performing supervised learning models are ensembles of hundreds or thousands of base-level classifiers. Unfortunately, the space required to store this many classifiers, and the time required to execute them at run-time, prohibits their use in applications where test sets are large (e.g. Google), where storage space is at a premium (e.g. PDAs), and where computational power is limited (e.g. hearing aids). We present a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance.
- Found it remarkable to discover that the improved data munging algorithm was critical to having the compression technique work.
- Related this with the previously discussed Deep Compression paper. This paper acheived much higher compression ratios (size reduction by a factor of 1000 as oppossed to the maximum of 49 in the Deep Compression paper), but this project was compressing ensembles, whereas the other paper was compressing a single network. Potentially it seems both compression techniques could be applied to a given solution.
- Reflected on the distinction between training on the soft output vs. training on the classifications. Rich Caruana gave a talk at the University of Alberta and mentioned training the smaller network on the soft outputs of the ensemble. This paper describes expanding the training set by labeling generated inputs with the ensemble, but not "soft-labeling" them. Speculated that the use of soft outputs for training was a later development.
Deep Photo Style Transfer
Fujun Luan, Sylvain Paris, Eli Shechtman, Kavita Bala
This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style. Our approach builds upon the recent work on painterly transfer that separates style from the content of an image by considering different layers of a neural network. However, as is, this approach is not suitable for photorealistic style transfer. Even when both the input and reference images are photographs, the output still exhibits distortions reminiscent of a painting. Our contribution is to constrain the transformation from the input to the output to be locally affine in colorspace, and to express this constraint as a custom fully differentiable energy term. We show that this approach successfully suppresses distortion and yields satisfying photorealistic style transfers in a broad variety of scenarios, including transfer of the time of day, weather, season, and artistic edits.
- The example images did a good job of demonstrating the effectiveness of the approach in comparison to previous approaches.
- Felt more background reading would be necessary to understand the details better. In particular the loss function was difficult to understand, as well it would be helpful to have in mind the framework of how training works for style transfer problems.
- The work combined an impressively wide set of previous works.
- Noted how such combinations of techniques in machine learning (in this case image segmentation in conjuction with style transfer) to form significant advances in capability are an example of emergent features in AI and perhaps an indication of a trend that will continue as the set of approaches widens and more complex combinations develop.