WCOM Fall 14 - West Coast Optimization Meeting

WCOM Fall 14

Abstracts of Invited Talks

Sedi Bartz, UBC Okanagan
Optimal c-convex c-antiderivatives and optimal pricing for optimal transport
Given some partial data regarding an optimal transport plan, we consider the family of solutions to the dual Kantorvich problem with complience to the given initial data. We construct the envelope of this family which then consists of the optimal solutions. To this end, we present a recent construction of optimal c-convex c-antiderivatives. We initiate our talk with a short review of some of the basics of c-convexity theory. If time permits, we will consider a concrete example of the above construction where the cost function c is a metric. Then our optimizers are optimal constrained Lipschitz extensions.
Jesús De Loera, U. C. Davis
Augmentation or pivoting-like algorithms for linear and integer linear optimization
In the study of the simplex method one is interested on the performance of pivot rules and the number of pivots needed to reach the optimum. The solution of separable convex integer programs can be solved via similar "pivoting" or augmentation techniques. The steps for linear programming (LPs) are circuits or edges, for IPs instead we use Graver bases moves. In this talk we show that using the right pivot rule one can prove the right performance. We show polynomially many augmentation steps are possible if best augmenting steps along Graver basis directions are performed. Using instead augmentation along directions with best ratio of cost improvement/unit length, we show that for linear objectives the number of augmentation steps is bounded by the number of elements in the Graver basis of the problem matrix, giving strongly polynomial-time algorithms for the solution of special ILPs and reproving some classical cases of strongly polynomial LPs. This is joint work with Raymond Hemmecke, Technische Universität München, and Jon Lee, University of Michigan.
Hongbo Dong, Washington State University
Fast construction of convex valid constraints for MIQCPs in lower dimensional space
Many strong convex relaxations for mixed-integer quadratically constrained programs (MIQCPs) are based on intersecting the SDP constraint with polyhedral relaxations in the lifted variable space. This type of relaxations are typically large scale semidefinite programs, whose efficient/reliable solution is yet an active research area. We show, by exploiting a specific SDP relaxation, how to construct valid convex (nonlinear) cutting surfaces for the MIQCPs in the almost original variable space in a efficient manner. This framework can be used as a key component in a cutting surface algorithm to compute strong bounds for MIQCPs much faster than solving semidefinite programs with interior point methods. An interesting feature of our approach is that all computations are of low complexity, and we do not use eigenvalue decompositions in iterations. Computational results show that this approach is quite promising. We will also show how to exploit general SDP relaxations with many linear constraints to derive stronger convex cutting surfaces. This talk is based on a submitted paper (link) as well as very recent work (manuscript in preparation).
Lei Guo, University of Victoria
Mathematical programs with geometric constraints in Banach spaces: Enhanced optimality and sensitivity
We study the mathematical program with geometric constraints (MPGC) such that the image of a mapping from a Banach space is included in a nonempty and closed subset of a finite dimensional space. We obtain the nonsmooth enhanced Fritz John necessary optimality conditions in terms of the approximate subdifferential. In the case where the Banach space is a weakly compactly generated Asplund space, the optimality condition obtained can be expressed in terms of the limiting subdifferential while in the general case it can be expressed in terms of the Clarke subdifferential. The enhanced Fritz John condition allows us to obtain the enhanced Karush-Kuhn-Tucker condition under the quasinormality conditions which are weaker than the classical normality conditions. We then prove that the quasinomality is a sufficient condition for the existence of local error bounds of the constraint system. Finally we obtain a tighter upper estimate for the subdifferentials of the value function of the perturbed problem in terms of the enhanced multipliers. In addition, we study the directional differentiability of value function for parametric mathematical program with equilibrium constraints, which is a special case of MGPC, under the MPEC relaxed constant rank regularity condition and restricted inf-compactness.
Chayne Planiden, UBC Okanagan
Thresholds of prox-boundedness of PLQ functions
The Moreau envelope is a key tool in non-smooth analysis and optimization. An important aspect in applying the Moreau envelope to nonconvex functions is determining if the function is prox-bounded, that is, if there exists a point x and a parameter r such that the Moreau envelope with respect to r is finite at x. The infimum of all such r is called the threshold of prox-boundedness (prox-threshold) of the Moreau envelope of the function. We seek to understand the prox-thresholds of piecewise linear-quadratic (PLQ) functions. The main result provides a computational technique for determining the prox-threshold for a PLQ function, and further analyzes the behavior of the Moreau envelope of the function at the prox-threshold.
Mark Schmidt, UBC Vancouver
Minimizing finite sums with the stochastic average gradient
We propose the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. Specifically, under standard assumptions the convergence rate is improved from O(1/k) to a linear convergence rate of the form O(p^k) for some p < 1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. Beyond these theoretical results, the algorithm also has a variety of appealing practical properties: it supports regularization and sparse datasets, it allows an adaptive step-size and has a termination criterion, it allows mini-batches, and its performance can be further improved by non-uniform sampling. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.
Paul Tupper, Simon Fraser University
L₁ Embeddings, Steiner flows, and diversities
The embedding of finite metrics in L₁ has become a fundamental tool for both combinatorial optimization and large-scale data analysis. One important application is to network flow problems in which there is close relation between max-flow min-cut theorems and the minimal distortion embeddings of metrics into L₁. Here we show that this theory can be generalized considerably to encompass Steiner tree packing problems in both graphs and hypergraphs. Instead of the theory of L₁ metrics and minimal distortion embeddings, the parallel is the theory of diversities and the corresponding theory of L₁ diversities and embeddings which we develop here. Joint work with David Bryant.
Lin Xiao, Microsoft Research
An accelerated proximal coordinate gradient method and its application in machine learning
We consider the problem of minimizing the sum of two convex functions: one is smooth and given by a gradient oracle, and the other is separable over blocks of coordinates and has a simple known structure over each block. We develop an accelerated proximal coordinate gradient (APCG) method for minimizing such convex composite functions. For strongly convex functions, our method achieves faster linear convergence rates than existing randomized proximal coordinate gradient methods. Without strong convexity, our method enjoys accelerated sublinear convergence rates. We show how to apply the APCG method to solve the regularized empirical risk minimization (ERM) problem in machine learning, and devise efficient implementations that avoid full-dimensional vector operations. This is joint work with Qihang Lin (University of Iowa) and Zhaosong Lu (Simon Fraser University).
Yinyu Ye, Stanford University
Optimization for high-dimensional and massive data
We present several analytic models and computational algorithms dealing with high-dimensional and massively distributed data. Specifically, we discuss Least-squares with Non-convex Regularization, where we give sparse and structure characterizations for every KKT stationary solution of the problem; and the Alternating Direction Method of Multipliers (ADMM) for large-scale data, where we give an example to show that the direct extension of ADMM for three-block convex minimization problems is not necessarily convergent and propose possible convergent variants.
Julie Zhou, University of Victoria
D-optimal designs based on the second-order least squares estimator
We will briefly introduce the second-order least squares estimator (SLSE) for regression models. The SLSE is more efficient than the ordinary least squares estimator when the error distribution is asymmetric. Then we will discuss the new optimal design criteria based on the SLSE. In particular, A-optimal and D-optimal designs minimize the trace and the determinant of the asymptotic covariance of the SLSE, respectively. Using convex optimization techniques, we develop two algorithms to compute the D-optimal designs for polynomial and trigonometric regression models. The CVX program in MATLAB is effective to implement these algorithms.