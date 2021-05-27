Cancel
CreatorsPublishersAdvertisers
View more in
Mathematics

New Representations of Epigraphs of Conjugate Mappings and Lagrange, Fenchel-Lagrange Duality for Vector Optimization Problems

By N. Dinh, D. H. Long
arxiv.org
 22 days ago

In this paper we concern the vector problem of the model: \begin{align*} ({\rm VP})\quad\qquad &\rm{WInf} \{F(x): x\in C,\; G(x)\in -S\}. \end{align*} where $X, Y, Z$ are locally convex Hausdorff topological vector spaces, $F\colon X\rightarrow Y\cup\{+\infty_{Y}\}$ and $G\colon X\rightarrow Z\cup\{+\infty_{Z}\}$ are proper mappings, $C$ is a nonempty convex subset of $X$, and $S$ is a non-empty closed, convex cone in $Z$. Several new presentations of epigraphs of composite conjugate mappings associated to (VP) are established under variant qualifying conditions. The significance of these representations is twofold: Firstly, they play a key role in establish new kinds of vector Farkas lemmas which serve as tools in the study of vector optimization problems; secondly, they pay the way to define Lagrange dual problem and two new kinds of Fenchel-Lagrange dual problems for the vector problem (VP). Strong and stable strong duality results corresponding to these three mentioned dual problems of (VP) are established with the help of new Farkas-type results just obtained from the representations. It is shown that in the special case where $Y = \mathbb{R}$, the Lagrange and Fenchel-Lagrange dual problems for (VP), go back to Lagrange dual problem, and Fenchel-Lagrange dual problems for scalar problems, and the resulting duality results cover, and in some setting, extend the corresponding ones for scalar problems in the literature.

arxiv.org
IN THIS ARTICLE
#Duality#Conjugate#Vector Optimization#Vp
YOU MAY ALSO LIKE
News Break
Technology
News Break
Mathematics
News Break
Computers
News Break
Science
News Break
Computer Science
Related
Mathematicsarxiv.org

Discrete-to-Continuous Extensions, II: Lovász extension, optimizations and eigenvalue problems

In this paper, we use various versions of Lovász extension to systematically derive continuous formulations of problems from discrete mathematics. This will take place in the following context:. (1) For combinatorial optimization problems, we systematically develop equivalent continuous versions, thereby making tools from convex optimization, fractional programming and more general...
Computersarxiv.org

Learning Hard Optimization Problems: A Data Generation Perspective

Optimization problems are ubiquitous in our societies and are present in almost every segment of the economy. Most of these optimization problems are NP-hard and computationally demanding, often requiring approximate solutions for large-scale instances. Machine learning frameworks that learn to approximate solutions to such hard optimization problems are a potentially promising avenue to address these difficulties, particularly when many closely related problem instances must be solved repeatedly. Supervised learning frameworks can train a model using the outputs of pre-solved instances. However, when the outputs are themselves approximations, when the optimization problem has symmetric solutions, and/or when the solver uses randomization, solutions to closely related instances may exhibit large differences and the learning task can become inherently more difficult. This paper demonstrates this critical challenge, connects the volatility of the training data to the ability of a model to approximate it, and proposes a method for producing (exact or approximate) solutions to optimization problems that are more amenable to supervised learning tasks. The effectiveness of the method is tested on hard non-linear nonconvex and discrete combinatorial problems.
Computersarxiv.org

A gradient based resolution strategy for a PDE-constrained optimization approach for 3D-1D coupled problems

Coupled 3D-1D problems arise in many practical applications, in an attempt to reduce the computational burden in simulations where cylindrical inclusions with a small section are embedded in a much larger domain. Nonetheless the resolution of such problems can be non trivial, both from a mathematical and a geometrical standpoint. Indeed 3D-1D coupling requires to operate in non standard function spaces, and, also, simulation geometries can be complex for the presence of multiple intersecting domains. Recently, a PDE-constrained optimization based formulation has been proposed for such problems, proving a well posed mathematical formulation and allowing for the use of non conforming meshes for the discrete problem. Here an unconstrained optimization formulation of the problem is derived and an efficient gradient based solver is proposed for such formulation. Some numerical tests on quite complex configurations are discussed to show the viability of the method.
Computersarxiv.org

Error Mitigation for Deep Quantum Optimization Circuits by Leveraging Problem Symmetries

High error rates and limited fidelity of quantum gates in near-term quantum devices are the central obstacles to successful execution of the Quantum Approximate Optimization Algorithm (QAOA). In this paper we introduce an application-specific approach for mitigating the errors in QAOA evolution by leveraging the symmetries present in the classical objective function to be optimized. Specifically, the QAOA state is projected into the symmetry-restricted subspace, with projection being performed either at the end of the circuit or throughout the evolution. Our approach improves the fidelity of the QAOA state, thereby increasing both the accuracy of the sample estimate of the QAOA objective and the probability of sampling the binary string corresponding to that objective value. We demonstrate the efficacy of the proposed methods on QAOA applied to the MaxCut problem, although our methods are general and apply to any objective function with symmetries, as well as to the generalization of QAOA with alternative mixers. We experimentally verify the proposed methods on an IBM Quantum processor, utilizing up to 5 qubits. When leveraging a global bit-flip symmetry, our approach leads to a 23% average improvement in quantum state fidelity.
Mathematicsarxiv.org

Vector Quantized Models for Planning

Recent developments in the field of model-based RL have proven successful in a range of environments, especially ones where planning is essential. However, such successes have been limited to deterministic fully-observed environments. We present a new approach that handles stochastic and partially-observable environments. Our key insight is to use discrete autoencoders to capture the multiple possible effects of an action in a stochastic environment. We use a stochastic variant of \emph{Monte Carlo tree search} to plan over both the agent's actions and the discrete latent variables representing the environment's response. Our approach significantly outperforms an offline version of MuZero on a stochastic interpretation of chess where the opponent is considered part of the environment. We also show that our approach scales to \emph{DeepMind Lab}, a first-person 3D environment with large visual observations and partial observability.
Mathematicsarxiv.org

Quantum Reduction of Finding Short Code Vectors to the Decoding Problem

We give a quantum reduction from finding short codewords in a random linear code to decoding for the Hamming metric. This is the first time such a reduction (classical or quantum) has been obtained. Our reduction adapts to linear codes Stehlé-Steinfield-Tanaka-Xagawa' re-interpretation of Regev's quantum reduction from finding short lattice vectors to solving the Closest Vector Problem. The Hamming metric is a much coarser metric than the Euclidean metric and this adaptation has needed several new ingredients to make it work. For instance, in order to have a meaningful reduction it is necessary in the Hamming metric to choose a very large decoding radius and this needs in many cases to go beyond the radius where decoding is unique. Another crucial step for the analysis of the reduction is the choice of the errors that are being fed to the decoding algorithm. For lattices, errors are usually sampled according to a Gaussian distribution. However, it turns out that the Bernoulli distribution (the analogue for codes of the Gaussian) is too much spread out and can not be used for the reduction with codes. Instead we choose here the uniform distribution over errors of a fixed weight and bring in orthogonal polynomials tools to perform the analysis and an additional amplitude amplification step to obtain the aforementioned result.
Computersarxiv.org

Efficient Active Search for Combinatorial Optimization Problems

Recently numerous machine learning based methods for combinatorial optimization problems have been proposed that learn to construct solutions in a sequential decision process via reinforcement learning. While these methods can be easily combined with search strategies like sampling and beam search, it is not straightforward to integrate them into a high-level search procedure offering strong search guidance. Bello et al. (2016) propose active search, which adjusts the weights of a (trained) model with respect to a single instance at test time using reinforcement learning. While active search is simple to implement, it is not competitive with state-of-the-art methods because adjusting all model weights for each test instance is very time and memory intensive. Instead of updating all model weights, we propose and evaluate three efficient active search strategies that only update a subset of parameters during the search. The proposed methods offer a simple way to significantly improve the search performance of a given model and outperform state-of-the-art machine learning based methods on combinatorial problems, even surpassing the well-known heuristic solver LKH3 on the capacitated vehicle routing problem. Finally, we show that (efficient) active search enables learned models to effectively solve instances that are much larger than those seen during training.
Softwarearxiv.org

New Insights into Metric Optimization for Ranking-based Recommendation

Direct optimization of IR metrics has often been adopted as an approach to devise and develop ranking-based recommender systems. Most methods following this approach aim at optimizing the same metric being used for evaluation, under the assumption that this will lead to the best performance. A number of studies of this practice bring this assumption, however, into question. In this paper, we dig deeper into this issue in order to learn more about the effects of the choice of the metric to optimize on the performance of a ranking-based recommender system. We present an extensive experimental study conducted on different datasets in both pairwise and listwise learning-to-rank scenarios, to compare the relative merit of four popular IR metrics, namely RR, AP, nDCG and RBP, when used for optimization and assessment of recommender systems in various combinations. For the first three, we follow the practice of loss function formulation available in literature. For the fourth one, we propose novel loss functions inspired by RBP for both the pairwise and listwise scenario. Our results confirm that the best performance is indeed not necessarily achieved when optimizing the same metric being used for evaluation. In fact, we find that RBP-inspired losses perform at least as well as other metrics in a consistent way, and offer clear benefits in several cases. Interesting to see is that RBP-inspired losses, while improving the recommendation performance for all uses, may lead to an individual performance gain that is correlated with the activity level of a user in interacting with items. The more active the users, the more they benefit. Overall, our results challenge the assumption behind the current research practice of optimizing and evaluating the same metric, and point to RBP-based optimization instead as a promising alternative when learning to rank in the recommendation context.
Sciencearxiv.org

Complete Realization of Energy Landscape and Non-equilibrium Trapping Dynamics in Spin Glass and Optimization Problem

Energy landscapes are high-dimensional surfaces representing the dependence of system energy on variable configurations, which determine crucially the system's emergent behavior but are difficult to be analyzed due to their high-dimensional nature. In this article, we introduce an approach to reveal the complete energy landscapes of small spin glasses and Boolean satisfiability problems, which also unravels their non-equilibrium dynamics at an arbitrary temperature for an arbitrarily long time. In contrary to our common belief, our results show that it can be less likely to identify the ground states when temperature decreases, due to trapping in individual local minima, which ceases at different time, leading to multiple abrupt jumps with time in the ground-state probability. Simulations agree well with theoretical predictions on these remarkable phenomena. Finally, for large systems, we introduce a variant approach to extract partially the energy landscapes and observe both analytically and in simulations similar phenomena. This work introduces new methodology to unravel the non-equilibrium dynamics of glassy systems, and provides us with a clear, complete and new physical picture on their long-time behaviors inaccessible by modern numerics.
Mathematicsarxiv.org

Learning the optimal regularizer for inverse problems

In this work, we consider the linear inverse problem $y=Ax+\epsilon$, where $A\colon X\to Y$ is a known linear operator between the separable Hilbert spaces $X$ and $Y$, $x$ is a random variable in $X$ and $\epsilon$ is a zero-mean random process in $Y$. This setting covers several inverse problems in imaging including denoising, deblurring, and X-ray tomography. Within the classical framework of regularization, we focus on the case where the regularization functional is not given a priori but learned from data. Our first result is a characterization of the optimal generalized Tikhonov regularizer, with respect to the mean squared error. We find that it is completely independent of the forward operator $A$ and depends only on the mean and covariance of $x$. Then, we consider the problem of learning the regularizer from a finite training set in two different frameworks: one supervised, based on samples of both $x$ and $y$, and one unsupervised, based only on samples of $x$. In both cases, we prove generalization bounds, under some weak assumptions on the distribution of $x$ and $\epsilon$, including the case of sub-Gaussian variables. Our bounds hold in infinite-dimensional spaces, thereby showing that finer and finer discretizations do not make this learning problem harder. The results are validated through numerical simulations.
Coding & Programmingarxiv.org

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning

In the predict-then-optimize framework, the objective is to train a predictive model, mapping from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved. Recent work on decision-focused learning shows that embedding the optimization problem in the training pipeline can improve decision quality and help generalize better to unseen tasks compared to relying on an intermediate loss function for evaluating prediction quality. We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) that are solved via reinforcement learning. In particular, we are given environment features and a set of trajectories from training MDPs, which we use to train a predictive model that generalizes to unseen test MDPs without trajectories. Two significant computational challenges arise in applying decision-focused learning to MDPs: (i) large state and action spaces make it infeasible for existing techniques to differentiate through MDP problems, and (ii) the high-dimensional policy space, as parameterized by a neural network, makes differentiating through a policy expensive. We resolve the first challenge by sampling provably unbiased derivatives to approximate and differentiate through optimality conditions, and the second challenge by using a low-rank approximation to the high-dimensional sample-based derivatives. We implement both Bellman--based and policy gradient--based decision-focused learning on three different MDP problems with missing parameters, and show that decision-focused learning performs better in generalization to unseen tasks.
Coding & ProgrammingIntel iQ

Adding vectors using oneMKL

I want to add 2 vectors using the following command:. oneapi::mkl::vm::add ( general parameters ) The function signature says that it adds vec_a and vec_b and stores the result in vec_y. However, in my case, I want to perform vec_a = vec_a + vec_b, multiple times. I want to avoid...
Sciencearxiv.org

Three-loop color-kinematics duality: 24-dimensional solution space induced by new generalized gauge transformations

We obtain full-color three-loop three-point form factors of the stress-tensor supermultiplet and also of a length-3 half-BPS operator in N=4 SYM based on the color-kinematics duality and on-shell unitarity. The integrand results pass all planar and non-planar unitarity cuts, while satisfying the minimal power-counting of loop momenta and diagrammatic symmetries. Surprisingly, the three-loop solutions, while manifesting all dual Jacobi relations, contain a large number of free parameters; in particular, there are 24 free parameters for the form factor of stress-tensor supermultiplet. Such degrees of freedom are due to a new type of generalized gauge transformation associated with the operator insertion for form factors.The form factors we obtain can be understood as the N=4 SYM counterparts of three-loop Higgs plus three-gluon amplitudes in QCD and are expected to provide the maximally transcendental parts of the latter.
Sciencearxiv.org

Canonical Cortical Circuits and the Duality of Bayesian Inference and Optimal Control

The duality of sensory inference and optimal control was known since 1960s and has recently been recognized as common computations required for posterior distributions in dynamic Bayesian inference and value functions in optimal control. Meanwhile, an intriguing question about the brain is why entire neocortex shares a canonical six-layer architecture, while its posterior and anterior halves are engaged in sensory processing and motor control, respectively. Here we consider a hypothesis that the anterior and posterior cortical circuits evolved for dual computations for sensory inference and optimal control, or perceptual and value-based decision making, respectively. We explore how different types of cortical neurons may represent different variables, such as prior and posterior distributions and value functions, and what cortical dynamics may realize required computations. We further discuss experimental and computational approaches are required for scrutinizing this dual cortical circuit hypothesis.
Sciencearxiv.org

Large-scale optimal transport map estimation using projection pursuit

This paper studies the estimation of large-scale optimal transport maps (OTM), which is a well-known challenging problem owing to the curse of dimensionality. Existing literature approximates the large-scale OTM by a series of one-dimensional OTM problems through iterative random projection. Such methods, however, suffer from slow or none convergence in practice due to the nature of randomly selected projection directions. Instead, we propose an estimation method of large-scale OTM by combining the idea of projection pursuit regression and sufficient dimension reduction. The proposed method, named projection pursuit Monge map (PPMM), adaptively selects the most ``informative'' projection direction in each iteration. We theoretically show the proposed dimension reduction method can consistently estimate the most ``informative'' projection direction in each iteration. Furthermore, the PPMM algorithm weakly convergences to the target large-scale OTM in a reasonable number of steps. Empirically, PPMM is computationally easy and converges fast. We assess its finite sample performance through the applications of Wasserstein distance estimation and generative models.
Computersarxiv.org

Efficient solution method based on inverse dynamics for optimal control problems of rigid body systems

We propose an efficient way of solving optimal control problems for rigid-body systems on the basis of inverse dynamics and the multiple-shooting method. We treat all variables, including the state, acceleration, and control input torques, as optimization variables and treat the inverse dynamics as an equality constraint. We eliminate the update of the control input torques from the linear equation of Newton's method by applying condensing for inverse dynamics. The size of the resultant linear equation is the same as that of the multiple-shooting method based on forward dynamics except for the variables related to the passive joints and contacts. Compared with the conventional methods based on forward dynamics, the proposed method reduces the computational cost of the dynamics and their sensitivities by utilizing the recursive Newton-Euler algorithm (RNEA) and its partial derivatives. In addition, it increases the sparsity of the Hessian of the Karush-Kuhn-Tucker conditions, which reduces the computational cost, e.g., of Riccati recursion. Numerical experiments show that the proposed method outperforms state-of-the-art implementations of differential dynamic programming based on forward dynamics in terms of computational time and numerical robustness.
Sciencearxiv.org

Numerical Solution of the $L^1$-Optimal Transport Problem on Surfaces

In this article we study the numerical solution of the $L^1$-Optimal Transport Problem on 2D surfaces embedded in $R^3$, via the DMK formulation introduced in [FaccaCardinPutti:2018]. We extend from the Euclidean into the Riemannian setting the DMK model and conjecture the equivalence with the solution Monge-Kantorovich equations, a PDE-based formulation of the $L^1$-Optimal Transport Problem.
Softwarearxiv.org

Scaling optical computing in synthetic frequency dimension using integrated cavity acousto-optics

Optical computing with integrated photonics brings a pivotal paradigm shift to data-intensive computing technologies. However, the scaling of on-chip photonic architectures using spatially distributed schemes faces the challenge imposed by the fundamental limit of integration density. Synthetic dimensions of light offer the opportunity to extend the length of operand vectors within a single photonic component. Here, we show that large-scale, complex-valued matrix-vector multiplications on synthetic frequency lattices can be performed using an ultra-efficient, silicon-based nanophotonic cavity acousto-optic modulator. By harnessing the resonantly enhanced strong electro-optomechanical coupling, we achieve, in a single such modulator, the full-range phase-coherent frequency conversions across the entire synthetic lattice, which constitute a fully connected linear computing layer. Our demonstrations open up the route towards the experimental realizations of frequency-domain integrated optical computing systems simultaneously featuring very large-scale data processing and small device footprints.
Computersarxiv.org

Cardinality-constrained optimization problems in general position and beyond

We study cardinality-constrained optimization problems (CCOP) in general position, i. e. those optimization-related properties that are fulfilled for a dense and open subset of their defining functions. We show that the well-known cardinality-constrained linear independence constraint qualification (CC-LICQ) is generic in this sense. For M-stationary points we define nondegeneracy and show that it is a generic property too. In particular, the sparsity constraint turns out to be active at all minimizers of a generic CCOP. Moreover, we describe the global structure of CCOP in the sense of Morse theory, emphasizing the strength of the generic approach. Here, we prove that multiple cells need to be attached, each of dimension coinciding with the proposed M-index of nondegenerate M-stationary points. Beyond this generic viewpoint, we study singularities of CCOP. For that, the relation between nondegeneracy and strong stability in the sense of Kojima (1980) is examined. We show that nondegeneracy implies the latter, while the reverse implication is in general not true. To fill the gap, we fully characterize the strong stability of M-stationary points under CC-LICQ by first- and second-order information of CCOP defining functions. Finally, we compare nondegeneracy and strong stability of M-stationary points with second-order sufficient conditions recently introduced in the literature.