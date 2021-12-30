ContributorsPublishersAdvertisers
Coding & Programming

Analyzing Approximate Value Iteration Algorithms

By Arunselvan Ramaswamy
informs.org
 5 days ago

In this paper, we consider the stochastic iterative counterpart of the value iteration scheme wherein only noisy and possibly biased approximations of the Bellman operator are available. We call this counterpart the approximate value iteration (AVI) scheme. Neural networks are often used as function approximators, in...

pubsonline.informs.org

Comments / 0

Related
Lifehacker

How to Tell Which Apps Can See Your Private iPhone Data

Every year, Apple releases new features that both improve data privacy on the iPhone, and set a new benchmark for the industry as a whole. With iOS 15, it’s all about transparency. iOS 15.2 brings a new feature called App Privacy Report that provides a visual, easy-to-read report of all the ways an app is using or transmitting your private data.
CELL PHONES
HackerNoon

Where Visuals And Algorithms Collide: How Unrelated Algorithms Produce Intuitive Markings

A nautilus seashell with a perfect spiral is the product of specific DNA that coded for its existence. Visuals in your application behave similarly; their appearance is controlled by a particular set of seemingly distant algorithms that come together in just the right way. I will show how unrelated algorithms may be used in conjunction to produce intuitive markings on a chart.
TECHNOLOGY
Wired

A Move for 'Algorithmic Reparation' Calls for Racial Justice in AI

Now teams of sociologists and computer science researchers say the builders and deployers of AI models should consider race more explicitly, by leaning on concepts such as critical race theory and intersectionality. One approach presented before the American Sociological Association earlier this year coins the term algorithmic reparation. In a...
TECHNOLOGY
IN THIS ARTICLE
#Avi#Algorithms#Stochastic
TheSpoon

CES 2022 Preview: Carbon Origins to Wants to Merge Robot Delivery With the Metaverse

If you’re looking to get a fresh start on a new career in 2022, may I suggest a new occupation as a virtual reality robot delivery driver?. Yes, that’s a job – or at least a new gig – being offered by a startup out of Minneapolis called Carbon Origins. The company, which is building a refrigerated sidewalk delivery robot by the name of Skippy, is looking to assemble a roster of remote robot pilots who will utilize virtual reality technology to pilot Skippy around to businesses and consumer homes.
ELECTRONICS
towardsdatascience.com

Can Algorithms be Racist?

A brief exploration of juristic and social dilemmas within language models. As artificial intelligence (A.I.) continues to rapidly integrate within everyday life, there are a few ethical dilemmas that have arisen synchronously and their impact on use cases have become the subject of much debate (Kilbertus et al., 2017; Hardt et al., 2016; Pazzanese, 2020). One such predicament that this paper hinges on has to do with inclusivity and marginalization (Bender et al., 2021). How are notions of participation affected by training data that reinforce hegemonic power in the formation of algorithmic models? Accordingly, this article will seek to spotlight ethical challenges within A.I. via a grounded interpretivist viewpoint gained by qualitatively investigating the literature in order to discuss bias amplifications. As outlined by Bender et al., (2021), there are several juristic and social dilemmas regarding the growth and utilization of language models. This includes the reinforcement of hegemonic power over marginalized populations as a result of training data characteristics that reify power imbalances.
TECHNOLOGY
informs.org

Diffusion Approximations for a Class of Sequential Experimentation Problems

A decision maker (DM) must choose an action in order to maximize a reward function that depends on the DM’s action as well as on an unknown parameter Θ. The DM can delay taking the action in order to experiment and gather additional information on Θ. We model the problem using a Bayesian sequential experimentation framework and use dynamic programming and diffusion-asymptotic analysis to solve it. For that, we consider environments in which the average number of experiments that is conducted per unit of time is large and the informativeness of each individual experiment is low. Under such regimes, we derive a diffusion approximation for the sequential experimentation problem, which provides a number of important insights about the nature of the problem and its solution. First, it reveals that the problems of (i) selecting the optimal sequence of experiments to use and (ii) deciding the optimal time when to stop experimenting decouple and can be solved independently. Second, it shows that an optimal experimentation policy is one that chooses the experiment that maximizes the instantaneous volatility of the belief process. Third, the diffusion approximation provides a more mathematically malleable formulation that we can solve in closed form and suggests efficient heuristics for the nonasympototic regime. Our solution method also shows that the complexity of the problem grows only quadratically with the cardinality of the set of actions from which the decision maker can choose. We illustrate our methodology and results using a concrete application in the context of assortment selection and new product introduction. Specifically, we study the problem of a seller who wants to select an optimal assortment of products to launch into the marketplace and is uncertain about consumers’ preferences. Motivated by emerging practices in e-commerce, we assume that the seller is able to use a crowd voting system to learn these preferences before a final assortment decision is made. In this context, we undertake an extensive numerical analysis to assess the value of learning and demonstrate the effectiveness and robustness of the heuristics derived from the diffusion approximation.
SCIENCE
YOU MAY ALSO LIKE
NewsBreak
Technology
NewsBreak
Computers
NewsBreak
Science
NewsBreak
Coding & Programming
NewsBreak
Computer Science
informs.org

An Optimal Approximation for Submodular Maximization Under a Matroid Constraint in the Adaptive Complexity Model

In this paper, we study submodular maximization under a matroid constraint in the adaptive complexity model. This model was recently introduced in the context of submodular optimization to quantify the information theoretic complexity of black-box optimization in a parallel computation model. Despite the burst in work on submodular maximization in the adaptive complexity model, the fundamental problem of maximizing a monotone submodular function under a matroid constraint has remained elusive. In particular, all known techniques fail for this problem and there are no known constant factor approximation algorithms whose adaptivity is sublinear in the rank of the matroid k or in the worst case sublinear in the size of the ground set n. We present an algorithm that has an approximation guarantee arbitrarily close to the optimal.
COMPUTERS
informs.org

Bayesian Exploration: Incentivizing Exploration in Bayesian Games

We consider a ubiquitous scenario in the internet economy when individual decision makers (henceforth, agents) both produce and consume information as they make strategic choices in an uncertain environment. This creates a three-way trade-off between exploration (trying out insufficiently explored alternatives to help others in the future), exploitation (making optimal decisions given the information discovered by other agents), and incentives of the agents (who are myopically interested in exploitation while preferring the others to explore). We posit a principal who controls the flow of information from agents that came before to the ones that arrive later and strives to coordinate the agents toward a socially optimal balance between exploration and exploitation, not using any monetary transfers. The goal is to design a recommendation policy for the principal that respects agents’ incentives and minimizes a suitable notion of regret. We extend prior work in this direction to allow the agents to interact with one another in a shared environment: at each time step, multiple agents arrive to play a Bayesian game, receive recommendations, choose their actions, receive their payoffs, and then leave the game forever. The agents now face two sources of uncertainty: the actions of the other agents and the parameters of the uncertain game environment. Our main contribution is to show that the principal can achieve constant regret when the utilities are deterministic (the constant depends on the prior distribution but not on the time horizon) and logarithmic regret when the utilities are stochastic. As a key technical tool, we introduce the concept of explorable actions, the actions that some incentive-compatible policy can recommend with nonzero probability. We show how the principal can identify (and explore) all explorable actions and use the revealed information to perform optimally. In particular, our results significantly improve over the prior work on the special case of a single agent per round, which relies on assumptions to guarantee that all actions are explorable. Interestingly, we do not require the principal’s utility to be aligned with the cumulative utility of the agents; instead, the principal can optimize an arbitrary notion of per-round reward.
VIDEO GAMES
informs.org

Configuring the Enterprise Systems Portfolio: The Role of Information Risk

We investigate how public firms configure their enterprise systems (ES) portfolio when faced with information risk, which refers to the likelihood that corporate financial information is of poor quality. We focus on firms’ configuration of their ES portfolio by introducing a novel construct: ES portfolio balance, or the relative proportion of two categories of ES modules, operational and functional. We draw on the theory of information processing to hypothesize the impact of information risk on ES portfolio balance and how this impact is affected by internal controls. We construct a multisource panel data set of 697 firms and 1,993 firm-year observations from 2005 to 2008 and use econometric and multivariate procedures to test our hypotheses. We find that when faced with an increase in information risk, firms change their ES portfolio balance more toward operational modules. However, when such firms are also faced with materially weak internal controls, they change their ES portfolio balance more toward functional modules instead. These findings expand our understanding of how firms’ information processing needs drive the configuration of their ES portfolio and, more broadly, IT resources portfolio.
ECONOMY
informs.org

On a Reduction for a Class of Resource Allocation Problems

In the resource allocation problem (RAP), the goal is to divide a given amount of a resource over a set of activities while minimizing the cost of this allocation and possibly satisfying constraints on allocations to subsets of the activities. Most solution approaches for the RAP and its extensions allow each activity to have its own cost function. However, in many applications, often the structure of the objective function is the same for each activity, and the difference between the cost functions lies in different parameter choices, such as, for example, the multiplicative factors. In this article, we introduce a new class of objective functions that captures a significant number of the objectives occurring in studied applications. These objectives are characterized by a shared structure of the cost function depending on two input parameters. We show that, given the two input parameters, there exists a solution to the RAP that is optimal for any choice of the shared structure. As a consequence, this problem reduces to the quadratic RAP, making available the vast amount of solution approaches and algorithms for the latter problem. We show the impact of our reduction result on several applications, and in particular, we improve the best-known worst-case complexity bound of two problems in vessel routing and processor scheduling from.
ECONOMY
informs.org

A General Framework for Approximating Min Sum Ordering Problems

We consider a large family of problems in which an ordering (or, more precisely, a chain of subsets) of a finite set must be chosen to minimize some weighted sum of costs. This family includes variations of min sum set cover, several scheduling and search problems, and problems in Boolean function evaluation. We define a new problem, called the min sum ordering problem (MSOP), which generalizes all these problems using a cost and a weight function defined on subsets of a finite set. Assuming a polynomial time α-approximation algorithm for the problem of finding a subset whose ratio of weight to cost is maximal, we show that under very minimal assumptions, there is a polynomial time.
CODING & PROGRAMMING
informs.org

Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization

We propose a random search method for solving a class of simulation optimization problems with Lipschitz continuity properties. The algorithm samples candidate solutions from a parameterized probability distribution over the solution space and estimates the performance of the sampled points through an asynchronous learning procedure based on the so-called shrinking ball method. A distinctive feature of the algorithm is that it fully retains the previous simulation information and incorporates an approximation architecture to exploit knowledge of the objective function in searching for improved solutions. Each step of the algorithm involves simultaneous adaptation of a parameterized distribution and an approximator of the objective function, which is akin to the actor-critic structure used in reinforcement learning. We establish a finite-time probability bound on the algorithm’s performance and show its global convergence when only a single simulation observation is collected at each iteration. Empirical results indicate that the algorithm is promising and may outperform some of the existing procedures in terms of efficiency and reliability.
CODING & PROGRAMMING
informs.org

False Discovery in A/B Testing

We investigate what fraction of all significant results in website A/B testing is actually null effects (i.e., the false discovery rate (FDR)). Our data consist of 4,964 effects from 2,766 experiments conducted on a commercial A/B testing platform. Using three different methods, we find that the FDR ranges between 28% and 37% for tests conducted at 10% significance and between 18% and 25% for tests at 5% significance (two sided). These high FDRs stem mostly from the high fraction of true null effects, about 70%, rather than from low power. Using our estimates, we also assess the potential of various A/B test designs to reduce the FDR. The two main implications are that decision makers should expect one in five interventions achieving significance at 5% confidence to be ineffective when deployed in the field and that analysts should consider using two-stage designs with multiple variations rather than basic A/B tests.
TECHNOLOGY
softpedia.com

Image Analyzer

Is a program that you can use to view and edit image files. The interface of the tool is plain and easy to navigate through. Image Analyzer definitely needs some improvements when it comes to its appearance, since it's a little outdated. Pictures can be opened via the file browser...
SOFTWARE
StreetInsider.com

Nvidia embraces the metaverse with new software, marketplace deals

(Reuters) - Nvidia Corp on Tuesday said that it would give away software for free to artists and other creators building virtual worlds for the metaverse and that it has made technology deals with several marketplaces where artists sell the three-dimensional content they create. The metaverse - a loosely defined...
SOFTWARE
towardsdatascience.com

Profiling and Analyzing Performance of Python Programs

The tools and techniques for finding all the bottlenecks in your Python programs and fixing them, fast. Profiling is integral to any code and performance optimization. Any experience and skill in performance optimization that you might already have will not be very useful if you don’t know where to apply it. Therefore, finding bottlenecks in your applications can help you solve performance issues quickly with very little overall effort.
CODING & PROGRAMMING

Comments / 0

Community Policy