Estimating Instance-dependent Label-noise Transition Matrix using DNNs

By Shuo Yang, Erkun Yang, Bo Han, Yang Liu, Min Xu, Gang Niu, Tongliang Liu
arxiv.org
 22 days ago

In label-noise learning, estimating the transition matrix is a hot topic as the matrix plays an important role in building statistically consistent classifiers. Traditionally, the transition from clean distribution to noisy distribution (i.e., clean label transition matrix) has been widely exploited to learn a clean label classifier by employing the noisy data. Motivated by that classifiers mostly output Bayes optimal labels for prediction, in this paper, we study to directly model the transition from Bayes optimal distribution to noisy distribution (i.e., Bayes label transition matrix) and learn a Bayes optimal label classifier. Note that given only noisy data, it is ill-posed to estimate either the clean label transition matrix or the Bayes label transition matrix. But favorably, Bayes optimal labels are less uncertain compared with the clean labels, i.e., the class posteriors of Bayes optimal labels are one-hot vectors while those of clean labels are not. This enables two advantages to estimate the Bayes label transition matrix, i.e., (a) we could theoretically recover a set of Bayes optimal labels under mild conditions; (b) the feasible solution space is much smaller. By exploiting the advantages, we estimate the Bayes label transition matrix by employing a deep neural network in a parameterized way, leading to better generalization and superior classification performance.

arxiv.org
#The Matrix#Hot Topic#Machine Learning#Lg
Related
Computersarxiv.org

Matrix completion with data-dependent missingness probabilities

The problem of completing a large matrix with lots of missing entries has received widespread attention in the last couple of decades. Two popular approaches to the matrix completion problem are based on singular value thresholding and nuclear norm minimization. Most of the past works on this subject assume that there is a single number $p$ such that each entry of the matrix is available independently with probability $p$ and missing otherwise. This assumption may not be realistic for many applications. In this work, we replace it with the assumption that the probability that an entry is available is an unknown function $f$ of the entry itself. For example, if the entry is the rating given to a movie by a viewer, then it seems plausible that high value entries have greater probability of being available than low value entries. We propose two new estimators, based on singular value thresholding and nuclear norm minimization, to recover the matrix under this assumption. The estimators are shown to be consistent under a low rank assumption. We also provide a consistent estimator of the unknown function $f$.
Coding & Programmingtowardsdatascience.com

Adding confidence intervals to maximum likelihood estimates using

While working on a package called evt for extreme value theory in Python, it was necessary to add confidence intervals to a maximum likelihood estimate. Based on design considerations, the implementation that was considered to be optimal, is not the prettiest. Nevertheless, it will offer an interesting read for those interested. This post touches upon some interesting parts of statistics, and concludes by showing an example of the power of symbolic algebra in the actual code. First, a quick introduction into the distribution.
Sciencearxiv.org

Chaotic time-delay signature suppression using quantum noise

Time-delay signature (TDS) suppression of semiconductor lasers with external optical feedback is necessary to ensure the security of chaos-based secure communications. Here we numerically and experimentally demonstrate a technique to effectively suppress the TDS of chaotic lasers using quantum noise. The TDS and dynamical complexity are quantified using the autocorrelation function and normalized permutation entropy at the feedback delay time, respectively. Quantum noise from quadrature fluctuations of vacuum state is prepared through balanced homodyne measurement. The effects of strength and bandwidth of quantum noise on chaotic TDS suppression and complexity enhancement are investigated numerically and experimentally. Compared to the original dynamics, the TDS of this quantum-noise improved chaos is suppressed up to 94% and the bandwidth suppression ratio of quantum noise to chaotic laser is 1:25. The experiment agrees well with the theory. The improved chaotic laser is potentially beneficial to chaos-based random number generation and secure communication.
Astronomyarxiv.org

On the Estimation of the Depth of Maximum of Extensive Air Showers Using the Steepness Parameter of the Lateral Distribution of Cherenkov Radiation

Using Monte Carlo simulation of extensive air showers, we showed that the maximum depth of showers, $X_{max}$ can be estimated using $P=Q(100)/Q(200)$, the ratio of Cherenkov photon densities at 100 and 200 meters from the shower core, which is known as the steepness parameter of the lateral distribution of Cherenkov radiation on the ground. A simple quadratic model has been fitted to a set of data from simulated extensive air showers, relating the steepness parameter and the shower maximum depth. Then the model has been tested on another set of simulated showers. The average difference between the actual maximum depth of the simulated showers and the maximum depth obtained from the lateral distribution of Cherenkov light is about 9 $g/cm^2$. In addition, possibility of a more direct estimation of the mass of the initial particle from $P$ has been investigated. An exponential relation between these two quantities has been fitted. Applying the model to another set of showers, we found that the average difference between the estimated and the actual mass of primary particles is less than 0.5 atomic mass unit.
Coding & Programmingarxiv.org

Points2Polygons: Context-Based Segmentation from Weak Labels Using Adversarial Networks

In applied image segmentation tasks, the ability to provide numerous and precise labels for training is paramount to the accuracy of the model at inference time. However, this overhead is often neglected, and recently proposed segmentation architectures rely heavily on the availability and fidelity of ground truth labels to achieve state-of-the-art accuracies. Failure to acknowledge the difficulty in creating adequate ground truths can lead to an over-reliance on pre-trained models or a lack of adoption in real-world applications. We introduce Points2Polygons (P2P), a model which makes use of contextual metric learning techniques that directly addresses this problem. Points2Polygons performs well against existing fully-supervised segmentation baselines with limited training data, despite using lightweight segmentation models (U-Net with a ResNet18 backbone) and having access to only weak labels in the form of object centroids and no pre-training. We demonstrate this on several different small but non-trivial datasets. We show that metric learning using contextual data provides key insights for self-supervised tasks in general, and allow segmentation models to easily generalize across traditionally label-intensive domains in computer vision.
Physicsarxiv.org

Noise-Resilient Phase Transitions and Limit-Cycles in Coupled Kerr Oscillators

Driven-dissipative quantum many-body systems have been the subject of many studies in recent years. They possess unique, novel classes of dissipation-stabilized quantum many-body phases including the limit cycle. For a long time it has been speculated if such a behavior, a recurring phenomenon in non-linear classical and quantum many-body systems, can be classified as a time crystal. However, the robustness of these periodic dynamics, against quantum fluctuations is an open question. In this work we seek the answer to this question in a canonical yet important system, i.e., a multi-mode cavity with self and cross-Kerr non-linearity, including the fluctuation effects via higher order correlations. Employing the Keldysh path integral, we investigate the Green's function and correlation of the cavity modes in different regions. Furthermore, we extend our analysis beyond the mean-field by explicitly including the effect of two-body correlations via the 2nd-cumulant expansion. Our results shed light on the emergence of dissipative phase transitions in open quantum systems and clearly indicate the robustness of limit-cycle oscillations in the presence of the quantum fluctuations.
Trafficarxiv.org

Estimating parking occupancy using smart meter transaction data

The excessive search for parking, known as cruising, generates pollution and congestion. Cities are looking for approaches that will reduce the negative impact associated with searching for parking. However, adequately measuring the number of vehicles in search of parking is difficult and requires sensing technologies. In this paper, we develop an approach that eliminates the need for sensing technology by using parking meter payment transactions to estimate parking occupancy and the number of cars searching for parking. The estimation scheme is based on Particle Markov Chain Monte Carlo. We validate the performance of the Particle Markov Chain Monte Carlo approach using data simulated from a GI/GI/s queue. We show that the approach generates asymptotically unbiased Bayesian estimates of the parking occupancy and underlying model parameters such as arrival rates, average parking time, and the payment compliance rate. Finally, we estimate parking occupancy and cruising using parking meter data from SFpark, a large scale parking experiment and subsequently, compare the Particle Markov Chain Monte Carlo parking occupancy estimates against the ground truth data from the parking sensors. Our approach is easily replicated and scalable given that it only requires using data that cities already possess, namely historical parking payment transactions.
Softwarearxiv.org

Region-aware Adaptive Instance Normalization for Image Harmonization

Image composition plays a common but important role in photo editing. To acquire photo-realistic composite images, one must adjust the appearance and visual style of the foreground to be compatible with the background. Existing deep learning methods for harmonizing composite images directly learn an image mapping network from the composite to the real one, without explicit exploration on visual style consistency between the background and the foreground images. To ensure the visual style consistency between the foreground and the background, in this paper, we treat image harmonization as a style transfer problem. In particular, we propose a simple yet effective Region-aware Adaptive Instance Normalization (RAIN) module, which explicitly formulates the visual style from the background and adaptively applies them to the foreground. With our settings, our RAIN module can be used as a drop-in module for existing image harmonization networks and is able to bring significant improvements. Extensive experiments on the existing image harmonization benchmark datasets show the superior capability of the proposed method. Code is available at {this https URL}.
Coding & Programmingarxiv.org

Parameter Estimation for Grouped Data Using EM and MCEM Algorithms

Nowadays, the confidentiality of data and information is of great importance for many companies and organizations. For this reason, they may prefer not to release exact data, but instead to grant researchers access to approximate data. For example, rather than providing the exact income of their clients, they may only provide researchers with grouped data, that is, the number of clients falling in each of a set of non-overlapping income intervals. The challenge is to estimate the mean and variance structure of the hidden ungrouped data based on the observed grouped data. To tackle this problem, this work considers the exact observed data likelihood and applies the Expectation-Maximization (EM) and Monte-Carlo EM (MCEM) algorithms for cases where the hidden data follow a univariate, bivariate, or multivariate normal distribution. The results are then compared with the case of ignoring the grouping and applying regular maximum likelihood. The well-known Galton data and simulated datasets are used to evaluate the properties of the proposed EM and MCEM algorithms.
Computersarxiv.org

Data-Efficient Instance Generation from Instance Discrimination

Generative Adversarial Networks (GANs) have significantly advanced image synthesis, however, the synthesis quality drops significantly given a limited amount of training data. To improve the data efficiency of GAN training, prior work typically employs data augmentation to mitigate the overfitting of the discriminator yet still learn the discriminator with a bi-classification (i.e., real vs. fake) task. In this work, we propose a data-efficient Instance Generation (InsGen) method based on instance discrimination. Concretely, besides differentiating the real domain from the fake domain, the discriminator is required to distinguish every individual image, no matter it comes from the training set or from the generator. In this way, the discriminator can benefit from the infinite synthesized samples for training, alleviating the overfitting problem caused by insufficient training data. A noise perturbation strategy is further introduced to improve its discriminative power. Meanwhile, the learned instance discrimination capability from the discriminator is in turn exploited to encourage the generator for diverse generation. Extensive experiments demonstrate the effectiveness of our method on a variety of datasets and training settings. Noticeably, on the setting of 2K training images from the FFHQ dataset, we outperform the state-of-the-art approach with 23.5% FID improvement.
Computershackaday.com

LED Matrix Hack Chat

Join us on Wednesday, June 9 at noon Pacific for the LED Matrix Hack Chat with Garrett Mace!. It’s pretty amazing how quickly light-emitting diodes went from physics lab curiosity to a mainstream commodity product made in the millions, if not billions. Everything about LEDs has gotten better, smaller, and cheaper over the years, going from an “any color you want as long as it’s red” phase to all the colors of the rainbow and beyond in a relatively short time. LEDs have worked their way into applications that just didn’t seem likely not that long ago, like architectural lighting, automotive applications, and even immense displays covering billboards, buildings, and sporting venues with multicolor, high-resolution displays.
Computersarxiv.org

Title:Label Noise SGD Provably Prefers Flat Global Minimizers

Abstract: In overparametrized models, the noise in stochastic gradient descent (SGD) implicitly regularizes the optimization trajectory and determines which local minimum SGD converges to. Motivated by empirical studies that demonstrate that training with noisy labels improves generalization, we study the implicit regularization effect of SGD with label noise. We show that SGD with label noise converges to a stationary point of a regularized loss $L(\theta) +\lambda R(\theta)$, where $L(\theta)$ is the training loss, $\lambda$ is an effective regularization parameter depending on the step size, strength of the label noise, and the batch size, and $R(\theta)$ is an explicit regularizer that penalizes sharp minimizers. Our analysis uncovers an additional regularization effect of large learning rates beyond the linear scaling rule that penalizes large eigenvalues of the Hessian more than small ones. We also prove extensions to classification with general loss functions, SGD with momentum, and SGD with general noise covariance, significantly strengthening the prior work of Blanc et al. to global convergence and large learning rates and of HaoChen et al. to general models.
Coding & Programmingmathworks.com

Short Transmission Model using Labels in MATLAB SIMULINK

Short Transmission Line model in Matlab simulink software developed by Dr. J. A. Laghari. This short transmission line model using Labels is designed in Matlab/simulink software. Commonly, different components and measurement blocks are connected directly through wires in MATLAB/SIMULINK model. This technique is suitable for small circuits consisting of two...
Technologyatoallinks.com

Why use a white label SEO audit tool

The potential advantages of looking into white label SEO can be very rewarding. In this article, I aim to inform on why white label SEO maybe the way forward, hopefully answering any uncertainties you have surrounding the tool. Why use a white label SEO audit tool. SEO is in demand...
Coding & Programmingarxiv.org

NRGNN: Learning a Label Noise-Resistant Graph Neural Network on Sparsely and Noisily Labeled Graphs

Graph Neural Networks (GNNs) have achieved promising results for semi-supervised learning tasks on graphs such as node classification. Despite the great success of GNNs, many real-world graphs are often sparsely and noisily labeled, which could significantly degrade the performance of GNNs, as the noisy information could propagate to unlabeled nodes via graph structure. Thus, it is important to develop a label noise-resistant GNN for semi-supervised node classification. Though extensive studies have been conducted to learn neural networks with noisy labels, they mostly focus on independent and identically distributed data and assume a large number of noisy labels are available, which are not directly applicable for GNNs. Thus, we investigate a novel problem of learning a robust GNN with noisy and limited labels. To alleviate the negative effects of label noise, we propose to link the unlabeled nodes with labeled nodes of high feature similarity to bring more clean label information. Furthermore, accurate pseudo labels could be obtained by this strategy to provide more supervision and further reduce the effects of label noise. Our theoretical and empirical analysis verify the effectiveness of these two strategies under mild conditions. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed method in learning a robust GNN with noisy and limited labels.
Technologyai-summary.com

Summary: Attention-based Deep Multiple Instance Learning

Before diving into code, let’s step back and consider why artificial intelligence is poised to transform healthcare. The remarkable advancements in AI that we see today are greatly attributed to the success of deep neural networks. This new era would not be possible without a perfect storm of the following four driving forces:
Coding & Programmingarxiv.org

Dynamic Instance-Wise Classification in Correlated Feature Spaces

In a typical supervised machine learning setting, the predictions on all test instances are based on a common subset of features discovered during model training. However, using a different subset of features that is most informative for each test instance individually may not only improve prediction accuracy, but also the overall interpretability of the model. At the same time, feature selection methods for classification have been known to be the most effective when many features are irrelevant and/or uncorrelated. In fact, feature selection ignoring correlations between features can lead to poor classification performance. In this work, a Bayesian network is utilized to model feature dependencies. Using the dependency network, a new method is proposed that sequentially selects the best feature to evaluate for each test instance individually, and stops the selection process to make a prediction once it determines that no further improvement can be achieved with respect to classification accuracy. The optimum number of features to acquire and the optimum classification strategy are derived for each test instance. The theoretical properties of the optimum solution are analyzed, and a new algorithm is proposed that takes advantage of these properties to implement a robust and scalable solution for high dimensional settings. The effectiveness, generalizability, and scalability of the proposed method is illustrated on a variety of real-world datasets from diverse application domains.
Technologyarxiv.org

An Efficient Point of Gaze Estimator for Low-Resolution Imaging Systems Using Extracted Ocular Features Based Neural Architecture

A user's eyes provide means for Human Computer Interaction (HCI) research as an important modal. The time to time scientific explorations of the eye has already seen an upsurge of the benefits in HCI applications from gaze estimation to the measure of attentiveness of a user looking at a screen for a given time period. The eye tracking system as an assisting, interactive tool can be incorporated by physically disabled individuals, fitted best for those who have eyes as only a limited set of communication. The threefold objective of this paper is - 1. To introduce a neural network based architecture to predict users' gaze at 9 positions displayed in the 11.31° visual range on the screen, through a low resolution based system such as a webcam in real time by learning various aspects of eyes as an ocular feature set. 2.A collection of coarsely supervised feature set obtained in real time which is also validated through the user case study presented in the paper for 21 individuals ( 17 men and 4 women ) from whom a 35k set of instances was derived with an accuracy score of 82.36% and f1_score of 82.2% and 3.A detailed study over applicability and underlying challenges of such systems. The experimental results verify the feasibility and validity of the proposed eye gaze tracking model.
Sciencearxiv.org

Large-scale optimal transport map estimation using projection pursuit

This paper studies the estimation of large-scale optimal transport maps (OTM), which is a well-known challenging problem owing to the curse of dimensionality. Existing literature approximates the large-scale OTM by a series of one-dimensional OTM problems through iterative random projection. Such methods, however, suffer from slow or none convergence in practice due to the nature of randomly selected projection directions. Instead, we propose an estimation method of large-scale OTM by combining the idea of projection pursuit regression and sufficient dimension reduction. The proposed method, named projection pursuit Monge map (PPMM), adaptively selects the most ``informative'' projection direction in each iteration. We theoretically show the proposed dimension reduction method can consistently estimate the most ``informative'' projection direction in each iteration. Furthermore, the PPMM algorithm weakly convergences to the target large-scale OTM in a reasonable number of steps. Empirically, PPMM is computationally easy and converges fast. We assess its finite sample performance through the applications of Wasserstein distance estimation and generative models.
Computersarxiv.org

Rare event estimation using stochastic spectral embedding

Estimating the probability of rare failure events is an essential step in the reliability assessment of engineering systems. Computing this failure probability for complex non-linear systems is challenging, and has recently spurred the development of active-learning reliability methods. These methods approximate the limit-state function (LSF) using surrogate models trained with a sequentially enriched set of model evaluations. A recently proposed method called stochastic spectral embedding (SSE) aims to improve the local approximation accuracy of global, spectral surrogate modelling techniques by sequentially embedding local residual expansions in subdomains of the input space. In this work we apply SSE to the LSF, giving rise to a stochastic spectral embedding-based reliability (SSER) method. The resulting partition of the input space decomposes the failure probability into a set of easy-to-compute domain-wise failure probabilities. We propose a set of modifications that tailor the algorithm to efficiently solve rare event estimation problems. These modifications include specialized refinement domain selection, partitioning and enrichment strategies. We showcase the algorithm performance on four benchmark problems of various dimensionality and complexity in the LSF.