A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation

 6 days ago

Robin Camarasa (1 and 2), Daniel Bos (2 and 3), Jeroen Hendrikse (4), Paul Nederkoorn (5), M. Eline Kooi (6), Aad van der Lugt (2), Marleen de Bruijne (1, 2 and 7), ((1) Biomedical Imaging Group Rotterdam, Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands, (2) Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands, (3) Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands, (4) Department of Radiology, University Medical Center Utrecht, Utrecht, The Netherlands, (5) Department of Neurology, Academic Medical Center University of Amsterdam, Amsterdam, The Netherlands, (6) Department of Radiology and Nuclear Medicine, CARIM School for Cardiovascular Diseases, Maastricht University Medical Center, Maastricht, The Netherlands, (7) Department of Computer Science, University of Copenhagen, Denmark)

A Quantitative Model for Optical Coherence Tomography

Optical coherence tomography (OCT) is a widely used imaging technique in the micrometer regime, which gained accelerating interest in medical imaging in the last twenty years. In up-to-date OCT literature [5,6] certain simplifying assumptions are made for the reconstructions, but for many applications a more realistic description of the OCT imaging process is of interest. In mathematical models, for example, the incident angle of light onto the sample is usually neglected or a plane wave description for the light-sample interaction in OCT is used, which ignores almost completely the occurring effects within an OCT measurement process. In this article, we make a first step to a quantitative model by considering the measured intensity as a combination of back-scattered Gaussian beams affected by the system. In contrast to the standard plane wave simplification, the presented model includes system relevant parameters such as the position of the focus and the spot size of the incident laser beam, which allow a precise prediction of the OCT data and therefore ultimately serves as a forward model. The accuracy of the proposed model - after calibration of all necessary system parameters - is illustrated by simulations and validated by a comparison with experimental data obtained from a 1300 nm swept-source OCT system.
Title:General primitivity in the mapping class group

Abstract: For $g\geq 2$, let $\text{Mod}(S_g)$ be the mapping class group of the closed orientable surface $S_g$ of genus $g$. In this paper, we obtain necessary and sufficient conditions under which a given pseudo-periodic mapping can be a root of another up to conjugacy. Using this characterization, the canonical decomposition of (non-periodic) mapping classes, and some known algorithms, we give a theoretical algorithm for computing its roots up to conjugacy. Furthermore, we derive realizable bounds on the degrees of roots of pseudo-periodic mapping classes in $\text{Mod}(S_g)$, the Torelli group, the level-$m$ subgroup of $\text{Mod}(S_g)$, and the commutator subgroup of $\text{Mod}(S_2)$. In particular, we show that the highest possible (realizable) degree of a root of a pseudo-periodic mapping class $F$ is $3q(F)(g+1)(g+2)$, realized by the roots of $T_c^{q(F)}$, where $c$ is a separating curve in $S_g$ of genus $[g/2]$ and $q(F)$ is a unique positive integer associated with the conjugacy class of $F$. Finally, for $g\geq 3$ we show that any pseudo-periodic having a nontrivial periodic component that is not the hyperelliptic involution, normally generates $\text{Mod}(S_g)$. Consequently, we establish there always exist roots of bounding pair maps and powers of Dehn twists that normally generate $\text{Mod}(S_g)$.
Towards an extended taxonomy of information dynamics via Integrated Information Decomposition

Pedro A.M. Mediano, Fernando E. Rosas, Andrea I Luppi, Robin L. Carhart-Harris, Daniel Bor, Anil K. Seth, Adam B. Barrett. Complex systems, from the human brain to the global economy, are made of multiple elements that interact in such ways that the behaviour of the `whole' often seems to be more than what is readily explainable in terms of the `sum of the parts.' Our ability to understand and control these systems remains limited, one reason being that we still don't know how best to describe -- and quantify -- the higher-order dynamical interactions that characterise their complexity. To address this limitation, we combine principles from the theories of Information Decomposition and Integrated Information into what we call Integrated Information Decomposition, or $\Phi$ID. $\Phi$ID provides a comprehensive framework to reason about, evaluate, and understand the information dynamics of complex multivariate systems. $\Phi$ID reveals the existence of previously unreported modes of collective information flow, providing tools to express well-known measures of information transfer and dynamical complexity as aggregates of these modes. Via computational and empirical examples, we demonstrate that $\Phi$ID extends our explanatory power beyond traditional causal discovery methods -- with profound implications for the study of complex systems across disciplines.
Distributionally Robust Multi-Output Regression Ranking

Despite their empirical success, most existing listwiselearning-to-rank (LTR) models are not built to be robust to errors in labeling or annotation, distributional data shift, or adversarial data perturbations. To fill this gap, we introduce a new listwise LTR model called Distributionally Robust Multi-output Regression Ranking (DRMRR). Different from existing methods, the scoring function of DRMRR was designed as a multivariate mapping from a feature vector to a vector of deviation scores, which captures local context information and cross-document interactions. DRMRR uses a Distributionally Robust Optimization (DRO) framework to minimize a multi-output loss function under the most adverse distributions in the neighborhood of the empirical data distribution defined by a Wasserstein ball. We show that this is equivalent to a regularized regression problem with a matrix norm regularizer. Our experiments were conducted on two real-world applications, medical document retrieval, and drug response prediction, showing that DRMRR notably outperforms state-of-the-art LTR models. We also conducted a comprehensive analysis to assess the resilience of DRMRR against various types of noise: Gaussian noise, adversarial perturbations, and label poisoning. We show that DRMRR is not only able to achieve significantly better performance than other baselines, but it can maintain a relatively stable performance as more noise is added to the data.
Refining the Semantics of Epistemic Specifications

Answer set programming (ASP) is an efficient problem-solving approach, which has been strongly supported both scientifically and technologically by several solvers, ongoing active research, and implementations in many different fields. However, although researchers acknowledged long ago the necessity of epistemic operators in the language of ASP for better introspective reasoning, this research venue did not attract much attention until recently. Moreover, the existing epistemic extensions of ASP in the literature are not widely approved either, due to the fact that some propose unintended results even for some simple acyclic epistemic programs, new unexpected results may possibly be found, and more importantly, researchers have different reasonings for some critical programs. To that end, Cabalar et al. have recently identified some structural properties of epistemic programs to formally support a possible semantics proposal of such programs and standardise their results. Nonetheless, the soundness of these properties is still under debate, and they are not widely accepted either by the ASP community. Thus, it seems that there is still time to really understand the paradigm, have a mature formalism, and determine the principles providing formal justification of their understandable models. In this paper, we mainly focus on the existing semantics approaches, the criteria that a satisfactory semantics is supposed to satisfy, and the ways to improve them. We also extend some well-known propositions of here-and-there logic (HT) into epistemic HT so as to reveal the real behaviour of programs. Finally, we propose a slightly novel semantics for epistemic ASP, which can be considered as a reflexive extension of Cabalar et al.'s recent formalism called autoepistemic ASP.
An Approximation Algorithm for a General Class of Multi-Parametric Optimization Problems

In a widely studied class of multi-parametric optimization problems, the objective value of each solution is an affine function of real-valued parameters. For many important multi-parametric optimization problems, an optimal solutions set with minimum cardinality can contain super-polynomially many solutions. Consequently, any exact algorithm for such problems must output a super-polynomial number of solutions.
Few-shot Learning Based on Multi-stage Transfer and Class-Balanced Loss for Diabetic Retinopathy Grading

Diabetic retinopathy (DR) is one of the major blindness-causing diseases current-ly known. Automatic grading of DR using deep learning methods not only speeds up the diagnosis of the disease but also reduces the rate of misdiagnosis. However, problems such as insufficient samples and imbalanced class distribu-tion in DR datasets have constrained the improvement of grading performance. In this paper, we introduce the idea of multi-stage transfer into the grading task of DR. The new transfer learning technique leverages multiple datasets with differ-ent scales to enable the model to learn more feature representation information. Meanwhile, to cope with imbalanced DR datasets, we present a class-balanced loss function that performs well in natural image classification tasks, and adopt a simple and easy-to-implement training method for it. The experimental results show that the application of multi-stage transfer and class-balanced loss function can effectively improve the grading performance metrics such as accuracy and quadratic weighted kappa. In fact, our method has outperformed two state-of-the-art methods and achieved the best result on the DR grading task of IDRiD Sub-Challenge 2.
A temporal logic of epistemic and normative justifications, with an application to the Protagoras paradox

We combine linear temporal logic (with both past and future modalities) with a deontic version of justification logic to provide a framework for reasoning about time and epistemic and normative reasons. In addition to temporal modalities, the resulting logic contains two kinds of justification assertions: epistemic justification assertions and deontic justification assertions. The former presents justification for the agent's knowledge and the latter gives reasons for why a proposition is obligatory. We present two kinds of semantics for the logic: one based on Fitting models and the other based on neighborhood models. The use of neighborhood semantics enables us to define the dual of deontic justification assertions properly, which corresponds to the notion of permission in deontic logic. We then establish the soundness and completeness of an axiom system of the logic with respect to these semantics. Further, we formalize the Protagoras versus Euathlus paradox in this logic and present a precise analysis of the paradox, and also briefly discuss Leibniz's solution.
A Comparison of Code Embeddings and Beyond

Program representation learning is a fundamental task in software engineering applications. With the availability of "big code" and the development of deep learning techniques, various program representation learning models have been proposed to understand the semantic properties of programs and applied on different software engineering tasks. However, no previous study has comprehensively assessed the generalizability of these deep models on different tasks, so that the pros and cons of the models are unclear. In this experience paper, we try to bridge this gap by systemically evaluating the performance of eight program representation learning models on three common tasks, where six models are based on abstract syntax trees and two models are based on plain text of source code. We kindly explain the criteria for selecting the models and tasks, as well as the method for enabling end-to-end learning in each task. The results of performance evaluation show that they perform diversely in each task and the performance of the AST-based models is generally unstable over different tasks. In order to further explain the results, we apply a prediction attribution technique to find what elements are captured by the models and responsible for the predictions in each task. Based on the findings, we discuss some general principles for better capturing the information in the source code, and hope to inspire researchers to improve program representation learning methods for software engineering tasks.
Accelerated Steady-State Electrostatic Particle-in-Cell Simulation of Langmuir Probes

First-principles particle-in-cell (PIC) simulation is a powerful tool for understanding plasma behavior, but this power often comes at great computational expense. Artificially reducing the ion/electron mass ratio is a time-honored practice to reduce simulation costs. Usually, this is a severe approximation. However, for steady-state collisionless, electrostatic (Vlasov-Poisson) systems, the solution with reduced mass ratio can be scaled to the solution for the real mass ratio, with no approximation. This 'scaled mass' method, which works with already-existing PIC codes, can reduce the computation time for a large class of electrostatic PIC simulations by the square root of the mass ratio. The particle distributions of the resulting steady state must be trivially rescaled to yield the true distributions, but the self-consistent electrostatic field is independent of the mass ratio. This method is equivalent to 'numerical timestepping,' an approach that evolves electron and ion populations with different timesteps. Numerical timestepping can be viewed as a special case of the speed-limited PIC (SLPIC) method, which is not restricted to steady-state phenomena. Although the scaled-mass approach is simplest, numerical timestepping and SLPIC more easily generalize to include other effects, such as magnetic forces and collisions. The equivalence of these new approaches is demonstrated by applying them to simulate a cylindrical Langmuir probe in electron-argon plasma, speeding up simulation by two orders of magnitude. Methods such as SLPIC can therefore play an invaluable role in interpreting probe measurements by including geometric effects, collisions, secondary emission, non-Maxwellian distributions, and magnetic fields.
Systematic Nuclear Uncertainties in the Hypertriton System

The hypertriton bound state is relevant for inference of knowledge about the hyperon-nucleon (YN) interaction. In this work we compute the binding energy of the hypertriton using the ab initio hypernuclear no-core shell model (NCSM) with realistic interactions derived from chiral effective field theory. In particular, we employ a large family of nucleon-nucleon interactions with the aim to quantify the theoretical precision of predicted hypernuclear observables arising from nuclear-physics uncertainties. The three-body calculations are performed in a relative Jacobi-coordinate harmonic oscillator basis and we implement infrared correction formulas to extrapolate the NCSM results to infinite model space. We find that the spread of the predicted hypertriton binding energy, attributed to the nuclear-interaction model uncertainty, is about 100 keV. In conclusion, the sensitivity of the hypertriton binding energy to nuclear-physics uncertainties is of the same order of magnitude as experimental uncertainties such that this bound-state observable can be used in the calibration procedure to constrain the YN interactions.
Generalized rainbow patterns of oblate drops simulated by a ray model in three dimensions

The scattering patterns near the primary rainbow of oblate drops are simulated by extending the vectorial complex ray model (VCRM) [1] to three-dimensional (3D) calculations. With the curvature of wavefront as intrinsic property of a ray, this advanced ray model permits, in principle, to predict the amplitudes and phases of all emergent rays with a rigorous algebraic formalism. This letter reports a breakthrough of VCRM for 3D scattering with a line-by-line triangulation interpolation algorithm allowing to calculate the total complex amplitude of scattered f eld. This makes possible to simulate not only the skeleton (geometrical rainbow angles, hyperbolic-umbilic caustics), but also the coarse (Airy bows, lattice) and f ne (ripple fringes) structures of the generalized rainbow patterns (GRPs) of oblate drops. The simulated results are found qualitatively and quantitatively in good agreement with experimental scattering patterns for drops of different aspect ratios. The physical interpretation of the GRPs is also given. This work opens up prominent perspectives for simulating and understanding the 3D scattering of large particles of any shape with smooth surface by VCRM.
Modeling electronic response properties with an explicit-electron machine learning potential

Explicit-electron force fields introduce electrons or electron pairs as semi-classical particles in force fields or empirical potentials, which are suitable for molecular dynamics simulations. Even though semi-classical electrons are a drastic simplification compared to a quantum-mechanical electronic wavefunction, they still retain a relatively detailed electronic model compared to conventional polarizable and reactive force fields. The ability of explicit-electron models to describe chemical reactions and electronic response properties has already been demonstrated, yet the description of short-range interactions for a broad range of chemical systems remains challenging. In this work, we present the electron machine learning potential (eMLP), a new explicit electron force field where the short-range interactions are modeled with machine learning. The electron pair particles will be located at well-defined positions, derived from localized molecular orbitals or Wannier centers, naturally imposing the correct dielectric and piezoelectric behavior of the system. The eMLP is benchmarked on two newly constructed datasets: eQM7, a extension of the QM7 dataset for small molecules, and a dataset for the crystalline $\beta$-glycine. It is shown that the eMLP can predict dipole moments, polarizabilities and IR-spectra of unseen molecules with high precision. Furthermore, a variety of response properties, e.g. stiffness or piezoelectric constants, can be accurately reproduced.
Generalized Ising Model on a Scale-Free Network: An Interplay of Power Laws

We consider a recently introduced generalization of the Ising model in which individual spin strength can vary. The model is intended for analysis of ordering in systems comprising agents which, although matching in their binarity (i.e., maintaining the iconic Ising features of `+' or `$-$', `up' or `down', `yes' or `no'), differ in their strength. To investigate the interplay between variable properties of nodes and interactions between them, we study the model on a complex network where both the spin strength and degree distributions are governed by power laws. We show that in the annealed network approximation, thermodynamic functions of the model are self-averaging and we obtain an exact solution for the partition function. This allows us to derive the leading temperature and field dependencies of thermodynamic functions, their critical behavior, and logarithmic corrections at the interface of different phases. We find the delicate interplay of the two power laws leads to new universality classes.
Exactness of Parrilo's conic approximations for copositive matrices and associated low order bounds for the stability number of a graph

De Klerk and Pasechnik (2002) introduced the bounds $\vartheta^{(r)}(G)$ ($r\in \mathbb{N}$) for the stability number $\alpha(G)$ of a graph $G$ and conjectured exactness at order $\alpha(G)-1$: $\vartheta^{(\alpha(G)-1)}(G)=\alpha(G)$. These bounds rely on the conic approximations $\mathcal{K}_n^{(r)}$ by Parrilo (2000) for the copositive cone $\text{COP}_n$. A difficulty in the convergence analysis of $\vartheta^{(r)}$ is the bad behaviour of the cones $\mathcal{K}_n^{(r)}$ under adding a zero row/column: when applied to a matrix not in $\mathcal{K}^{(0)}_n$ this gives a matrix not in any ${\mathcal{K}}^{(r)}_{n+1}$, thereby showing strict inclusion $\bigcup_{r\ge 0}{\mathcal{K}}^{(r)}_n\subset \text{COP}_n$ for $n\ge 6$. We investigate the graphs with $\vartheta^{(r)}(G)=\alpha(G)$ for $r=0,1$: we algorithmically reduce testing exactness of $\vartheta^{(0)}$ to acritical graphs, we characterize critical graphs with $\vartheta^{(0)}$ exact, and we exhibit graphs for which exactness of $\vartheta^{(1)}$ is not preserved under adding an isolated node. This disproves a conjecture by Gvozdenović and Laurent (2007) which, if true, would have implied the above conjecture by de Klerk and Pasechnik.
Using Soft Labels to Model Uncertainty in Medical Image Segmentation

Medical image segmentation is inherently uncertain. For a given image, there may be multiple plausible segmentation hypotheses, and physicians will often disagree on lesion and organ boundaries. To be suited to real-world application, automatic segmentation systems must be able to capture this uncertainty and variability. Thus far, this has been addressed by building deep learning models that, through dropout, multiple heads, or variational inference, can produce a set - infinite, in some cases - of plausible segmentation hypotheses for any given image. However, in clinical practice, it may not be practical to browse all hypotheses. Furthermore, recent work shows that segmentation variability plateaus after a certain number of independent annotations, suggesting that a large enough group of physicians may be able to represent the whole space of possible segmentations. Inspired by this, we propose a simple method to obtain soft labels from the annotations of multiple physicians and train models that, for each image, produce a single well-calibrated output that can be thresholded at multiple confidence levels, according to each application's precision-recall requirements. We evaluated our method on the MICCAI 2021 QUBIQ challenge, showing that it performs well across multiple medical image segmentation tasks, produces well-calibrated predictions, and, on average, performs better at matching physicians' predictions than other physicians.
Neural network tokamak equilibria with incompressible flows

We present several numerical results concerning the solution of a Generalized Grad-Shafranov Equation (GGSE), which governs axisymmetric plasma equilibria with incompressible flows of arbitrary direction, using fully connected, feed-forward deep neural networks, also known as multi-layer perceptrons. Solutions to the GGSE in a Tokamak-relevant D-Shaped domain are approximated by such artificial neural networks (ANNs) upon minimizing the GGSE mean squared residual in the plasma volume and the poloidal flux function on the plasma boundary. Solutions for the Solovev and the general linearizing ansatz for the free functions involved in the GGSE are obtained and benchmarked against known analytic solutions. We also construct a non-linear equilibrium incorporating characteristics relevant to the H-mode confinement. In our numerical experiments it was observed that changing the radial distribution of the training points had no appreciable effect on the accuracy of the trained solution. In particular it is shown that localizing the training points near the boundary results in ANN solutions that describe quite accurately the entire magnetic configuration thus demonstrating the interpolation capabilities of the ANNs.
Distributionally Robust Multiclass Classification and Applications in Deep CNN Image Classifiers

We develop a Distributionally Robust Optimization (DRO) formulation for Multiclass Logistic Regression (MLR), which could tolerate data contaminated by outliers. The DRO framework uses a probabilistic ambiguity set defined as a ball of distributions that are close to the empirical distribution of the training set in the sense of the Wasserstein metric. We relax the DRO formulation into a regularized learning problem whose regularizer is a norm of the coefficient matrix. We establish out-of-sample performance guarantees for the solutions to our model, offering insights on the role of the regularizer in controlling the prediction error. We apply the proposed method in rendering deep CNN-based image classifiers robust to random and adversarial attacks. Specifically, using the MNIST and CIFAR-10 datasets, we demonstrate reductions in test error rate by up to 78.8% and loss by up to 90.8%. We also show that with a limited number of perturbed images in the training set, our method can improve the error rate by up to 49.49% and the loss by up to 68.93% compared to Empirical Risk Minimization (ERM), converging faster to an ideal loss/error rate as the number of perturbed images increases.
Learning from Small Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales

Motivated by the problem of learning when the number of training samples is small, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e.g., translational invariance of images. Kernels based on the \textit{minimum} distance over a group of transformations, which corresponds to defining similarity as the \textit{best} over the possible transformations, are not generally positive definite. Perhaps it is for this reason that they have neither previously been experimentally tested for their performance nor studied theoretically. Instead, previous attempts have employed kernels based on the \textit{average} distance over a group of transformations, which are trivially positive definite, but which generally yield both poor margins as well as poor performance, as we show. We address this lacuna and show that positive definiteness indeed holds \textit{with high probability} for kernels based on the minimum distance in the small training sample set regime of interest, and that they do yield the best results in that regime. Another important property of CNNs is their ability to incorporate local features at multiple spatial scales, e.g., through max pooling. A third important property is their ability to provide the benefits of composition through the architecture of multiple layers. We show how these additional properties can also be embedded into SVMs. We verify through experiments on widely available image sets that the resulting SVMs do provide superior accuracy in comparison to well-established neural network (DNN) benchmarks for small sample sizes.
