or2yw: Modeling and Visualizing OpenRefineHistories as YesWorkflow Diagrams

By Nikolaus Nova Parulian, Lan Li, Bertram Ludaescher
arxiv.org
 4 days ago

OpenRefine is a popular open-source data cleaning tool. It allows users to export a previously executed data cleaning workflow in a JSON format for possible reuse on other datasets. We have developed or2yw, a novel tool that maps a JSON-formatted OpenRefine operation history to a YesWorkflow (YW) model, which then...

arxiv.org

Computing Voronoi Diagrams in the Polar-Coordinate Model of the Hyperbolic Plane

A Voronoi diagram is a basic geometric structure that partitions the space into regions associated with a given set of sites, such that all points in a region are closer to the corresponding site than to all other sites. While being thoroughly studied in Euclidean space, they are also of interest in hyperbolic space. In fact, there are several algorithms for computing hyperbolic Voronoi diagrams that work with the various models used to describe hyperbolic geometry. However, the polar-coordinate model has not been considered before, despite its increased popularity in the network science community. While Voronoi diagrams have the potential to advance this field, the model is geometrically not as approachable as other models, which impedes the development of geometric algorithms.
MATHEMATICS
Lumia UK

Embedded Software Development in Visual Studio

We are happy to announce that we have added new embedded development capabilities to Visual Studio 2022 Preview. Used in conjunction with the new vcpkg artifact capabilities you can quickly bootstrap an embedded development machine and get started. In this post we will walk you through Visual Studio installation of...
SOFTWARE
martechseries.com

Mode Introduces Visual Explorer

Mode Analytics, the most comprehensive platform for collaborative Business Intelligence (BI) and interactive Data Science, today introduced Visual Explorer, a new flexible visualization system that helps analysts explore data faster and provide easy-to-interpret insights to business stakeholders. Marketing Technology News: Vaughan Brown joins mce Systems as Global Marketing Manager. “Visual...
SOFTWARE
arxiv.org

Computing a link diagram from its exterior

A knot is circle piecewise-linearly embedded into the 3-sphere. The topology of a knot is intimately related to that of its exterior, which is the complement of an open regular neighborhood of the knot. Knots are typically encoded by planar diagrams, whereas their exteriors, which are compact 3-manifolds with torus boundary, are encoded by triangulations. Here, we give the first practical algorithm for finding a diagram of a knot given a triangulation of its exterior. Our method applies to links as well as knots, and allows us to recover links with hundreds of crossings. We use it to find the first diagrams known for 19 principal congruence arithmetic link exteriors; the largest has over 1,000 crossings. Other applications include finding pairs of knots with the same 0-surgery, which relates to questions about slice knots and the smooth 4D Poincaré conjecture.
MATHEMATICS
arxiv.org

On visual self-supervision and its effect on model robustness

Recent self-supervision methods have found success in learning feature representations that could rival ones from full supervision, and have been shown to be beneficial to the model in several ways: for example improving models robustness and out-of-distribution detection. In our paper, we conduct an empirical study to understand more precisely in what way can self-supervised learning - as a pre-training technique or part of adversarial training - affects model robustness to $l_2$ and $l_{\infty}$ adversarial perturbations and natural image corruptions. Self-supervision can indeed improve model robustness, however it turns out the devil is in the details. If one simply adds self-supervision loss in tandem with adversarial training, then one sees improvement in accuracy of the model when evaluated with adversarial perturbations smaller or comparable to the value of $\epsilon_{train}$ that the robust model is trained with. However, if one observes the accuracy for $\epsilon_{test} \ge \epsilon_{train}$, the model accuracy drops. In fact, the larger the weight of the supervision loss, the larger the drop in performance, i.e. harming the robustness of the model. We identify primary ways in which self-supervision can be added to adversarial training, and observe that using a self-supervised loss to optimize both network parameters and find adversarial examples leads to the strongest improvement in model robustness, as this can be viewed as a form of ensemble adversarial training. Although self-supervised pre-training yields benefits in improving adversarial training as compared to random weight initialization, we observe no benefit in model robustness or accuracy if self-supervision is incorporated into adversarial training.
SCIENCE
arxiv.org

Prompting Visual-Language Models for Efficient Video Understanding

Visual-language pre-training has shown great success for learning joint visual-textual representations from large-scale web data, demonstrating remarkable ability for zero-shot generalisation. This paper presents a simple method to efficiently adapt one pre-trained visual-language model to novel tasks with minimal training, and here, we consider video understanding tasks. Specifically, we propose to optimise a few random vectors, termed as continuous prompt vectors, that convert the novel tasks into the same format as the pre-training objectives. In addition, to bridge the gap between static images and videos, temporal information is encoded with lightweight Transformers stacking on top of frame-wise visual features. Experimentally, we conduct extensive ablation studies to analyse the critical components and necessities. On 9 public benchmarks of action recognition, action localisation, and text-video retrieval, across closed-set, few-shot, open-set scenarios, we achieve competitive or state-of-the-art performance to existing methods, despite training significantly fewer parameters.
CODING & PROGRAMMING
arxiv.org

GAN-Supervised Dense Visual Alignment

We propose GAN-Supervised Learning, a framework for learning discriminative models and their GAN-generated training data jointly end-to-end. We apply our framework to the dense visual alignment problem. Inspired by the classic Congealing method, our GANgealing algorithm trains a Spatial Transformer to map random samples from a GAN trained on unaligned data to a common, jointly-learned target mode. We show results on eight datasets, all of which demonstrate our method successfully aligns complex data and discovers dense correspondences. GANgealing significantly outperforms past self-supervised correspondence algorithms and performs on-par with (and sometimes exceeds) state-of-the-art supervised correspondence algorithms on several datasets -- without making use of any correspondence supervision or data augmentation and despite being trained exclusively on GAN-generated data. For precise correspondence, we improve upon state-of-the-art supervised methods by as much as $3\times$. We show applications of our method for augmented reality, image editing and automated pre-processing of image datasets for downstream GAN training.
CODING & PROGRAMMING
esri.com

Table: The newest visualization in ArcGIS Dashboards

Getting the most out of your dashboard is as much about choosing the best visualization as it is about having the right data. Charts, gauges, and indicators are typically the go-to visualizations because they communicate concepts quickly and easily. Although ease of communication is important, there are times when you need more granularity. With the December 2021 release of ArcGIS Dashboards, you can now create tables to provide more depth and understanding to your dashboard’s data.
SOFTWARE
arxiv.org

Twistor Coverings and Feynman Diagrams

Recently, a worldsheet dual to free ${\cal N}=4$ Super Yang-Mills has been proposed in terms of twistor variables for ${\rm AdS}_5$, in parallel to that for the ${\rm AdS}_3$ dual to the free symmetric orbifold CFT. In the latter case, holomorphic covering maps play a central role in determining correlators and are associated to Feynman diagrams. After recasting these maps in terms of the worldsheet twistor variables for ${\rm AdS}_3$, we generalise to ${\rm AdS}_5$. We propose stringy incidence relations and appropriate reality conditions for the twistor covering maps. For some special kinematic configurations of correlators, we exhibit an explicit construction of the corresponding covering map. We find that the closed string worldsheet corresponding to this map is related to a gauge theory Feynman diagram by the Strebel construction, as for ${\rm AdS}_3/{\rm CFT}_2$. Rather strikingly, the regularised Strebel area of the worldsheet reproduces the Feynman propagator of the free field theory.
CHEMISTRY
towardsdatascience.com

Visualizing Decision Trees with Pybaobabdt

Data visualization is the language of decision-making. Good charts effectively convey information. Great charts enable, inform, and improve decision making: Dante Vitagliano. Decision trees can be visualized in multiple ways. Take, for instance, the indentation nodes where every internal and leaf node is depicted as text, while the parent-child relationship is shown by indenting the child with respect to the parent.
CODING & PROGRAMMING
arxiv.org

Category-theoretic recipe for dualities in one-dimensional quantum lattice models

We present a systematic approach for generating duality transformations in quantum lattice models. Within our formalism, dualities are completely characterized by equivalent but distinct realizations of a given (possibly non-abelian and non-invertible) symmetry. These different realizations are encoded into fusion categories, and dualities are methodically generated by considering all Morita equivalent categories. The full set of symmetric operators can then be constructed from the categorical data. We construct explicit intertwiners, in the form of matrix product operators, that convert local symmetric operators of one realization into local symmetric operators of its dual. Concurrently, it maps local operators that transform non-trivially into non-local ones. This guarantees that the structure constants of the algebra of all symmetric operators are equal in both dual realizations. Families of dual Hamiltonians, possibly with long range interactions, are then designed by taking linear combinations of the corresponding symmetric operators. We illustrate this approach by establishing matrix product operator intertwiners for well-known dualities such as Kramers-Wannier and Jordan-Wigner, consider theories with two copies of the Ising category symmetry, and present an example with quantum group symmetries. Finally, we comment on generalizations to higher dimensions of this categorical approach to dualities.
MATHEMATICS
arxiv.org

Ensembling Off-the-shelf Models for GAN Training

The advent of large-scale training has produced a cornucopia of powerful visual recognition models. However, generative models, such as GANs, have traditionally been trained from scratch in an unsupervised manner. Can the collective "knowledge" from a large bank of pretrained vision models be leveraged to improve GAN training? If so, with so many models to choose from, which one(s) should be selected, and in what manner are they most effective? We find that pretrained computer vision models can significantly improve performance when used in an ensemble of discriminators. Notably, the particular subset of selected models greatly affects performance. We propose an effective selection mechanism, by probing the linear separability between real and fake samples in pretrained model embeddings, choosing the most accurate model, and progressively adding it to the discriminator ensemble. Interestingly, our method can improve GAN training in both limited data and large-scale settings. Given only 10k training samples, our FID on LSUN Cat matches the StyleGAN2 trained on 1.6M images. On the full dataset, our method improves FID by 1.5x to 2x on cat, church, and horse categories of LSUN.
CODING & PROGRAMMING
arxiv.org

Characterizing and addressing the issue of oversmoothing in neural autoregressive sequence modeling

Neural autoregressive sequence models smear the probability among many possible sequences including degenerate ones, such as empty or repetitive sequences. In this work, we tackle one specific case where the model assigns a high probability to unreasonably short sequences. We define the oversmoothing rate to quantify this issue. After confirming the high degree of oversmoothing in neural machine translation, we propose to explicitly minimize the oversmoothing rate during training. We conduct a set of experiments to study the effect of the proposed regularization on both model distribution and decoding performance. We use a neural machine translation task as the testbed and consider three different datasets of varying size. Our experiments reveal three major findings. First, we can control the oversmoothing rate of the model by tuning the strength of the regularization. Second, by enhancing the oversmoothing loss contribution, the probability and the rank of <eos> token decrease heavily at positions where it is not supposed to be. Third, the proposed regularization impacts the outcome of beam search especially when a large beam is used. The degradation of translation quality (measured in BLEU) with a large beam significantly lessens with lower oversmoothing rate, but the degradation compared to smaller beam sizes remains to exist. From these observations, we conclude that the high degree of oversmoothing is the main reason behind the degenerate case of overly probable short sequences in a neural autoregressive model.
CODING & PROGRAMMING
arxiv.org

Neural Style Transfer and Unpaired Image-to-Image Translation to deal with the Domain Shift Problem on Spheroid Segmentation

Background and objectives. Domain shift is a generalisation problem of machine learning models that occurs when the data distribution of the training set is different to the data distribution encountered by the model when it is deployed. This is common in the context of biomedical image segmentation due to the variance of experimental conditions, equipment, and capturing settings. In this work, we address this challenge by studying both neural style transfer algorithms and unpaired image-to-image translation methods in the context of the segmentation of tumour spheroids.
COMPUTERS
arxiv.org

BoGraph: Structured Bayesian Optimization From Logs for Systems with High-dimensional Parameter Space

Current auto-tuning frameworks struggle with tuning computer systems configurations due to their large parameter space, complex interdependencies, and high evaluation cost. Utilizing probabilistic models, Structured Bayesian Optimization (SBO) has recently overcome these difficulties. SBO decomposes the parameter space by utilizing contextual information provided by system experts leading to fast convergence. However, the complexity of building probabilistic models has hindered its wider adoption. We propose BoAnon, a SBO framework that learns the system structure from its logs. BoAnon provides an API enabling experts to encode knowledge of the system as performance models or components dependency. BoAnon takes in the learned structure and transforms it into a probabilistic graph model. Then it applies the expert-provided knowledge to the graph to further contextualize the system behavior. BoAnon probabilistic graph allows the optimizer to find efficient configurations faster than other methods. We evaluate BoAnon via a hardware architecture search problem, achieving an improvement in energy-latency objectives ranging from $5-7$ x-factors improvement over the default architecture. With its novel contextual structure learning pipeline, BoAnon makes using SBO accessible for a wide range of other computer systems such as databases and stream processors.
CODING & PROGRAMMING
arxiv.org

Visualizing Ensemble Predictions of Music Mood

Music mood classification has been a challenging problem in comparison with some other classification problems (e.g., genre, composer, or period). One solution for addressing this challenging is to use an of ensemble machine learning models. In this paper, we show that visualization techniques can effectively convey the popular prediction as well as uncertainty at different music sections along the temporal axis, while enabling the analysis of individual ML models in conjunction with their application to different musical data. In addition to the traditional visual designs, such as stacked line graph, ThemeRiver, and pixel-based visualization, we introduced a new variant of ThemeRiver, called "dual-flux ThemeRiver", which allows viewers to observe and measure the most popular prediction more easily than stacked line graph and ThemeRiver. Testing indicates that visualizing ensemble predictions is helpful both in model-development workflows and for annotating music using model predictions.
COMPUTERS
arxiv.org

A Static Analyzer for Detecting Tensor Shape Errors in Deep Neural Network Training Code

We present an automatic static analyzer PyTea that detects tensor-shape errors in PyTorch code. The tensor-shape error is critical in the deep neural net code; much of the training cost and intermediate results are to be lost once a tensor shape mismatch occurs in the midst of the training phase. Given the input PyTorch source, PyTea statically traces every possible execution path, collects tensor shape constraints required by the tensor operation sequence of the path, and decides if the constraints are unsatisfiable (hence a shape error can occur). PyTea's scalability and precision hinges on the characteristics of real-world PyTorch applications: the number of execution paths after PyTea's conservative pruning rarely explodes and loops are simple enough to be circumscribed by our symbolic abstraction. We tested PyTea against the projects in the official PyTorch repository and some tensor-error code questioned in the StackOverflow. PyTea successfully detects tensor shape errors in these codes, each within a few seconds.
CODING & PROGRAMMING
arxiv.org

Adaptation and Attention for Neural Video Coding

Nannan Zou, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed R. Tavakoli, Jani Lainema, Emre Aksu, Miska Hannuksela, Esa Rahtu. Neural image coding represents now the state-of-the-art image compression approach. However, a lot of work is still to be done in the video domain. In this work, we propose an end-to-end learned video codec that introduces several architectural novelties as well as training novelties, revolving around the concepts of adaptation and attention. Our codec is organized as an intra-frame codec paired with an inter-frame codec. As one architectural novelty, we propose to train the inter-frame codec model to adapt the motion estimation process based on the resolution of the input video. A second architectural novelty is a new neural block that combines concepts from split-attention based neural networks and from DenseNets. Finally, we propose to overfit a set of decoder-side multiplicative parameters at inference time. Through ablation studies and comparisons to prior art, we show the benefits of our proposed techniques in terms of coding gains. We compare our codec to VVC/H.266 and RLVC, which represent the state-of-the-art traditional and end-to-end learned codecs, respectively, and to the top performing end-to-end learned approach in 2021 CLIC competition, E2E_T_OL. Our codec clearly outperforms E2E_T_OL, and compare favorably to VVC and RLVC in some settings.
CODING & PROGRAMMING
arxiv.org

Masked Feature Prediction for Self-Supervised Visual Pre-Training

We present Masked Feature Prediction (MaskFeat) for self-supervised pre-training of video models. Our approach first randomly masks out a portion of the input sequence and then predicts the feature of the masked regions. We study five different types of features and find Histograms of Oriented Gradients (HOG), a hand-crafted feature descriptor, works particularly well in terms of both performance and efficiency. We observe that the local contrast normalization in HOG is essential for good results, which is in line with earlier work using HOG for visual recognition. Our approach can learn abundant visual knowledge and drive large-scale Transformer-based models. Without using extra model weights or supervision, MaskFeat pre-trained on unlabeled videos achieves unprecedented results of 86.7% with MViT-L on Kinetics-400, 88.3% on Kinetics-600, 80.4% on Kinetics-700, 38.8 mAP on AVA, and 75.0% on SSv2. MaskFeat further generalizes to image input, which can be interpreted as a video with a single frame and obtains competitive results on ImageNet.
CODING & PROGRAMMING

