Business Insider

Facebook is working on AI tech that will monitor your every move

Facebook envisions a future where smartglasses "become as useful in everyday life as smartphones," the company said in a new blog post. In order to achieve that future, such devices will require powerful AI software that can read and respond to the world around the headset's user. And the only way to train AI to see and hear the world like humans do is for it to experience the world like we do: from a first-person perspective.
INTERNET
arxiv.org

Multimodal Sensory Learning for Real-time, Adaptive Manipulation

Adaptive control for real-time manipulation requires quick estimation and prediction of object properties. While robot learning in this area primarily focuses on using vision, many tasks cannot rely on vision due to object occlusion. Here, we formulate a learning framework that uses multimodal sensory fusion of tactile and audio data in order to quickly characterize and predict an object's properties. The predictions are used in a developed reactive controller to adapt the grip on the object to compensate for the predicted inertial forces experienced during motion. Drawing inspiration from how humans interact with objects, we propose an experimental setup from which we can understand how to best utilize different sensory signals and actively interact with and manipulate objects to quickly learn their object properties for safe manipulation.
ENGINEERING
arxiv.org

Real-time Drift Detection on Time-series Data

Nandini Ramanan, Rasool Tahmasbi, Marjorie Sayer, Deokwoo Jung, Shalini Hemachandran, Claudionor Nunes Coelho Jr. Practical machine learning applications involving time series data, such as firewall log analysis to proactively detect anomalous behavior, are concerned with real time analysis of streaming data. Consequently, we need to update the ML models as the statistical characteristics of such data may shift frequently with time. One alternative explored in the literature is to retrain models with updated data whenever the models accuracy is observed to degrade. However, these methods rely on near real time availability of ground truth, which is rarely fulfilled. Further, in applications with seasonal data, temporal concept drift is confounded by seasonal variation. In this work, we propose an approach called Unsupervised Temporal Drift Detector or UTDD to flexibly account for seasonal variation, efficiently detect temporal concept drift in time series data in the absence of ground truth, and subsequently adapt our ML models to concept drift for better generalization.
COMPUTERS
arxiv.org

Modeling of Pan Evaporation Based on the Development of Machine Learning Methods

For effective planning and management of water resources and implementation of the related strategies, it is important to ensure proper estimation of evaporation losses, especially in regions that are prone to drought. Changes in climatic factors, such as changes in temperature, wind speed, sunshine hours, humidity, and solar radiation can have a significant impact on the evaporation process. As such, evaporation is a highly non-linear, non-stationary process, and can be difficult to be modeled based on climatic factors, especially in different agro-climatic conditions. The aim of this study, therefore, is to investigate the feasibility of several machines learning (ML) models (conditional random forest regression, Multivariate Adaptive Regression Splines, Bagged Multivariate Adaptive Regression Splines, Model Tree M5, K- nearest neighbor, and the weighted K- nearest neighbor) for modeling the monthly pan evaporation estimation. This study proposes the development of newly explored ML models for modeling evaporation losses in three different locations over the Iraq region based on the available climatic data in such areas. The evaluation of the performance of the proposed model based on various evaluation criteria showed the capability of the proposed weighted K- nearest neighbor model in modeling the monthly evaporation losses in the studies areas with better accuracy when compared with the other existing models used as a benchmark in this study.
SCIENCE
arxiv.org

Revisiting Design Choices in Model-Based Offline Reinforcement Learning

Offline reinforcement learning enables agents to leverage large pre-collected datasets of environment transitions to learn control policies, circumventing the need for potentially expensive or unsafe online data collection. Significant progress has been made recently in offline model-based reinforcement learning, approaches which leverage a learned dynamics model. This typically involves constructing a probabilistic model, and using the model uncertainty to penalize rewards where there is insufficient data, solving for a pessimistic MDP that lower bounds the true MDP. Existing methods, however, exhibit a breakdown between theory and practice, whereby pessimistic return ought to be bounded by the total variation distance of the model from the true dynamics, but is instead implemented through a penalty based on estimated model uncertainty. This has spawned a variety of uncertainty heuristics, with little to no comparison between differing approaches. In this paper, we compare these heuristics, and design novel protocols to investigate their interaction with other hyperparameters, such as the number of models, or imaginary rollout horizon. Using these insights, we show that selecting these key hyperparameters using Bayesian Optimization produces superior configurations that are vastly different to those currently used in existing hand-tuned state-of-the-art methods, and result in drastically stronger performance.
COMPUTERS
arxiv.org

Local and Global Context-Based Pairwise Models for Sentence Ordering

Sentence Ordering refers to the task of rearranging a set of sentences into the appropriate coherent order. For this task, most previous approaches have explored global context-based end-to-end methods using Sequence Generation techniques. In this paper, we put forward a set of robust local and global context-based pairwise ordering strategies, leveraging which our prediction strategies outperform all previous works in this domain. Our proposed encoding method utilizes the paragraph's rich global contextual information to predict the pairwise order using novel transformer architectures. Analysis of the two proposed decoding strategies helps better explain error propagation in pairwise models. This approach is the most accurate pure pairwise model and our encoding strategy also significantly improves the performance of other recent approaches that use pairwise models, including the previous state-of-the-art, demonstrating the research novelty and generalizability of this work. Additionally, we show how the pre-training task for ALBERT helps it to significantly outperform BERT, despite having considerably lesser parameters. The extensive experimental results, architectural analysis and ablation studies demonstrate the effectiveness and superiority of the proposed models compared to the previous state-of-the-art, besides providing a much better understanding of the functioning of pairwise models.
SCIENCE
arxiv.org

Model-independent time-delay interferometry based on principal component analysis

With a laser interferometric gravitational-wave detector in separate free flying spacecraft, the only way to achieve detection is to mitigate the dominant noise arising from the frequency fluctuations of the lasers via postprocessing. The noise can be effectively filtered out on the ground through a specific technique called time-delay interferometry (TDI), which relies on the measurements of time-delays between spacecraft and careful modeling of how laser noise enters the interferometric data. Recently, this technique has been recast into a matrix-based formalism by several authors, offering a different perspective on TDI, particularly by relating it to principal component analysis (PCA). In this work, we demonstrate that we can cancel laser frequency noise by directly applying PCA to a set of shifted data samples, without any prior knowledge of the relationship between single-link measurements and noise, nor time-delays. We show that this fully data-driven algorithm achieves a gravitational-wave sensitivity similar to classic TDI.
AEROSPACE & DEFENSE
arxiv.org

Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

We study the model-based reward-free reinforcement learning with linear function approximation for episodic Markov decision processes (MDPs). In this setting, the agent works in two phases. In the exploration phase, the agent interacts with the environment and collects samples without the reward. In the planning phase, the agent is given a specific reward function and uses samples collected from the exploration phase to learn a good policy. We propose a new provably efficient algorithm, called UCRL-RFE under the Linear Mixture MDP assumption, where the transition probability kernel of the MDP can be parameterized by a linear function over certain feature mappings defined on the triplet of state, action, and next state. We show that to obtain an $\epsilon$-optimal policy for arbitrary reward function, UCRL-RFE needs to sample at most $\tilde O(H^5d^2\epsilon^{-2})$ episodes during the exploration phase. Here, $H$ is the length of the episode, $d$ is the dimension of the feature mapping. We also propose a variant of UCRL-RFE using Bernstein-type bonus and show that it needs to sample at most $\tilde O(H^4d(H + d)\epsilon^{-2})$ to achieve an $\epsilon$-optimal policy. By constructing a special class of linear Mixture MDPs, we also prove that for any reward-free algorithm, it needs to sample at least $\tilde \Omega(H^2d\epsilon^{-2})$ episodes to obtain an $\epsilon$-optimal policy. Our upper bound matches the lower bound in terms of the dependence on $\epsilon$ and the dependence on $d$ if $H \ge d$.
CODING & PROGRAMMING
arxiv.org

Real-time EEG-based Emotion Recognition using Discrete Wavelet Transforms on Full and Reduced Channel Signals

Real-time EEG-based Emotion Recognition (EEG-ER) with consumer-grade EEG devices involves classification of emotions using a reduced number of channels. These devices typically provide only four or five channels, unlike the high number of channels (32 or more) typically used in most current state-of-the-art research. In this work we propose to use Discrete Wavelet Transforms (DWT) to extract time-frequency domain features, and we use time-windows of a few seconds to perform EEG-ER classification. This technique can be used in real-time, as opposed to post-hoc on the full session data. We also apply baseline removal preprocessing, developed in prior research, to our proposed DWT Entropy and Energy features, which improves classification accuracy significantly. We consider two different classifier architectures, a 3D Convolutional Neural Network (3D CNN) and a Support Vector Machine (SVM). We evaluate both models on subject-independent and subject dependent setups to classify the Valence and Arousal dimensions of an individual's emotional state. We test them on both the full 32-channel data provided by the DEAP dataset, and also a reduced 5-channel extract of the same dataset. The SVM model performs best on all the presented scenarios, achieving an accuracy of 95.32% on Valence and 95.68% on Arousal for the full 32-channel subject-dependent case, beating prior real-time EEG-ER subject-dependent benchmarks. On the subject-independent case an accuracy of 80.70% on Valence and 81.41% on Arousal was also obtained. Reducing the input data to 5 channels only degrades the accuracy by an average of 3.54% across all scenarios, making this model appropriate for use with more accessible low-end EEG devices.
TECHNOLOGY
arxiv.org

Efficient Estimation in NPIV Models: A Comparison of Various Neural Networks-Based Estimators

We investigate the computational performance of Artificial Neural Networks (ANNs) in semi-nonparametric instrumental variables (NPIV) models of high dimensional covariates that are relevant to empirical work in economics. We focus on efficient estimation of and inference on expectation functionals (such as weighted average derivatives) and use optimal criterion-based procedures (sieve minimum distance or SMD) and novel efficient score-based procedures (ES). Both these procedures use ANN to approximate the unknown function. Then, we provide a detailed practitioner's recipe for implementing these two classes of estimators. This involves the choice of tuning parameters both for the unknown functions (that include conditional expectations) but also for the choice of estimation of the optimal weights in SMD and the Riesz representers used with the ES estimators. Finally, we conduct a large set of Monte Carlo experiments that compares the finite-sample performance in complicated designs that involve a large set of regressors (up to 13 continuous), and various underlying nonlinearities and covariate correlations. Some of the takeaways from our results include: 1) tuning and optimization are delicate especially as the problem is nonconvex; 2) various architectures of the ANNs do not seem to matter for the designs we consider and given proper tuning, ANN methods perform well; 3) stable inferences are more difficult to achieve with ANN estimators; 4) optimal SMD based estimators perform adequately; 5) there seems to be a gap between implementation and approximation theory. Finally, we apply ANN NPIV to estimate average price elasticity and average derivatives in two demand examples.
SCIENCE
arxiv.org

Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble

Sign language is commonly used by deaf or mute people to communicate but requires extensive effort to master. It is usually performed with the fast yet delicate movement of hand gestures, body posture, and even facial expressions. Current Sign Language Recognition (SLR) methods usually extract features via deep neural networks and suffer overfitting due to limited and noisy data. Recently, skeleton-based action recognition has attracted increasing attention due to its subject-invariant and background-invariant nature, whereas skeleton-based SLR is still under exploration due to the lack of hand annotations. Some researchers have tried to use off-line hand pose trackers to obtain hand keypoints and aid in recognizing sign language via recurrent neural networks. Nevertheless, none of them outperforms RGB-based approaches yet. To this end, we propose a novel Skeleton Aware Multi-modal Framework with a Global Ensemble Model (GEM) for isolated SLR (SAM-SLR-v2) to learn and fuse multi-modal feature representations towards a higher recognition rate. Specifically, we propose a Sign Language Graph Convolution Network (SL-GCN) to model the embedded dynamics of skeleton keypoints and a Separable Spatial-Temporal Convolution Network (SSTCN) to exploit skeleton features. The skeleton-based predictions are fused with other RGB and depth based modalities by the proposed late-fusion GEM to provide global information and make a faithful SLR prediction. Experiments on three isolated SLR datasets demonstrate that our proposed SAM-SLR-v2 framework is exceedingly effective and achieves state-of-the-art performance with significant margins. Our code will be available at this https URL.
TECHNOLOGY
arxiv.org

LightSeq: Accelerated Training for Transformer-based Models on GPUs

Transformer-based models have proven to be powerful in many natural language, computer vision, and speech recognition applications. It is expensive to train these types of models due to unfixed input length, complex computation, and large numbers of parameters. Existing systems either only focus on efficient inference or optimize only BERT-like encoder models. In this paper, we present LightSeq, a system for efficient training of Transformer-based models on GPUs. We propose a series of GPU optimization techniques tailored to computation flow and memory access patterns of neural layers in Transformers. LightSeq supports a variety of network architectures, including BERT (encoder-only), GPT (decoder-only), and Transformer (encoder-decoder). Our experiments on GPUs with varying models and datasets show that LightSeq is 1.4-3.5x faster than previous systems. In particular, it gains 308% training speedup compared with existing systems on a large public machine translation benchmark (WMT14 English-German).
CODING & PROGRAMMING
technologynetworks.com

Real-Time Monitoring of HPLC Pump Performance

TESTA Analytical Solutions has published a technical report that evaluates an exciting new liquid flowmeter device designed to provide continuous real-time monitoring of the pump flow rate of any liquid chromatography system. High Performance Liquid Chromatography (HPLC) is nowadays one of the most widely used techniques in analytical chemistry. The...
TECHNOLOGY
arxiv.org

Dynamical Wasserstein Barycenters for Time-series Modeling

Many time series can be modeled as a sequence of segments representing high-level discrete states, such as running and walking in a human activity application. Flexible models should describe the system state and observations in stationary ``pure-state'' periods as well as transition periods between adjacent segments, such as a gradual slowdown between running and walking. However, most prior work assumes instantaneous transitions between pure discrete states. We propose a dynamical Wasserstein barycentric (DWB) model that estimates the system state over time as well as the data-generating distributions of pure states in an unsupervised manner. Our model assumes each pure state generates data from a multivariate normal distribution, and characterizes transitions between states via displacement-interpolation specified by the Wasserstein barycenter. The system state is represented by a barycentric weight vector which evolves over time via a random walk on the simplex. Parameter learning leverages the natural Riemannian geometry of Gaussian distributions under the Wasserstein distance, which leads to improved convergence speeds. Experiments on several human activity datasets show that our proposed DWB model accurately learns the generating distribution of pure states while improving state estimation for transition periods compared to the commonly used linear interpolation mixture models.
COMPUTERS
HPCwire

It’s Time for Real-World AI on the Edge

Artificial Intelligence has been a key talking point in the high-performance computing and computer science communities for decades now. First it was a thought experiment and theoretical discussion point, but in recent years, it has become a practical area of focus for many scientists, researchers, and engineers, as well as businesses, universities, and government agencies. As the shift from theory to practice accelerated, so too did the excitement and scope of the expected impact of this technology on our society.
SOFTWARE
towardsdatascience.com

Stacking Machine Learning Models for Multivariate Time Series

Time series analysis is all too often seen as an esoteric sub-field of data science. It is not. Other data science sub-fields have their idiosyncrasies (e.g. NLP, recommender systems, graph theory etc.), and it is the same with time series. Time series is idiosyncratic, not distinct. If your goal is...
COMPUTERS
arxiv.org

A Survey on Deep Learning for Skeleton-Based Human Animation

L. Mourot, L. Hoyet, F. Le Clerc, François Schnitzler (2), Pierre Hellier (2) ((1) Inria, Univ Rennes, CNRS, IRISA, (2) InterDigital, Inc) Human character animation is often critical in entertainment content production, including video games, virtual reality or fiction films. To this end, deep neural networks drive most recent advances through deep learning and deep reinforcement learning. In this article, we propose a comprehensive survey on the state-of-the-art approaches based on either deep learning or deep reinforcement learning in skeleton-based human character animation. First, we introduce motion data representations, most common human motion datasets and how basic deep models can be enhanced to foster learning of spatial and temporal patterns in motion data. Second, we cover state-of-the-art approaches divided into three large families of applications in human animation pipelines: motion synthesis, character control and motion editing. Finally, we discuss the limitations of the current state-of-the-art methods based on deep learning and/or deep reinforcement learning in skeletal human character animation and possible directions of future research to alleviate current limitations and meet animators' needs.
COMPUTERS
arxiv.org

An FPGA-Based Fully Pipelined Bilateral Grid for Real-Time Image Denoising

The bilateral filter (BF) is widely used in image processing because it can perform denoising while preserving edges. It has disadvantages in that it is nonlinear, and its computational complexity and hardware resources are directly proportional to its window size. Thus far, several approximation methods and hardware implementations have been proposed to solve these problems. However, processing large-scale and high-resolution images in real time under severe hardware resource constraints remains a challenge. This paper proposes a real-time image denoising system that uses an FPGA based on the bilateral grid (BG). In the BG, a 2D image consisting of x- and y-axes is projected onto a 3D space called a "grid," which consists of axes that correlate to the x-component, y-component, and intensity value of the input image. This grid is then blurred using the Gaussian filter, and the output image is generated by interpolating the grid. Although it is possible to change the window size in the BF, it is impossible to change it on the input image in the BG. This makes it difficult to associate the BG with the BF and to obtain the property of suppressing the increase in hardware resources when the window radius is enlarged. This study demonstrates that a BG with a variable-sized window can be realized by introducing the window radius parameter wherein the window radius on the grid is always 1. We then implement this BG on an FPGA in a fully pipelined manner. Further, we verify that our design suppresses the increase in hardware resources even when the window size is enlarged and outperforms the existing designs in terms of computation speed and hardware resources.
SOFTWARE
arxiv.org

On Language Model Integration for RNN Transducer based Speech Recognition

The mismatch between an external language model (LM) and the implicitly learned internal LM (ILM) of RNN-Transducer (RNN-T) can limit the performance of LM integration such as simple shallow fusion. A Bayesian interpretation suggests to remove this sequence prior as ILM correction. In this work, we study various ILM correction-based LM integration methods formulated in a common RNN-T framework. We provide a decoding interpretation on two major reasons for performance improvement with ILM correction, which is further experimentally verified with detailed analysis. We also propose an exact-ILM training framework by extending the proof given in the hybrid autoregressive transducer, which enables a theoretical justification for other ILM approaches. Systematic comparison is conducted for both in-domain and cross-domain evaluation on the Librispeech and TED-LIUM Release 2 corpora, respectively. Our proposed exact-ILM training can further improve the best ILM method.
SCIENCE
commercialintegrator.com

NSCA, PSA Release Study On Service-Based Models

The NSCA and PSA have released a new study that the organizations hope will change the integration industry’s dialogue around service-based business models. If you’re ever tired of hearing about why integrators need to adopt the revenue models of Netflix, Hello Fresh, Microsoft 365 and other popular technologies delivered as a service, then this should be a breath of fresh air.
BUSINESS

