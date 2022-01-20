ContributorsPublishersAdvertisers
Software

Use of Simulation Models for the Development of a Statistical Production Framework for Mobile Network Data with the simutils Package

By B. Oancea, D. Salgado, S. Barragan, M. Necula
arxiv.org
 4 days ago

We propose to use agent-based simulation models for the development of statistical methods in Official Statistics, especially in relation with the new digital data sources. We present a mobile network data simulator which is managed through the simutils R package which provides geospatial representations of...

arxiv.org

Comments / 0

Related
arxiv.org

A Framework for Pedestrian Sub-classification and Arrival Time Prediction at Signalized Intersection Using Preprocessed Lidar Data

The mortality rate for pedestrians using wheelchairs was 36% higher than the overall population pedestrian mortality rate. However, there is no data to clarify the pedestrians' categories in both fatal and nonfatal accidents, since police reports often do not keep a record of whether a victim was using a wheelchair or has a disability. Currently, real-time detection of vulnerable road users using advanced traffic sensors installed at the infrastructure side has a great potential to significantly improve traffic safety at the intersection. In this research, we develop a systematic framework with a combination of machine learning and deep learning models to distinguish disabled people from normal walk pedestrians and predict the time needed to reach the next side of the intersection. The proposed framework shows high performance both at vulnerable user classification and arrival time prediction accuracy.
ARTIFICIAL INTELLIGENCE
techxplore.com

iDIRECT network framework could help scientists better understand biological systems

Despite the fundamental role networks play in how scientists understand the dynamics and properties of complex systems, reconstructing networks from large-scale experimental data is a challenge. In systems biology and microbial ecology—the study of microbes in the environment and their interactions with each other—the challenges of reconstructing these networks can...
SCIENCE
arxiv.org

SnapFuzz: An Efficient Fuzzing Framework for Network Applications

In recent years, fuzz testing has benefited from increased computational power and important algorithmic advances, leading to systems that have discovered many critical bugs and vulnerabilities in production software. Despite these successes, not all applications can be fuzzed efficiently. In particular, stateful applications such as network protocol implementations are constrained by their low fuzzing throughput and the need to develop fuzzing harnesses that reset their state and isolate their side effects. In this paper, we present SnapFuzz, a novel fuzzing framework for network applications. SnapFuzz offers a robust architecture that transforms slow asynchronous network communication into fast synchronous communication based on UNIX domain sockets, speeds up all file operations by redirecting them to an in-memory filesystem, and removes the need for many fragile modifications, such as configuring time delays or writing cleanup scripts, together with several other improvements. Using SnapFuzz, we fuzzed five popular networking applications: LightFTP, Dnsmasq, LIVE555, TinyDTLS and Dcmqrscp. We report impressive performance speedups of 72.4x, 49.7x, 24.8x, 23.9x, and 8.5x, respectively, with significantly simpler fuzzing harnesses in all cases. Through its performance advantage, SnapFuzz has also found 12 previously-unknown crashes in these applications.
SOFTWARE
arxiv.org

Development of a resource-efficient FPGA-based neural network regression model for the ATLAS muon trigger upgrades

In this paper, a resource-efficient FPGA-based neural network regression model is developed for potential applications in the future hardware muon trigger system of the ATLAS experiment at the Large Hadron Collider (LHC). Effective real-time selection of muon candidates is the cornerstone of the ATLAS physics programme. With the planned upgrades, the entirely new FPGA-based hardware muon trigger system will be installed in 2025-2026 that will process full muon detector data within a 10 ${\mu}s$ latency window. The planned large FPGA devices should have sufficient spare resources to allow deployment of machine learning methods for improving identification of muon candidates and searching for new exotic particles. Our model promises to improve the rejection of the dominant source of background events in the central detector region, which are due to muon candidates with low transverse momenta. This neural network was implemented in the hardware description language using 65 digital signal processors and about 10,000 lookup tables. The simulated network latency and deadtime are 245 and 60 ns, respectively, when implemented in the FPGA device using a 400 MHz clock frequency. These results are well within the requirements of the future ATLAS muon trigger system, therefore opening a possibility for deploying machine learning methods for data taking by the ATLAS experiment at the High Luminosity LHC.
SCIENCE
IN THIS ARTICLE
#Data Visualization#Mobile Network#The Simulation#Official Statistics#Simutils#Xml#Ap#Stat#Msc#Acm
arxiv.org

A Formal Category Theoretical Framework for Multi-model Data Transformations

Data integration and migration processes in polystores and multi-model database management systems highly benefit from data and schema transformations. Rigorous modeling of transformations is a complex problem. The data and schema transformation field is scattered with multiple different transformation frameworks, tools, and mappings. These are usually domain-specific and lack solid theoretical foundations. Our first goal is to define category theoretical foundations for relational, graph, and hierarchical data models and instances. Each data instance is represented as a category theoretical mapping called a functor. We formalize data and schema transformations as Kan lifts utilizing the functorial representation for the instances. A Kan lift is a category theoretical construction consisting of two mappings satisfying a certain universal property. In this work, the two mappings correspond to schema transformation and data transformation.
COMPUTERS
arxiv.org

A novel hierarchical multiresolution framework using CutFEM

In this paper, we propose a robust concurrent multiscale method for continuum-continuum coupling based on the cut finite element method. The computational domain is defined in a fully non-conforming fashion by approximate signed distance functions over a fixed background grid and decomposed into microscale and macroscale regions by a novel zooming technique. The zoom interface is represented by a signed distance function which is allowed to intersect the computational mesh arbitrarily. We refine the mesh inside the zooming region hierarchically for high-resolution computations. In the examples considered here, the microstructure can possess void, and hard inclusions and the corresponding geometry is defined by a signed distance function interpolated over the refined mesh. In our zooming technique, the zooming interface is allowed to intersect the microstructure interface in a arbitrary way. Then, the coupling between the subdomains is applied using Nitsche's method across interfaces. This multiresolution framework proposes an efficient stabilized algorithm to ensure the stability of elements cut by the zooming and the microstructure interfaces. It is tested for several multiscale examples to demonstrate its robustness and efficiency for elasticity and plasticity problems.
CODING & PROGRAMMING
arxiv.org

Fast and accurate waveform modeling of long-haul multi-channel optical fiber transmission using a hybrid model-data driven scheme

The modeling of optical wave propagation in optical fiber is a task of fast and accurate solving the nonlinear Schrödinger equation (NLSE), and can enable the research progress and system design of optical fiber communications, which are the infrastructure of modern communication systems. Traditional modeling of fiber channels using the split-step Fourier method (SSFM) has long been regarded as challenging in long-haul wavelength division multiplexing (WDM) optical fiber communication systems because it is extremely time-consuming. Here we propose a linear-nonlinear feature decoupling distributed (FDD) waveform modeling scheme to model long-haul WDM fiber channel, where the channel linear effects are modelled by the NLSE-derived model-driven methods and the nonlinear effects are modelled by the data-driven deep learning methods. Meanwhile, the proposed scheme only focuses on one-span fiber distance fitting, and then recursively transmits the model to achieve the required transmission distance. The proposed modeling scheme is demonstrated to have high accuracy, high computing speeds, and robust generalization abilities for different optical launch powers, modulation formats, channel numbers and transmission distances. The total running time of FDD waveform modeling scheme for 41-channel 1040-km fiber transmission is only 3 minutes versus more than 2 hours using SSFM for each input condition, which achieves a 98% reduction in computing time. Considering the multi-round optimization by adjusting system parameters, the complexity reduction is significant. The results represent a remarkable improvement in nonlinear fiber modeling and open up novel perspectives for solution of NLSE-like partial differential equations and optical fiber physics problems.
COMPUTERS
arxiv.org

GraphVAMPNet, using graph neural networks and variational approach to markov processes for dynamical modeling of biomolecules

Finding low dimensional representation of data from long-timescale trajectories of biomolecular processes such as protein-folding or ligand-receptor binding is of fundamental importance and kinetic models such as Markov modeling have proven useful in describing the kinetics of these systems. Recently, an unsupervised machine learning technique called VAMPNet was introduced to learn the low dimensional representation and linear dynamical model in an end-to-end manner. VAMPNet is based on variational approach to Markov processes (VAMP) and relies on neural networks to learn the coarse-grained dynamics. In this contribution, we combine VAMPNet and graph neural networks to generate an end-to-end framework to efficiently learn high-level dynamics and metastable states from the long-timescale molecular dynamics trajectories. This method bears the advantages of graph representation learning and uses graph message passing operations to generate an embedding for each datapoint which is used in the VAMPNet to generate a coarse-grained representation. This type of molecular representation results in a higher resolution and more interpretable Markov model than the standard VAMPNet enabling a more detailed kinetic study of the biomolecular processes. Our GraphVAMPNet approach is also enhanced with an attention mechanism to find the important residues for classification into different metastable states.
COMPUTERS
YOU MAY ALSO LIKE
NewsBreak
Telecommunication
NewsBreak
Technology
NewsBreak
Computers
NewsBreak
Software
arxiv.org

Scalable In Situ Compression of Transient Simulation Data Using Time-Dependent Bases

Large-scale simulations of time-dependent problems generate a massive amount of data and with the explosive increase in computational resources the size of the data generated by these simulations has increased significantly. This has imposed severe limitations on the amount of data that can be stored and has elevated the issue of input/output (I/O) into one of the major bottlenecks of high-performance computing. In this work, we present an in situ compression technique to reduce the size of the data storage by orders of magnitude. This methodology is based on time-dependent subspaces and it extracts low-rank structures from multidimensional streaming data by decomposing the data into a set of time-dependent bases and a core tensor. We derive closed-form evolution equations for the core tensor as well as the time-dependent bases. The presented methodology does not require the data history and the computational cost of its extractions scales linearly with the size of data -- making it suitable for large-scale streaming datasets. To control the compression error, we present an adaptive strategy to add/remove modes to maintain the reconstruction error below a given threshold. We present four demonstration cases: (i) analytical example, (ii) incompressible unsteady reactive flow, (iii) stochastic turbulent reactive flow, and (iv) three-dimensional turbulent channel flow.
COMPUTERS
arxiv.org

Enhancement of Healthcare Data Performance Metrics using Neural Network Machine Learning Algorithms

Patients are often encouraged to make use of wearable devices for remote collection and monitoring of health data. This adoption of wearables results in a significant increase in the volume of data collected and transmitted. The battery life of the devices is then quickly diminished due to the high processing requirements of the devices. Given the importance attached to medical data, it is imperative that all transmitted data adhere to strict integrity and availability requirements. Reducing the volume of healthcare data for network transmission may improve sensor battery life without compromising accuracy. There is a trade-off between efficiency and accuracy which can be controlled by adjusting the sampling and transmission rates. This paper demonstrates that machine learning can be used to analyse complex health data metrics such as the accuracy and efficiency of data transmission to overcome the trade-off problem. The study uses time series nonlinear autoregressive neural network algorithms to enhance both data metrics by taking fewer samples to transmit. The algorithms were tested with a standard heart rate dataset to compare their accuracy and efficiency. The result showed that the Levenbery-Marquardt algorithm was the best performer with an efficiency of 3.33 and accuracy of 79.17%, which is similar to other algorithms accuracy but demonstrates improved efficiency. This proves that machine learning can improve without sacrificing a metric over the other compared to the existing methods with high efficiency.
HEALTH
arxiv.org

Functional Data-Driven Framework for Fast Forecasting of Electrode Slurry Rheology Simulated by Molecular Dynamics

Marc Duquesnoy, Teo Lombardo, Fernando Caro, Florent Haudiquez, Alain C. Ngandjong, Jiahui Xu, Hassan Oularbi, Alejandro A. Franco. Computational modeling of the manufacturing process of Lithium-Ion Battery (LIB) composite electrodes based on mechanistic approaches, allows predicting the influence of manufacturing parameters on electrode properties. However, ensuring that the calculated properties match well with experimental data, is typically time and resources consuming In this work, we tackled this issue by proposing a functional data-driven framework combining Functional Principal Component Analysis and K-Nearest Neighbors algorithms. This aims first to recover the early numerical values of a mechanistic electrode manufacturing simulation to predict if the observable being calculated is prone to match or not, \textit{i.e} screening step. In a second step it recovers additional numerical values of the ongoing mechanistic simulation iterations to predict the mechanistic simulation result, \textit{i.e} forecasting step. We demonstrated this approach in context of LIB manufacturing through non-equilibrium molecular dynamics (NEMD) simulations, aiming to capture the rheological behavior of electrode slurries. We discuss in full details our novel methodology and we report that the expected mechanistic simulation results can be obtained 11 times faster with respect to running the complete mechanistic simulation, while being accurate enough from an experimental point of view, with a $F1_{score}$ equals to 0.90, and a $R^2_{score}$ equals to 0.96 for the learnings validation. This paves the way towards a powerful tool to drastically reduce the utilization of computational resources while running mechanistic simulations of battery manufacturing electrodes.
ENGINEERING
arxiv.org

Calipers: A Criticality-aware Framework for Modeling Processor Performance

Computer architecture design space is vast and complex. Tools are needed to explore new ideas and gain insights quickly, with low efforts and at a desired accuracy. We propose Calipers, a criticality-based framework to model key abstractions of complex architectures and a program's execution using dynamic event-dependence graphs. By applying graph algorithms, Calipers can track instruction and event dependencies, compute critical paths, and analyze architecture bottlenecks. By manipulating the graph, Calipers enables architects to investigate a wide range of Instruction Set Architecture (ISA) and microarchitecture design choices/"what-if" scenarios during both early- and late-stage design space exploration without recompiling and rerunning the program. Calipers can model in-order and out-of-order microarchitectures, structural hazards, and different types of ISAs, and can evaluate multiple ideas in a single run. Modeling algorithms are described in detail.
CODING & PROGRAMMING
bestnewsmonitoring.com

Mobile Virtual Network Operator (MVNO) Market Statistics based on Analysis and facts in 2021

The Mobile Virtual Network Operator (MVNO) Market Report of MarketResearch.Biz is an unique collection of Market Size, Industry Growth, Share, Trends, Constraints, and drivers of Key business. Starting with associate examination on the present state of the Mobile Virtual Network Operator (MVNO) market 2021, the analysis clarifies the dynamics touching every phase inside it. The report has been introduced a distinctive analysis technology specifically designed for this market. It also includes details on sales channels, distributors, traders, and dealers, including research findings and conclusions, an appendix, and data sources. The research document goes into great detail about product launch events, business drivers, challenges, and opportunities.
MARKETS
arxiv.org

A Guideline for the Statistical Analysis of Compositional Data in Immunology

The study of immune cellular composition is of great scientific interest in immunology and multiple large-scale data have also been generated recently to support this investigation. From the statistical point of view, such immune cellular composition data corresponds to compositional data that conveys relative information. In compositional data, each element is positive and all the elements together sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations among elements in the compositional data. As this type of data has become more widely available, investigation of optimal statistical strategies considering compositional features in data became more in great need. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio and Dirichlet approaches, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.
SCIENCE
techraptor.net

Team17 Acquires Bus Simulator 21 Developer Astragon

Simulation game publisher Astragon Entertainment has announced that it has entered into an agreement to be acquired by Team17. Team17 is a game publisher with quite a few titles under its belt; most recently, it has purchased the Hell Let Loose IP and mobile publishing company The Label. However, it has also had some difficulties as of late; Team17 parted ways with Beyond Eyes developer Tiger and Squid late last year. More recently, Team17 ended its publisher relationship with Ready or Not developer Void Interactive.
VIDEO GAMES
arxiv.org

Learning-From-Disagreement: A Model Comparison and Visual Analytics Framework

With the fast-growing number of classification models being produced every day, numerous model interpretation and comparison solutions have also been introduced. For example, LIME and SHAP can interpret what input features contribute more to a classifier's output predictions. Different numerical metrics (e.g., accuracy) can be used to easily compare two classifiers. However, few works can interpret the contribution of a data feature to a classifier in comparison with its contribution to another classifier. This comparative interpretation can help to disclose the fundamental difference between two classifiers, select classifiers in different feature conditions, and better ensemble two classifiers. To accomplish it, we propose a learning-from-disagreement (LFD) framework to visually compare two classification models. Specifically, LFD identifies data instances with disagreed predictions from two compared classifiers and trains a discriminator to learn from the disagreed instances. As the two classifiers' training features may not be available, we train the discriminator through a set of meta-features proposed based on certain hypotheses of the classifiers to probe their behaviors. Interpreting the trained discriminator with the SHAP values of different meta-features, we provide actionable insights into the compared classifiers. Also, we introduce multiple metrics to profile the importance of meta-features from different perspectives. With these metrics, one can easily identify meta-features with the most complementary behaviors in two classifiers, and use them to better ensemble the classifiers. We focus on binary classification models in the financial services and advertising industry to demonstrate the efficacy of our proposed framework and visualizations.
COMPUTERS
theblockcrypto.com

Elliptic is developing a new blockchain data product for institutional crypto traders

Blockchain analytics firm Elliptic announced Monday that it is developing a new data product focused on institutional crypto traders. To that end, Elliptic has set up a new "market intelligence unit," headed by its co-founder James Smith. The unit, Elliptic's second business line after its primary blockchain analytics offering, will offer on-chain data to crypto traders.
MARKETS
arxiv.org

Dynamic Cooperative Vehicle Platoon Control Considering Longitudinal and Lane-changing Dynamics

This paper presents a distributed cascade Proportional Integral Derivate (DCPID) control algorithm for the connected and automated vehicle (CAV) platoon considering the heterogeneity of CAVs in terms of the inertial lag. Furthermore, a real-time dynamic cooperative lane-changing model for CAVs, which can seamlessly combine the DCPID algorithm and the improved sine function is developed. The DCPID algorithm determines the appropriate longitudinal acceleration and speed of the lane-changing vehicle considering the speed fluctuations of the front vehicle on the target lane (TFV). In the meantime, the sine function plans a reference trajectory which is further updated in real time using the model predictive control (MPC) to avoid potential collisions until lane-changing is completed. Both the local and the asymptotic stability conditions of the DCPID algorithm are mathematically derived, and the sensitivity of the DCPID control parameters under different states is analyzed. Simulation experiments are conducted to assess the performance of the proposed model and the results indicate that the DCPID algorithm can provide robust control for tracking and adjusting the desired spacing and velocity for all 400 scenarios, even in the relatively extreme initial state. Besides, the proposed dynamic cooperative lane-changing model can guarantee an effective and safe lane-changing with different speeds and even in emergency situations (such as the sudden deceleration of the TFV).
CARS
arxiv.org

Smoothed Model-Assisted Small Area Estimation

In countries where population census and sample survey data are limited, generating accurate subnational estimates of health and demographic indicators is challenging. Existing model-based geostatistical methods leverage covariate information and spatial smoothing to reduce the variability of estimates but often assume the survey design is ignorable, which may be inappropriate given the complex design of household surveys typically used in this context. On the other hand, small area estimation approaches common in the survey statistics literature do not incorporate both unit-level covariate information and spatial smoothing in a design-consistent way. We propose a new smoothed model-assisted estimator that accounts for survey design and leverages both unit-level covariates and spatial smoothing, bridging the survey statistics and model-based geostatistics perspectives. Under certain assumptions, the new estimator can be viewed as both design-consistent and model-consistent, offering potential benefits from both perspectives. We demonstrate our estimator's performance using both real and simulated data, comparing it with existing design-based and model-based estimators.
SCIENCE
arxiv.org

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

Graph neural networks (GNNs) have recently exploded in popularity thanks to their broad applicability to ubiquitous graph-related problems such as quantum chemistry, drug discovery, and high energy physics. However, meeting demand for novel GNN models and fast inference simultaneously is challenging because of the gap between the difficulty in developing efficient FPGA accelerators and the rapid pace of creation of new GNN models. Prior art focuses on the acceleration of specific classes of GNNs but lacks the generality to work across existing models or to extend to new and emerging GNN models. In this work, we propose a generic GNN acceleration framework using High-Level Synthesis (HLS), named GenGNN, with two-fold goals. First, we aim to deliver ultra-fast GNN inference without any graph pre-processing for real-time requirements. Second, we aim to support a diverse set of GNN models with the extensibility to flexibly adapt to new models. The framework features an optimized message-passing structure applicable to all models, combined with a rich library of model-specific components. We verify our implementation on-board on the Xilinx Alveo U50 FPGA and observe a speed-up of up to 25x against CPU (6226R) baseline and 13x against GPU (A6000) baseline. Our HLS code will be open-source on GitHub upon acceptance.
CODING & PROGRAMMING

Comments / 0

Community Policy