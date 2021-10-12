CreatorsPublishersAdvertisers
Continuous Conditional Random Field Convolution for Point Cloud Segmentation

By Fei Yang, Franck Davoine, Huan Wang, Zhong Jin
 10 days ago

Point cloud segmentation is the foundation of 3D environmental perception for modern intelligent systems. To solve this problem and image segmentation, conditional random fields (CRFs) are usually formulated as discrete models in label space to encourage label consistency, which is actually a kind of postprocessing. In this paper, we reconsider the

Business Insider

Facebook is working on AI tech that will monitor your every move

Facebook envisions a future where smartglasses "become as useful in everyday life as smartphones," the company said in a new blog post. In order to achieve that future, such devices will require powerful AI software that can read and respond to the world around the headset's user. And the only way to train AI to see and hear the world like humans do is for it to experience the world like we do: from a first-person perspective.
Explainability-Aware One Point Attack for Point Cloud Neural Networks

With the proposition of neural networks for point clouds, deep learning has started to shine in the field of 3D object recognition while researchers have shown an increased interest to investigate the reliability of point cloud networks by fooling them with perturbed instances. However, most studies focus on the imperceptibility or surface consistency, with humans perceiving no perturbations on the adversarial examples. This work proposes two new attack methods: opa and cta, which go in the opposite direction: we restrict the perturbation dimensions to a human cognizable range with the help of explainability methods, which enables the working principle or decision boundary of the models to be comprehensible through the observable perturbation magnitude. Our results show that the popular point cloud networks can be deceived with almost 100% success rate by shifting only one point from the input instance. In addition, we attempt to provide a more persuasive viewpoint of comparing the robustness of point cloud models against adversarial attacks. We also show the interesting impact of different point attribution distributions on the adversarial robustness of point cloud networks. Finally, we discuss how our approaches facilitate the explainability study for point cloud networks. To the best of our knowledge, this is the first point-cloud-based adversarial approach concerning explainability. Our code is available at this https URL.
ScienceAlert

A Physicist Quantified The Amount of Information in The Entire Observable Universe

In attempts to understand the very nature of our reality, physicists sure have some mind-bending theories. Like what if information is a tangible and fundamental aspect of physical reality itself – alongside matter and energy? Or, alternatively, what if information is the fifth state of matter? Information is, after all, something all matter and energy measurably possess. The rules that govern their existence, like their mass, speed, or charge, are all bits of information they contain. So to allow experimental probing of such ideas, physicist Melvin Vopson from the University of Portsmouth in the UK estimated how much information a single elementary...
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP

Recently, transformer and multi-layer perceptron (MLP) architectures have achieved impressive results on various vision tasks. A few works investigated manually combining those operators to design visual network architectures, and can achieve satisfactory performances to some extent. In this paper, we propose to jointly search the optimal combination of convolution, transformer, and MLP for building a series of all-operator network architectures with high performances on visual tasks. We empirically identify that the widely-used strided convolution or pooling based down-sampling modules become the performance bottlenecks when the operators are combined to form a network. To better tackle the global context captured by the transformer and MLP operators, we propose two novel context-aware down-sampling modules, which can better adapt to the global information encoded by transformer and MLP operators. To this end, we jointly search all operators and down-sampling modules in a unified search space. Notably, Our searched network UniNet (Unified Network) outperforms state-of-the-art pure convolution-based architecture, EfficientNet, and pure transformer-based architecture, Swin-Transformer, on multiple public visual benchmarks, ImageNet classification, COCO object detection, and ADE20K semantic segmentation.
Lightweight Convolutional Neural Networks By Hypercomplex Parameterization

Hypercomplex neural networks have proved to reduce the overall number of parameters while ensuring valuable performances by leveraging the properties of Clifford algebras. Recently, hypercomplex linear layers have been further improved by involving efficient parameterized Kronecker products. In this paper, we define the parameterization of hypercomplex convolutional layers to develop lightweight and efficient large-scale convolutional models. Our method grasps the convolution rules and the filters organization directly from data without requiring a rigidly predefined domain structure to follow. The proposed approach is flexible to operate in any user-defined or tuned domain, from 1D to $n$D regardless of whether the algebra rules are preset. Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks for 3D inputs like color images. As a result, the proposed method operates with $1/n$ free parameters as regards its analog in the real domain. We demonstrate the versatility of this approach to multiple domains of application by performing experiments on various image datasets as well as audio datasets in which our method outperforms real and quaternion-valued counterparts.
Exact integrability conditions for cotangent vector fields

In Quantum Hydro-Dynamics the following problem is relevant: let $(\sqrt{\rho},\Lambda) \in H^1(\R^d) \times L^2(\R^d,\R^d)$ be a finite energy hydrodynamics state, i.e. $\Lambda = 0$ when $\rho = 0$ and \begin{equation*} E = \int_{\R^d} \frac{1}{2} \big| \nabla \sqrt{\rho} \big|^2 + \frac{1}{2} \Lambda^2 \mathcal L^d < \infty. \end{equation*} The question is under which conditions there exists a wave function $\psi \in H^1(\R^d,\C)$ such that \begin{equation*} \sqrt{\rho} = |\psi|, \quad J = \sqrt{\rho} \Lambda = \Im \big( \bar \psi \nabla \psi). \end{equation*} The second equation gives for $\psi = \sqrt{\rho} w$ smooth, $|w| = 1$, that $i \Lambda = \sqrt{\rho} \bar w \nabla w$.
Two-level Group Convolution

Group convolution has been widely used in order to reduce the computation time of convolution, which takes most of the training time of convolutional neural networks. However, it is well known that a large number of groups significantly reduce the performance of group convolution. In this paper, we propose a new convolution methodology called ``two-level'' group convolution that is robust with respect to the increase of the number of groups and suitable for multi-GPU parallel computation. We first observe that the group convolution can be interpreted as a one-level block Jacobi approximation of the standard convolution, which is a popular notion in the field of numerical analysis. In numerical analysis, there have been numerous studies on the two-level method that introduces an intergroup structure that resolves the performance degradation issue without disturbing parallel computation. Motivated by these, we introduce a coarse-level structure which promotes intergroup communication without being a bottleneck in the group convolution. We show that all the additional work induced by the coarse-level structure can be efficiently processed in a distributed memory system. Numerical results that verify the robustness of the proposed method with respect to the number of groups are presented. Moreover, we compare the proposed method to various approaches for group convolution in order to highlight the superiority of the proposed method in terms of execution time, memory efficiency, and performance.
Randomized Extended Kaczmarz is a Limit Point of Sketch-and-Project

The sketch-and-project (SAP) framework for solving systems of linear equations has unified the theory behind popular projective iterative methods such as randomized Kaczmarz, randomized coordinate descent, and variants thereof. We show that the randomized extended Kaczmarz (REK) method - so far not shown to lie within this framework - cannot be formulated as a SAP method, a surprising result as it is of a very similar flavor. We show, in fact, that REK may instead be recovered as a limit point of a particular family of SAP methods. We provide an extensive theoretical analysis of said family, including convergence guarantees and further connections to REK. We follow this with an array of experiments demonstrating these methods and their connections in practice.
Nature.com

Critical heat flux enhancement in microgravity conditions coupling microstructured surfaces and electrostatic field

We run pool boiling experiments with a dielectric fluid (FC-72) on Earth and on board an ESA parabolic flight aircraft able to cancel the effects of gravity, testing both highly wetting microstructured surfaces and plain surfaces and applying an external electric field that creates gravity-mimicking body forces. Our results reveal that microstructured surfaces, known to enhance the critical heat flux on Earth, are also useful in microgravity. An enhancement of the microgravity critical heat flux on a plain surface can also be obtained using the electric field. However, the best boiling performance is achieved when these techniques are used together. The effects created by microstructured surfaces and electric fields are synergistic. They enhance the critical heat flux in microgravity conditions up to 257"‰kW/m2, which is even higher than the value measured on Earth on a plain surface (i.e., 168"‰kW/m2). These results demonstrate the potential of this synergistic approach toward very compact and efficient two-phase heat transfer systems for microgravity applications.
Long range order for random field Ising and Potts models

We present a new and simple proof for the classic results of Imbrie (1985) and Bricmont-Kupiainen (1988) that for the random field Ising model in dimension three and above there is long range order at low temperatures with presence of weak disorder. With the same method, we obtain a couple of new results: (1) we prove that long range order exists for the random field Potts model at low temperatures with presence of weak disorder in dimension three and above; (2) we obtain a lower bound on the correlation length for the random field Ising model at low temperatures in dimension two (which matches the upper bound in Ding-Wirth (2020)). Our proof is based on an extension of the Peierls argument with inputs from Chalker (1983), Fisher-Fröhlich-Spencer (1984), Ding-Wirth (2020) and Talagrand's majorizing measure theory (1980s) (and in particular, our proof does not involve the renormalization group theory).
Unsupervised Representation Learning for 3D Point Cloud Data

Though a number of point cloud learning methods have been proposed to handle unordered points, most of them are supervised and require labels for training. By contrast, unsupervised learning of point cloud data has received much less attention to date. In this paper, we propose a simple yet effective approach for unsupervised point cloud learning. In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud. They make up a pair. After going through a shared encoder and a shared head network, the consistency between the output representations are maximized with introducing two variants of contrastive losses to respectively facilitate downstream classification and segmentation. To demonstrate the efficacy of our method, we conduct experiments on three downstream tasks which are 3D object classification (on ModelNet40 and ModelNet10), shape part segmentation (on ShapeNet Part dataset) as well as scene segmentation (on S3DIS). Comprehensive results show that our unsupervised contrastive representation learning enables impressive outcomes in object classification and semantic segmentation. It generally outperforms current unsupervised methods, and even achieves comparable performance to supervised methods. Our source codes will be made publicly available.
TheConversationAU

Facebook wants AI to find your keys and understand your conversations

Facebook has announced a research project that aims to push the “frontier of first-person perception”, and in the process help you remember where your left your keys. The Ego4D project provides a huge collection of first-person video and related data, plus a set of challenges for researchers to teach computers to understand the data and gather useful information from it. In September, the social media giant launched a line of “smart glasses” called Ray-Ban Stories, which carry a digital camera and other features. Much like the Google Glass project, which met mixed reviews in 2013, this one has prompted complaints of...
jaxenter.com

“Data continues to pile up at the edge, in datacenters and in clouds”

What are some of the most pressing problems with searching unstructured data today? We talked with Krishna Subramanian, President and COO of Komprise about new “deep analytics” capabilities for your data management software and best practices around searching/analyzing unstructured data. JAXenter: Today you announced new “deep analytics” capabilities for your...
Decomposing Convolutional Neural Networks into Reusable and Replaceable Modules

Training from scratch is the most common way to build a Convolutional Neural Network (CNN) based model. What if we can build new CNN models by reusing parts from previously build CNN models? What if we can improve a CNN model by replacing (possibly faulty) parts with other parts? In both cases, instead of training, can we identify the part responsible for each output class (module) in the model(s) and reuse or replace only the desired output classes to build a model? Prior work has proposed decomposing dense-based networks into modules (one for each output class) to enable reusability and replaceability in various scenarios. However, this work is limited to the dense layers and based on the one-to-one relationship between the nodes in consecutive layers. Due to the shared architecture in the CNN model, prior work cannot be adapted directly. In this paper, we propose to decompose a CNN model used for image classification problems into modules for each output class. These modules can further be reused or replaced to build a new model. We have evaluated our approach with CIFAR-10, CIFAR-100, and ImageNet tiny datasets with three variations of ResNet models and found that enabling decomposition comes with a small cost (2.38% and 0.81% for top-1 and top-5 accuracy, respectively). Also, building a model by reusing or replacing modules can be done with a 2.3% and 0.5% average loss of accuracy. Furthermore, reusing and replacing these modules reduces CO2e emission by ~37 times compared to training the model from scratch.
Conditioned local limit theorems for random walks on the real line

Consider a random walk $S_n=\sum_{i=1}^n X_i$ with independent and identically distributed real-valued increments $X_i$ of zero mean and finite variance. Assume that $X_i$ is non-lattice and has a moment of order $2+\delta$. For any $x\geq 0$, let $\tau_x = \inf \left\{ k\geq 1: x+S_{k} < 0 \right\}$ be the first time when the random walk $x+S_n$ leaves the half-line $[0,\infty)$. We study the asymptotic behavior of the probability $\bb P (\tau_x >n)$ and that of the expectation $\mathbb{E} \left( f(x + S_n ), \tau_x > n \right)$ for a large class of target function $f$ and various values of $x$, $y$ possibly depending on $n$. This general setting implies limit theorems for the joint distribution $\mathbb{P} \left( x + S_n \in y+ [0, \Delta], \tau_x > n \right)$ where $\Delta>0$ may also depend on $n$. In particular, the case of moderate deviations $y=\sigma \sqrt{q n\log n}$ is considered. We also deduce some new asymptotics for random walks with drift and give explicit constants in the asymptotic of the probability $\bb P (\tau_x =n)$. For the proofs we establish new conditioned integral limit theorems with precise error terms.
EditVAE: Unsupervised Part-Aware Controllable 3D Point Cloud Shape Generation

This paper tackles the problem of parts-aware point cloud generation. Unlike existing works which require the point cloud to be segmented into parts a priori, our parts-aware editing and generation is performed in an unsupervised manner. We achieve this with a simple modification of the Variational Auto-Encoder which yields a joint model of the point cloud itself along with a schematic representation of it as a combination of shape primitives. In particular, we introduce a latent representation of the point cloud which can be decomposed into a disentangled representation for each part of the shape. These parts are in turn disentangled into both a shape primitive and a point cloud representation, along with a standardising transformation to a canonical coordinate system. The dependencies between our standardising transformations preserve the spatial dependencies between the parts in a manner which allows meaningful parts-aware point cloud generation and shape editing. In addition to the flexibility afforded by our disentangled representation, the inductive bias introduced by our joint modelling approach yields the state-of-the-art experimental results on the ShapeNet dataset.
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

David W. Romero, Robert-Jan Bruintjes, Jakub M. Tomczak, Erik J. Bekkers, Mark Hoogendoorn, Jan C. van Gemert. When designing Convolutional Neural Networks (CNNs), one must select the size of the convolutional kernels before training. Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice. A more efficient approach is to learn the kernel size during training. However, existing works that learn the kernel size have a limited bandwidth. These approaches scale kernels by dilation, and thus the detail they can describe is limited. In this work, we propose FlexConv, a novel convolutional operation with which high bandwidth convolutional kernels of learnable kernel size can be learned at a fixed parameter cost. FlexNets model long-term dependencies without the use of pooling, achieve state-of-the-art performance on several sequential datasets, outperform recent works with learned kernel sizes, and are competitive with much deeper ResNets on image benchmark datasets. Additionally, FlexNets can be deployed at higher resolutions than those seen during training. To avoid aliasing, we propose a novel kernel parameterization with which the frequency of the kernels can be analytically controlled. Our novel kernel parameterization shows higher descriptive power and faster convergence speed than existing parameterizations. This leads to important improvements in classification accuracy.
Convolutional Neural Networks Are Not Invariant to Translation, but They Can Learn to Be

When seeing a new object, humans can immediately recognize it across different retinal locations: the internal object representation is invariant to translation. It is commonly believed that Convolutional Neural Networks (CNNs) are architecturally invariant to translation thanks to the convolution and/or pooling operations they are endowed with. In fact, several studies have found that these networks systematically fail to recognise new objects on untrained locations. In this work, we test a wide variety of CNNs architectures showing how, apart from DenseNet-121, none of the models tested was architecturally invariant to translation. Nevertheless, all of them could learn to be invariant to translation. We show how this can be achieved by pretraining on ImageNet, and it is sometimes possible with much simpler data sets when all the items are fully translated across the input canvas. At the same time, this invariance can be disrupted by further training due to catastrophic forgetting/interference. These experiments show how pretraining a network on an environment with the right `latent' characteristics (a more naturalistic environment) can result in the network learning deep perceptual rules which would dramatically improve subsequent generalization.
PointAcc: Efficient Point Cloud Accelerator

Deep learning on point clouds plays a vital role in a wide range of applications such as autonomous driving and AR/VR. These applications interact with people in real-time on edge devices and thus require low latency and low energy. Compared to projecting the point cloud to 2D space, directly processing the 3D point cloud yields higher accuracy and lower #MACs. However, the extremely sparse nature of point cloud poses challenges to hardware acceleration. For example, we need to explicitly determine the nonzero outputs and search for the nonzero neighbors (mapping operation), which is unsupported in existing accelerators. Furthermore, explicit gather and scatter of sparse features are required, resulting in large data movement overhead. In this paper, we comprehensively analyze the performance bottleneck of modern point cloud networks on CPU/GPU/TPU. To address the challenges, we then present PointAcc, a novel point cloud deep learning accelerator. PointAcc maps diverse mapping operations onto one versatile ranking-based kernel, streams the sparse computation with configurable caching, and temporally fuses consecutive dense layers to reduce the memory footprint. Evaluated on 8 point cloud models across 4 applications, PointAcc achieves 3.7X speedup and 22X energy savings over RTX 2080Ti GPU. Co-designed with light-weight neural networks, PointAcc rivals the prior accelerator Mesorasi by 100X speedup with 9.1% higher accuracy running segmentation on the S3DIS dataset. PointAcc paves the way for efficient point cloud recognition.
