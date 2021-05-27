Cancel
CreatorsPublishersAdvertisers
View more in
Software

Tracking Without Re-recognition in Humans and Machines

By Drew Linsley, Girik Malik, Junkyung Kim, Lakshmi N Govindarajan, Ennio Mingolla, Thomas Serre
arxiv.org
 22 days ago

Imagine trying to track one particular fruitfly in a swarm of hundreds. Higher biological visual systems have evolved to track moving objects by relying on both appearance and motion features. We investigate if state-of-the-art deep neural networks for visual tracking are capable of the same. For this, we introduce PathTracker, a synthetic visual challenge that asks human observers and machines to track a target object in the midst of identical-looking "distractor" objects. While humans effortlessly learn PathTracker and generalize to systematic variations in task design, state-of-the-art deep networks struggle. To address this limitation, we identify and model circuit mechanisms in biological brains that are implicated in tracking objects based on motion cues. When instantiated as a recurrent network, our circuit model learns to solve PathTracker with a robust visual strategy that rivals human performance and explains a significant proportion of their decision-making on the challenge. We also show that the success of this circuit model extends to object tracking in natural videos. Adding it to a transformer-based architecture for object tracking builds tolerance to visual nuisances that affect object appearance, resulting in a new state-of-the-art performance on the large-scale TrackingNet object tracking challenge. Our work highlights the importance of building artificial vision models that can help us better understand human vision and improve computer vision.

arxiv.org
IN THIS ARTICLE
#Pathtracker#Trackingnet
YOU MAY ALSO LIKE
News Break
Artificial Intelligence
News Break
Technology
News Break
Computers
News Break
Software
Related
Computersai-summary.com

Summary: Shortcomings of AI: The Bridge between Machine and Human

At present, artificial intelligence has improved to such an extent that AI-based machines and robots can paint, write poems and easily do many things that a human can do. Common sense grows within a human through experience and the practice of retaining that experience, which AI is still not capable of doing.
Softwaremartechseries.com

Building A Foundational Map Of Humanity Using Machine Learning

Geospatial data and analytics company Fraym announced a Series B financing to further scale their AI/ML software for mapping humanity. Fraym is the preeminent global provider of geospatial data for understanding population dynamics. Dozens of data-driven organizations like Mastercard, the World Bank, Department of Defense, and USAID rely on Fraym’s foundational data to drive impact and mission success. Over the past 5 years, the company has:
Wildlifetechnologynetworks.com

Night and Day: Animal Studies May Not Translate to Humans Without Time Considerations

Imagine being woken up at 3 a.m. to navigate a corn maze, memorize 20 items on a shopping list or pass your driver's test. According to a new analysis out of West Virginia University, that's often what it's like to be a rodent in a biomedical study. Mice and rats, which make up the vast majority of animal models, are nocturnal. Yet a survey of animal studies across eight behavioral neuroscience domains showed that most behavioral testing is conducted during the day, when the rodents would normally be at rest.
ScienceScience Now

Machine-generated theories of human decision-making

You are currently viewing the summary. Imagine a choice between two gambles: getting $100 with a probability of 20% or getting $50 with a probability of 80%. In 1979, Kahneman and Tversky published prospect theory (1), a mathematically specified descriptive theory of how people make risky choices such as these. They explained numerous documented violations of expected utility theory, the dominant theory at the time, by using nonlinear psychophysical functions for perceiving underlying probabilities and evaluating resulting payoffs. Prospect theory revolutionized the study of choice behavior, showing that researchers could build formal models of decision-making based on realistic psychological principles (2). But in the ensuing decades, as dozens of competing theories have been proposed (3), there has been theoretical fragmentation, redundancy, and stagnation. There is little consensus on the best decision theory or model. On page 1209 of this issue, Peterson et al. (4) demonstrate the power of a more recent approach: Instead of relying on the intuitions and (potentially limited) intellect of human researchers, the task of theory generation can be outsourced to powerful machine-learning algorithms.
Europearxiv.org

Dutch Named Entity Recognition and De-identification Methods for the Human Resource Domain

The human resource (HR) domain contains various types of privacy-sensitive textual data, such as e-mail correspondence and performance appraisal. Doing research on these documents brings several challenges, one of them anonymisation. In this paper, we evaluate the current Dutch text de-identification methods for the HR domain in four steps. First, by updating one of these methods with the latest named entity recognition (NER) models. The result is that the NER model based on the CoNLL 2002 corpus in combination with the BERTje transformer give the best combination for suppressing persons (recall 0.94) and locations (recall 0.82). For suppressing gender, DEDUCE is performing best (recall 0.53). Second NER evaluation is based on both strict de-identification of entities (a person must be suppressed as a person) and third evaluation on a loose sense of de-identification (no matter what how a person is suppressed, as long it is suppressed). In the fourth and last step a new kind of NER dataset is tested for recognising job titles in texts.
EngineeringArtsJournal

Should Humans Have Empathy For AI Machines?

Empathy, of course, is a two-way street, and we humans don’t exhibit a whole lot more of it for bots than bots do for us. Numerous studies have found that when people are placed in a situation where they can cooperate with a benevolent A.I., they are less likely to do so than if the bot were an actual person. – The New York Times.
SoftwarePosted by
TheStreet

Outlook On The Human Machine Interface Global Market To 2026 - By Component, Configuration, Technology Type, End-use Industry And Region

DUBLIN, June 3, 2021 /PRNewswire/ -- The "Human Machine Interface Market: Global Industry Trends, Share, Size, Growth, Opportunity and Forecast 2021-2026" report has been added to ResearchAndMarkets.com's offering. The global human machine interface market exhibited strong growth during 2015-2020. A human machine interface (HMI) is a component in electronic devices...
Computersarxiv.org

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David Cox, James Glass. Recent work on speech self-supervised learning (speech SSL) demonstrated the benefits of scale in learning rich and transferable representations for Automatic Speech Recognition (ASR) with limited parallel data. It is then natural to investigate the existence of sparse and transferrable subnetworks in pre-trained speech SSL models that can achieve even better low-resource ASR performance. However, directly applying widely adopted pruning methods such as the Lottery Ticket Hypothesis (LTH) is suboptimal in the computational cost needed. Moreover, contrary to what LTH predicts, the discovered subnetworks yield minimal performance gain compared to the original dense network. In this work, we propose Prune-Adjust- Re-Prune (PARP), which discovers and finetunes subnetworks for much better ASR performance, while only requiring a single downstream finetuning run. PARP is inspired by our surprising observation that subnetworks pruned for pre-training tasks only needed to be slightly adjusted to achieve a sizeable performance boost in downstream ASR tasks. Extensive experiments on low-resource English and multi-lingual ASR show (1) sparse subnetworks exist in pre-trained speech SSL, and (2) the computational advantage and performance gain of PARP over baseline pruning methods. On the 10min Librispeech split without LM decoding, PARP discovers subnetworks from wav2vec 2.0 with an absolute 10.9%/12.6% WER decrease compared to the full model. We demonstrate PARP mitigates performance degradation in cross-lingual mask transfer, and investigate the possibility of discovering a single subnetwork for 10 spoken languages in one run.
ScienceScience Now

Using large-scale experiments and machine learning to discover theories of human decision-making

You are currently viewing the abstract. Science, abe2629, this issue p. 1209; see also abi7668, p. 1150. Predicting and understanding how people make decisions has been a long-standing goal in many fields, with quantitative models of human decision-making informing research in both the social sciences and engineering. We show how progress toward this goal can be accelerated by using large datasets to power machine-learning algorithms that are constrained to produce interpretable psychological theories. Conducting the largest experiment on risky choice to date and analyzing the results using gradient-based optimization of differentiable decision theories implemented through artificial neural networks, we were able to recapitulate historical discoveries, establish that there is room to improve on existing theories, and discover a new, more accurate model of human decision-making in a form that preserves the insights from centuries of research.
TechnologyItproportal

Why we must combine machine and human in the translation world

Machine translation can offer significant time and cost-saving efficiencies for international businesses, so it’s no surprise it’s growing in popularity around the world. But for those businesses still to implement this technology for their international communication needs, recent headlines around the inaccuracies of such technology might be a cause for concern.
Aerospace & DefensePosted by
ScienceAlert

Scientists Track The Secret Sounds of Rocket Launches That Humans Can't Hear

Rocket launches are noisy affairs, for sure, but the sounds they produce aren't all audible to the human ear. As rockets leave Earth they generate infrasound, low-frequency sound waves that need special instruments to detect. And scientists have indeed been detecting them. A new study details infrasounds from 1,001 rocket launches, including Space Shuttles, Falcon 9 rockets, Soyuz rockets, the Ariane 5, Russian Proton rockets, and Chinese Long March rockets. These recordings were made using the International Monitoring System (IMS), a network of more than 50 monitoring stations around the world put together as a result of the 1996 Comprehensive Nuclear-Test-Ban Treaty....
Cell Phonestechxplore.com

New app tracks human mobility and COVID-19

Analyzing how people move about in their daily lives has long been important to urban planners, traffic engineers, and others developing new infrastructure projects. But amid the social restrictions and quarantine policies imposed during the global spread of COVID-19—which is directly linked to the movement of people—human mobility patterns changed dramatically.
Computersarxiv.org

Partial success in closing the gap between human and machine vision

Robert Geirhos, Kantharaju Narayanappa, Benjamin Mitzkus, Tizian Thieringer, Matthias Bethge, Felix A. Wichmann, Wieland Brendel. A few years ago, the first CNN surpassed human performance on ImageNet. However, it soon became clear that machines lack robustness on more challenging test cases, a major obstacle towards deploying machines "in the wild" and towards obtaining better computational models of human visual perception. Here we ask: Are we making progress in closing the gap between human and machine vision? To answer this question, we tested human observers on a broad range of out-of-distribution (OOD) datasets, adding the "missing human baseline" by recording 85,120 psychophysical trials across 90 participants. We then investigated a range of promising machine learning developments that crucially deviate from standard supervised CNNs along three axes: objective function (self-supervised, adversarially trained, CLIP language-image training), architecture (e.g. vision transformers), and dataset size (ranging from 1M to 1B). Our findings are threefold. (1.) The longstanding robustness gap between humans and CNNs is closing, with the best models now matching or exceeding human performance on most OOD datasets. (2.) There is still a substantial image-level consistency gap, meaning that humans make different errors than models. In contrast, most models systematically agree in their categorisation errors, even substantially different ones like contrastive self-supervised vs. standard supervised models. (3.) In many cases, human-to-model consistency improves when training dataset size is increased by one to three orders of magnitude. Our results give reason for cautious optimism: While there is still much room for improvement, the behavioural difference between human and machine vision is narrowing. In order to measure future progress, 17 OOD datasets with image-level human behavioural data are provided as a benchmark here: this https URL.
Sciencearxiv.org

Physion: Evaluating Physical Prediction from Vision in Humans and Machines

Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiau-Yu Fish Tung, R.T. Pramod, Cameron Holdaway, Sirui Tao, Kevin Smith, Li Fei-Fei, Nancy Kanwisher, Joshua B. Tenenbaum, Daniel L.K. Yamins, Judith E. Fan. While machine learning algorithms excel at many challenging visual tasks, it is unclear that they can...
Coding & Programmingcantorsparadise.com

Alan Turing on Intuition and Human-Machine Computation

According to Alan Turing (writing in 1936/7), when it comes to human beings, a computation is the following:. A computation occurs when the human mind carries out a mental action according to a rule. The words above (which aren’t Turing’s own — exact — words) don’t mean that people know...
TechnologyHousing Wire

How tech can improve the title experience without removing the human element

Click to share on Facebook (Opens in new window) Click to email this to a friend (Opens in new window) One of the biggest challenges facing title services is the rapidly changing landscape. Historically, title has been slow to adapt to technology – however, COVID-19 drove the adoption and implementation of digital solutions.
TechnologyEntrepreneur

Evolving Landscape Of AI driven Video Analytics

Opinions expressed by Entrepreneur contributors are their own. You're reading Entrepreneur India, an international franchise of Entrepreneur Media. The Internet has an ocean of images and videos that exude diverse meaning/context to its creator. It can become an extremely informative/relevant source of information should there be a way this data can be read, analysed to arrive at coherent information, and applied to derive meaningful applications, beneficial to a larger audience. Video surveillance is a common technology owned by public and private entities alike to monitor their survillence requirements. Video analytics on the other hand is an advanced emerging technology that gathers, processes and analyses the visual content from the CCTV/IP cameras, making the data an asset that can be put to meaningful use to empower businesses.
Sciencearxiv.org

Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

We address the novel task of jointly reconstructing the 3D shape, texture, and motion of an object from a single motion-blurred image. While previous approaches address the deblurring problem only in the 2D image domain, our proposed rigorous modeling of all object properties in the 3D domain enables the correct description of arbitrary object motion. This leads to significantly better image decomposition and sharper deblurring results. We model the observed appearance of a motion-blurred object as a combination of the background and a 3D object with constant translation and rotation. Our method minimizes a loss on reconstructing the input image via differentiable rendering with suitable regularizers. This enables estimating the textured 3D mesh of the blurred object with high fidelity. Our method substantially outperforms competing approaches on several benchmarks for fast moving objects deblurring. Qualitative results show that the reconstructed 3D mesh generates high-quality temporal super-resolution and novel views of the deblurred object.
Chemistryarxiv.org

Fast Quantum Property Prediction via Deeper 2D and 3D Graph Networks

Meng Liu, Cong Fu, Xuan Zhang, Limei Wang, Yaochen Xie, Hao Yuan, Youzhi Luo, Zhao Xu, Shenglong Xu, Shuiwang Ji. Molecular property prediction is gaining increasing attention due to its diverse applications. One task of particular interests and importance is to predict quantum chemical properties without 3D equilibrium structures. This is practically favorable since obtaining 3D equilibrium structures requires extremely expensive calculations. In this work, we design a deep graph neural network to predict quantum properties by directly learning from 2D molecular graphs. In addition, we propose a 3D graph neural network to learn from low-cost conformer sets, which can be obtained with open-source tools using an affordable budget. We employ our methods to participate in the 2021 KDD Cup on OGB Large-Scale Challenge (OGB-LSC), which aims to predict the HOMO-LUMO energy gap of molecules. Final evaluation results reveal that we are one of the winners with a mean absolute error of 0.1235 on the holdout test set. Our implementation is available as part of the MoleculeX package (this https URL).