So never lose an opportunity of urging a practical beginning, however small,
for it is wonderful how often in such matters the mustard-seed germinates and roots itself.
Reinforcement Learning with Structured Actions and Policies
The input tensors in each layer of Deep Neural Network (DNN) models are often partitioned/tiled to get accommodated in the limited on-chip memory of accelerators.
Studies show that efficient tiling schedules (commonly referred to as mapping) for a given accelerator and DNN model reduce the data movement between the accelerator and different levels of the memory hierarchy improving the performance.
However, finding layer-wise optimum mapping for a target architecture with a given energy and latency envelope is an open problem due to the huge search space in the mappings.
In this paper, we propose a Reinforcement Learning (RL) based automated mapping approach to find optimum schedules of DNN layers for a given architecture model without violating the specified energy and latency constraints.
The learned policies easily adapt to a wide range of DNN models with different hardware configurations, facilitating transfer learning improving the training time.
Experiments show that the proposed work improves latency and energy consumption by an average of 21.5% and 15.6% respectively compared to the state-of-the-art genetic algorithm-based GAMMA approach for a wide range of DNN models running on NVIDIA Deep Learning Accelerator (NVDLA).
The training time of RL-based transfer learning is 15× faster than that of GAMMA
Reinforcement Learning with Structured Actions and Policies
Deep Reinforcement Learning has been very successful in solving a variety of hard problems.
But many RL architectures treat the action as coming from an unordered set or from a bounded interval.
It is often the case that the actions and policies have a non-trivial structure that can be exploited
for more efficient learning. This ranges from game playing settings where the same action is repeated multiple times,
to supply-chain problems where the action space has a combinatorial structure, to problems that require a
hierarchical decomposition to solve effectively. In this talk, I will present several scenarios in which taking advantage
of the structure leads to more efficient learning. In particular, our talk about some of our recent work on action repetition,
actions that are related via a graph structure, ensemble policies,
and policies learnt through a combination of hierarchical planning and learning.
Many real-life applications of machine learning involve very high volumes of data and annotation of such huge data is practically impossible. Many times, we get partial or weak supervision for such data. Weak supervision can of various types. It becomes very challenging for the traditional algorithms to work with such weakly supervised data. This requires new algorithms to handle such weakly supervised data and produce desirable accuracies. We will discuss new algorithms which can deal with such weak supervision situations.
Frauds and Machine Learning Techniques for Their Detection
Almost all government functions and companies in all business domains suffer from frauds of various kinds. Organizations collect huge amounts of data covering their transactions, operations, stores, suppliers, employees and financial activities. Instances of frauds tend to leave tell-tale traces in these business databases. Given the size and complexity of these databases, and constant emergence of newer methods for frauds, it has become imperative to design and use machine learning algorithms for detecting characteristic patterns and signs of frauds. Since instances of known frauds are few, we need unsupervised techniques for their detection, which efficiently use unlabelled data for detecting even new and unknown types of frauds. In part (1) of this talk, we cover basics of frauds and explain some anomaly detection techniques that can be used for fraud detection. In part (2), we present ML algorithms to detect (a) collusion in stock market trading, and (b) tax evasion.
Vignettes of AI research: Foundations of Unsupervised Learning and Social Robotics
Artificial Intelligence(AI) based systems have made rapid progress in the last decade leading to revolutionary changes in several disciplines such as Medical Imaging, Autonomous Driving etc. However most of today's AI systems are largely based on Supervised learning, wherein the underlying machines are trained by inputs labelled by humans. The ability to learn from the environment without labels, often called Unsupervised Learning, is now considered as the next big challenge in AI. In this talk we will focus on a few foundational questions of Unsupervised Learning where many fundamental challenges remain. Some are statistical in nature, such as Model Complexity and Sample complexity, while some are algorithmic, including the challenge of provably learning the parameters of the model from finite amount of data. I will present several recent results, derived from ideas drawn from the disciplines of Computational Geometry and Statistical Mechanics, which furthers the theory behind several un-supervised learning models. Finally, I will consider Social Robotics, a sub-field of robotics, as a medium for providing Teaching Assistance to Special Educators in providing therapy to children on the Autism spectrum. Contrary to existing understanding , our results show that robotic toys could provide effective intervention.
AI is everywhere. It's not just powering applications like smart assistants, machine translation, and automated driving,
it's also giving engineers and scientists a set of techniques for tackling common tasks in new ways.
AI is transforming engineering in nearly every industry and application area.
And yet, many organisations and individual engineers and scientists are deterred by what they see as
challenges of implementing AI:
Belief that to do AI, you need to be an expert in data science
Concern that developing an AI system is time-consuming and expensive
Lack of access to good quality, labeled data
The cost and complexities of integrating AI into existing algorithms and systems
Success with AI requires more than training an AI model,
especially in AI-driven systems that make decisions and take action.
A solid AI workflow involves preparing the data, creating a model, designing the system on which the model
will run, and deploying to hardware or enterprise systems.
In this talk, There will be an overview on how AI can be approached in a pragmatic manner and how systems that incorporate AI can be developed, tested, deployed,
and monitored using consistent and explainable workflows
Data-driven decision frameworks for COVID-19 response - A personal journey
The COVID-19 pandemic turned a few of us into accidental epidemiologists. In this talk, I will speak about our journey on the R&D of data-driven decision frameworks for COVID-19 response. I will cover three aspects -- simulation models, our interactions with policy makers, and efforts in data sharing.
I will touch upon the city-scale agent-based simulator, its use in modelling the impact of the Mumbai locals,
the Campus Rakshak (campus-scale) simulator, the work-place readiness self-assessment tool,
the swabs2labs tool for efficient use of lab capacity, the Karnataka serosurveys,
their use in forecasting and assessing the heterogeneity of COVID-19 spread across the districts,
our struggle with variant modelling, the Rt calculator for India states and the districts of Karnataka,
the early warning system, our recent collaborative effort to keep alive the efforts of the covid19india.org volunteers,
and the forecast hub -- or how we learnt to stop pushing our model and embrace half-a-dozen.
We hope a few of these tools will survive to help us in the future.
Multi-label classification (MLC) is a generalization of the traditional single-label/multi-class classification. This talk will be on the what,
why and how of multi-label classification. To understand the basics, we will begin with "what" multi-label data is.
Then, we will discuss "why" it is needed, focussing on some real-life application areas where it is used. Next,
we will move on to "how" multi-label classification is performed, and highlight some popular ML classification models in the literature.
Finally, some datasets, metrics relevant to ML classification will be discussed.
Different Approaches to Natural Language Processing
Natural Language Processing (NLP) has evolved significantly over last few decades. It has various application across industry.
Different applications use different approaches of NLP to achieve solution. These different approaches of NLP majorly spanned over Rule Based NLP,
Classical NLP, Statistical NLP and Deep Learning Based NLP. Discussion over the technicalities of various NLP approaches in this talk
A tremendous amount of text-content is available in the form of documents,
microblogs, scientific articles, etc. and this is keep on growing exponentially over the time with
the arrival of new data from multiple sources. In order to scan through such large volume of data,
there is a requirement of developing some efficient text-mining techniques. Summarization
techniques become popular in extracting relevant information from huge amount of data.
Moreover development of some supervised technique requires huge amount of labeled data. The
annotation of data for developing supervised information extraction systems is time-consuming
and costly. In summarization, the aim is to generate compress, relevant, and concise information
from the available data. Different facets of summarization, like document summarization, figure-
summarization, microblog summarization, and multi-modal microblog summarization, will be
discussed in the talk. The task of summarization is posed as multiobjective optimization problem
where multiple quality measures like cohesion, readability, anti-redundancy, are simultaneously
optimized. In order to measure these quality measures, different semantic similarity measures,
textual entailment concepts are utilized. Extensive experimentations have verified that all our
proposed methods outperform many other state-of-the-art methods when tested on task-related
data-sets.
Graph Representation Learning in the Presence of Community Outliers
Graph representation learning has received significant interest in the machine learning community.
Different types of algorithms such as skip-gram based optimization, matrix factorization, deep-autoencoders and more recently,
graph neural networks are proposed in the literature. Analysis of outliers in a graph is important as all real-life networks
contain outlier nodes. A (community) outlier in a graph is a node that violets the graph's overall community structure.
Recent studies have shown that outlier nodes can affect the embeddings of other regular nodes in a graph. Further,
their embeddings can get mixed easily with other regular nodes, making it difficult to detect them by post-processing.
So, it is crucial to reduce the effect of outlier nodes on the embeddings of other nodes in a graph and detect them in an
integrated way.
In this talk, we characterize different types of outliers present in an attributed graph.
We discuss some of our recently proposed approaches that integrate outlier detection and network embedding into a single
framework and hence minimize the effect of outliers on the embeddings of regular nodes in a graph. We would also present
experimental results to motivate the problem and show the usefulness of such integrated approaches
The last decade has seen rapid strides in Artificial Intelligence (AI) moving from being a fantasy to a reality that is a part of each one of our lives, embedded in various technologies.
A catalyst of this rapid uptake has been the enormous success of deep learning methods for addressing problems in various domains including computer vision,
natural language processing, and speech understanding. However, as AI makes its way into risk-sensitive and safety-critical applications such as healthcare,
aerospace and finance, it is essential for AI models to not only make predictions but also be able to explain their predictions,
and be robust to adversarial inputs. This talk will introduce the audience to this increasingly important area of explainable and robust AI,
as well as describe some of our recent research in this domain, especially the role of causality in explaining neural networks.
While existing methods for neural network attributions (for explanations) are largely statistical,
we propose a new attribution method for neural networks developed using first principles of causality (to the best of our knowledge, the first such).
This talk will also include an overview of our other recent efforts in exploring causal inference towards explainable neural networks.