Selected Publications

In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called “CoaCor”), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others. To this end, we propose an effective framework based on reinforcement learning, which explicitly encourages the code annotation model to generate annotations that can be used for the retrieval task. Through extensive experiments, we show that code annotations generated by our framework are much more detailed and more useful for code retrieval, and they can further improve the performance of existing code retrieval models significantly.
The Web Confernce (WWW), 2019, 2019

Experimental Study of StaQC vs other datasets on Code Summarization
In Deep Learning Day, KDD (Spotlight Presentation - Top 5%), 2018

Pytorch tutorial for Bi-directional LSTM-CNN-CRF Named Entity Recognizer
Presented at ICML (MLTRAIN), 2018

Work Experience

  • Graduate Research Assitant, Nationwide Center for Advanced Customer Insights (Fall 2018 - Present)

    Solving critical problems in Insurance industry using Machine Learning

  • Graduate Student Researcher, advised by Dr.Huan Sun (Spring 2018 - Present)

    Currently exploring strategies to use distant supervision techniques to boost the performance of Code Summarization models.

  • Deep Learning Intern, The Climate Corporation (May 2018 - July 2018)

    As part of the Sead and Placement team at The Climate Corporation, I worked on the development of a weather-modelling approach for seeding rate prediction.

    • Developed deep unsupervised architecture to generate features from multi-variate time-series weather information.
    • Designed transfer learning approach to use weather features developed, in a supervised machine learning algorithm to predict optimal seeding rate.
    • Developed features from soil composition, hybrids and studied their interaction with weather. The model developed will be deployed as part of the Climate fieldview application.
  • Natural Language Processing Engineer, Citi Group (July 2016 - June 2017)

    Worked on the developement of Chat Bot for Real-time Financial Trading at Citi group. As part of the NLP team, I performed the following tasks:

    • Developed models for Information Retrieval from financial Trade chats using NLP and Machine learning.
    • Developed regular expressions to capture financial entities from the unstructured financial text data.
    • Developed and tuned classification models to classify between different financial entities (swaps vs bonds).
    • Developed a Named Entity Recognizer to detect entities in a financial chat.
    • Developed a deep learning model which can classify between financial vs social texts and a deep Learning model which can extract Ticker information from an unstructured text.
    • Improved performance of the reference data mapper using Bloomfilters.

  • Application Developer, Citi Group (July 2015 - June 2016)

    Worked on the development Java module for automated Runbook generation.

    • The runbook module reduced the man hours required for runbook creation by 40%.
    • Designed modules for Runbook version control, approval workflow and notification alerts.
    • The runbook module, centralized the data storage providing scope for future data analysis on a large data corpus.
    • Developed the module using core Java, JDBC, Google Web Toolkit(GWT), Hibernate, SQL.
    • Performed POC on blockchain for Reconciliation, proving blockchain helps in avoiding the expensive process of end-of-day internal reconciliations.
    • Presented “Blockchain and it’s use cases in Financial Institutions” at the Citi Annual Town Hall’2016.

Recent & Upcoming Talks

Dual Learning for Machine Translation
Mar 12, 2018 2:00 PM
Task-Oriented Query Reformulation with Reinforcement Learning
Feb 12, 2018 2:00 PM

Projects

Study of Sequence Labeling Architectures in NLP

Studying different approaches and designs for sequence labeling tasks in NLP

Research Project - Time Series

Advisor : Prof. Jihun Hamm | Spring 2018

Video Classification using NLP

Classify Video’s based on the actions performed in the video using Image Captining

Detection of Quora Question duplicates

Machine Learning Project to identify question pairs that have the same intent?

Invasive Species Monitoring using Computer Vision

Classification of Invasive Species using Computer Vision

Image Classification using Deep Learning

CiFAR Image Classification using Convolutional Neural Networks

Sentiment Analysis of User Posts

Carried out sentiment analysis to understand the opinion polarity of posts on a Facebook page

Contact

  • Bay Area