I am Shubham Jain, Research Assistant and Ph.D. student at Computational Privacy Group, Imperial College London working with Dr. Yves-Alexandre de Montjoye. My work spans across privacy, engineering and fairness.
Prior to this I was working at Qure.ai as an AI Scientist and Founding Member of the team. I graduated in Computer Science from IIT Bombay in 2016.
This blog is to record and share my learnings, some good blog posts I come across and my experiences. I like building things, reading, traveling and watching sports. You can get in touch with me through twitter or email shubhamjain0594[at]gmail[dot]com.
Publications
Complete list with number of citations on Google Scholar
- Jain, S., Bensaid, E. and de Montjoye, Y.A., 2019, May. UNVEIL: Capture and Visualise WiFi Data Leakages. In The World Wide Web Conference (pp. 3550-3554). ACM. Link
- Patravali, J., Jain, S. and Chilamkurthy, S., 2017, September. 2D-3D fully convolutional neural networks for cardiac MR segmentation. In International Workshop on Statistical Atlases and Computational Models of the Heart (pp. 130-139). Springer, Cham. Link
Projects
Testing the data crawlers
We deployed a website at this link to test the effectiveness of data crawlers.
UNVEIL: Capture and Visualise WiFi data leakages
Abstract
In the past few years, numerous privacy vulnerabilities have been discovered in the WiFi standards and their implementations for mobile devices. These vulnerabilities allow an attacker to collect large amounts of data on the device user, which could be used to infer sensitive information such as religion, gender, and sexual orientation. Solutions for these vulnerabilities are often hard to design and typically require many years to be widely adopted, leaving many devices at risk.
In this paper, we present UNVEIL - an interactive and extendable platform to demonstrate the consequences of these attacks. The platform performs passive and active attacks on smartphones to collect and analyze data leaked through WiFi and communicate the analysis results to users through simple and interactive visualizations.
The platform currently performs two attacks. First, it captures probe requests sent by nearby devices and combines them with public WiFi location databases to generate a map of locations previously visited by the device users. Second, it creates rogue access points with SSIDs of popular public WiFis (e.g. _Heathrow WiFi, Railways WiFi) and records the resulting internet traffic. This data is then analyzed and presented in a format that highlights the privacy leakage. The platform has been designed to be easily extendable to include more attacks and to be easily deployable in public spaces. We hope that UNVEIL will help raise public awareness of privacy risks of WiFi networks.
OPAL Project
OPAL (for “Open Algorithms”) is a non-profit socio-technological innovation developed by a group of partners around the MIT Media Lab, Imperial College London, Orange, the World Economic Forum and Data-Pop Alliance, aiming to unlock the potential of private sector data for public good purposes by “sending the code to the data” in a safe, participatory, and sustainable manner. It is designed to provide a far better picture of human reality to official statisticians, policymakers, businesses, and citizens, while fostering inclusion and inputs of all on the kinds and uses of analysis performed on data about them.
Interpreting Neural Networks
Interpretability of neural networks is a major challenge and is as well an integral component of the Chest X-Rays diagnostic solution at Qure.ai. I developed an internal library implementing various papers on interpretability to generate heatmaps and ensuring that these algorithms are compatible with all the models being developed. Our paper, in regards to this work, was presented in RSNA 2017 which is the largest radiology conference in the world with 50k+ attendees. The paper received Roadie 2017 award for the most popular abstract by page views by auntminnie.com. The work has been very well summarised in this blog post.
Another blog post, written by my team, introducing various visualization algorithms can be found here.
2D-3D fully convolutional neural networks for cardiac MR segmentation
Abstract
In this paper, we develop a 2D and 3D segmentation pipelines for fully automated cardiac MR image segmentation using Deep Convolutional Neural Networks (CNN). Our models are trained end-to-end from scratch using the ACD Challenge 2017 dataset comprising of 100 studies, each containing Cardiac MR images in End Diastole and End Systole phase. We show that both our segmentation models achieve near state-of-the-art performance scores in terms of distance metrics and have convincing accuracy in terms of clinical parameters. A comparative analysis is provided by introducing a novel dice loss function and its combination with cross entropy loss. By exploring different network structures and comprehensive experiments, we discuss several key insights to obtain optimal model performance, which also is central to the theme of this challenge.
Kaggle Ultrasound Nerve Segmentation Challenge
My first Kaggle competition and first experience with deep learning. We finished 28th on the leaderboard out of 923 participants. Our solution was an ensemble of modified U-Net and Fully Connvolutional Networks for Semantic Segmentation. We released a tutorial on torchnet which can be used to start with this competition.
Real-Time Air Quality Monitoring Network
I developed a low-cost air quality monitoring network of sensors during my undergraduate thesis. This project was funded by Development Impact Lab, UC Berkeley and was done under the guidance of Prof. Bhaskaran Raman, CSE Dept., IIT Bombay. I collaborated with Ph.D. students in UC Berkeley and Professors in Environmental Science and Geographic Information Systems at IIT Bombay across different phases of the project. During the course of the project, we built hardware and drivers to measure and transmit data, backend server exposing secured APIs to record and serve data, applications built over these APIs for consuming the generated data and lastly, we designed validation experiments to compare our sensors alongside the standard monitors used by the government agencies.
The project was presented to then Hon. Minister of Environment, Mr. Prakash Javadekar via IIT Bombay Global Business Forum 2015. Presentations, Reports and Articles published in relation to the project are listed below:
- Enabling air quality analysis using Berkeley software - By K. Shankari, Amplab, UC Berkeley
- BRAS update: Air quality graphs and twitter feed - By K. Shankari, Amplab, UC Berkeley
- Project Report