Karan Desai

Karan Desai

I am a fifth year Computer Science PhD student at the University of Michigan, advised by Justin Johnson. I work in computer vision, and I study how we can use natural language supervision to tackle computer vision tasks. I am deeply passionate about curating high-quality datasets in a scalable and ethically responsible manner.

During my PhD, I interned twice at Meta AI: summer 2021 with Laurens van der Maaten and Ishan Misra, summer 2022 and Rama Vedantam and Maximilian Nickel. Before joining UMich, I was a visiting scholar at the Georgia Institute of Technology, working with the labs of Devi Parikh and Dhruv Batra. I completed my undergraduate studies in 2018 from the Indian Institute of Technology Roorkee with a major in Electrical Engineering and minor in Computer Science.

Feel free to say hi: kdexd at umich dot edu

Selected Publications

Hyperbolic Image-Text Representations
Karan Desai, Maximilian Nickel, Tanmay Rajpurohit, Justin Johnson, Ramakrishna Vedantam
ICML 2023 paper bibtex code
Learning Visual Representations via Language-Guided Sampling
Mohamed El Banani, Karan Desai, Justin Johnson
CVPR 2023 paper bibtex code
RedCaps: Web-curated image-text data created by the people, for the people
Karan Desai, Gaurav Kaul, Zubin Aysola, Justin Johnson
NeurIPS 2021 (Datasets and Benchmarks) paper bibtex code website
CASTing Your Model: Learning to Localize Improves Self-Supervised Representations
Ramprasaath R. Selvaraju*, Karan Desai*, Justin Johnson, Nikhil Naik
CVPR 2021 paper bibtex code blog
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai and Justin Johnson
CVPR 2021 paper bibtex code website video
Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
Ramakrishna Vedantam, Karan Desai, Stefan Lee, Marcus Rohrbach, Dhruv Batra, Devi Parikh
ICML 2019 paper bibtex code website
nocaps: novel object captioning at scale
Harsh Agrawal*, Karan Desai*, Yufei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson
ICCV 2019 paper bibtex code website

Side Projects

(before grad school)

PyTorch implementation of the EMNLP 2017 paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog".
Trianglify is a highly customizable library to generate beautiful triangle art views for android. Uses the Delaunay Triangulation algorithm under the hood.
Yolog wraps over vanilla git log for better display of commit history graph.

First Projects

These are my humble beginnings, I try to keep them functional over the years!

My first neural network using numpy (2015), a multi layer perceptron classifier for MNIST. Back then, this repo made to the Github trending charts for almost two weeks.
My first github repository (2015), browser-based snake game implemented in JavaScript. The game still works on Github pages!