Raunak Dey

I am a Physics Ph.D. candidate at the University of Maryland, College Park, advised by Prof. Joshua Weitz and mentored by Dr. Stephen Beckett. I work at the intersection of Bayesian machine learning, scientific AI, and inverse problems — building interpretable models that recover the hidden parameters, dynamics, and interaction structure of complex systems from noisy, indirect, time-series data.

I develop probabilistic and optimal methods for inference and decision-making under realistic constraints, across computational biology, health AI, and physical systems. I am currently an AI R&D intern at Elanco, building computer-vision systems for pharmaceutical research, and I maintain open-source scientific software (B2, InvODE, PHAMILY). My mission is building quantitative models for science.

Mathematical Tools
  • Bayesian Inference & MCMC
  • Deep Learning & Generative AI
  • Optimization & Optimal Control
  • Probabilistic ML & Uncertainty
  • Mechanistic ODE & Time-series Models
Application Domains
  • Computational Biology & Health AI
  • Computer Vision & Behavioral AI
  • Time-series Forecasting
  • Phage Therapy & Treatment Modeling
  • Stochastic Physics & Signal Processing
Impact Created
  • 8 Journal & Conference Papers
  • 1 Patent
  • 3 Manuscripts in Review
  • 8 Awards
  • 3 Open-source Tools
  • 3 Students Mentored

I am fortunate to have worked alongside excellent scientists and engineers at

  • Elanco Elanco Animal Health
  • UMD University of Maryland
  • Georgia Tech Georgia Tech
  • Simons Foundation Simons Foundation
  • UBC University of British Columbia
  • IISER Kolkata IISER Kolkata

News & Highlights

profile photo

Mission: Building quantitative models for science

Research

Research Image

A common thread runs through much of my work: recovering hidden structure from indirect, noisy observations — parameters, states, interaction networks, and traits that cannot be measured directly but must be inferred. I approach these as inverse problems, solved with Bayesian and probabilistic methods that return full posterior distributions rather than point estimates, making uncertainty explicit.

Inference of microbial interaction traits and networks: In natural ecosystems, multiple phage species coexist with their bacterial hosts in ways that are difficult to observe directly. Starting from population dynamics time series, I build hierarchical Bayesian models — using PyMC and HMC/NUTS samplers — to recover latent infection traits, interaction network structure, and trait distributions across microbial communities. A key finding is that pairwise interaction measurements systematically underestimate community-level dynamics: higher-order interactions emerge at scale, enabling coexistence that pairwise models cannot explain. This is the subject of my first-author paper in The ISME Journal (2026). A companion primer on Bayesian learning of microbial traits from time series is currently in review.

Phage therapy and treatment optimization: Phages can be used to treat antibiotic-resistant bacterial infections. Using pharmacokinetic/pharmacodynamic (PK/PD) models and hollow-fiber infection model (HFIM) data, I apply Bayesian inference to estimate latent treatment parameters across 12 clinical datasets — informing questions like: when is combination phage-antibiotic therapy better than monotherapy, and what dosing schedule is optimal?

Sequential Bayesian source localization: In a search-and-rescue setting, a mobile drone must localize an acoustic emitter using only phase-difference-of-arrival measurements. I designed an optimal sequential Bayesian decision framework: at each step, the drone updates a spatial posterior and computes the information-maximizing next position. Counterintuitively, moving directly toward the estimated source is not optimal — the framework discovers better exploration-exploitation strategies.

As an AI R&D intern at Elanco Animal Health, I am building a production computer-vision system to classify animal behavior from video for pharmaceutical efficacy studies — across many individuals, multiple cameras and scenes, and strongly imbalanced behavior classes.

The core challenge is that raw video is high-dimensional, expensive, and noisy. The behavior labels come from human-annotated clinical protocols that are inconsistent in time format, duplicated, and misaligned with video timestamps. Before any model can run, the data must be made tractable and trustworthy.

Data engineering: I built an automated pipeline that cleans and time-aligns annotations — deduplicating records, normalizing date and time formats, merging behavior intervals, and matching labels to video frames using Gemini as a multimodal timestamp parser. Exploratory analysis then reduced the label space to most informative behaviors grounded in class frequency and clinical relevance.

Scalable vision pipeline: With ~1 TB of video, compute choices matter. A pilot sampling study determined the minimum frame rate and resolution that preserve behavioral signal — dramatically cutting storage and GPU cost. Segmentation then crops behavior-relevant regions per frame, reducing redundant background tokens before they reach the model.

Probabilistic modeling: Zero-shot baselines with multimodal foundation models (Gemini) produced weak results on this specialized domain. The approach I am developing layers probabilistic structure on top of fine-tuned representations: location-conditioned priors (behaviors are not equally likely at all locations in the scene) and hidden Markov model / Viterbi decoding to enforce temporal consistency across frames. This is the same inverse-problem instinct as the phage and drone work — injecting structure the way a Bayesian would, rather than letting the model hallucinate transitions freely.

LLM and foundation-model time-series forecasting: Large language models trained on token sequences can, in principle, forecast numerical time series by treating numbers as tokens and next-token prediction as next-step forecasting. I investigated how well zero-shot and few-shot LLM forecasting works on chaotic and stochastic dynamical systems — probing fundamental limits of predictability. A novel aggregated tokenization scheme improved forecasting accuracy significantly over standard baselines; LoRA adapters fine-tuned across GPT-4, LLaMA-2, and Mistral further improved task-specific performance while halving training time on HPC clusters. I also benchmarked zero/few-shot forecasting with time-series foundation models (TimesFM, Prophet) on these systems.

Network inference from population dynamics: Given observed abundance trajectories of interacting species, can we recover the underlying interaction network? I developed a multitask ML framework that jointly estimates interaction structure, mechanistic model parameters, and forecasting targets from population-dynamics data — without requiring direct observation of interactions. The framework couples differential-equation constraints with learned representations to regularize the inversion. Presented at NetSci-2024 and APS Global Summit 2025.

GPU-accelerated constrained optimization for microbiome networks: Predicting the dynamics of thousands of coupled microbial interaction networks requires solvers that scale. I designed and deployed a GPU-accelerated adaptive optimizer (TensorFlow/Adam) for large-scale constrained network time-series forecasting, achieving 5× better prediction across 10,000 interaction networks versus a CVXPY baseline.

Generative models for time series: VAE, GAN, and diffusion architectures can approximate underlying probability distributions from samples. For physics-derived dynamical systems with sparse or undersampled observations, I explored generative approaches for predictive extrapolation and uncertainty-aware synthesis of time series.

Before applying probabilistic methods to biological and health problems, I spent several years studying stochastic processes at a fundamental level — building the theoretical and experimental foundations I now use in ML contexts.

Arcsine laws in nonequilibrium systems: In equilibrium statistical mechanics, the fraction of time a Brownian particle spends on one side of its starting point follows a striking distribution: the arcsine law. We experimentally verified that this result extends to nonequilibrium systems driven by active forces — and showed that the distribution of thermodynamic currents exhibits a non-monotonic skewness as a function of observation timescale that classical theory does not predict. Published in Physical Review E (2022, first author) and Physical Review Research (2022).

Random number extraction from stochastic trajectories: True random numbers are generated by physical processes, not algorithms. Using experimentally recorded Brownian trajectories from an optical trap, I developed ML-based extraction algorithms that pass all NIST statistical randomness tests — and showed that the entropy of the extracted bits improves asymptotically with sampling rate, connecting experimental physics to cryptographic-grade validation. Published in Frontiers in Physics (2021, first author) and presented at SPIE Photonics (2021, oral).

Early in my research career I designed algorithms for getting the most out of noisy physical measurement systems — a signal-processing and optimal-control problem that turned out to be excellent training for the Bayesian inference work that followed.

Optimal sensing in microrheology: In optical-tweezers experiments, a microscopic probe embedded in a complex fluid reveals the fluid's viscoelastic properties through its motion. The challenge is that useful signal only exists in a narrow linear-response regime, and noise limits broadband measurement. I designed optimal and feedback-control algorithms — including a multi-sinusoid modulation scheme — that maximize signal-to-noise across a 2 kHz bandwidth, improving measurement speed 20× over single-frequency methods. This work was applied to study the viscoelastic properties of mutated Lamin A (linked to cardiomyopathy) and resulted in an Indian patent (No. 539208, 2024). Published in Physical Review Fluids (2021) and Soft Matter (2021).

Signal processing for radio astronomy: At NCRA (Giant Metrewave Radio Telescope, India), I built a frame-stacking pipeline to filter RFI-corrupted frames from pulsar observations, combining multi-telescope data and frame-to-frame comparison algorithms to recover clean signal from noisy integrations (99% corrupt-frame detection).

Microscopy and imaging systems: At the Li Lab (University of British Columbia), I built a custom dual-lens microscope with 3D motorized control via gaming joysticks and real-time contour-based particle tracking in MATLAB/LabVIEW. At the Silva Lab (Georgia Tech), I worked on time-synchronized dual single-photon detection for quantum spectroscopy with entangled photon pairs.

Journal & conference publications and Patents

project image

Density-dependent feedback and higher-order interactions enable coexistence in phage–bacteria community dynamics


Raunak Dey, Ashley Coenen, Nalatie Solonenko, Marie Burris, Anna Mackey, Julia Galasso, Christine Sun, David Demory, Daniel Muratore, Stephen Beckett, Matthew Sullivan, Joshua Weitz
The ISME Journal, 2026
ISME J

project image

Bayesian Learning of Microbial Traits from Population Time Series Data: A Primer


Raunak Dey, Robert Beach, Kennedi M. Hambrick, Ioannis Sgouralis, Paul Fremont, David Demory, Eric Carr, Stephen J. Beckett, Joshua S. Weitz, David Talmy
In Review, 2026
Code

project image

A system for carrying out active microrheology to probe viscoelasticity of protein


Kaushik Sengupta, Chandrayee Mukherjee, Avijit Kundu, Raunak Dey, Shuvojit Paul, Ayan Banerjee
Indian Patent No. 539208 (2024), The Patent Office, Government of India Patented, 2024
Patent

Designed an optimal control method to perform microrheology on proteins in solution such that the signal-to-noise ratio is maximized over a broadband frequency range (2 KHz).

project image

Experimental verification of arcsine laws in mesoscopic nonequilibrium systems


Raunak Dey, Avijit Kundu, Biswajit Das and Ayan Banerjee.
Physical Review E, 2022
paper / arxiv

Built a numerical framework to analyze stochastic trajectories of Brownian particles and showed that their thermodynamic currents follow the three Levy arcsine laws.

project image

Non-monotonic skewness of currents in non-equilibrium steady states


Sreekanth K. Manikandan, Biswajit Das, Avijit Kundu, Raunak Dey, Ayan Banerjee, and Supriya Krishnamurthy
Phys. Rev. Research, 2022
Paper / arXiv

Using a feed-forward neural network, we fit the best basis function to compute thermodynamic currents of a Brownian particle in water (a constrained non-convex optimization problem.) Showed how the skewness properties of these stochastic currents change non-mononotically over time.

project image

Active microrheology using pulsed optical tweezers to probe viscoelasticity of lamin A


Chandrayee Mukherjee, Kaushik Sengupta, Avijit Kundu, Raunak Dey, Ayan Banerjee
Soft Matter, Royal Society of Chemistry, 2021
Paper

Used our optimal control algorithm to maximise the broadband SNR for microrheology experiments on Lamin proteins. Laminopathies, such as cardiomyopathy are driven by mutations of these Lamin proteins. With some mutations the nuclear walls disintegrates. That changes the viscoelastic properties of the cell. However, we found out that even before the cell disintegrates, the protein in solution exhibits different viscoelastic properties, thereby aiding into fast detection.

project image

Single-shot wideband active microrheology using modulated optical tweezers


Avijit Kundu, Raunak Dey, Shuvojit Paul, Ayan Banerjee
Phys. Rev. Fluids, 2021
Paper / arXiv

Created an online feedback-control algorithm to maximise the signal to noise ratio (SNR) while performing broadband active microrheology measurement with modulated stochastic probes. We improved the benchmark SNR 10x on a 2 kHz bandwidth.

project image

Random number extraction from optically trapped Brownian oscillator using an iterative algorithm


Raunak Dey, Avijit Kundu, Subhrokoli Ghosh and Ayan Banerjee.
SPIE Nanoscience + Engineering, 2021, San Diego, California, United States, 2021
Presentation+Paper / code

Developed a class of Machine Learning algorithms to extract random numbers from dampled driven Physics based stochastic time series. Demonstrated through an Ornstein-Uhlenbeck process with 3 different potential functions.

project image

Microrheology over a broad frequency range probing multiple-sinusoid oscillating optical tweezer


Raunak Dey, Shuvojit Paul, Ayan Banerjee
SPIE Nanoscience + Engineering, 2021, San Diego, California, United States, 2021
Presentation+Paper

Invented a technique for algorithmically optimizing the signal to noise ratio (SNR) for wideband active microrheology.

project image

Simultaneous random number generation and optical tweezers calibration employing a learning algorithm based on the Brownian dynamics of a trapped colloidal particle


Raunak Dey,Subhrokoli Ghosh, Avijit Kundu and Ayan Banerjee.
Frontiers in Physics, 2021
code / PDF

Developed an optimization algorithm integrated with National Institute of Standards and Technology (NIST) tests for randomness suite to characterize how random are experimentally sampled tracjetories of Brownian particles. Our inverse model extracts real-random numbers from stochastic trajectories instead of algorithmitcally generated pseudo-randoms. Demonstrated on the experimentally measured trajectories of an optically trapped Brownian particle in water.

project image

Probing medium viscoelasticity using signal transmission through coupled harmonic oscillators*


Avijit Kundu, Raunak Dey, Shuvojit Paul, Ayan Banerjee
APS March meeting, 2021
Paper / code

Formulated a novel two-point technique to probe viscoelasticity of a medium using modulated Brownian osciallators. Using two probes instead of one, creates motional resonance between the probes and provides a finer insight into medium viscoelasticity.

Software

InvODE

Language: Python

A lightweight, powerful and intuitive Python library for parameter optimization in ordinary differential equation (ODE) systems. InvODE combines global optimization with local refinement techniques to efficiently find optimal parameter sets for complex dynamical systems.

B2 — Bayesian for Biology

Language: Python, MATLAB, Julia

Tools and examples for integrating ODE-based models with Bayesian inference for biological systems.

PHAMILY: Phage Microbe Analysis

Language: Python

GUI software for Bayesian analysis of phage and microbes, built with PySide, Qt, and PyMC (co-developed by: Timothy Cai).

Experience

May-Aug 2026

Elanco Animal Health, Indianapolis, IN

AI Research & Development Intern

May 2024 - Dec 2025

Microbiome Center, UMD

Summer Intern
ML based virus-microbe network inference

Aug 2021 - Present

Georgia Tech & University of Maryland

Graduate Research Assistant (Simons Foundation)
Physics

Summer 2019

Univeristy of British Columbia, Canada

Mitacs Graduate Research Intern
Instumentation Engineering

2018

National Center for Radio Astrophysics (NCRA)

Visiting Summer Research Program
Radio Astrophysics (Pulsar division)

2015 - 2021

Indian Institute of Science Education and Research (IISER) Kolkata

Research (Optimal Control & Stochastic Modeling)
Physics

Awards & Honors

2025
Microbiome Research Fellowship

UMD, Microbiome Center

2024
Microbiome Summer Research Award

UMD, Microbiome Center

2024
Thomas G. Mason Interdisciplinary Research Fund

Department of Physics, UMD

2024
International Conference Student Support Award

Graduate School, UMD

2020
Levenstein Award

University of Syracuse

2020
SPIE Travel Grant

MKS Instruments

2019
Mitacs GRI Award

Canada

2015
Inspire Scholarship

Government of India

Talks

Talk Image 1

Multi-task inference of virus-microbe interaction networks from population dynamics

APS Global Summit | 2025

External talk on multi-task inference of virus-microbe interaction networks from observed population dynamics.

Talk Image 2

Statistical inference of network models

NetSci-2024, Canada | June 16, 2024

Delivered a talk on using multimodal machine learning to infer network edges from population dynamics.

Talk Image 3

Bayesian inference of emergent traits and higher-order interactions in virus-microbe communities

SCOPE annual meeting, Simons Foundation (NYC) | November 2023

Presented Bayesian inference of life-history traits and higher-order emergent effects in phage-microbe communities.

Talk Image 4

ML extraction of random numbers from stochastic trajectories

SPIE Photonics, San Diego, California | August 1, 2021

Presented an iterative ML algorithm for extracting random numbers from stochastic trajectories.

Teaching

Bootcamp: PyTorch for AI

Instructors: Raunak Dey
Links: Video Lecture; Colab Notebook; Course Webpage

A summer bootcamp on scientific computing for beginners with Python and PyTorch organized by Pratyush Tiwary, University of Maryland.

AI Bootcamp Image

Intro Physics I, II

Venue: Georgia Tech
Instructors: Ed Greco, Emily Alicea-munoz, Raunak Dey (GTA)
Course ID: PH2211, PH2212

This introductory Physics course introduces programming and goal-driven problem solving tailored to physicists. The courses cover topics from basic mechanics to electromagnetism with hands-on problems solving sessions.

Intermediate Quantum Mechanics

Venue: IISER Kolkata
Instructors: Soumitro Banerjee, Raunak Dey (GTA)
Course ID: PH3105

A deep dive into the advanced concepts of quantum mechanics, emphasizing modern research methods and computational techniques.

Press and Media