Gaurav Bhole

Research Intern at Laboratory of Integrative Systems Physiology, École Polytechnique Fédérale de Lausanne (EPFL)

Masters Research Student at Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology Hyderabad (IIITH)

Building solutions to decode the complexities of life sciences using AI

Gaurav Bhole

About Me

I'm pursuing my integrated B.Tech in Computer Science and Master of Science in Computational Natural Sciences by Research at IIIT Hyderabad (CGPA: 7.52/10), where I've been recognized as a Research List Award Recipient for the academic year 2024-25. I have also been awarded the IHub-Data Research-Translation Fellowship for the year 2025–26. Under Dr. Nita Parekh's guidance at CCNSB, I've contributed to research in Long-Read DNA sequencing for structural variant detection and multi-modal mammographic analysis.

Simultaneuously, I am working as a Research Intern at the Laboratory of Integrative Systems Physiology (LISP) at EPFL, where I work under the guidance of Dr. Johan Auwerx on cutting-edge aging and behavioral analysis research. My current focus involves developing hierarchical masked autoencoder frameworks to model mouse movement trajectories and decode aging patterns from behavioral time series data using the Healthspan Diversity Panel (~4000 mice from 82 genetically diverse strains).

My research philosophy centers on bridging the gap between computational innovation and biological understanding. I believe that the most profound scientific breakthroughs emerge at the intersection of rigorous computational methods and deep biological insight. By leveraging the power of artificial intelligence and machine learning, I aim to uncover patterns and relationships in complex biological systems that would otherwise remain hidden, contributing to advances that can improve human health and our understanding of life itself.

Deep Learning

Neural networks, transformers, and autoencoder architectures for biological data analysis

Medical Imaging

Mammography analysis, fMRI processing, and computer-aided diagnosis systems

Genomics

Long-read sequencing, structural variant detection, and multi-omics integration

Behavioral Analysis

Movement trajectory modeling and aging pattern recognition in biological systems

NLP & LLMs

Large language models, machine unlearning, and conversational AI for healthcare

Computational Biology

Systems-level modeling and computational approaches to biological problems

Current Research Focus

At EPFL (LISP)

Developing hierarchical masked autoencoder frameworks for modeling aging trajectories from continuous mouse behavioral data. Working with the Healthspan Diversity Panel to link natural movement patterns with genetic variation and molecular aging signatures.

At IIIT Hyderabad (CCNSB)

Advancing structural variant detection in human genomes using Long-Read DNA sequencing technologies. Developing multi-modal deep learning approaches for mammographic analysis and cancer subtype classification using hypergraph contrastive learning.

Professional Journey

EPFL

Research Intern

LISP, École Polytechnique Fédérale de Lausanne (EPFL)

May 2025 - Present • Lausanne, Switzerland

Adapting hierarchical masked autoencoder frameworks to model mouse movement trajectories and decode aging patterns from behavioral time series pose vectors, working with the Healthspan Diversity Panel (~1800 mice from 82 strains).

IIIT-H

Research Student

CCNSB, IIIT Hyderabad

May 2023 - Present • Hyderabad, India

Conducting research in Genetics and Medical Imaging under Dr. Nita Parekh. Focus on Long-Read DNA sequencing for structural variant detection and multi-modal classification strategies for mammographic analysis.

IIIT-H

Head Teaching Assistant

IIIT Hyderabad

Aug 2023 - May 2025 • Hyderabad, India

Designed examination papers and led tutorial sessions for Non-Linear Dynamics and Bioinformatics courses.

Global Health X

Research Intern

Global Health X

Jun 2024 - Sept 2024 • Hyderabad, India

Developed conversational AI agents for mental health support using LLM frameworks like DsPy and Langchain. Implemented fine-tuning on Meta-Llama-3.1-8B with PEFT for therapeutic contexts.

Publications & Research

Mammo-Bench: A Large-scale Benchmark Dataset of Mammography Images

Accepted at The 13th International Conference on Computational Advances in Bio and Medical Sciences 2025, Atlanta, USA

Gaurav Bhole, Suba S, Nita Parekh

Medical Imaging Deep Learning Mammography Dataset

HyperCLSA: A Hypergraph Contrastive Learning Pipeline for Multi-Omics Data Integration

Accepted at The 11th International Conference on Pattern Recognition and Machine Intelligence 2025, Delhi, India

Gaurav Bhole, Poorvi HC, Madhav J, Prabhakar Bhimalapuram, P K Vinod

Multi-Omics Hypergraph Learning Contrastive Learning Cancer Genomics

DFANet: A Difference Fusion Attention-based method for Semantic Change Detection

Under review at Journal of the Indian Society of Remote Sensing

Omkar Oak, Rukmini Nazre, Rujuta Budke, Suraj Sawant, Gaurav Bhole

Remote Sensing Change Detection Attention Mechanisms Multi-task Learning

Deep phenotyping via hierarchical learning of mouse movement

Oral and Poster Presentation at the Computational Biology Symposium 2025, UNIL, Switzerland

Gaurav Bhole, Giacomo von Alvensleben, Jon Lecumberri, Andy Bonnetto, Michał Grudzień, Alexander Mathis, Johan Auwerx

Behavioral Analysis Aging Research Autoencoder Systems Biology

Featured Projects

Modeling Brain Activity During Naturalistic Movie Watching

Developed encoding and decoding models using deep MLPs and LSTMs to predict fMRI brain activity from video embeddings, achieving high intra-subject accuracy for short film identification from brain activity patterns.

Neuroscience fMRI LSTM Deep Learning

Machine Unlearning for PII Removal from LLMs

Developed adaptive Representation Misdirection Unlearning techniques to selectively remove personally identifiable information from large language models. Achieved 4th place rankings on both 1B and 7B parameter model leaderboards in SemEval-2025 Task 4.

LLMs Privacy Machine Unlearning NLP

Image Captioning using FAISS-Accelerated Retrieval

Significantly reduced computational time with FAISS for large datasets using a Distributed Representation-Based Query Expansion Approach, demonstrating that classical retrieval-based methods can achieve competitive performance for image captioning tasks.

FAISS Multi-modal Retrieval Computer Vision

Parameter Efficient Fine Tuning for Text Summarization

Implemented and compared three parameter-efficient fine-tuning approaches—Prompt Tuning, LoRA, and traditional fine-tuning—on GPT-2 for text summarization using CNN/Daily Mail dataset. Validated that parameter-efficient methods achieve comparable performance with significantly reduced computational requirements.

PEFT LoRA GPT-2 NLP

Brain Encoding and Decoding for Visual Cognition

Developed bidirectional computational neuroscience pipelines using Natural Scenes Dataset to map between visual stimuli and fMRI responses. Compared CNN architectures achieving correlations up to 0.43, revealing insights into how deep networks model human visual processing mechanisms.

Neuroscience Visual Processing CNN fMRI

Tokenization Effects in Psycholinguistic Surprisal Analysis

Extended research on surprisal theory by comparing character-level n-gram models, token-level GPT-2 surprisal, and character-level surprisal via beam-based marginalization across four eye-tracking corpora. Found that marginalized character-level surprisal consistently outperformed token-based approaches.

Psycholinguistics Eye-tracking Surprisal Theory NLP

Quantization and Model Compression

Implemented various model quantization techniques for LLMs including both custom quantization implementations and Bitsandbytes integration, focusing on reducing model size while maintaining performance.

LLM Quantization Model Compression Optimization

Age Prediction from Facial Images

Developed and compared various CNN and Vision Transformer models for age prediction using facial images. Achieved 7th place ranking among 200 contestants in a Kaggle competition, demonstrating effective application of computer vision techniques for age estimation.

Computer Vision CNN Vision Transformer Kaggle

Neural Machine Translation with Transformer

Built a Transformer model from scratch for English-French translation based on the "Attention is All You Need" paper. Implemented custom encoder-decoder architecture with self-attention mechanisms and positional encodings.

Transformer Machine Translation Attention NLP

Text-Based Brain Encoding and Decoding for Cognitive Science

Developed comprehensive computational neuroscience pipeline for bidirectional mapping between textual stimuli and fMRI brain activations. Enhanced decoding performance through multi-ROI integration achieving improved 2V2 accuracy and correlation metrics.

Neuroscience NLP fMRI Cognitive Science

Analysis of Song Lyrics for Global and Indian Top Charts

Analyzed Indian and Global chart songs using self-similarity matrix algorithms for lyrics segmentation and NLP techniques for sentiment analysis. Developed extractive summarization methods and applied Valence-Arousal framework for cross-cultural comparison.

NLP Sentiment Analysis Topic Modeling Cultural Analysis

IMDB Movie Review Sentiment Analysis using RNNs and LSTMs

Implemented and compared RNN and LSTM architectures for binary sentiment classification on IMDB movie reviews, discovering that mean pooling across all timesteps significantly outperformed last-hidden-state approaches.

RNN LSTM Sentiment Analysis NLP

Technical Expertise

Programming Languages

Python C/C++ JavaScript R MATLAB Shell Assembly

Deep Learning Frameworks

PyTorch TensorFlow Hugging Face OpenCV scikit-learn FAISS

Bioinformatics

Biopython Pysam PyVCF GATK BWA Samtools

LLM & NLP

Langchain DSPy Langgraph PEFT LoRA Transformers

Tools & Technologies

Linux Git Docker Vim Jupyter SLURM

Web Development

React.js Node.js Express.js MongoDB HTML/CSS Tailwind CSS

Leadership & Service

Entrepreneurship Cell

Corporate Relations Head

Techno-Cultural Fest

Corporate Relations Head

Server Administrator

Computational Biology Server

Football Captain

University Team

A Bit of Research Humor

As someone who's experienced the academic journey from undergraduate to a masters research student, I find this representation of how people in science see each other both hilarious and surprisingly accurate!

How people in science see each other - Academic hierarchy humor poster

Current Status: Somewhere between the Masters and PhD student phases, definitely seeing my professors as the wise mentors they are (most of the time)! 😄

Let's Connect

Interested in collaboration, research opportunities, or just want to discuss the latest in AI and computational biology? I'd love to hear from you.