Dristanta Das

Logo

Resume · LinkedIn · GitHub

Hi, I'm Dristanta Das, an Assistant Manager – Data Science at Genpact, specializing in LLM-based systems, NLP, and AI automation. I design and deploy scalable Text-to-SQL, RAG, and agentic AI solutions using frameworks like LangChain, LangGraph, and PyTorch. Previously at UST, I developed NLP-driven search and anomaly detection systems that improved operational efficiency and data reliability. My focus lies in building robust, production-ready AI solutions that drive measurable business impact.

Dristanta Das

Data Science & Machine Learning Engineer
Transforming data into actionable insights
3+ Years
12+ Projects
10+ Tech Stack
Learning

Experience

Genpact

Bengaluru, Karnataka
Apr 2025 – Present
Assistant Manager - Data Science
  • Architected end-to-end Text-to-SQL system using LangChain and LangGraph for 50+ users, reducing query creation time by 65%
  • Engineered automated invoice processing pipeline with LLMs, achieving 85%+ accuracy across 10,000+ invoices
  • Built agentic workflows with 92% task completion rate across LLM-powered applications
  • Led integration of LLM solutions with cross-functional teams, scaling to 500+ daily queries

UST

Kolkata, West Bengal
Jul 2022 – Mar 2025
Associate III Data Scientist (Oct 2022 – Mar 2025)
  • Developed provider search system using advanced NLP and LLMs, boosting search efficiency by 30%
  • Implemented NER and semantic search for medical term retrieval, achieving 25% efficiency enhancement
Associate II Data Scientist (Jul 2022 – Sept 2022)
  • Evaluated patient data using BigQuery, improving client understanding by 45% for 10,000+ patients
  • Designed data anomaly detection system adhering to HIPAA and GDPR compliance

Skills

AI/ML

NLP, LLMs, NER, RAG, Machine Learning, Deep Learning, Statistical Modeling

Frameworks

LangChain, LangGraph, PyTorch, Hugging Face, Scikit-learn, XGBoost, FastAPI, Spacy

Tools & Cloud

Python, R, SQL, Git, Docker, BigQuery, Vector DBs, AWS, Azure

Design

Architecture Design, Data Flow Modeling, UML, Anomaly Detection, HIPAA & GDPR

Projects

RAG QA Bot

GitHub →
RAG LangChain BM25 DPR LLMs
Python-based QA bot with hybrid retrieval using BM25 and DPR, achieving 40% higher accuracy. Integrated GPT-3.5-turbo, Phi-2, and Llama3 with contextual compression to reduce hallucinations by 60%.

2.5D Visual Sound

GitHub →
Audio Processing Computer Vision Deep Learning
Converting monaural audio into binaural audio by leveraging video, providing listeners with 3D sound sensation and rich perceptual experience.
2.5D Visual Sound

Topic Modelling of NLP Repositories

GitHub →
Topic Modeling GitHub API spaCy
Analyzed popular NLP repositories using GitHub API and spaCy to understand how NLP libraries are being used in the community.
Topic Modelling

Resume Analysis with Spacy

GitHub →
NLP Information Extraction Spacy
Intelligent resume scoring system using Spacy for automated candidate evaluation and ranking based on job requirements.

Bankruptcy Prediction with LDA

GitHub →
LDA Statistical Modeling Classification
Statistical data analysis using Linear Discriminant Analysis with advanced dimensionality reduction to predict financial distress in companies.
Bankruptcy Analysis

Financial Time Series Forecasting

GitHub →
Time Series ARIMA Financial Analysis
Comprehensive forecasting analysis of major global stock indices including Nifty 50, Dax, Dow Jones, and Nikkei using ARIMA models.
Financial Forecasting

FIFA 21 Data Analysis

GitHub →
EDA Sports Analytics Python
Comprehensive analysis of FIFA 21 player statistics, uncovering insights about attributes, market values, and performance metrics across different positions and nationalities.
FIFA 21 Analysis

Zomato Food Data Analysis

GitHub →
Data Visualization R & Python Business Intelligence
In-depth analysis of Zomato's restaurant and food delivery data, deriving insights about customer preferences, pricing strategies, and restaurant ratings.
Zomato Analysis