Dristanta Das
Data Science & Machine Learning Engineer
Transforming data into actionable insights
3+
Years
12+
Projects
10+
Tech Stack
∞
Learning
Experience
Assistant Manager - Data Science
- Architected end-to-end Text-to-SQL system using LangChain and LangGraph for 50+ users, reducing query creation time by 65%
- Engineered automated invoice processing pipeline with LLMs, achieving 85%+ accuracy across 10,000+ invoices
- Built agentic workflows with 92% task completion rate across LLM-powered applications
- Led integration of LLM solutions with cross-functional teams, scaling to 500+ daily queries
Associate III Data Scientist (Oct 2022 – Mar 2025)
- Developed provider search system using advanced NLP and LLMs, boosting search efficiency by 30%
- Implemented NER and semantic search for medical term retrieval, achieving 25% efficiency enhancement
Associate II Data Scientist (Jul 2022 – Sept 2022)
- Evaluated patient data using BigQuery, improving client understanding by 45% for 10,000+ patients
- Designed data anomaly detection system adhering to HIPAA and GDPR compliance
Skills
AI/ML
NLP, LLMs, NER, RAG, Machine Learning, Deep Learning, Statistical Modeling
Frameworks
LangChain, LangGraph, PyTorch, Hugging Face, Scikit-learn, XGBoost, FastAPI, Spacy
Tools & Cloud
Python, R, SQL, Git, Docker, BigQuery, Vector DBs, AWS, Azure
Design
Architecture Design, Data Flow Modeling, UML, Anomaly Detection, HIPAA & GDPR
Projects
RAG
LangChain
BM25
DPR
LLMs
Python-based QA bot with hybrid retrieval using BM25 and DPR, achieving 40% higher accuracy. Integrated GPT-3.5-turbo, Phi-2, and Llama3 with contextual compression to reduce hallucinations by 60%.
Audio Processing
Computer Vision
Deep Learning
Converting monaural audio into binaural audio by leveraging video, providing listeners with 3D sound sensation and rich perceptual experience.
Topic Modeling
GitHub API
spaCy
Analyzed popular NLP repositories using GitHub API and spaCy to understand how NLP libraries are being used in the community.
NLP
Information Extraction
Spacy
Intelligent resume scoring system using Spacy for automated candidate evaluation and ranking based on job requirements.
LDA
Statistical Modeling
Classification
Statistical data analysis using Linear Discriminant Analysis with advanced dimensionality reduction to predict financial distress in companies.
Time Series
ARIMA
Financial Analysis
Comprehensive forecasting analysis of major global stock indices including Nifty 50, Dax, Dow Jones, and Nikkei using ARIMA models.
EDA
Sports Analytics
Python
Comprehensive analysis of FIFA 21 player statistics, uncovering insights about attributes, market values, and performance metrics across different positions and nationalities.
Data Visualization
R & Python
Business Intelligence
In-depth analysis of Zomato's restaurant and food delivery data, deriving insights about customer preferences, pricing strategies, and restaurant ratings.