Skills

Machine Learning & Deep Learning

Proficient in building and deploying machine learning models using frameworks like TensorFlow, PyTorch, and scikit-learn. Experienced in training SVMs, logistic regression, clustering models, and fine-tuning LLMs for real-world applications.

Natural Language Processing (NLP)

Skilled in entity extraction, text classification, summarization, and conceptual search using tools such as HuggingFace Transformers, SpaCy, Gensim, and NLTK. Applied NLP techniques in both traditional and neural network-based systems.

Graph AI & Knowledge Graphs

Built enterprise-grade Graph RAG systems and multi-hop question answering pipelines using Neo4j, NetworkX, and Node2Vec. Designed custom query routers and graph traversal methods for efficient retrieval.

Data Engineering & Pipelines

Experienced in building scalable data pipelines with PySpark, Airflow, and Pandas. Developed robust preprocessing systems for large-scale text and tabular data.

Full Stack Development

Developed APIs and admin interfaces using Ruby on Rails, MySQL, JavaScript, and HTML. Implemented backend logic and designed database schemas to support scalable applications.

Programming & Scripting

Expert in Python, with working knowledge of Ruby, SQL, and JavaScript. Strong understanding of data structures, algorithms, and efficient coding practices for AI and backend development.

LLMs & RAG Systems

Specialized in Retrieval-Augmented Generation (RAG), agentic RAG, and embedding-based search. Integrated OpenAI, LLaMA, and custom SLMs into production systems with optimized routing and accuracy.

Teaching & Mentorship

Passionate educator and visiting professor with experience delivering lectures on data science, NLP, and AI topics. Skilled in breaking down complex topics for diverse audiences.

Career Highlights

Intro to GraphRAG with Neo4j

JAVAFEST • 2025

Watch it on YouTube

Presented a live demo exploring Graph-based Retrieval-Augmented Generation (Graph RAG) using Neo4j as a backend, highlighting its strengths in knowledge-intensive AI workflows.

Nitty Gritty Details on Data Science

EPAM • 2021

Delivered a deep dive into Locality Sensitive Hashing (LSH), classification evaluators, and metrics, with performance comparisons for large-scale text mining systems.

String Manipulation Techniques After Java 8

DEVON • 2023

A short clip on YouTube

Compared performance of string manipulation techniques introduced in Java 8+, using JMH benchmarks to assess efficiency across various approaches.

Experience

Dataworkz

Data Scientist • Jun, 2022 — Present

Leading initiatives in Graph-based RAG, LLM fine-tuning, and intelligent query routing systems for enterprise AI applications. Focused on combining knowledge graphs and generative AI for more accurate, contextual responses.

  • Built Graph RAG systems using Neo4j, NetworkX, and MongoDB with Node2Vec for structured knowledge extraction
  • Designed a multi-hop question generator and optimized LLM-based query routing (F1-score 0.959)
  • Worked on tool-creator engines, agentic RAG, and domain-specific entity extraction using SLMs

Mobicip Technologies

Full Stack Developer • Aug, 2021 — Oct, 2022

Bridged full-stack development with data science, contributing to both frontend and backend architecture for parental control applications.

  • Built APIs for signup and subscription systems using Ruby on Rails
  • Designed and implemented the database schema for the Admin Portal
  • Streamlined FAQ data processing through custom scripts and data cleaning pipelines

Equator Technologies

NLP Engineer • Mar, 2019 — Jul, 2021

Developed ML solutions for document classification, deduplication, and conceptual search across massive datasets.

  • Built and deployed 4 end-to-end ML projects handling over 1.5M text documents
  • Implemented MinHash and LSH for scalable document deduplication
  • Developed conceptual search engine using shingling and efficient hashing techniques

Projects

Graph RAG - Hybrid Retrieval for Enterprise AI

Architect & Developer • Nov, 2024 — Present

Designed and deployed a Graph-based Retrieval-Augmented Generation system for enterprise use cases, leveraging Neo4j, Node2Vec, and NetworkX to improve retrieval efficiency and support multi-hop question answering.

Agentic RAG with Query Routing Optimization

ML Engineer • Feb, 2024 — June, 2024

Built an Agentic RAG system integrating dynamic tool use with optimized query routing techniques including LLM-based, embedding-based, and dependency parsing methods. Achieved peak F1-score of 0.959 for query classification.

Fine-tuning Multiple LLMs for Domain-Specific Tasks

ML Researcher • Mar, 2023 — Aug, 2023

Researched and fine-tuned LLMs (Llama3, GPT) for QA and summarization tasks using few-shot learning, BERTScore, and context-aware embeddings. Achieved a 30% improvement in response relevance using vectorized storage in MongoDB and Couchbase.

Large Scale Near Duplicate Identification

Developer & Researcher • Oct, 2020 — Apr, 2021

Implemented a scalable near-duplicate document detection system using Locality Sensitive Hashing (LSH) and shingling to process 1.6 million documents. Achieved O(1) document querying with optimized LSH configuration.