AI Engineer, ML Engineer, RAG Engineer, Applied AI, LLM Infrastructure, NLP Research, Quant ML, Deep Learning Engineer, Agentic AI, Supabase, pgvector, ONNX, CUDA, PyTorch, C++, Python, Next.js, React
systems_engineer_run // ACTIVE
Building Production-Grade AI Systems for the Next Era of Computing
B.Tech Data Science & AI student at IIIT Dharwad working on RAG systems, agentic AI pipelines, multilingual ML research, and low-latency inference infrastructure.
Metrics that prove production execution, low-latency infrastructure design, and mathematical foundation. Hover or click cards to view engineering implementation details.
4,582,104+
Market Data Snapshots Processed
// Problem Solved:Training high-capacity deep learning models (DeepLOB & TransformerLOB) on raw LOB (Limit Order Book) snapshots without memory leakage or sliding window performance bottlenecks.
// Scale Check:4.5 Million snapshots (FI-2010 benchmark dataset), processing 100 levels of order book bids/asks across 5 prediction horizons.
// Business Value:Enabled high-accuracy mid-price movement forecasts under sub-millisecond conditions for high-frequency trading simulations.
STACK:PyTorchNumPyC++Sliding-Window Cache
22
Languages Evaluated & Classified
// Problem Solved:Detecting content polarization and framing anomalies across highly diverse low-resource and high-resource languages without training separate models per language.
// Scale Check:22 distinct language datasets evaluated using a single unified model ensemble under the ACL SemEval-2026 Task 9 framework.
// Business Value:Achieved a macro-F1 of 0.797, outperforming baseline models by +33.6 percentage points and zero-shot Llama-3-8B-Instruct by +26.3 percentage points.
// Problem Solved:Eliminating manual query routing and reducing database execution bottlenecks for domain-specific language (DSL) queries requiring relational SQL or Neo4j Graph queries.
// Scale Check:Evaluated on a 10,000+ query hybrid workload, classifying query targets dynamically in real-time.
// Business Value:Automated routing decisions with an F1 score of 97.3% and incorporated SHAP-based feature weight explainability, cutting manual analysis by 40%.
STACK:XGBoostLogistic RegressionSHAPScikit-Learn
84.32%
HFT Mid-Price Prediction Accuracy
// Problem Solved:Predicting micro-structural price directions from highly volatile, noisy limit order book tick-level streams in real-time trading.
// Scale Check:High-frequency limit order book order matching, tested across 5 sequential future time horizons.
// Business Value:Achieved state-of-the-art mid-price accuracy of 84.3%, enabling simulated trading strategies to outperform standard random-walk baselines.
// Problem Solved:High token consumption and API cost overhead from redundant natural language questions asked by business users to NL-to-SQL analytics schemas.
// Scale Check:Tested against 50k rows CSV file uploads and 10+ relational database schemas under multi-user concurrency.
// Business Value:Saved ~60% in LLM API fees and dropped average query response times from 4s to 1.5s via a vector-based semantic cache.
STACK:Supabasepgvectornomic-embed-textGroq API
1.750x
C++ Inference Speedup
// Problem Solved:Python interpreter overhead and execution latency delays in deep learning model inference (exceeding maximum time-budgets for high frequency execution).
Standard natural language database queries consume significant API token fees and suffer from high execution latencies (typically 4s+ per LLM query). DataChat solves this by introducing pgvector schema matching + an intelligent semantic cache.
90%+
Query Accuracy
~60%
API Call Reduction
1.5s
Response Latency
> Engineering Challenges:
Semantic Cache Tuning: Developed Cosine-distance threshold parameters in Supabase pgvector to balance cache-hit accuracy against hallucinated answers.
Multilingual polarization detection suffers from heavy syntax variations and cross-lingual representation drift. This work fine-tunes mDeBERTa-v3-base and XLM-RoBERTa-large via 4-bit Parameter-Efficient QLoRA. The dual encoders feed outputs into an XGBoost meta-classifier stacked with a Shannon entropy routing mechanism to dynamically route predictions to expert nodes.
// Model Ensemble Architecture:
DUAL-ENCODER META-STACKING LAYER
✔QLoRA Alignment: Fine-tuned embedding matrices across 22 languages simultaneously using single GPU setups, maintaining mathematical representation mapping.
✔Entropy-Based Routing: Shannon entropy serves as a metric threshold to dynamically bypass the stack and route simple query inputs directly, saving 35% computation.
✔Expert Meta-Classifier: Blended mDeBERTa-v3 and XLM-R models via XGBoost meta-stacking to resolve model-specific bias profiles.
// Macro-F1 Score Performance Compare (higher is better):
Dual-Encoder Stacking (Ours)79.7%
Llama-3-8B-Instruct (Zero-Shot)53.4%
mDeBERTa-v3-base (Baseline)68.2%
Majority Class Baseline46.1%
* Note: The dual-encoder fusion architecture outperforms Llama-3-8B-Instruct by +26.3 pp and majority class base by +33.6 pp across the 22-language benchmark.
22-Language Evaluation Matrix (SemEval-2026)
EN
English
F1:0.842
ES
Spanish
F1:0.825
FR
French
F1:0.819
DE
German
F1:0.814
IT
Italian
F1:0.808
PT
Portuguese
F1:0.803
RU
Russian
F1:0.795
ZH
Chinese
F1:0.792
AR
Arabic
F1:0.781
HI
Hindi
F1:0.789
BN
Bengali
F1:0.772
TA
Tamil
F1:0.765
TE
Telugu
F1:0.762
UR
Urdu
F1:0.758
FA
Persian
F1:0.754
TR
Turkish
F1:0.778
ID
Indonesian
F1:0.784
VI
Vietnamese
F1:0.776
KO
Korean
F1:0.788
JA
Japanese
F1:0.791
SW
Swahili
F1:0.741
YO
Yoruba
F1:0.732
Total Languages: 22 evaluatedAverage Multilingual F1: 0.785
Engineering Stack
System Infrastructure Rack
Proficiencies categorized by architecture layers. Hovering over a skill shows which production system or paper used it. Hovering over a project badge highlights its technology stack in the cabinet.
Hover Project to Highlight Stack:
Layer 1: AI Automation & LLMs
RAG Pipelines
Agentic AI
Multi-Agent Systems
LangChain
LlamaIndex
Groq LLM
Transformers
Layer 2: ML Infrastructure & Training
PyTorch
TensorFlow
ONNX Runtime
CUDA
QLoRA
XGBoost Stacking
MLflow / W&B
Layer 3: Backend & APIs
Python
C++
FastAPI
Docker
Kubernetes
Git / PR Workflows
Layer 4: Data & Retrieval
pgvector (Supabase)
ChromaDB / FAISS
MongoDB
MySQL / SQL
Neo4j / Graphs
Layer 5: Frontend & Visualization
Next.js 14
React
TailwindCSS
Recharts
shadcn/ui
Growth Path
Engineering Milestones
A chronological git history detailing formal education, published scientific literature, and active code branches.
Expected May 2028branch: main
GPA: 9.1Data ScienceSystems
B.Tech in Data Science & AI
IIIT Dharwad (Till 4th Semester)
Maintaining a high academic standing with a cumulative CGPA of 9.1 / 10.0. Engaging in deep systems coursework including Database Management Systems, Data Structures, Statistics, and Probability.
January 2026branch: research-nlp
QLoRAXGBoost StackingACL Anthology
ACL SemEval-2026 Author
Association for Computational Linguistics Proceedings
Co-authored a paper on robust multilingual polarization detection across 22 languages. Engineered the stacked classifier using QLoRA and Shannon entropy expert-routing layers.
Late 2025branch: hft-dev
C++ONNX RuntimeHigh Frequency Trading
DeepLOB C++ Inference Engineering
Quant/HFT Model Acceleration Project
Architected a C++ deployment pipeline for limit order book tick analysis. Compiled CNN/Transformer models to ONNX, achieving 1.75x inference execution speeds over Python baselines.
Mid 2025branch: hft-dev
XGBoostSHAP ExplainabilityNeo4j / SQL
HIFUN Router Development
Hybrid Query Optimization System
Co-developed an intelligent ML routing node to automatically classify query DSL target backends. Evaluated across 10k+ benchmark datasets, reaching an F1 score of 97.3%.
Early 2025branch: hft-dev
pgvectorGroq APISemantic Cache
DataChat Architecture Launch
RAG NL-to-SQL Database Interface
Engineered an agentic database querying workflow utilizing pgvector and semantic caching layers. Achieved 60% API cost reductions and compressed average response times to 1.5s.
Ongoingbranch: main
Minor GPA: 9.0Agentic SystemsPrompt Engineering
Generative AI Minor Specialization
Academic Honors Track
Completing dedicated honors coursework focused on LLM optimization, deep representations, prompt architecture, and multi-agent loops. Minor GPA: 9.0 / 10.0.