Mukesh Javvaji - Portfolio

Developed a modular multi-agent system for code generation, debugging, and explanation enhanced by reinforcement learning with human and AI feedback.

Overview

This project introduces a modular multi-agent system designed to enhance code generation, debugging, and explanation through specialized roles, integrating Reinforcement Learning with Human Feedback (RLHF) and AI Feedback (RLAIF). A Streamlit UI provides interactive user engagement, facilitating structured feedback.

Technical Highlights

The system leverages a hierarchical agent architecture:

Planner Agent: Routes and classifies user queries.
Chain-of-Thought (CoT) Agent: Provides structured reasoning and algorithm strategies.
Developer Agent: Generates executable Python code.
Debugger Agent: Identifies and corrects code issues.
Explainer Agent: Offers concise, user-friendly code explanations.

Agent communication is managed via a LangGraph workflow, ensuring efficient state transitions and clear data flow.

Challenges and Solutions

Agent Coordination: Addressed by implementing a LangGraph-based state machine for efficient inter-agent communication.
Model Efficiency: Employed knowledge distillation to fine-tune lightweight student models using larger teacher models, balancing computational efficiency and performance.
Feedback Integration: Implemented PPO-based fine-tuning with both human and AI feedback, enhancing agent performance iteratively.

Technologies and Tools

Python, PyTorch, Hugging Face Transformers
LangGraph, Streamlit
Reinforcement Learning (RLHF, RLAIF), PPO
GCP, Docker, CI/CD Pipelines

Key Results

The system demonstrated comparable performance to significantly larger models, notably improving code accuracy, robustness, and explanation clarity. Detailed evaluation, feedback loops, and results are included in the project report(top of the page).

Multi-Agent Code Development System

Overview

Technical Highlights

Challenges and Solutions

Technologies and Tools

Key Results