MJ

Multi-Agent Code Development System

LLMDistillationPrompt EngineeringFine-tuningLanGraphRLHFRLAIFStreamlitHuggingFace TransformersPyTorchChatbot

Developed a modular multi-agent system for code generation, debugging, and explanation enhanced by reinforcement learning with human and AI feedback.

Overview

This project introduces a modular multi-agent system designed to enhance code generation, debugging, and explanation through specialized roles, integrating Reinforcement Learning with Human Feedback (RLHF) and AI Feedback (RLAIF). A Streamlit UI provides interactive user engagement, facilitating structured feedback.

Technical Highlights

The system leverages a hierarchical agent architecture:

  • Planner Agent: Routes and classifies user queries.
  • Chain-of-Thought (CoT) Agent: Provides structured reasoning and algorithm strategies.
  • Developer Agent: Generates executable Python code.
  • Debugger Agent: Identifies and corrects code issues.
  • Explainer Agent: Offers concise, user-friendly code explanations.
Project Diagram

Agent communication is managed via a LangGraph workflow, ensuring efficient state transitions and clear data flow.

Challenges and Solutions

  • Agent Coordination: Addressed by implementing a LangGraph-based state machine for efficient inter-agent communication.
  • Model Efficiency: Employed knowledge distillation to fine-tune lightweight student models using larger teacher models, balancing computational efficiency and performance.
  • Feedback Integration: Implemented PPO-based fine-tuning with both human and AI feedback, enhancing agent performance iteratively.

Technologies and Tools

  • Python, PyTorch, Hugging Face Transformers
  • LangGraph, Streamlit
  • Reinforcement Learning (RLHF, RLAIF), PPO

My Role

Designed and implemented RLHF and RLAIF frameworks, developed specialized reward models, integrated PPO for agent optimization, and managed feedback-driven enhancements.

Key Results

The system demonstrated comparable performance to significantly larger models, notably improving code accuracy, robustness, and explanation clarity. Detailed evaluation, feedback loops, and results are included in the project report(top of the page).