- Career Center Home
- Search Jobs
- Deep Learning Intern — LLM Research & Model Safety
Description
Deep Learning Intern - LLM Research & Model SafetyDeep Learning Intern - LLM Research (Safety & Alignment)We're seeking a Deep Learning Intern passionate about advancing Large Language Model (LLM) research, with a focus on safety, interpretability, and alignment. In this role, you'll investigate model behavior, identify vulnerabilities, and design fine-tuning and evaluation strategies to make AI systems more robust and trustworthy
You'll collaborate with researchers and engineers to experiment with LLMs, Vision-Language Models (VLMs), and multimodal architectures, contributing to next-generation AI systems that are both powerful and safe
This is a 12-week, full-time, on-site internship at our San Jose, California office, where you'll work on high-impact projects that directly support our mission. We're looking for motivated students eager to apply their technical and research skills to shape the future of responsible AI
Your Responsibilities
- Research and prototype methods to improve safety, interpretability, and reliability of LLMs
- Fine-tune pre-trained LLMs on curated datasets for task adaptation and behavioral control
- Design evaluation frameworks to measure robustness, alignment, and harmful output rates
- Conduct adversarial and red-teaming experiments to uncover weaknesses in model responses
- Collaborate with engineering teams to integrate findings into production inference systems
- Explore and experiment with multimodal model extensions, including VLMs and audio-based models
- Stay up-to-date with the latest research on model alignment, parameter-efficient tuning, and safety benchmarks
- Currently enrolled in a Bachelor's, Master's, or PhD program in Computer Engineering or a related field in the U.S. for the full duration of the internship
- Graduation expected between December 2026 - June 2027
- Available for 12 weeks between May-August 2026 or June-September 2026
- Strong programming skills in Python and experience with deep learning frameworks (PyTorch or TensorFlow)
- Understanding of transformer architectures, attention mechanisms, and scaling laws
- Experience or coursework in LLM fine-tuning, LoRA/QLoRA, or instruction-tuning methods
- Familiarity with evaluation datasets and safety benchmarks (e.g., HELM, TruthfulQA, JailbreakBench)
- Interest in AI safety, interpretability, or bias detection
- Exposure to Vision-Language Models (VLMs), speech/audio models, or multimodal architectures is a plus
- Ability to implement research ideas into working prototypes efficiently
- Hands-on experience in LLM and multimodal model research, focusing on safety and performance
- Exposure to fine-tuning, red-teaming, and evaluation of frontier AI models
- Mentorship from experts working at the intersection of deep learning research and AI safety engineering
- Opportunities to publish internal studies or papers and contribute to real-world model safety initiatives
Compensation:
BS: $50/hour
MS: $58/hour
PhD: $65/hour