Research Scientist Intern, Real-Time Multimodal AI (PhD) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Meta · 10 hours ago

Research Scientist Intern, Real-Time Multimodal AI (PhD)

Meta is building the future of connection through world-class AR/VR hardware and software. They are seeking an exceptional Research Scientist Intern to contribute to the development of real-time multimodal AI systems, focusing on fine-tuning and optimizing large foundation models for agent-based applications.

Computer Software
check
Comp. & Benefits

Responsibilities

Research and develop novel approaches for fine-tuning large multimodal foundation models (vision-language, audio-visual) for real-time applications
Design and implement efficient inference pipelines for deploying fine-tuned models in real-time communication scenarios
Explore agentic architectures that leverage fine-tuned models as tools within larger AI systems
Collaborate with cross-functional teams to integrate models into prototype experiences
Document and present research progress with the goal of publishing findings at top-tier ML/CV conferences
Contribute to building working prototypes that demonstrate the capabilities of fine-tuned multimodal models

Qualification

Multimodal learningVision-language modelsFine-tuning large modelsPython programmingDeep learning frameworksReal-time communication systemsCloud infrastructure (AWS)Containerization (Docker)Agentic AI systemsParameter-efficient fine-tuningCommunication

Required

2+ years of research experience in one or more of the following areas: multimodal learning, vision-language models, large language models, or foundation model fine-tuning
Hands-on experience fine-tuning large foundation models (e.g., LLaVA, InternVL, Qwen-VL, LLaMA, or similar)
Strong programming skills in Python
Experience with deep learning frameworks such as PyTorch
Excellent communication skills and ability to work independently
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment

Preferred

Proven track record of achieving significant results as demonstrated by first-authored publications at leading conferences such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICASSP, Interspeech, ACL, EMNLP, or similar
Experience with speech-to-speech LLMs or audio-visual foundation models
Familiarity with real-time communication systems (e.g., LiveKit, WebRTC) or low-latency inference optimization
Experience with cloud infrastructure (AWS) and containerization (Docker)
Experience with parameter-efficient fine-tuning techniques (LoRA, QLoRA, adapters, etc.)
Experience with agentic AI systems, tool-use, or function-calling in LLMs
Demonstrated software engineering experience via internships, work experience, or contributions to open source repositories (e.g., GitHub)
Intent to return to degree program after completion of the internship

Benefits

Benefits

Company

Meta's mission is to build the future of human connection and the technology that makes it possible.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Kathryn Glickman
Director, CEO Communications
linkedin
leader-logo
Christine Lu
CTO Business Engineering NA
linkedin
Company data provided by crunchbase