Senior Research Scientist (Multimodal Large Language Model) - PICO jobs in United States
cer-icon
Apply on Employer Site
company-logo

ByteDance · 12 hours ago

Senior Research Scientist (Multimodal Large Language Model) - PICO

ByteDance is a VR company committed to developing immersive and interactive VR experiences. They are seeking a Senior Research Scientist to lead the R&D of multimodal large language models tailored for mixed reality scenarios, focusing on integrating various modalities and enhancing user interaction through innovative technologies.

ContentData MiningFoundational AIInternetSocial Media
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Lead the R&D of multimodal large language models (MLLM) tailored for MR scenarios, integrating vision, point clouds, text, and other multimodal information—including model architecture optimization, cross-modal alignment, data construction, evaluation system enhancement, and end-to-end training/inference acceleration
Drive the research and implementation of MLLM tool-use capabilities in MR environments, enabling models to proficiently utilize spatial interaction and spatial computing-related professional tools, support tool calls for both single-turn and multi-turn conversations, and solve complex user tasks through interaction
Address key challenges in long-horizon, multi-turn tool-augmented tasks in MR, such as context memory management, tool selection strategy, and error correction mechanisms
Keep abreast of cutting-edge technologies in MLLM, multimodal intelligence, and tool-use research, and lead the application and deployment of innovative technologies in PICO's MR products
Collaborate with cross-functional teams (including software engineering, product design, and hardware development) to translate research outcomes into practical features that enhance user experience

Qualification

Multimodal large model expertiseModel optimizationPythonC++Deep learning frameworks2D/3D computer visionReinforcement learningIndependent research abilitiesProblem-solving skillsCollaboration skillsCommunication skills

Required

Master's or Ph.D. degree in Computer Science, Electrical Engineering, Machine Learning, Artificial Intelligence, or a related quantitative field
Expertise in multimodal large model pre-training, post-training, fine-tuning, or cross-modal fusion technologies, with hands-on experience in model optimization, training workflow design, and performance tuning
Proven research experience in LLM tool use, reinforcement learning, LLM agents, or interactive learning, with a deep understanding of single-turn and multi-turn interaction mechanisms
Proficiency in core 2D/3D computer vision tasks, including detection, segmentation, depth estimation, image matching, and 3D scene perception
Skilled in Python and C++, with solid programming capabilities and experience in developing large-scale models using mainstream deep learning frameworks (PyTorch/TensorFlow)
Excellent problem-solving and independent research abilities, capable of addressing complex technical challenges in the integration of MR and MLLM tool use

Preferred

Publications in AI/ML/CV conferences (e.g., NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, EMNLP) focusing on multimodal large models, LLM tool use, or agent systems
Hands-on experience in building large-scale MLLM training pipelines, tool-use evaluation systems, or multimodal agent platforms
Familiarity with MR/AR/VR technologies, spatial computing, or 3D scene reconstruction (3DGS, NeRF, etc.) is a strong plus
Experience in addressing long-horizon reasoning or asynchronous agent behavior challenges is highly valued
Award winners of competitions such as ACM-ICPC, NOI/IOI, TopCoder, or AI/ML contests (e.g., Kaggle) are preferred
Strong collaboration and communication skills, able to lead research initiatives and drive cross-team technical alignment

Benefits

Medical, dental, and vision insurance
401(k) savings plan with company match
Paid parental leave
Short-term and long-term disability coverage
Life insurance
Wellbeing benefits
10 paid holidays per year
10 paid sick days per year
17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure)

Company

ByteDance

company-logo
ByteDance is a technology company that develops content creation platforms and services.

H1B Sponsorship

ByteDance has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1350)
2024 (1123)
2023 (775)
2022 (487)
2021 (417)
2020 (245)

Funding

Current Stage
Late Stage
Total Funding
$9.8B
Key Investors
Capital TodayG42Tiger Global Management
2025-11-20Secondary Market· $300M
2024-07-25Secondary Market
2023-03-14Secondary Market· $100M

Leadership Team

leader-logo
Jochen Bischoff
Head of Global Business Solutions - Africa
linkedin
leader-logo
Matty Lin
General Manager, Global Business Solutions, KR
linkedin
Company data provided by crunchbase