Network Architect jobs in United States
cer-icon
Apply on Employer Site
company-logo

Together AI · 1 month ago

Network Architect

Together AI is a research-driven artificial intelligence company focused on building the next-generation AI compute platform. As a Network Architect, you will define and evolve the global network architecture that supports AI training and research, collaborating with various teams to ensure optimal performance and resiliency of the network.

AI InfrastructureArtificial Intelligence (AI)Generative AIInternetIT InfrastructureOpen Source
check
H1B Sponsor Likelynote

Responsibilities

Define and evolve Together AI’s global routing and backbone architecture, spanning self-built data centers, partner colocation sites, PoPs, cloud regions, and interconnect fabrics
Establish the end-to-end topology strategy for high-bandwidth AI workloads: east–west fabrics, spine/superspine/core, DCI, and cross-region interconnect
Design traffic engineering, load balancing, and capacity planning models to ensure low latency, deterministic performance, and fault tolerance at scale
Develop the multicloud interconnect and peering strategy, including BGP policy frameworks, route leak mitigation, and security posture across heterogeneous networks
Architect the control-plane stack for programmability, stability, and automation—including routing design, provisioning, configuration management, and state consistency
Establish foundational observability primitives for a global backbone (telemetry, flow sampling, path validation, synthetic testing, health models)
Work closely with compute, storage, hardware, and data platform teams to ensure network design meets the performance demands of distributed AI training workloads
Collaborate with operations and NOC teams to ensure designs are supportable, debuggable, and resilient under real-world failure conditions
Provide architectural direction and mentorship to engineers across the org, influencing long-term strategy for both physical and virtual network domains
Model evolving topologies for next-generation workloads (multi-Tbps east–west, high fan-in/fan-out distributed systems, GPU cluster fabrics)
Evaluate and guide the adoption of emerging technologies: advanced optical transport, RoCEv2, high-speed Ethernet fabrics, Infiniband overlays, EVPN/VXLAN, SR-MPLS/SRv6, programmable data planes, and hardware offload

Qualification

GPU cluster designHigh-throughput data center fabricsRoCEv2 architectureBackboneDCI architecturesMulti-cloud networkingTelemetryObservabilityCapacity modelingAutomationFailure-mode analysisTraffic engineering

Required

Have deep experience designing and operating large-scale GPU clusters or HPC-style compute fabrics, and understand the unique demands these workloads place on network design (east–west dominance, congestion behavior, fan-in/fan-out patterns, loss sensitivity)
Are fluent in building high-throughput data center fabrics (leaf–spine/superspine/core) that support tens of thousands of GPUs, multi-terabit east–west traffic, and strict performance SLAs
Have architected or operated RoCEv2 or lossless Ethernet environments at scale—including PFC/ECN tuning, congestion control, and end-to-end stability considerations
Are experienced designing backbone and DCI architectures that support GPU training clusters across multiple regions, interconnect exotic fabrics, and handle high-volume synchronization traffic
Have led architecture for networks spanning multiple clouds, private backbones, and diverse PoPs, and understand how AI workloads behave across these domains
Design with operational realities in mind: observability, capacity modeling, automation, telemetry, and failure-mode analysis for GPU-heavy environments
Are comfortable setting architectural direction in fast-moving environments where compute, storage, and network evolution are tightly coupled

Benefits

Startup equity
Health insurance
Other competitive benefits

Company

Together AI

twittertwittertwitter
company-logo
Together AI is a cloud-based platform designed for constructing open-source generative AI and infrastructure for developing AI models.

H1B Sponsorship

Together AI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (19)
2024 (6)
2023 (3)

Funding

Current Stage
Growth Stage
Total Funding
$533.5M
Key Investors
Salesforce VenturesLux Capital
2025-02-20Series B· $305M
2024-03-13Series A· $106M
2023-11-29Series A· $102.5M

Leadership Team

leader-logo
Vipul Ved Prakash
Co-Founder & CEO
linkedin
leader-logo
Kae Ike Lim
Executive Assistant to Co-Founder and CEO
linkedin
Company data provided by crunchbase