
Job Information
Meta Software Engineering Manager - AI Systems Co-Design in Menlo Park, California
Summary:
Meta is seeking a hands-on engineering managers to join the AI & Systems Co-Design team at Meta. The team works at the intersection of hardware, software and AI technologies, making direct contributions on LLama, DLRM, DCPerf, MTIA and many other cutting edge open-source as well as internal infrastructure projects. The co-design AI team has established relationships with both academia and industry. We frequently collaborate with academia through internships and have a track record of publications in top AI, systems and architecture conferences. We partner closely with industry leaders to influence their roadmaps and build the best products for Meta’s Infrastructure.Join us and be a part of the team that is shaping the future of Meta AI infrastructure!
Required Skills:
Software Engineering Manager - AI Systems Co-Design Responsibilities:
Lead and support the communications team that works on collective libraries and contribute to enabling performance at scale of our inference and training of GenAI (Llama) and Ranking & Retrieval (DLRM) models
Enable the growth of individual contributors, driving the technical roadmap along with technical leads and expand the impact of the team by growing new skill-sets and capabilities
Lead a high performance team of engineers to deliver new capabilities and efficient compute systems for our fleet
Technical management
experience in systems architecture, performance, workload-analysis and large scale distributed systems
Work cross-functionally across hardware and software/services team to drive engineering efforts
Minimum Qualifications:
Minimum Qualifications:
Experience in leading teams working on high performance computing (HPC) and AI/ML systems, including:
Communication libraries (e.g., NCCL, RCCL, UCC, MPI)
GPU/ASIC-based kernel development and optimization (e.g. CUDA, ROCm)
Distributed systems for large scale training and serving
Systems Architecture + Performance
Large scale distributed systems
Experience running a large-scale program and dealing with ambiguity
Preferred Qualifications:
Preferred Qualifications:
Experience with collective communication, e.g. one of these libraries NCCL, RCCL, Gloo, UCC, MPI
Network architecture
Public Compensation:
$177,000/year to $251,000/year + bonus + equity + benefits
Industry: Internet
Equal Opportunity:
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Meta
-
- Meta Jobs