Staff Software Engineer - ML Training Platform
Company
Location
USA
Type
Full Time
Job Description
Location: This role is completely remote-friendly . If you happen to live close to one of our physical office locations our doors are open for you to come into the office as often as you'd like.
Who We Are: The Machine Learning Platform team at Reddit is a high-impact team that owns the infrastructure that powers recommendations content discovery user and content quantification while directly impacting other teams such as Growth Ads Feeds and Core Machine Learning teams.
What You’ll Do: As a Staff Software Engineer Training Platform this person will work on our wider Machine Learning Platform team and be instrumental in architecting implementing and maintaining foundational ML infrastructure that powers Feeds Ranking Content Understanding Recommendations and much more to fulfill Reddit’s mission of bringing community and belonging to everyone in the world. You will build systems and tools that enable machine learning engineers (MLEs) and data scientists (DSs) and continuously improve the ML software development lifecycle. You will deliver a self service ML platform that enables the continuous iteration and improvement of systems that use ML techniques including Deep Learning Natural Language Processing Recommendation Systems Representation Learning and Computer Vision.
-
Lead the building testing and maintenance of ML infrastructure at Reddit
-
Propose design and implement high-performance ML platform solutions that significantly advance the deployment of models that serve millions of redditors a seamless experience for MLEs
-
Play a pivotal role in designing building and optimizing the infrastructure and tooling required to support large-scale machine learning workflows
-
Design and implement solutions that significantly advance the architecture of the ML Platform
-
Analyze bottlenecks in distributed systems and optimize for performance and cost-efficiency
-
Work with management on team goal setting planning and de-risk project execution
-
Mentor other team members in adopting a rigorous DevOps approach to maintain and/or improve ML platform components and services health and quality
Who You Might Be:
-
8+ years of work experience in a production software development environment or building data systems plus a degree in ML Engineering Computer Science or other relevant discipline
-
Experience with design and architecture of large scale ML Systems
-
Experience with ML frameworks such as TensorFlow PyTorch or JAX
-
Experience with training workflows hyperparameter tuning and resource optimization on CPU and GPU
-
Experience with MLOps practices and tools such as Ray and MLFlow
-
Hands-on experience with Kubernetes Docker or other container orchestration systems
-
Experience building production-quality code incorporating testing evaluation and monitoring using object oriented programming experience in: Python and/or golang.
-
Comfortable with distributed systems big data (Petabyte scale) and data-intensive systems
Benefits:
-
Comprehensive Healthcare Benefits
-
401k Matching
-
Workspace benefits for your home office
-
Personal & Professional development funds
-
Family Planning Support
-
Flexible Vacation (please use them!) & Reddit Global Wellness Days
-
4+ months paid Parental Leave
-
Paid Volunteer time off
#LI-DB1 #LI-Remote
Date Posted
01/29/2025
Views
0
Similar Jobs
Events Marketing Specialist - Finalsite
Views in the last 30 days - 0
Finalsite a leading community relationship management platform for K12 schools is seeking a highly organized and detailoriented Events Marketing Speci...
View DetailsDeveloper II - Eventbrite, Inc.
Views in the last 30 days - 0
Eventbrite is seeking a Web Application Developer to join their highperforming GTM Gotomarket Business Systems team The role involves implementing and...
View DetailsMobile Engineering Manager - Mobile Retention - Dropbox
Views in the last 30 days - 0
Dropbox is seeking a Mobile Engineering Manager to lead a team of iOS and Android engineers working on the Dropbox apps The role involves managing cri...
View DetailsCommercial Account Executive - Mid Market - MariaDB plc
Views in the last 30 days - 0
MariaDB is a leading database for modern application development used by 75 of the Fortune 500 and billions of people daily The company is seeking a C...
View DetailsAccount Manager - SMB - Syndigo
Views in the last 30 days - 0
The Syndigo Account Manager SMB role involves managing client relationships creating strategies and ensuring value delivery The individual will work c...
View DetailsSenior Manager - Customer Success - Contentsquare
Views in the last 30 days - 0
The job posting is for a Senior Manager of Customer Success position in California The role involves leading a team of Customer Success Managers CSMs ...
View Details