Research Scientist - Large-Scale Machine Learning Systems (SysML) - Global Tech Research Program - 2027 Start (PhD)
ByteDance
Software Engineering, Data Science
Responsibilities
Team Introduction: The Applied Machine Learning (AML) team is committed to the research and deployment of the next-generation of machine learning core technologies. This covers large pre-trained models and device-cloud collaboration learning, as well as wide applications in search, recommendation, advertising, auditing, federated learning, and more. The team has a strong foundation in scientific research, engineering and product implementation. Our team members have rich backgrounds covering natural language processing (NLP), computer vision (CV), multimodality, graph computing, search and recommendation, federated learning and other fields, and have published more than 100 top-tier conference papers. We are looking for talented individuals to join our team in 2027. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at our Company. Successful candidates must be able to commit to an onboarding date by end of year 2027. Please state your availability and graduation date clearly in your resume. Topic Content: Large-scale recommendation systems are being increasingly adopted across products such as short-video, text-based community, and image platforms, with modality-specific information playing an ever-growing role in recommendations. In ByteDance's practice, we have found that modality information serves effectively as generalizable features to support recommendation and other business scenarios. Research on end-to-end ultra-large-scale multimodal recommendation systems holds significant potential. Building on an algorithm-engineering co-design approach, we aim to further explore directions including multimodal co-training, models with hundreds of billions of parameters, and end-to-end modeling with extended sequence lengths. On the engineering side, research directions include: multimodal sample representation; high-performance multimodal inference engines built on the PyTorch framework; high-performance multimodal training framework development; and the application of heterogeneous hardware in multimodal recommendation systems. On the algorithm side, research directions include: designing effective recommendation-and-ads multimodal co-training architectures, sparse MoE, memory networks, and mixed precision. Topic Challenges: 1. High difficulty in unifying multimodal representations and achieving efficient fusion. 2. Extremely high computational costs for training and inference of ultra-large-parameter models. 3. Difficulty balancing stability and efficiency in end-to-end modeling with long sequences. 4. Significant complexity in algorithm-engineering co-design and heterogeneous hardware adaptation Topic Value: 1. Technical value: - Achieve breakthroughs in multimodal representation fusion and training/inference bottlenecks for ultra-large-scale models; refine the co-design framework for algorithms and engineering; advance heterogeneous hardware adaptation and the development and deployment of domestically developed high-performance frameworks. 2. Business value: - Enhance recommendation accuracy and generalization capability in multimodal scenarios; overcome the modality limitations of existing recommendation systems; empower multiple products including short-video and text-based community platforms; reduce computational costs; and drive scalable business growth.
Qualifications
Minimum Qualifications 1. Individuals who are completing or recently completed a PhD in Artificial Intelligence, Computer Science, Computer Engineering, or a related technical discipline; 2. Proficiency in 1 or more programming languages such as C/C++/Go/Python/Java in a Linux environment; 3. Deep understanding of distributed system principles, with experience in designing, developing, and maintaining large-scale distributed systems; Preferred Qualifications 1. Familiarity with Kubernetes architecture and extensive experience in cloud-native system development; 2. Experience with at least one mainstream machine learning framework (e.g., TensorFlow, PyTorch, MXNet); 3. Familiarity with Django, Flask, or related technologies, with backend development experience; 4. Experience in one or more of the following areas: AI Infrastructure, HW/SW Co-Design, High-Performance Computing, ML Hardware Architecture (GPU, accelerators, networking), Machine Learning Frameworks, ML for Systems, Distributed Storage.
Job Information
About Us
Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
Why Join ByteDance
Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day.
As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.
Diversity & Inclusion
ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.