What are the responsibilities and job description for the Staff Machine Learning Engineer - AI Foundation position at XPENG?
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, machine learning, and smart connectivity.
We are looking for a full-time Machine Learning Engineer - AI Foundation, with deep knowledge and strong enthusiasm towards establishing a state-of-art ML infrastructure for training very large foundation model and accelerating model training/inference.
Our mission is to solve the autonomous driving problem. You will work with a team of talented software engineers, machine learning engineers and research scientists to push the boundary of state-of-art machine learning models which will enable the next-generation E2E solution of autonomous driving.
Job Responsibilities
We are an Equal Opportunity Employer. It is our policy to provide equal employment opportunities to all qualified persons without regard to race, age, color, sex, sexual orientation, religion, national origin, disability, veteran status or marital status or any other prescribed category set forth in federal or state regulations.
We are looking for a full-time Machine Learning Engineer - AI Foundation, with deep knowledge and strong enthusiasm towards establishing a state-of-art ML infrastructure for training very large foundation model and accelerating model training/inference.
Our mission is to solve the autonomous driving problem. You will work with a team of talented software engineers, machine learning engineers and research scientists to push the boundary of state-of-art machine learning models which will enable the next-generation E2E solution of autonomous driving.
Job Responsibilities
- Optimize transformer-based LLMs for low-latency and high-throughput inference.
- Optimize kernels and model graphs using tools like CUDA, Triton, and custom fused operators.
- Implement and benchmark (Quantization, Knowledge distillation, structured and unstructured pruning, KV-cache optimization, etc.).
- Deploy optimized models across GPUs, CPUs, and edge acceleators.
- Contribute to internal tooling and documentation for model optimization flows.
- Master in CS/CE/EE, or equivalent, with 5-8 years of industry experience.
- Good knowledge of PyTorch.
- Knowledge of transformer architecture and ways to accelerate the training and inference of transformer models.
- Previous experience in the autonomous driving industry.
- Knowledge of Torchscript and Nvidia TensorRT.
- Strong programming skills in Python and C
- Familiarity with GPU CPU, NPU, DSP architecture.
- Deep understanding of memory bandwidth, compute bottlenecks, and hardware-aware model optimization
- Being efficiently in solving complex problems collaboratively on larger teams
- A fun, supportive and engaging environment.
- Infrastructures and computational resources to support your work.
- Opportunity to work on cutting edge technologies with the top talents in the field.
- Opportunity to make significant impact on the transportation revolution by the means of advancing autonomous driving.
- Competitive compensation package.
- Snacks, lunches, dinners, and fun activities.
We are an Equal Opportunity Employer. It is our policy to provide equal employment opportunities to all qualified persons without regard to race, age, color, sex, sexual orientation, religion, national origin, disability, veteran status or marital status or any other prescribed category set forth in federal or state regulations.