What are the responsibilities and job description for the Agentic Infrastructure Engineer Intern position at XPENG?
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, machine learning, and smart connectivity.
XPENG is building the next generation of enterprise AI infrastructure — autonomous, application-driven systems that power real-time decision making across autonomous driving and AI platforms. As part of our AI Enablement team, you will work on internal platforms at the frontier of LLM orchestration, multi-agent coordination, and automated workflow systems.
This role is ideal for candidates passionate about developer productivity, AI-native tooling, and scalable infrastructure. You will build systems for CI/CD, observability, evaluation, and workflow automation, while working closely with senior engineers across AI infrastructure and platform teams.
Key Responsibilities
XPENG is building the next generation of enterprise AI infrastructure — autonomous, application-driven systems that power real-time decision making across autonomous driving and AI platforms. As part of our AI Enablement team, you will work on internal platforms at the frontier of LLM orchestration, multi-agent coordination, and automated workflow systems.
This role is ideal for candidates passionate about developer productivity, AI-native tooling, and scalable infrastructure. You will build systems for CI/CD, observability, evaluation, and workflow automation, while working closely with senior engineers across AI infrastructure and platform teams.
Key Responsibilities
- Design and implement Electron-based desktop applications for prompt workflow visualization, process inspection, and self-service dashboards for non-engineering stakeholders.
- Contribute to JavaScript/TypeScript components that enable LLM orchestration, AI workflow interoperability, pipeline automation, and MCP-compatible connectors.
- Build and extend evaluation frameworks for verifying agent outputs and system reliability, including LLM-as-judge metrics, structured validation, and automated feedback loops.
- Instrument operational observability tooling (e.g., LangFuse, OpenTelemetry, custom metrics) and develop automated dashboards to surface runtime insights and model performance trends.
- Participate in the full engineering lifecycle including design reviews, implementation, testing, CI/CD integration, and Git-based collaboration workflows.
- Collaborate closely with platform engineers and researchers on benchmark selection, failure analysis, prompt optimization, and quantitative evaluation methodologies.
- Currently enrolled in a Bachelor’s, Master’s, or Ph.D. program in Computer Science, Software Engineering, Electrical Engineering, or a related technical field.
- Strong proficiency in JavaScript/TypeScript and modern frontend/backend development practices.
- Experience with Electron Framework, including desktop application development, IPC communication, and renderer/main process architecture.
- Familiarity with AI agent systems, LLM tooling frameworks, orchestration pipelines, or evaluation workflows.
- Understanding of CI/CD fundamentals, automated testing pipelines, artifact publishing, and developer productivity tooling.
- Experience designing evaluation or verification systems for AI outputs, structured data validation, or workflow automation.
- Strong communication and documentation skills with the ability to clearly present technical ideas, PRs, reports, and experimental findings.
- Experience building internal AI developer tools, observability platforms, or workflow orchestration systems.
- Familiarity with modern AI infrastructure frameworks such as LangFuse, OpenTelemetry, MCP, or related tooling ecosystems.
- Experience with prompt engineering, automated evaluation pipelines, or agent reliability optimization.
- Knowledge of modern frontend application architecture and performance optimization for Electron-based systems.
- Experience working in fast-paced engineering environments with rapid iteration cycles and cross-functional collaboration.
- Contributions to open-source projects or prior experience developing scalable AI infrastructure platforms.
- Self-motivated and proactive in solving problems.
- Comfortable operating in fast-moving, ambiguous environments.
- Strong in debugging, systems thinking, and iterative problem solving.
- Able to quickly learn unfamiliar systems and contribute meaningful improvements.
- Clear communicators who collaborate effectively and provide thoughtful feedback.
- Passionate about building reliable AI-native tools and infrastructure.
- A fun, supportive and engaging environment.
- Infrastructures and computational resources to support your work.
- Opportunity to work on cutting edge technologies with the top talents in the field.
- Opportunity to make significant impact on the transportation revolution by the means of advancing autonomous driving.
- Competitive compensation package.
- Snacks, lunches, dinners, and fun activities.