What are the responsibilities and job description for the Critical Facilities Engineer position at Nebius?
Why work at Nebius
Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in-house AI/ML teams. Our employees work at the cutting edge of AI cloud infrastructure alongside some of the most experienced and innovative leaders and engineers in the field.
Where we work
Headquartered in Amsterdam and listed on Nasdaq, Nebius has a global footprint with R&D hubs across Europe, North America, and Israel. The team of over 800 employees includes more than 400 highly skilled engineers with deep expertise across hardware and software engineering, as well as an in-house AI R&D team.
New data center development:
We give you the opportunity to work with cutting-edge technologies in data operations, cloud computing and infrastructure management. As global data center operations grow, there will be ample opportunities for career progression. Working in the data center directly impacts performance, customer satisfaction and efficiency, with the opportunity to contribute to new data center projects.You’ll collaborate with experts in AI data center development and operations, gaining insights from leaders in the field. This environment fosters innovation, and allows you to work on solutions that exceed industry standards in design and deployment.
The Role
Nebius is building next-generation AI infrastructure at scale. We are seeking a Data Center Critical Facilities Engineer to provide technical oversight, reporting, and operational governance across our data center environments.
This role sits within the critical facilities domain and is focused on the systems that power and cool our infrastructure, including electrical distribution, UPS systems, and cooling (CRAH/CRAC, chilled water, HVAC).
You will act as a central point of oversight and accountability, ensuring operational activities are executed to standard, properly documented, and aligned with reliability and audit requirements.
Your Responsibilities
Critical Infrastructure Oversight
- Provide technical oversight of power and cooling systems, including UPS, PDUs, generators, and HVAC infrastructure.
- Review and approve change requests impacting critical systems.
- Ensure all maintenance, incidents, and changes are executed in alignment with operational standards.
Audit, Reporting & Documentation
- Own and maintain operational documentation, including SOPs, runbooks, and system records.
- Produce after-action reports, incident summaries, and audit ready documentation.
- Develop reports, dashboards, and visualizations to provide visibility into data center operations and performance.
- Ensure documentation across systems (e.g., SharePoint or internal tools) is accurate, current, and standardized.
Operational Coordination
- Act as a liaison between facilities teams, vendors, and internal stakeholders.
- Track and coordinate maintenance activities, schedules, and deliverables.
- Ensure vendors and internal teams are aligned, accountable, and meeting expectations.
Process Improvement & Governance
- Identify gaps in operational processes and drive improvements in documentation, reporting, and execution standards.
- Establish best practices for audit readiness and operational consistency.
- Support leadership with technical presentations and summaries of infrastructure performance and risks.
We Expect You To Have
- 5–10 years of experience in data center critical facilities or mission critical environments.
- Strong understanding of electrical and mechanical infrastructure, including:
- UPS systems, PDUs, generators.
- Cooling systems (CRAH/CRAC, chilled water, HVAC).
- Experience working in live data center environments with high availability requirements.
- Proven experience with documentation, reporting, and operational processes (SOPs, audits, runbooks, etc.)
- Ability to review and manage change requests, maintenance activities, and incident follow ups.
- Strong organizational and communication skills, with the ability to translate technical operations into clear reports and presentations.
Nice to Have
- Experience creating dashboards, reports, or visualizations for operational data.
- Familiarity with tools such as SharePoint, Excel, ticketing systems, or CMMS platforms.
- Experience supporting audits, compliance, or operational governance frameworks.
- Exposure to data center environments supporting high density or GPU based workloads.