What are the responsibilities and job description for the Technical Operations Center Engineer position at Tential Solutions?
What We Need
We are seeking a highly motivated Technical Operations (TOC) Engineer to join our 24/7 Technical Operations Center team. This role is a vital part of our live service operations, serving as a primary escalation point for the Junior TOC team and remaining critical for maintaining the high availability, performance, and reliability of our global game infrastructure.
The ideal candidate is a composed professional with deep technical expertise in incident, problem, and service request management. While you possess the interpersonal skills to lead during a crisis, your focus will be on the technical architecture, comprehensive documentation, and advanced automation of our operational workfl ows. You must be a strong troubleshooter with a signifi cant bias toward automation to ensure our studios and global community enjoy an uninterrupted experience.
What You Will Do
We are seeking a highly motivated Technical Operations (TOC) Engineer to join our 24/7 Technical Operations Center team. This role is a vital part of our live service operations, serving as a primary escalation point for the Junior TOC team and remaining critical for maintaining the high availability, performance, and reliability of our global game infrastructure.
The ideal candidate is a composed professional with deep technical expertise in incident, problem, and service request management. While you possess the interpersonal skills to lead during a crisis, your focus will be on the technical architecture, comprehensive documentation, and advanced automation of our operational workfl ows. You must be a strong troubleshooter with a signifi cant bias toward automation to ensure our studios and global community enjoy an uninterrupted experience.
What You Will Do
- Technical Escalation: Act as a key escalation point for both expected and unexpected events involving production applications, systems, and cloud infrastructure.
- Infrastructure Maintenance: Maintain our global infrastructure across both on-premise and public cloud platforms.
- Automation Leadership: Propose and implement complex automation strategies to improve overall effi ciency and system performance.
- SRE Collaboration: Work closely with Site Reliability Engineering (SRE) teams to onboard and implement innovative technical solutions.
- Performance Standards: Ensure uptime and performance standards are met for a seamless gaming experience.
- Advanced Troubleshooting: Diagnose and resolve high-level technical issues involving production networks and system operations.
- Documentation & Audit: Ensure all operational activities and technical configurations are thoroughly documented and remain compliant with audits.
- Service Fulfi llment: Oversee the resolution of technical service requests and user-submitted tickets, ensuring a high level of customer service.
- Experienced: You have 3 years in large-scale production networks or systems operations, with a strong grasp of reliability engineering principles.
- Collaborative Communicator: You are able to communicate complex concepts clearly in English, whether with technical staff or senior management.
- Adaptable and Agile: You can support a globally distributed team, quickly adapt to new tools, and embrace changes in a dynamic environment.
- Continuous Learner: You are deeply interested in learning and implementing new technologies and architectural concepts.
- Customer Centric: You always put the needs of the customer fi rst and think about problems and requests through the lens of the end user.
- Public Cloud Providers: Expert-level knowledge of AWS. (GCP and Azure a plus)
- Operating Systems: Advanced administration of Linux and Windows in production environments.
- Virtualized Environments: Deep experience with VMware required. Other virtualization platforms (Proxmox, KVM, Hyper-V, and WSL) are a plus
- Infrastructure as Code (IaC): Deep familiarity with Terraform, Ansible, Puppet, and Pulumi.
- Networking: Expertise in protocols, firewall permissions, and advanced network triage.
- Passion for Full Stack: You are passionate about learning the full stack including pursuing formal training and certifications.