soho square solutions is Hiring a Systems Reliability Engineer - Ticker Plant and Feeds Near New York, NY
On our team, you will work on our most critical services. You will ensure Bloomberg's financial product, the Terminal, is fast, highly available, scalable and able to withstand unprecedented increases in load. You will be at the heart of solving production problems with a scope from the kernel to the application. Flexibility and creativity are essential for taking a holistic approach to troubleshooting. A strong attention to detail is also needed as you will dig deeper into certain issues when required.
We are located in multiple locations and work with the various application development teams but report directly to the SRE organization for oversight, strategic direction, training and career development. You will build automation tools for system health and production acceptance tests to validate production changes to ensure the system is well-instrumented and highly fault tolerant.
We'll trust you to:
Ensure optimal availability, latency, scalability and efficiency of Bloomberg application development
Respond to and resolve unexpected and potential service problems
Drive capacity planning, performance analysis, instrumentation and other non-functional systems requirements
Review and influence ongoing design, architecture, standards and methodology for improving operating services
Write production software acceptance tests and own systems releases including coordination of coverage and communication plans
You'll need to have:
A Bachelor's degree in Computer Science or equivalent experience
Experience developing customer-facing, high availability, large-scale distributed applications
In-depth knowledge of Linux/UNIX
Exposure to C/C or Java technologies
An understanding of a variety of scripting languages
We'd love to see:
Extensive experience working with fault-tolerant approaches in a large-scale distributed environment and high performance systems
Familiarity with complex systems environments
A solid understanding of Internet and networking protocols
A passion for performance excellence and robustness
The ability to analyze and troubleshoot large-scale distributed systems
The ability to handle periodic on-call duty as well as out-of-band requests