What are the responsibilities and job description for the Senior Engineer, DevOps (Senior AI Platform Engineer) position at Bayside Solutions?
Senior Engineer, DevOps (Senior AI Platform Engineer)
W2 Contract
Pay Rate: $62.11 - $72.11 per hour
Location: Carrollton, TX - Onsite Role
Duties and Responsibilities:
- Design, implement, and manage scalable and resilient infrastructure on AWS.
- Architect and maintain Windows/Linux-based environments, ensuring seamless integration with cloud platforms.
- Develop and maintain infrastructure-as-code(IaC) using both AWS CloudFormation/CDK and Terraform/OpenTofu.
- Develop and maintain Configuration Management for Windows & Linux servers using Chef.
- Design, build, and optimize CI/CD pipelines using GitLab CI/CD for .NET applications.
- Integrate and support AI services, including orchestration with AWS Bedrock, Google Agentspace, and other generative AI frameworks, ensuring they can be securely and efficiently consumed by platform services.
- Enable AI/ML workflows by building and optimizing infrastructure pipelines that support large-scale model training, inference, and deployment across AWS and GCP environments.
- Automate model lifecycle management (training, deployment, monitoring) through CI/CD pipelines, ensuring reproducibility and seamless integration with development workflows.
- Collaborate with AI engineering teams to deliver scalable environments, standardized APIs, and infrastructure that accelerate AI adoption at the platform level.
- Implement observability, security, data privacy, and cost-optimization strategies specifically for AI workloads, including monitoring and resource scaling for inference services.
- Implement and enforce security best practices across the infrastructure and deployment processes.
- Collaborate closely with development teams to understand their needs and provide Platform expertise.
- Troubleshoot and resolve infrastructure and application deployment issues.
- Implement and manage monitoring and logging solutions to ensure system visibility and proactive issue detection.
- Clearly and concisely contribute to the development and documentation of Platform Engineering standards and best practices.
- Stay up-to-date with the latest industry trends and technologies in cloud computing, Platform Engineering, and security.
- Provide mentorship and guidance to junior team members.
Requirements and Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
- 5 years of experience in a Platform Engineering, DevOps, or Site Reliability Engineering (SRE) role.
- 1 year(s) of experience with AI services & LLMs.
- Extensive hands-on experience with Amazon Web Services (AWS)
- Solid understanding of Windows/Linux Server administration and integration with cloud environments.
- Proven experience with infrastructure-as-code tools, specifically AWS CDK and Terraform.
- Strong experience designing and implementing CI/CD pipelines using GitLab CI/CD.
- Experience deploying and managing .NET applications in cloud environments.
- Deep understanding of security best practices and their implementation in cloud infrastructure and CI/CD pipelines.
- Solid understanding of networking principles (TCP/IP, DNS, load balancing, firewalls) in cloud environments.
- Experience with monitoring and logging tools (e.g., NewRelic, CloudWatch).
- Strong scripting skills (e.g., PowerShell, Python, Ruby, Bash).
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration skills.
- Experience with containerization technologies (e.g., Docker, Kubernetes) is a plus.
- Relevant AWS and/or GCP certifications are a plus.
- Experience with the configuration management tool Chef
Preferred Qualifications:
- Knowledge of and a strong understanding of PowerShell and Python Scripting
- Strong background with AWS EC2 features and Services (Autoscaling and WarmPools)
- Understanding of Windows server build process using tools like Chocolaty for packages and Packer for AMI/Image generation.
- Extensive hands-on experience with Amazon Web Services (AWS)
Desired Skills and Experience
AWS, GCP, Cloud Infrastructure, DevOps, Platform Engineering, Site Reliability Engineering (SRE), Infrastructure as Code, Terraform, OpenTofu, AWS CloudFormation, AWS CDK, CI/CD, GitLab CI/CD, .NET Deployment, Windows Server Administration, Linux Administration, Configuration Management, Chef, AI/ML Infrastructure, Generative AI, LLMs, AWS Bedrock, Google Agentspace, Model Training Pipelines, Model Deployment, Model Monitoring, MLOps, API Development, Cloud Security, Data Privacy, Observability, Monitoring & Logging, CloudWatch, New Relic, Cost Optimization, Networking (TCP/IP, DNS, Load Balancing, Firewalls), Scripting (PowerShell, Python, Ruby, Bash), Automation, Troubleshooting, Containerization, Docker, Kubernetes, EC2, Autoscaling, Warm Pools, Packer, Chocolatey, AMI/Image Creation, Platform Architecture, Scalability, Resilience, Documentation, Mentorship, Collaboration
Bayside Solutions, Inc. is not able to sponsor any candidates at this time. Additionally, candidates for this position must qualify as a W2 candidate.
Bayside Solutions, Inc. may collect your personal information during the position application process. Please reference Bayside Solutions, Inc.'s CCPA Privacy Policy at www.baysidesolutions.com.
Salary : $62 - $72