What are the responsibilities and job description for the Software Engineer II – AI Infrastructure position at Staffed4U?
Software Engineer II – AI Infrastructure
Location: Annapolis Junction, MD
Work Schedule: Full-Time, Onsite
Clearance Required: Active TS/SCI with Full Scope Polygraph (FSP)
Salary Range: $193,000 - $306,000
Overview
Join us in building the next generation of AI infrastructure that will power innovation across critical mission environments.
We are seeking an experienced Software Engineer II to support an advanced AI Infrastructure Team responsible for developing and maintaining the platform that serves as the foundation for enterprise AI capabilities. This role focuses on AI inference services while supporting a broader ecosystem of AI-enabled applications, including Retrieval-Augmented Generation (RAG), autonomous agents, and emerging machine learning technologies.
The ideal candidate is a highly skilled engineer who can independently design, build, deploy, and operate scalable infrastructure solutions while helping shape the future of AI adoption across mission-critical environments.
Key Responsibilities
AI Infrastructure & Platform Engineering
- Design, implement, and optimize infrastructure supporting AI model inference at scale.
- Develop, deploy, and maintain production AI services and applications.
- Support emerging AI technologies, including:
- Retrieval-Augmented Generation (RAG)
- Agentic AI Systems
- Large Language Model (LLM) Platforms
- AI Inference Services
- Build highly available, reliable, and scalable AI platform components.
- Navigate ambiguous requirements and define practical, scalable technical solutions.
Cloud & Systems Engineering
- Design and manage cloud-native infrastructure within AWS environments.
- Automate infrastructure provisioning and configuration using Infrastructure-as-Code (IaC) principles.
- Support Kubernetes deployments and administration across production environments.
- Integrate systems across diverse platforms and technologies.
- Optimize high-volume web applications and distributed systems for performance and reliability.
Observability & Operations
- Implement monitoring, logging, and observability solutions across AI services and infrastructure.
- Develop operational dashboards and alerting capabilities using:
- Grafana
- Prometheus
- OpenTelemetry
- Application Performance Monitoring (APM) tools
- Support incident response, troubleshooting, and root cause analysis efforts.
DevOps & Automation
- Develop and maintain CI/CD pipelines.
- Improve deployment automation and operational efficiency.
- Promote DevOps best practices across engineering teams.
- Drive adoption of modern engineering tools and methodologies.
Security & Collaboration
- Contribute to secure AI system design and implementation.
- Support compliance with organizational security requirements.
- Provide technical guidance and informal mentorship to junior engineers.
- Collaborate with software engineers, data scientists, platform engineers, and mission stakeholders.
Required Qualifications
Education & Experience
- Bachelor's degree in Computer Science, Software Engineering, Computer Engineering, Information Systems, or a related technical discipline.
Substitution:
- Four (4) additional years of directly related experience may be substituted for a bachelor's degree.
Experience
- Eight (8) or more years of software engineering experience.
- Proven experience building and supporting production systems at scale.
- Experience designing and supporting high-volume web applications.
- Experience integrating complex systems across multiple technologies and platforms.
- Experience supporting cloud-native infrastructure in AWS.
- Experience administering and deploying applications within Kubernetes environments.
Technical Skills
- Strong Python development skills.
- AWS Cloud Engineering
- Kubernetes
- Infrastructure as Code (IaC)
- CI/CD Pipelines
- DevOps Methodologies
- Monitoring and Observability Platforms
- Distributed Systems Architecture
- Performance Optimization
- Systems Integration
Observability Technologies
Experience with one or more of the following:
- OpenTelemetry
- Grafana
- Prometheus
- Application Performance Monitoring (APM) Solutions
Professional Skills
- Strong problem-solving and analytical abilities.
- Ability to thrive in ambiguous and rapidly evolving environments.
- Strong organizational influence and change management skills.
- Excellent written and verbal communication skills.
- Ability to work independently and collaboratively within highly technical teams.
Desired Qualifications
Candidates with one or more of the following qualifications are highly desired:
- Experience with AI inference serving technologies such as:
- vLLM
- LiteLLM
- Similar inference platforms
- Experience with agentic AI frameworks such as:
- LangChain
- LangGraph
- Similar orchestration frameworks
- Experience with:
- Vector databases
- Embedding systems
- Semantic search technologies
- Knowledge of:
- High-Performance Computing (HPC)
- Distributed Computing Systems
- Experience supporting production AI/ML environments.
Compensation
Salary Range: $193,000 - $306,000
Compensation is based on experience, education, technical expertise, and overall alignment with program requirements.
Benefits
Medical Coverage
Choose from three comprehensive medical plans through Aetna. The company pays 80% of monthly premiums for employees.
Health Savings Account (HSA)
- Pre-tax contributions for qualified medical expenses
- Company contributes 50% of the annual deductible (prorated based on start date)
Dental Coverage
- Aetna Passive PPO Max Plan
- Company pays 80% of monthly premiums
Vision Coverage
- Aetna Vision Preferred Premier 24M Plan
- Company pays 80% of monthly premiums
Life Insurance
- 100% Company-Paid Life Insurance
- Accidental Death & Dismemberment (AD&D) Coverage
Short-Term Disability
- 100% Company-Paid
- Pays 60% of earnings up to $1,500 per week for up to 12 weeks
Retirement Plan
- Automatic 6% employer contribution to 401(k)
- Fully vested from day one
- Employee contributions encouraged but not required
Paid Time Off & Holidays
- 5–6 weeks of PTO depending on tenure
- 11 paid holidays annually
Professional Development
- $5,000 annual tuition reimbursement
- Paid training, certifications, and industry conferences
- Ongoing support for technical growth and career advancement
Why Join Us?
This is an opportunity to help shape the future of AI infrastructure while supporting critical mission objectives. You'll work alongside top-tier engineers building scalable AI platforms, deploying cutting-edge technologies, and solving some of the most challenging problems in modern software engineering.
Salary : $193,000 - $306,000