What are the responsibilities and job description for the Sr Advanced AI Platform Engineer position at Honeywell?
We are seeking a Full Stack AI Platform Engineer to join our Data Engineering, AI & ML Platform team. This role is central to designing, building, and scaling the enterprise AI/ML platform that powers intelligent automation across a global portfolio.
As a Full Stack AI Platform Engineer here at Honeywell, you will design, build, and scale AI systems end-to-end — from high-throughput IoT streaming pipelines and knowledge graph infrastructure, through LLM orchestration and RAG services, to the React-based interfaces that surface autonomous insights to plant engineers, facility managers, and OT security analysts.
You will work at the intersection of data engineering, machine learning operations, and edge AI — building production-grade infrastructure that processes billions of IoT events from building management systems, deploys models to edge devices, and enables AI-driven applications including predictive diagnostics, energy monitoring, and RAG-based knowledge systems.
This is a high-impact individual contributor role for someone who thrives in ambiguity, ships production systems, and can operate across the full stack from cloud-native platforms to edge GPU hardware. You will report to our Sr Data Engineering Manager and work from our Atlanta, GA location on a hybrid basis.
- Note: for the first 90 days, new hires must be prepared to work onsite 100% M-F.
KEY RESPONSIBILITIES
AI/ML Platform Engineering
- Develop high-performance, production-ready Python APIs using FastAPI to serve as the primary interface for on-device model inference
- Design, build, and maintain enterprise AI/ML platform services on multi-cloud infrastructure including model deployment, serving and experiment tracking.
- Build robust CI/CD stacks to automate the testing of inference logic and the deployment of API services to edge devices.
- Implement ML orchestration workflows using LangGraph, MLflow, and custom orchestration layers for multi-agent AI systems.
- Develop and integrate AI workloads using ML-Ops and tracing tools like LangSmith.
- Design and implement automated data processing pipelines within FastAPI to handle real-time sensor or image inputs for the model.
- Bridge the gap between research and deployment by converting code from experimental into modular, maintainable Python packages.
Edge AI & Inference
- Ability to integrate and run pre-built AI models on local hardware using standard industry runtimes.
- Skilled at building the software logic required to process data inputs and handle model outputs efficiently.
- Expert at developing Python-based services and automating their deployment to devices via standardized pipelines.
- Capable of monitoring and optimizing software to run reliably within strict memory and hardware limitations.
- Experience deploying containerized models from Azure to edge devices using Azure IoT Edge or managed online endpoints
Data & Knowledge Engineering
- Experience building pipelines to structure, clean, and store data for model training or real-time retrieval (RAG) on edge devices
- Ability to convert experimental data processing logic from notebooks into production-ready Python modules.
- Design automated workflows to collect, label, and manage datasets, ensuring high-quality data is available for continuous model improvement.
Production Operations & Reliability
- Own platform reliability for AI services serving multiple business units.
- Implement observability, monitoring, and alerting for ML pipelines and inference services.
- Drive cost optimization across data platform workloads, cloud compute, and storage infrastructure.
- Proficient in using Azure Machine Learning Studio to manage the full lifecycle of models, including registration, versioning, and monitoring.