What are the responsibilities and job description for the Grafana Observability Architect position at Tekaccel, Inc?
Hi
hope you are doing well
please share your updated resume, if you are available for new role.
Role: Grafana Observability Architect
Locations: Dallas, TX | Jersey City, NJ | Tampa, FL Onsite
Role Summary
We are seeking an experienced Grafana Observability Architect to lead the design, deployment, and optimization of enterprise grade observability platforms. This role requires deep expertise in Grafana and modern observability tools, along with strong cloud and Kubernetes experience. The architect will play a key role in enabling end to end visibility across distributed systems while collaborating with cross functional teams and stakeholders.
Key Responsibilities
- Design, deploy, configure, and manage Grafana-based observability platforms at enterprise scale
- Architect and maintain comprehensive monitoring, logging, and tracing solutions using:
- Prometheus (PromQL)
- Loki
- Tempo
- SQL-based data sources
- Define and implement observability standards, best practices, and architectures across environments
- Integrate Grafana with cloud-native services and Kubernetes clusters
- Develop and optimize dashboards, alerts, and metrics to support:
- Infrastructure monitoring
- Application performance monitoring (APM)
- Distributed tracing and troubleshooting
- Use scripting and automation (e.g., Python) to automate observability workflows, provisioning, and maintenance tasks
- Collaborate with application, platform, SRE, and DevOps teams to ensure end-to-end observability coverage
- Troubleshoot complex performance and reliability issues using observability insights
- Document architecture designs, monitoring strategies, runbooks, and best practices
- Present technical solutions and observability strategies clearly to engineering leadership and stakeholders
Essential Skills & Technical Expertise
- Grafana (Expert Level)
- 8 years of experience designing, deploying, and managing Grafana environments
- Observability Stack Expertise
- Prometheus (PromQL)
- Loki
- Tempo
- Metrics, logs, and traces correlation
- Cloud & Infrastructure
- Strong experience with public cloud platforms:
- Amazon Web Services (AWS) required
- Microsoft Azure or Google Cloud Platform (Google Cloud Platform) preferred
- Kubernetes monitoring and observability
- Strong experience with public cloud platforms:
- Scripting & Automation
- Proficiency in Python or similar scripting languages
- Automating dashboards, alerts, and workflows
- Databases & Querying
- Strong SQL knowledge for observability queries and reporting
- Communication & Documentation
- Ability to explain complex technical concepts to non-technical stakeholders
- Strong documentation and presentation skills
Required Experience
- 10 years of overall IT experience
- 8 years in observability, monitoring, or platform engineering roles
- Hands-on experience designing enterprise-scale Grafana observability architectures
- Proven experience supporting large, distributed, cloud-native environments
Thanks & Regards
Ajay Pratap Singh
|
E: Website:
|
Tekaccel, Inc. 2601 Little Elm Pkwy, Suite # 1804, Little Elm, TX 75068 |