Key Responsibilities:
- Maintain, scale, and harden on-premise Kubernetes clusters (self-hosted, not managed cloud) for high availability and disaster recovery
- Design, implement, and manage Infrastructure as Code (IaC) for on-premise environments using Ansible and other automation tools
- Build, optimize, and monitor Azure DevOps pipelines for CI/CD of containerized applications and infrastructure changes
- Support and extend Azure-based infrastructure components integrated with on-prem workloads
- Implement and maintain observability stack: Prometheus, Grafana, Kafka (for log/metric streaming), and database monitoring (PostgreSQL, MongoDB)
- Package and deploy applications using Helm, ensuring standardized, repeatable releases
- Write and maintain Bash and Python scripts for automation, troubleshooting, and tooling
- Collaborate with development, security, and operations teams to ensure platform reliability, compliance, and performance
- Document architecture, processes, and runbooks for knowledge sharing.
We'd love to hear from you if you have:
- 5+ years of hands-on experience as a DevOps Engineer in enterprise environments
- Proven expertise in on-premise Kubernetes deployment and management (not AKS/EKS/GKE)
- Strong proficiency with Ansible for configuration management and automation
- Extensive experience with Azure DevOps for CI/CD pipeline design and maintenance
- Solid experience with Helm for Kubernetes application packaging
- Hands-on experience with Prometheus, Grafana, and Kafka for monitoring and event streaming
- Experience managing and monitoring PostgreSQL and MongoDB in production
- Proficient in Bash and Python scripting for automation and tooling
- English B2 level or higher — able to communicate clearly with global teams and stakeholders
- Demonstrated ability to collaborate with a team and a client.
Nice-to-Have Skills:
- Experience with Argo CD for GitOps-based deployment
- Familiarity with Jaeger or OpenTelemetry for distributed tracing
- Exposure to Istio for service mesh implementation
- Experience with ELK Stack (Elasticsearch, Logstash/Filebeat, Kibana) for centralized logging
- Knowledge of HashiCorp Vault for secrets management and dynamic credentials.