About the Role
We are on the hunt for a highly motivated SRE
What You’ll do at Copperpod
• Increase observability standard on our platform by implementing robust monitoring of our services, informative dashboards for platform visibility / reports to provide valuable insights to drive down the MTTD(mean time to detect) and MTTR(mean time to resolve) on incidents and also application health and service performance on production systems.
• Hands on application management of around 60+ micro services within AWS cloud / Kubernetes infrastructure ensuring availability, resiliency, scalability and performance including full stack diagnosis, fault resolution (Kubernetes, unix debugging).
• Review platform architecture and service flows to influence resiliency and high availability of services and Iteratively ensure the operational readiness of all services on our platform before introducing them to production environment.
•Identify areas for process automation and develop automated scripts for regular operational activities. Collaborate and partner with service, data and develops teams to inspire changes and ensure optimal application performance and resiliency.
• Provide rotational on-call support to respond, detect, triage and resolve production incidents on our platform.
• Conduct, document and present root cause analysis documents to share insights with respective stakeholders.
You are a good fit if you have
• Minimum 4+ years of experience with building automations into daily operational process through one or more scripting languages(preferably python or Shell scripting).
• Minimum 4+ years of experience as SRE on high volume and/or critical production environment along with quality control, validating services, deploying, supporting and managing applications on cloud.
• Strong experience in configuring and automating operational responsibilities for AWS managed services including EC2, S3, lambda, RDS, Elasticache, Route53 DMS, Kafka, WAF.
• Strong Experience with observability tools (New Relic, Cloudwatch, Prometheus, Graylog Grafana), container technologies and orchestration ( Kubernetes, Dockers), AWS networking (Route tables, R53, API-Gateway, NACL, NAT), infrastructure as code (Terraform or CloudFormation), SQL/NoSQL Databases.
• Understands key SRE concepts such as Toil, SLI, SLO, Error Budgets, MTTD, MTTR, etc.
• Experience working as SRE for production systems, good with SRE principles and debugging production issues across services and levels of the stack and ensure up-to-date support playbook / documentations.
• Ensuring reliability of production by building insightful dashboards and monitors to reduce alerts, tuning thresholds & monitoring performance of services using any APM tool.
Benefits and Support
• Competitive compensation and remuneration package.
• Inclusive culture.
• Robust mentorship program with industry stalwarts.
• Opportunity to work with the brightest minds in the industry and drive valuable
outcomes.
Our Values
At Copperpod Digital, we are committed to creating a workplace that reflects the diversity
of our community and the world around us. We celebrate diversity and welcome people
from a variety of backgrounds, ethnicities, cultures, perspectives, experiences, and skill sets.
A diverse team composed of individuals with different perspectives, lived experiences, and
identities is essential to achieving our mission.
If you're passionate about UI development, have a drive for creating seamless user
experiences, and meet the requirements above, we encourage you to apply for this exciting
opportunity.
About the Company
Copperpod Digital is the bridge that connects businesses with their audience in a constantly
evolving digital landscape. Our commitment to innovation and experimentation drives us to
push the boundaries of what's possible, empowering brands to create meaningful
connections with consumers. At Copperpod, we believe that the future of digital lies in
web3 technology, and that's where metamaxx.io comes in. As a division of Copperpod
Digital, metamaxx.io specializes in helping businesses develop cutting-edge digital offerings
that bridge the gap between different realities. Together, we're building the future of digital
experiences.