Build the future you want

Join the companies disrupting their industries
companies
Jobs

Senior Site Reliability Engineer

Tyk

Tyk

Software Engineering
Vancouver, BC, Canada
Posted 6+ months ago

Who are Tyk, and what do we do?
The Tyk API Management platform is helping to drive the connected world and power new products and services. We’re changing the way that organisations connect any number of their systems and services. Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or media industries (to name just a few!)

If you’ve banked online, used an app to check the news, or perhaps even driven a connected car, API’s, and by extension, Tyk, make that possible. Founded in 2015 with offices in London - UK, London - Ontario, Atlanta and Singapore, we have many thousands of users of our B2B platform across the globe. Brands using Tyk range from Lotte, Bell, T Mobile, to RBS, Capital One and Vinci. We have a varied user base hailing from every continent – even Antarctica.

Our Mission

Tyk is on a mission to connect every system in the world. We’ve started by building an API Management platform.

Total flexibility, default remote, radical responsibility

We offer unlimited paid holidays and remote working from anywhere in the world, for everyone, Why? Tyk was founded on the principle of offering flexibility and autonomy to our employees, we believe this allows our employees to achieve their best results. It also means we can build the best possible team, location and working hours are no barrier.

If this sounds like an environment that you believe could work for you then read on to find out more.

At Tyk, we’re obsessed with building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower users with a rich feature set, high availability, and stellar performance level to pursue their missions.

Our customer base is growing, so we’re seeking an experienced Senior SRE to optimise, automate, and improve our performance, using insights from massive-scale data in real time. We want an original thinker, a challenger, a technical legend, an opinionated collaborator who wants to make things better.

Here’s what you’ll be getting up to:

  • Collaborate with the Principal SRE to shape and implement the SRE strategic plan.
  • Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process.
  • Address wellbeing and performance concerns, fostering a positive and productive team environment.
  • Work with the Principal SRE and Scrum Master to analyse wellbeing survey outcomes and develop improvement plans.
  • Champion operational communication, ensuring high-quality and timely updates on team progress.
  • Ensure SLA compliance for our cloud environment through proactive monitoring.
  • Develop and oversee the roadmap for proactive alerting and monitoring.
  • Define and track key performance metrics for cloud services, driving continuous improvement.
  • Design and implement solutions to maintain and enhance KPIs.
  • Lead performance tuning and fault finding by analysing metrics from operating systems and applications.
  • Optimise system and infrastructure performance, focusing on innovation and customer needs anticipation.
  • Engage with commercial teams to understand growth plans and develop corresponding SRE strategies.
  • Direct the analysis of cloud infrastructure, focusing on automation, scalability, and management.
  • Align with the Principal SRE on automation strategies for cloud-operations tasks.
  • Model excellence in software design and automation to enhance Tyk Cloud services, creating runbooks and knowledge sharing.
  • Conduct blame-free root cause analysis postmortems, reporting findings and recommendations.
  • Document operational processes and policies, ensuring replicability and adherence.
  • Provide on-call support, ensuring effective response and resolution in line with SLAs.
  • Plan and execute software upgrades to optimise cloud services.
  • Assist commercial teams with data requests and account management.
  • Champion and adhere to SCRUM methodologies within the SRE team.