DevOps Tech Lead

March 17 2025
Industries IT: Software
Categories Programming, Development, Project management
Remote
Toronto, ON • Full time

Who we are

At illumin, we are transforming the advertising landscape. Our platform offers an integrated space for journey planning, execution, and reporting. It empowers marketers to connect with their audiences in powerful ways through real-time data and easy-to-use visual tools. By seamlessly combining media planning and buying in an intuitive interface, marketers can take complete control of their campaigns, meeting customers wherever they are in the buying journey and maximizing the impact of their ad spend through personalized insights for smarter decision-making.

We are at a pivotal moment, evolving into a product-led company with a team of over 100 skilled professionals and new leadership guiding our path forward. By harnessing the power of data, advancing our AI capabilities, and deeply investing in our people, we are preparing for a future that will redefine what’s possible in journey advertising.

Our work is guided by two beliefs: that the ability to execute is paramount to success and that we are only as good as our people. As we grow and transform, we are looking for team members (illumineers) who share our bias for speed, delivery over perfection, and an entrepreneurial mindset. Joining us now is a chance to be part of our transformation.

Who we need

Reporting to the Manager of DevOps, we are hiring a Senior DevOps Engineer / Technical Lead to design, implement, and lead infrastructure and platform engineering initiatives. In this role, you will serve as a senior technical resource, collaborating closely with leadership to drive infrastructure strategy and mentor other engineers. You will focus on large-scale, high-availability production environments, drawing on your expertise in Kubernetes, enterprise networking, virtualization, and automation. You will play a key role in ensuring the performance, scalability, and reliability of our infrastructure while guiding the team in adopting best practices and cutting-edge technologies.

If you are an innovative thinker who thrives in environments that embrace new ideas rather than resist them, this is the place for you.

This is a hybrid opportunity, working Mondays, Tuesdays and Thursdays on-site in our downtown Toronto office. Our headquarters are located within minutes of St. Andrew and Union subway stations.

What's in it for you

Cutting-edge technology. Our system processes billions of requests daily, requiring hyper-efficient and highly scalable infrastructure. You will work within a highly sophisticated, enterprise-scale system that integrates the latest advancements in infrastructure, networking, and automation. From advanced Kubernetes orchestration to low-level infrastructure optimization, you will be exposed to complex engineering challenges that keep your skills sharp.

Unique engineering challenges. You will operate within an ultra-low latency system, where engineers regularly diagnose and optimize milliseconds of latency—sometimes even troubleshooting an extra five-second delay in response times. You will solve challenges in distributed systems, observability, and deep performance tuning that few other industries can offer.

Autonomy. We value engineers who take the initiative to lead projects and develop new solutions rather than waiting for top-down direction. We push the limits to improve and reach our full potential, individually and as a company.

Professional development. You want to grow your skills, your influence, your career. We are committed to building the strengths of our team. You will work with some of the best minds in the field, continuously learning new technologies, advanced troubleshooting techniques, and best practices that push the boundaries of what is possible.

How you will make an impact:

  • Design, implement, and operate Kubernetes clusters at scale. You will lead the deployment and management of Kubernetes clusters in production environments, ensuring reliability and performance at scale. You will develop and maintain custom Kubernetes Operators and CSI drivers to extend cluster functionality and meet specific operational needs.
  • Engineer and automate on-premise infrastructure. You will design, maintain, and automate bare-metal and co-location (Colo) environments without reliance on public cloud providers. This role requires a deep understanding of physical infrastructure, data center operations, and custom hardware integrations to optimize performance and reliability.
  • Develop automation solutions for enterprise networking. You will create and maintain automation workflows for enterprise networking environments, particularly those using Cisco technologies. Your work will ensure seamless integration of network changes and configurations into CI/CD and Infrastructure as Code (IaC) workflows, improving operational efficiency and reducing manual effort.
  • Build and maintain production-grade software and tools. You will develop infrastructure automation and management tools using Go, Python, Bash, TypeScript, and Rust. Through high-quality, maintainable code, your work will improve system monitoring, operational efficiency, and platform reliability.
  • Design and automate VMware environments. You will architect and manage VMware infrastructure, including vSphere, vCenter, and vSAN, ensuring seamless integration with Kubernetes and CI/CD workflows. Your work will enhance virtualization efficiency, automation, and scalability across environments.
  • Administer Linux and Windows systems. You will manage Linux environments (RHEL, Debian, Ubuntu) at an advanced level, ensuring stability, security, and performance. Additionally, you will support Windows Server environments, including Active Directory integration, to maintain interoperability across platforms.
  • Lead Infrastructure as Code (IaC) and configuration management. You will drive the adoption and implementation of Terraform and Ansible to enable version-controlled, repeatable, and automated infrastructure deployment. Your expertise will ensure consistency, scalability, and efficiency in infrastructure provisioning.
  • Architect and optimize CI/CD pipelines. You will design and maintain CI/CD pipelines and processes using GitLab CI, Jenkins, ArgoCD, and other automation tools. Your contributions will enhance deployment velocity, reliability, and security, supporting continuous delivery and operational excellence.
  • Mentor and support engineering team members. You will provide guidance, mentorship, and code reviews for junior and intermediate engineers, sharing best practices and fostering a collaborative learning environment. Your leadership will help the team overcome challenges and improve technical skills.
  • Implement and manage monitoring and observability tools. You will deploy and maintain real-time monitoring and observability solutions such as NetData, New Relic, Prometheus, and Grafana, ensuring proactive system health monitoring and performance optimization.
  • Work within structured ITIL processes. You will operate within an ITIL-based framework, contributing to incident management, change management, and problem resolution. Your involvement will support continual service improvements and operational efficiency.
  • Apply and advocate for DevOps methodologies. You will promote and implement DevOps principles, ensuring seamless alignment between development, operations, and business objectives. Your work will foster a culture of automation, collaboration, and continuous improvement.
  • Lead technical design discussions and innovation. You will actively participate in architectural discussions and strategic planning, challenging existing approaches and introducing innovative solutions to improve scalability, security, and performance.
  • Identify and deliver innovative solutions. You will proactively identify, propose, and implement solutions to complex infrastructure challenges, often requiring custom-built tools and creative problem-solving approaches.
  • Participate in on-call rotations and incident response. As part of an on-call rotation, you will respond to production incidents and critical issues as needed, ensuring minimal downtime and rapid resolution.

What you bring:

  • The technical expertise. You have worked as a Senior DevOps Engineer, Systems Engineer, or in similar roles, ideally in high-availability production environments. You have extensive hands-on Kubernetes experience in production, including custom controllers/operators, CSI drivers, and multi-cluster management. You have a deep understanding of co-location and bare-metal environments, including rack/stack, PXE booting, provisioning, and physical hardware management. You have networking experience, including Cisco enterprise networking (routing, switching, VLANs, firewalls), and have automated network configurations using Ansible or similar tools.
  • The software development expertise. You have expert-level coding skills in one or more of the following: Go, Python, Bash, TypeScript, or Rust, including building automation tools, operators, and CLIs. You have deep knowledge of VMware (vSphere, vCenter, ESXi, vSAN), with experience scripting and automating tasks using PowerCLI or similar tools. You have expert-level Linux administration (RedHat, Debian/Ubuntu) and a solid working knowledge of Windows Server, including Active Directory integrations and DNS.
  • The infrastructure and automation knowledge. You have strong experience with Infrastructure as Code (Terraform, Ansible) and configuration management principles. You have advanced CI/CD expertise, including pipeline design, artifact management, security scanning, and deployment strategies (GitOps, ArgoCD, FluxCD). You have experience working with observability stacks, such as NetData, New Relic, Prometheus, Grafana, Loki, and ELK. You have strong knowledge of ITIL processes, with experience operating within structured incident, problem, and change management environments. You have exposure to ultra-low-latency or real-time environments.
  • The critical thinking skills. You can innovate and find creative solutions to complex problems without relying on cloud-native offerings. You are highly analytical, able to assess trade-offs, and committed to optimizing performance, security, and reliability. You are comfortable with ambiguity and willing to figure things out when no clear path or process is outlined.
  • The interpersonal skills. You have applied DevOps principles (CI/CD, IaC, GitOps, immutable infrastructure) in enterprise environments and can clearly communicate their value to both technical and non-technical stakeholders. You can build trusting relations with in-person and remote teams. You can lead, mentor, and collaborate with cross-functional teams, including development, operations, and security teams. You quickly identify when priorities need to shift and take feedback from leaders and peers.

What else should you know about us?

We are undergoing a transformative shift. We are embracing change and the opportunities that come with it, empowering every illumineer to innovate, experiment, and bring forward new ideas. Whether accessing new technology, restructuring workflows, or expanding your team, you will have full support if you can make the business case.

We are a broad and diverse team, but we all share a passion for success, a drive to do more, and a love of creating connections. We hire for talent and commitment and provide the guidelines and guidance to elevate skills, knowledge, and abilities across all areas. This is a place where proven methods meet bold ideas, offering opportunities to grow personally and professionally.

To support a healthy work-life balance, we offer a flexible work environment, a meal credit for your in-office days, and a free massage with an RMT in-house every eight weeks. That is in addition to our comprehensive benefits, which include life, AD&D, long-term disability insurance, and coverage for prescriptions, dental, vision, mental health, and professional health services. You will also have access to a workplace advisor, the Vitality Wellness app, and a $300 annual healthcare spending account.

Apply now

If you want to seize the opportunity to impact a company and influence an industry, and you have 70% of what we are looking for, apply now. We can't promise an interview, but we will consider your whole application.

What you can expect from our interview process:

  • A virtual interview with a senior Talent Advisor to discuss your experience and interest in the role and an online technical assessment.
  • A virtual technical interview with the Director, Infrastructure & Operations and the Senior DevOps Engineer) to discuss your technical skills and problem-solving approach.
  • An at-home technical assessment.
  • A final interview with the CITO and Hiring Manager to discuss any final questions you have about the product, the team, or the role and gain a deeper understanding of what it’s like to work with us.

illumin is firmly committed to diversity within its community and welcomes applications from racialized persons/persons of colour, Indigenous People of North America and the world, persons with disabilities, 2SLGBTQIA+ persons, and those who may contribute to the further diversification of ideas.

We are committed to providing equitable opportunities in employment and to providing a workplace which is free from discrimination and harassment. We are equally committed to providing an inclusive and accessible workplace. If you require accommodations at any stage of the interview process, please email us at hr@illumin.com .

#LI-Hybrid
#LI-DNI

Apply now!

Similar offers

Searching...
No similar offer found.
An error has occured, try again later.

Jobs.ca network