Manager, Cloud Operations
Redis
Manager, Cloud Operations
IsraelWho we are
We're Redis. We built the product that runs the fast apps our world runs on. (If you checked the weather, used your credit card, or looked at your flight status online today, you’re welcome.) At Redis, you’ll work with the fastest, simplest technology in the business—whether you’re building it, telling its story, or selling it to our 10,000+ worldwide customers. We’re creating a faster world with simpler experiences. You in?
Why you'll love this job
As a Manager in Cloud Operations at Redis, you will lead a team of senior engineers, responsible for keeping Redis Cloud reliable, available, and fast for customers worldwide. You’ll combine hands-on technical leadership with day-to-day people management, helping your team grow while maintaining strong operational standards.
You will work in a dynamic, fast-paced, large-scale, multi-cloud production environment, collaborating closely with R&D and other engineering teams, and continuously improving how we run and scale our distributed systems. You will also drive innovation in how the team operates by leading cross-functional projects that introduce practical new tools, automation, and ways of working into production. Your leadership will help align operations with business priorities, build a strong operations culture, and steadily improve our reliability and efficiency in a fast‑evolving, technical domain.
Our ideal candidate thrives on leading multiple, high‑visibility operational initiatives, communicates clearly with different stakeholders, and enjoys working in an environment that demands proactive risk management, cross‑team collaboration, and continuous improvement. If you are driven by ownership, enabling teams, and delivering reliable cloud services at scale, this is your opportunity.
What you’ll do
- Lead and develop a CloudOps team in Israel as part of the global Cloud Operations organization, setting clear goals, expectations, and ways of working, and investing in people's growth and performance.
- Act as a hands-on technical leader in a cloud‑native, high‑scale environment, with focus on reliability, resiliency, observability, automation, performance, and cost efficiency.
- Own and improve core operational processes: on‑call and incident response, escalations, change management, runbooks, production‑readiness, and post‑incident reviews that drive real follow‑through.
- Partner directly and proactively with R&D, Platform, Product, Support, and Customer Success to shape and reduce reliability risks, improve deployment safety and performance, and ensure customer‑impacting issues are tracked to closure without constant reminders.
- Use data, metrics, and observability tooling, together with automation and AI‑driven workflows, to measure system health, guide decisions, identify patterns, and drive continuous improvement in reliability and operational excellence.
What you’ll need to have
- 5+ years of experience in Cloud Operations, SRE, Production Engineering, or similar roles in large‑scale production environments, including 3+ years managing or leading engineering teams.
- Strong hands‑on experience with at least one major public cloud (AWS, GCP, or Azure) running production workloads at scale, balancing resilience, performance, and cost.
- Good understanding of Linux and networking fundamentals, and hands-on experience with automation or scripting. Exposure to modern observability and incident management tools and practices, and familiarity with databases or distributed data systems (experience with Redis or similar technologies is a strong plus).
- Proven ability to lead teams through incidents and operational change with clear, calm communication under pressure, and to collaborate effectively across time zones and multiple stakeholder groups.
- A high degree of ownership and accountability, with a data‑driven approach to prioritization and decision‑making, and the ability to balance process discipline with pragmatism and speed to delivery.
Extra great if you have
- Experience leading operational projects (e.g., automation, reliability improvements, cost optimization) and contributing from design to rollout.
- Experience reliability metrics, and incident response processes, including standardizing RCAs and driving long‑term fixes.
- Familiarity with ITIL/ITSM concepts (incident, change, problem management, and security/compliance processes) adapted for modern cloud operations.
- Experience with cost optimization, capacity planning, or FinOps‑related practices in large‑scale cloud environments.
- A track record of automation and process transformation in distributed teams, turning ad‑hoc workflows into scalable, repeatable, and well‑documented operational practices.
#LI-BL1
#LI-HYBRID
As a global company, we value a culture of curiosity, diversity of thought, and innovation from our employees, customers, and partners. Redis is committed to a diverse and inclusive work environment where all employees’ differences are celebrated and supported, and everyone feels safe to bring their authentic selves to work. Redis is dedicated to equal employment opportunities regardless of race, color, ancestry, religion, sex, national orientation, sexual orientation, age, marital status, disability, gender identity, gender expression, Veteran status, or any other classification protected by federal, state, or local law. We strive to create a workplace where every voice is heard, and every idea is respected. Redis is committed to working with and providing access and reasonable accommodation to applicants with mental and/or physical disabilities. If you think you may require accommodations for any part of the recruitment process, please send a request to [email protected]. All requests for accommodations are treated discreetly and confidentially, as practical and permitted by law. Any offer of employment at Redis is contingent upon the successful completion of a background check, consistent with applicable laws. Redis reserves the right to retain data longer than stated in the privacy policy in order to evaluate candidates.