Senior Site Reliability Engineer
At Caffeine, we want to change how people consume live television - making it more friendly, connected, and fun. To do this, we’re building a new social broadcasting platform that features world-class content, easy-to-use broadcasting tools, a social and fun viewing experience, and an engaged broadcaster community.
This is an exciting and enormous challenge, and we will only be successful if we build a supportive and collaborative team. Our teams prioritize delighting our community of viewers and broadcasters, working with intention, taking ownership of our commitments, and acting with resilience and determination—all with the intention to ship greatness, always.
As a Senior Site Reliability Engineer, you will be a part of a fast-growing team of dedicated Developer Platform engineers working to build, iterate on, and maintain the mission-critical systems that run and scale in production. The Developer Platform team is responsible for ensuring the highest levels of availability in production environments. Qualified SREs will come from a software development background and have a passion for Software Architecture, Distributed Systems, Operations, Kubernetes, and DevOps.
What you'll do:
- Work alongside multi-functional development teams in an embedded fashion as an authority on platform and DevOps ideologies
- Bring legacy and greenfield applications into the next generation of Kubernetes-based platform at Caffeine
- Manage, secure, and monitor all installed systems, and infrastructure
- Learn and adapt to the ever-changing landscape of tools and services responsible for modern software delivery
- Focus on improving the quality of life of each and every developer at Caffeine by building tools and culture
- Partner with development and systems engineering to improve production deployments, management, and environment stability
Who you are & What you've done:
- Solid understanding of performant, scalable Golang
- Demonstrated cloud experience, preferably in AWS and/or GCP
- Strong understanding across SRE, DevOps, Distributed Systems, and Platform
- Experience operating and scaling Kubernetes-based platforms for highly functional development teams. Bonus points for hands-on RedHat OpenShift experience
- Experience running production workloads through a service mesh such as Istio
- Experience scaling highly elastic distributed systems
- Excellent project management skills and the ability to work in a fast-paced work environment
- Comfortable debugging and developing code written in Java, Golang, or other strongly typed languages
- Hands-on experience writing operators that automate platform operations
- Experience with monitoring systems (e.g. Datadog, SignalFX, Splunk)
- Experience with automation software (e.g. Terraform, Ansible)
- Proven scripting skills (e.g., shell scripts, Perl, Bash, Python)
We are committed to an inclusive and diverse Caffeine. We believe that different perspectives lead to better ideas, and better ideas allow us to better understand the needs and interests of our diverse, global community. We welcome people of different backgrounds, experiences, abilities and perspectives and are an equal opportunity employer.