Site Reliability Engineer
Manticore Games is the developer of Core, a platform and community that is redefining how virtual worlds and games are created, published, and played. Core’s mission is to empower and discover a whole new generation of creators.
We are looking for Senior Site Reliability Engineers who can help build, monitor, and run a live platform that will revolutionize the way we create, play, and share gameplay experiences.
- Monitor and operate Core platform services
- Troubleshoot operational problems as they arise, test fixes, and perform follow-ups to ensure issues have been correctly resolved
- Part of an on-call rotation to assist finding a resolution during incidents
- Apply your systems knowledge to triage problems and tune resource usage
- Participate in code reviews for projects written by your team
- 5+ years of experience monitoring and operating a live environment
- Hands-on experience with container technologies such as Docker and kubernetes
- Experience with managing Linux VMs
- Experience with cloud services and architecture (Azure, GCP, AWS)
- Ability to write and maintain tools written in an language like PowerShell
- Experience with several of the following tools: Splunk, fluentd/fluent-bit, Jenkins, Microsoft Orleans, Datadog, nginx, redis
- Familiarity with several different database technologies
- Experience deploying and managing Kubernetes clusters with tools like Helm and Terraform
- Experience with C# or similar programming language
- Experience leading investigations and resolving live environment outages
Manticore Games provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.