This job listing expired on May 12, 2021
Tweet

Do you visualize your future at NVIDIA? WE DO!

Our team is building and supporting a variety of internal services. Among the most important projects we support underlying services powering GeForce Now! game patching system, which has high demands on stability as it's servicing our external customers. Also, we develop and operate an internal content delivery network for game builds, spanning multiple NVIDIA sites across the globe and delivering over 2 petabytes of data to the clients each month. Additionally, we develop and operate a game build encryption system, keeping up with ever-changing security requirements from our partners. We operate a number of Ceph installations with largest one being over a petabyte as of now. And, finally, we develop a unified Linux deployment, configuration and monitoring infrastructure to boost productivity of DevOps engineers in surrounding teams. Our infrastructure is hosted in the on-premise data centers and machine labs at NVIDIA sites.

We’re looking for a new team members to help us bring our operations to the next level, including monitoring and incident tracking processes, first line of support and documentation/statistics. Our technology stack relies on a combination of industry standard open source components (like CentOS, SaltStack, CheckMK, Cassandra, Spark, Ceph, Redis, PostgreSQL etc.) and proprietary software components based on Golang, NodeJS, Ruby, Python, Perl.

What You'll Be Doing

  • You will work flexible shifts in a 5/2 schedule in full compliance with Russian labor code.
  • You will monitor production systems health.
  • You will fine tune metric thresholds and eliminate false positives.
  • You will track incidents and analyze the SLA reports.
  • You will act as first line of support for different projects, performing initial diagnostics and mitigation.
  • You will write post-mortems and update existing documentation if needed.
  • You will participate in enhancements to the monitoring system and diagnostic tools.

What We Would Like To See

  • Experience with basic troubleshooting of a Linux environment, like cpu/memory usage, disk and network io.
  • Ability to diagnose networking issues via ping/traceroute/tcpdump.
  • Bash scripting experience.
  • Good communication skills.
  • Ability to document your work.
  • Self-motivated, engaged, eager for self-education.

Ways To Stand Out From The Crowd

  • Upper-intermediate verbal and written technical English.
  • We appreciate additional programming skills in Python or other scripting languages.
  • Real passion for PC/mobile gaming is highly desired.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people in the world working for us. If you're creative and autonomous, we want to hear from you!