Splunk Cloud Infrastructure Site Reliability Engineer - San Francisco in San Francisco, California
As a Site Reliability Engineer you will be part of our team that works very closely with our application development team to provide high availability services to users that use Splunk as a service using cloud infrastructure. You will provide input on and execute security and deployment practices, scaling and metrics, as well as running general day-to-day server management.
Monitor all the things. We’re Splunk, we love the data
Designing, scaling out, and maintaining our AWS Cloud-based infrastructure
Designing and write code to develop and maintain systems which powers Splunk cloud services hosted in the public cloud
Developing Scripts and applications to automate system deployment scaling and infrastructure
Implement horizontally scaled out systems, which allow thousands of concurrent Splunk users
Build deployment, management and monitoring systems using “infrastructure as code” tools and techniques
Implementing zero-downtime production pushes. Provide fanatical production support for applications and infrastructure
EC2 AWS S3
Linux system administration skills, Ubuntu Linux a plus
Knowledge of monitoring (e.g., Cloudwatch, Splunk, etc.) and configuration management tools (e.g., Ansible, Chef, Puppet, etc.)
Hosted and cloud-based service experience
Shell scripting and high level language expertise
We like Ruby and Python a lot .
Previous experience in a high paced agile development environment using tools such as git, jira, jenkins,.. etc.
- Bachelors Degree in Computer Science or relevant experience.