Site Reliability Engineers are critical when it comes to keeping both our customers and engineering team running. As part of the Cloud Platform group you will be building the software to help with operations and support, helping to fix support escalation issues, and ultimately crucial for the continuous improvement of people, processes, and technology within our organization.
Site Reliability Engineers will have a variety of responsibilities which may include things like:
Owning cloud production environments by monitoring availability and taking a holistic view of system health
Designing and developing tools to manage the platform infrastructure and applications
Improving reliability, stability, performance, and security of the overall system
Measuring and optimizing performance, identify and resolve bottlenecks
Measuring, identifying, and applying the best cost-effective solutions within AWS
Providing operational support and engineering for Vectra products and teams
Influencing platform architecture decisions
Automating, automating, automating
Relevant Bachelor's degree (Computer Science, Engineering, etc.) or equivalent experience
3 years of hands-on DevOps or operations experience on cloud production systems
Understanding of main SRE concepts (e.g. SLI, SLO, chaos engineering, blast radius, etc.)
Strong knowledge of networking
Zealous about production
Comfort in dealing with cloud deployments and security best practices
Linux system proficiency
Strong container/serverless experience (including Kubernetes)
CI/CD Pipeline experience - Gitlab, Jenkins, etc.
Proficient knowledge of Python language
Hands on experience on Amazon Web Services (AWS)
Hashicorp ecosystem experience - Terraform, Packer, Consul, Vault
Experience with telemetry tools/pipelines (metrics, logs, and traces with technologies like Open Telemetry)
Belief in DevOps and Agile principles
ABOUT THE COMPANY:
Vectra delivers a new class of real-time threat detection and advanced analysis of active breaches. Vectra picks up where perimeter security leaves off using AI to provide deep, continuous analysis of both internal and Internet-facing network activity for all phases of the attack progression as attackers attempt to breach, spy, spread, and steal.
Vectra directly analyzes network traffic and other relevant data sources in real-time using a combination of patented data science, machine learning, and behavioral analysis. We detect attacker behaviors and user anomalies within our customers' ecosystems, provide clear prioritization, and include relevant context so that our users can quickly and easily identify active attacks.