About the Role
The Behavox Platform is a scalable, fault-tolerant and highly performant storage and processing system which allows us to manage and analyze massive volumes of data.
We have an extensive and flexible set of APIs to develop products that allow our clients to work through millions of data items, by searching, filtering, and visualizing relationships between entities in the system.
As a Site Reliability Engineer, you will be responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of all production systems and services.
You will work together with other DevOps, Product, and Engineering teams to design and implement SRE practice at Behavox to build foundational infrastructure allowing to support the rapid growth of the Behavox client base.
This is an incredible opportunity to discover the world of high-load data processing and face the challenges of distributed Big Data systems.
It will also provide you the opportunity to :
1. Work with high-load and business-critical services that will have a big impact on the company
2. Implement your ideas in an environment that strives for continuous improvement
3. Be part of a fast-growing dynamic company and with modern technologies
More information about the tools and solutions used at Behavox can be found on our engineering blog https : / / blog.behavox.engineering
What You'll Bring
- A deep and genuine interest in Behavox as demonstrated by a connection to its mission, marketplace and / or technologies
- 5+ years of experience as an SRE / DevOps engineer responsible for deployment and maintenance of production systems
- Experience with Public Clouds (GCP / AWS). Knowledge of Google cloud Dataflow, Cloud Functions, Pub / Sub, or similar AWS technologies would be a plus
- Automation skills - SaltStack or equivalent tools(Ansible), knowledge of programming languages (Python, Golang, Java)
- Experience with Hashicorp stack : Terraform, Nomad, Consul, Vault
What You'll Do
- Perform deployment and maintenance of high-load and large-scale distributed storage and data processing systems in Public clouds
- Monitor, develop, and troubleshoot applications to resolve issues, lead incident support, be part of the on-call team
- Automate routine operations using Python / Golang
- Maintain cloud-based services in the public cloud providers (GCP / AWS)
- Administer and troubleshoot Linux operating systems and networks
What We Offer
- A truly global mission with a passionate community in locations all over the world
- Huge impact and learning potential as our aspirations require bold innovation
- Highly competitive compensation with 100% bonus pay already integrated
- Benefits include great health coverage for employee and family
- Generous time-off policy and flexible work schedule