Senior Site Reliability Engineer (SRE)
Company: Bellota Labs
Location: Redwood City
Posted on: February 19, 2026
|
|
|
Job Description:
Job Description Job Description At Bellota Labs , we are a
fast-paced, hypergrowth startup poised to revolutionize the gaming
world with ClubWPT Gold —a groundbreaking product from the World
Poker Tour . Driven by innovation, game integrity, and exceptional
customer experiences, we are on a mission to set new standards in
online gaming. We are seeking an experienced Senior Site
Reliability Engineer (SRE) to design, build, and maintain highly
reliable, scalable, and secure systems. You will play a critical
role in ensuring system availability, performance, and operational
excellence across our infrastructure and applications. As a senior
member of the team, you will also mentor engineers, influence
architecture decisions, and drive best practices in reliability
engineering, automation, and incident management. Key
Responsibilities: Reliability & Availability Design and implement
highly available, scalable, and fault-tolerant systems. Define and
maintain SLIs, SLOs, and SLAs. Lead incident response, root cause
analysis (RCA), and postmortems. Improve system resiliency and
reduce operational toil through automation. Observability &
Monitoring Design monitoring, alerting, and logging strategies.
Implement tools such as Prometheus, Grafana, Datadog, ELK, or
similar. Establish proactive alerting and capacity planning
processes. Performance & Scalability Conduct performance testing
and optimization. Identify bottlenecks and implement improvements.
Support system scaling initiatives and architecture reviews.
Collaboration & Leadership Partner with engineering teams to embed
reliability into development processes. Lead reliability
initiatives and cross-functional projects. Mentor junior engineers
and promote SRE best practices. Experience: 5 years of experience
in SRE, DevOps, or Infrastructure Engineering. Strong experience
with cloud platforms (AWS). Deep understanding of Linux systems and
networking fundamentals. Experience with containerization and
orchestration (Docker, Kubernetes). Proficiency in
scripting/programming (Python, Go, Bash, or similar). Experience
with monitoring and observability platforms (Datadog/Prometheus).
Preferred Technologies (Nice to Have): Experience operating
high-scale production systems. Experience with microservices
architecture. Background in database reliability (Postgres, MySQL,
Redis, etc.). Experience implementing SRE practices (error budgets,
blameless postmortems). Experience with AI-driven SRE Lead
High-Impact Projects – Play a key role in delivering innovative
gaming experiences to a global audience Collaborate Across Borders
– Work with talented teams across Asia and the US Fast-Paced Growth
– Be part of a hypergrowth startup with ambitious goals Competitive
Benefits – Enjoy a top-tier compensation package in a dynamic
company We may use artificial intelligence (AI) tools to support
parts of the hiring process, such as reviewing applications,
analyzing resumes, or assessing responses. These tools assist our
recruitment team but do not replace human judgment. Final hiring
decisions are ultimately made by humans. If you would like more
information about how your data is processed, please contact
us.
Keywords: Bellota Labs, West Sacramento , Senior Site Reliability Engineer (SRE), IT / Software / Systems , Redwood City, California