Posted in Other 30+ days ago.
Location: North Bergen, New Jersey
Sr. Staff Engineer, SRE - Equinix Metal
Equinix is the world's digital infrastructure company, operating 210 data centers across the globe and providing interconnections to all the key clouds and networks. Businesses need one place to simplify and bring together fragmented, complex infrastructure that spans private and public cloud environments. With Equinix Metal, customers can rapidly deploy automated single-tenant infrastructure and interact with 1,800+ networks and 2,900+ cloud and service providers. Our global platform allows customers to place infrastructure wherever they need it and connect it to everything they need to succeed.
At Equinix, we help the world's digital leaders scale with agility, speed the launch of digital services, deliver world-class experiences, and transform people's lives. Our culture is based on collaboration and the growth and development of our teams.
We hire hardworking people who thrive on solving challenging problems and give them opportunities to hone new skills, and try new approaches, as we grow our product portfolio with new software and network architecture solutions. We embrace diversity in thought and contribution and are committed to providing an equitable work environment. that is foundational to our core values as a company and is vital to our success.
This role can be remote or based in one of our office locations.
Job Summary: Site Reliability Engineering is a centralized SRE team focused on observability, incident management, and service level objectives. We work in a consultative and enabling model with service teams, product, and leadership to foster service ownership and continually drive for reliability across the whole software engineering process.
Responsibilities
Work with engineers to implement distributed tracing and observability practices
Assist on call teams with on call training, sustainability, and alert tuning
Help teams work through the SLO process and specify SLIs
Build and maintain our full service catalog
Facilitating blameless incident retrospectives
Mentoring & training engineers to write better retrospectives
Champion reliability and robustness work through the prioritization processes
Qualifications
7+ years experience
Bachelor's in Computer Science or Computer Engineering
Demonstrated experience with one or more of the core SRE practices (observability, incidents, SLOs)
Commitment to devops principles, people and process over code, maximizing collaboration across the organization
Ability to quickly understand and contribute to code written in primarily Go and Ruby
Desire to build on and evolve our SRE processes, capabilities, and programs that impact the whole software team
Adaptability & flexibility to work with a fast-growing software team to help it build a more reliable production system
Work independently on large reliability initiatives
Equinix is an equal opportunity employer. All applicants will receive consideration for employment without regard to race, religion, color, national origin, sex, sexual orientation, gender identity, age, status as a protected veteran, or status as a qualified individual with disability.
QTC Management, Inc.
|
Walmart
|
Walmart
|