Applies engineering knowledge to design and development work to contribute towards a comprehensive logging, monitoring, and alerting solution to provide feedback and system health of the company’s 5G cloud infrastructure. This position will serve as a key contributor as a LMA Engineer responsible for platform alerting for Network Cloud Software Releases and Cloud Platform Organization to ensure proper response to cluster and server issues.
Key Roles and Responsibilities:
Develop and leverage Logging, Monitoring, Alerting tools such as Nagios, Alertmanager, Alerta, Kibana, Grafana, Elasticsearch, and Prometheus
Collaborate to gather and review software requirements/user stories, provide estimates, create software design specifications and collaborate with engineers/architects to assess and test hardware and software interactions.
Contribute and develop monitoring and logging alerts to ensure system level health
Create and execute automated test plans/strategies utilizing business requirements and collaborate with engineers/architects, clients, etc. to validate test environments, test data and test results, design and implement code fixes, validate outcomes against expected results and produce associated reporting. In addition to Unit Test, responsibilities may include dynamic application security testing, interface testing, integration testing, end-to-end testing and/or user acceptance testing. Supports applications and solves configuration and environment issues. Supports the software deployment process
Participate in open source development and productizing of emerging and existing technologies
Leverage CICD pipelines to validate and exercise new code
Execute software deployment flows manually and develop resulting workflows in automation
Debug environmental and deployment issues in Cloud Stacks including within Jenkins, Kubernetes, Docker, and Kubernetes workloads
Develop Kubernetes declarative intent for software lifecycle management
Provide Tier 5 support for Production Operations on issues impacting tenant experience and functionality
Knowledge and skills:
Experience in LMA tooling (Kibana, Graphana, Elasticsearch, Prometheus, Nagios, Alertmanager)
Experience in Kubernetes, Kubernetes architecture, Docker
Experience with Deployment manifests
Intermediate or expertise competancy in Python
Expertise in Helm or similar experience with lifecycle managers for kubernetes workloads
Kubernetes package manager, template deployment and lifecycle
Golang experience
Knowledge of Ansible and salt
Proven Open Source development
OpenStack, Ceph, and baremetal awareness
Proven knowledge of and experience in DevSecOps, Scrum, and Agile
Education: Preferred Bachelors of Science degree in Computer Engineering, Computer Science, Applied Science, Electrical Engineering, or Math; Developer nanodegree; or equivalent experience.
Experience: Typically requires 2-3 years related technical experience