Duties & Responsibilities
Medical Science & Computing is searching for a strong DevOps Monitoring Engineer to join our Monitoring team within the DevOps group to build new monitoring infrastructure and tools for unified monitoring of thousands of cloud and on-premises services and hosts. This is a great opportunity to work on challenging problems as part of a dynamic DevOps team, in a technical, scientific, and goal-oriented environment. NCBI offers flexible working hours, remote options, training courses on-site and off-site, conference attendance reimbursement and tuition reimbursement.
NCBI advances science and public health by providing free access over the web to biomedical literature and genomic data, making it one of the 400 top most-visited sites in the world. NCBI's diverse staff of smart, talented, and deeply technical people collaborate to build critically valuable services for researchers, physicians, educators, students, and the general public. For example, NCBI develops and delivers PubMed, an index of over 29 million biomedical research abstracts, often with links to full-text literature and supporting data. NCBI is located in Bethesda, Maryland, and is part of the U.S. National Library of Medicine, one of the National Institutes of Health.
- Builds a modern Enterprise DevOps platform with a critically important observability layer, serving as a basis for automatic operations, such as auto-scaling and auto-healing.
- Works with in-house development teams to help them adopt the platform and best practices, including SLI/SLO/SLA.
- Performs research and evaluates new technologies to continuously advance the platform.
- Maintains high level of education for ourselves and our customers.
- Practices Agile development and continuous improvement.
Duties and Responsibilities:
Develop and maintain modern monitoring solutions including:
- Application services
- CI/CD pipeline
- System operational monitoring of cloud and on-premises machines
- Synthetic monitoring platform
- Business metrics gathering and data analysis
- Solid Linux skills
- At least three years of professional experience
- Experience with AWS, GCP, Azure or other public cloud providers
- Expert in at least one of following programming languages: Python, Rust, Go, Java, Scala, Kotlin
- Understanding of distributed systems design
- Customer-focused, team-oriented disposition
- Excellent communication and soft skills to deal with with customers, peers and management
- Good judgement, sense of integrity and responsibility
- BS in a STEM field (Engineering, Computer Science, Mathematics, Physics)
- OR equivalent industry experience in Software Development
- Presentation skills
- Coaching skills
- Experience with TICK/TIGK stack (Telegaf, InfluxDB, Grafana, Kapacitor)
- Splunk, ELK
- Service mesh (Linkerd)
- Any other DevOps technologies, any prior DevOps experience