Cloud Reliability Engineer
Company: Apex Systems
Location: Fayetteville
Posted on: August 7, 2022
Job Description:
**Please email Cameron Ivey if interested in this opportunity
civey@apexsystems.comJob Description:Work with a team of diverse
architects and engineers with backgrounds in PaaS, DevOps,
Security, and IaaS operations in order to spread knowledge and
practices. All in support of the Government Program Office's global
cloud and platform services environments.Responsible for the
engineering and life-cycle management of, mostly cloud-based,
platforms to attain a high degree of reliability and
security.DESIRED QUALIFICATIONS: BS Degree in STEM related field +3
years related experienceResponsibilities include:
- Run the production environment by monitoring availability and
taking a holistic view of system health.
- Build software and systems to manage platform infrastructure
and applications.
- Improve reliability, quality, performance for cloud-hosted
applications.
- Measure and optimize system performance, with an eye toward
pushing our capabilities forward, getting ahead of customer needs,
and innovating to continually improve.
- Provide primary operational support and engineering for
multiple large, distributed software applications.
- Gather and analyze metrics from both operating systems and
applications to assist in performance tuning and fault
finding.
- Partner with development teams to improve services through
rigorous testing and release procedures.
- Participate in system design consulting, platform management,
and capacity planning.
- Create sustainable systems and services through automation and
uplifts.
- Balance feature development speed and reliability with
well-defined service level objectives.Required qualifications
include:
- Bachelor's Degree in a STEM field.
- DoD 8570 Level II (Security +)
- Ability to program (structured and OO) with one or more high
level languages, such as Python, Java, C/C++, Ruby, and
JavaScript.
- Adept Shell/BASH scripter
- Experience with distributed storage technologies like NFS,
HDFS, Ceph, and S3.
- 2+ years of experience working with container orchestration
technologies, specifically Kubernetes.
- A proactive approach to spotting problems, areas for
improvement, and performance bottlenecks along with an ability to
offer and implement solutions to address these.
- Experience creating dashboards to track service health that
appeal to both technical and non-technical audiences preferably
with Splunk.
- Excellent written and verbal communication skills, with a
strong attention to detail and a head for problem solving.
- Skilled at working in tandem with a team, or unsupervised as
required.Preferred qualifications include:
- Bachelor's degree in Computer Science.
- Experience working with identity and access management
technologies and solutions.
- Experience with Agile development methodologies; using
collaboration tools such as Jira and Confluence.
- Experience with monitoring and logging solutions, specifically
Splunk
- Any of the following: AWS Certified SysOps Administrator
Associate or AWS Certified Solutions Architect Associate or any
Professional level of the above-mentioned certs where
applicable
- 1+ years' experience working with Gitlab
- Skilled at creating Ansible playbooks, working with AWX/Ansible
Tower
Keywords: Apex Systems, Fayetteville , Cloud Reliability Engineer, Engineering , Fayetteville, North Carolina
Didn't find what you're looking for? Search again!
Loading more jobs...