As a member of Advisor Services Technology, Service Availability and Engineering team, you will be immersed in a collaborative, innovative, and technically challenging environment. The platforms we support are essential to the success of the investment advisor and offer features such as client relationship management, portfolio accounting, trading, data integration, reporting and much more.
We are seeking a dynamic, motivated individual to lead and deliver exceptional solutions for the production resiliency of the systems. The role incorporates aspects of software engineering and operations, combining SRE and DevOps skills to come up with efficient ways of managing and operating applications. The role will require a high level of responsibility and accountability to deliver technical solutions.
What you are good at
- Practice and lead the team with Site Reliability Engineering and DevOps mindset and solve problems through automation and innovation
- Partner with the Architects and SMEs to ensure implementations are architected and designed from the aspect of production resiliency
- Identify opportunities to build innovative tools and solve unique operations problems on large enterprise and mission critical applications
- Develop tools, frameworks, and instrumentation to validate and increase rollout success for applications.
- Partner within the Support organizations to build and rollout plans for enhanced telemetry, and reduce defect leakage for software delivery to production.
- Real-Time troubleshooting of mission critical application workflows and incorporate feedback to product development.
- Work closely with dev teams during design phase, build and perform infrastructure upgrades to support our applications availability and reliability.
- Monitor the current-state solution portfolio to identify deficiencies through aging of the technologies used by the application, or misalignment with business requirements.
- Advocate and augment the Schwab Reliability Engineering principles, guidelines and standards
- Assist with the evaluation and selection of software product standards and services, as well as the design of standard and custom software configurations, improve our compliance procedures and enhance our risk posture.
What you have
- BS in computer science or related with at least 10 + years of experience with listed technical skills
- Extensive experience leading teams in Enterprise level Infrastructure orchestration with Ansible, Chef, SALT, Puppet
- Solid experience in High Availability and distributed systems, Linux and Windows administration, Data and SAN Storage Networks, NAS and Networking, leveraging tools to instrument and automate proactively and eventually predictive availability solutions
- Proven track record leading complex enterprise production application development and support efforts adhering to a mix of DevOps & SRE frameworks
- Experience transitioning platforms to the cloud, with knowledge of cloud frameworks & design patterns, micro-service architectures ,12-factor design
- Extensive Knowledge of networking, including DNS, DHCP, firewalls, load balancers and IP routing
- Ability to grasp difficult concepts, large architectures, and sophisticated designs quickly and troubleshoot with debugging skills across a variety of integrated platforms
- Proven capability to provide operational visibility on environment health to Senior Leadership, Technology and Business partners
- Receptive, approachable teammate, with the ability to positively interact with business partners, technology teams, recruiting personnel, offshore, and professional services
- Strong customer advocate with excellent written and verbal communication skills
- Experience in Monitoring tools - Splunk, Zenoss, Elastic, Appdynamics, Dynatrace
- Database experience with Oracle, SQL Server, Mongo DB, Aerospike
- Experience with Atlassian tools Jira, Confluence, Bamboo, BitBucket, Agile Frameworks
- Preferred experience with C#, .Net, and scripting