System and Automation Engineer

  • Full-time

Company Description

The California Independent System Operator (ISO) manages the flow of electricity across the high-voltage, long-distance power lines that make up 80 percent of California's power grid. We safeguard the economy and well-being of 30 million Californians by "keeping the lights on" 24/7.

As the impartial grid operator, the California ISO opens access to the wholesale power market that is designed to diversify resources and lower prices. It also grants equal access to 25,865 circuit-miles of power lines and reduces barriers to diverse resources competing to bring power to customers.

The California ISO's function is often compared to that of air traffic controllers. It would be grossly unfair for air traffic controllers to represent one airline and profit from allowing that company's planes to go through before others. In the same way, the California ISO operates independently—managing the electron traffic on a power grid we do not own—making sure electricity is safely delivered to utilities and consumers on time and reliably.

Job Description

Under the general direction of the Manager, maintains and administers the UNIX environment infrastructure across complex environments and redundant sites to ensure highest level of availability, data integrity and security. Identifies, researches, coordinates and resolves technical problems, and tracks and manages issues to ensure a timely resolution. Develops and manages automation of build and deployment tools. Collects and distributes data for capacity planning and performance management review.

What's In it for You

Our purpose is to lead the way to tomorrow's energy network. Make a difference and impact millions of people who depend on electricity in their everyday lives.

  • You get to work on interesting and challenging assignments that will help grow your skill set.
  • You will work in an extremely collaborative environment inside our LEED certified Folsom, California campus.
  • You will be challenged, be a part of a winning team, and your contributions will be rewarded and recognized.

What You Will Be Doing:

  • Installs, configures, monitors, upgrades, tests, and administers UNIX servers, databases and infrastructure supporting systems to ensure highest level of availability, performance and security for specific project initiatives. 
  • Responds to incidents and exercises independent judgement to create effective solutions, workarounds, and/or strategies to quickly resolve issues and minimize the impact to power grid and/or market reliability.
  • Participates in developing and managing automation and deployment of source code from the development team. Creates documentation and procedures and maintains a release repository which includes build and release procedures, dependencies, and notification lists. Measures and monitors progress to ensure releases are delivered on time and meet the acceptance criteria.
  • Collects and distributes data for supported systems capacity planning and performance management. Proactively identifies issues and participates in creating workable solutions and providing recommendations to management. 
  • Uses data replication and other appropriate systems to ensure data integrity and to provide for superior uptime and accurate data collection. Provides daily maintenance and technical support to project teams on the development of application systems and coordinates and deploys approved production-ready applications into multiple environments. 
  • Provides UNIX administrative support to project teams for server/software installation and configuration, for both purchased and internally developed, to meet new functionality requirements.
  • Collaborates with team members to perform root cause analysis and resolve technical issues related to integrated hardware, servers, software and other supported systems. 
  • Works with representatives from various ISO business units to design technology solutions and strategies that meet business requirements. Responds to incidents and participates in the resolution of systems failures. Creates documentation, standards and procedures to comply with corporate and industry standards.
  • Change management, Incident Management, Problem Management associated with day-to-day administration of systems
  • Build out, maintain and troubleshoot critical infrastructure ensuring highest level of availability, performance and security

Qualifications

Level of Education and Discipline

  • A Bachelor's degree (BA, BS) or equivalent education, training or experience in Computer Science, Information Technology, Management of Information Systems, Engineering or related technical field.  Master’s degree preferred.

Amount of Experience

  • Equivalent years of education and training, plus two (2) or more years related experience.

Certifications

  • UNIX Certification preferred.
  • A Linux administration certification preferred

Type of Experience

  • Exposure to working on Security risks within UNIX Server environment.
  • Experience with Red Hat Linux, Kickstart, IDM(Single sign-on), LVM, virtualization, Containers and Red Hat Satellite
  • Knowledge of Apache, F5(LTM & APM), SPLUNK
  • Strong Unix/LINUX fundamentals, scripting (Python/Perl/Shell), automation skills.
  • Configuration Management experience with Puppet, Ansible, Chef
  • Release Management experience with RPM, make, ant, apt, yum
  • SCM(Software Configuration Management) experience with Git, Subversion, Clearcase
  • Proven working experience in installing, configuring and troubleshooting RHEL Linux based environments. Excellent troubleshooting ability and the ability to perform root cause analysis
  •  
  •  Performance tuning, capacity planning, and root cause analysis skills are strongly desired.
  • Experience managing HP DL servers farms including rapid deployments and HP ILO / IPMI management
  • Understanding and working experience on Backup infrastructure technologies
  • Ability to Collaborate with other teams to resolve multi-tier complex issues
  • Experience with Disaster Recovery planning and testing
  • Experience with monitoring application like Nagios
  • Able to provide recommendations for architectural enhancements and roadmaps
  • Team player and should be able to work in fast paced dynamic environment.

Additional skills and abilities

  • Strong verbal and written communication and documentation skills required, with a demonstrated attention to detail.  
  • Ability to use deductive reasoning and analytical thinking with sound judgment and decision-making skills.  
  • Strong interpersonal and conflict resolution skills are also essential.  
  • Must be self-starting and willing and able to work independently in a dynamic corporate organization under pressure of tight deadlines and aggressive expectations.  
  • Problem solving skills with the ability to influence others without direct authority. 
  • Must be able to work effectively in a team environment as facilitator and team member.  
  • Must be proficient with Microsoft Office Suite. 
  • Systems knowledge working in a complex environment and the ability to make recommendations on the evolution of technology within a 6 to 12 month time horizon.  
  • Able to apply broad and comprehensive knowledge of principles, practices and procedures.

Additional Information

**We will also consider applicants for a Sr. Systems and Automation Engineer position. This position requires a Bachelor's degree or equivalent education, training or experience in Computer Science, Information Technology, Management of Information Systems, Engineering or related technical field.  Master’s degree preferred..**