Senior Server Engineer / Technical Team Lead
The role is responsible for providing a professional third-line onsite technical support and field engineering service to clients by proactively identifying and resolving technical incidents and problems.
Through preemptive service incident and resolution activities, this role will restore service to clients by managing incidents and seeing them through to an effective resolution.
Their primary objective is to ensure all requests, process events and resolution incidents result in zero missed SLA conditions.
The role is responsible for managing incidents of high complexity, conducts advanced and complicated tasks, and provides resolution to a diverse range of complex problems.
This position uses considerable judgement and independent analysis within defined policies and practices..
Experience & Qualifications:
- Diploma, degree or relevant qualification in IT/Computing (or demonstrated equivalent work experience)
- MCSE 2012 or later, Exchange ( and/or o365), ITIL, advantageous would be SCCM, SCOM, SFB etc
Any of the above certifications is a plus. The Engineer is expected to gain certifications relevant to services supported. Certifications carry additional weight on candidate’s qualification for the role. ITIL should be mandatory or we provide as part of induction.
Role / Responsibilities:
Internal: engage with Operations Centre and Centre of Excellence teams, provide and receive instructions, and manage escalation incidents as necessary following agreed procedures. .
External: proactively act as third-line technical and consulting support for clients. Will handle escalations and complex issues with clients.
Ensure resolution of incidents and requests:
They investigate second line support calls assigned to them and identify the root cause of incidents and problems. They ensure the efficient and comprehensive resolution of incidents and requests. This could involve ensuring that repairs are carried out by coordinating product requests and liaising with other team members. They will also report and escalate issues to 3rd party vendors if necessary. They take full ownership for managing the incident to resolution within the service level conditions.
They ensure that assigned infrastructure at the client site is configured, installed, tested and operational. In this regard they will perform necessary checks, apply monitoring tools and respond to alerts. Where software is a component of the solution they will also take responsibility for ensuring that the software is installed and configured according to client requirements.
Identify problems and errors:
The MS Engineer (L3) identifies problems and errors prior to or when they occur. He or she will log all such incidents in a timely manner with the required level of detail with all the necessary. They liaise with all stakeholders including client IT environments, vendors, carriers and colleagues to expedite diagnosis of errors and problems and to identify a resolution.
When required they will take responsibility receiving calls and incidents at the services desk. They assist in analysing, assigning and escalating the support calls. They also provide telephonic support to clients where required.
They update incidents with progress and resolution details.
Ensure continuous feedback
They provide continuous feedback to clients and affected parties and update all systems and/or portals as prescribed by company procedures.
They will proactively identify opportunities for work optimisation including opportunities for automation of work
Incident reduction and avoidance:
The MS Engineer (L3) will routinely identify common incidents and opportunities for avoidance as well as general opportunities for incident reduction. This could include identification of problematic systems in a client environment or regular times certain incidents occur. They will identify potential solutions for reduction/avoidance. MS Service Engineer also flags any repeat incidents or service requests for automation.
Skills and attributes
Analyses service and component availability, reliability, maintainability and serviceability. Ensures that services and components meet and continue to meet all of their agreed performance targets and service levels. Provides advice, assistance and leadership associated with the planning, design and improvement of service and component availability, to meet or exceed contracted outcomes for a client.
Service Level Management:
Monitor service delivery against service level agreements and maintains records of relevant information. Analyses service records against agreed service levels to identify actions required to maintain or improve levels of service. Ensures that service delivery meets agreed service levels.
Manages configuration items (CIs) and related information. Investigates and implements tools, techniques and processes for managing CIs and verifies that related information is complete, current and accurate. Maintains secure configuration, applying and maintaining tools, techniques and processes to identify, track, log and maintain accurate, complete and current information.
Monitors service component capacity and initiates actions to resolve any shortfalls according to agreed procedures. Applies techniques to control the demand upon a particular resource or service.
Uses available tools & platforms to investigate and diagnose problems, collect performance statistics and create reports, working with users, other staff and suppliers as appropriate. Drafts and maintains procedures and documentation for managed services. Ensures usage of knowledge articles in incident and problem diagnosis and resolution. Where missing, builds knowledge articles and disseminates them to junior team members, to increase first call resolution in the Operations Center. Maintains processes and checks that all requests for service are dealt with according to agreed procedures.
Prioritizes and diagnoses incidents according to agreed procedures. Investigates causes of incidents and seeks resolution. Escalates unresolved incidents. Facilitates recovery, following resolution of incidents through adoption of knowledge articles. Documents and closes resolved incidents according to agreed procedures.
Initiates and monitors actions to investigate and resolve problems in systems, processes and services. Determines problem fixes/remedies. Assists with the implementation of agreed remedies and preventative measures.
Utilizes knowledge management systems as part of all work activities to ensure adherence to agreed procedures. Identifies opportunities for additions, modifications or improvements to knowledge management systems in order to reduce incident resolution time and increase the number of activities that can be completed at L1 and L2. Analyzes knowledge articles created by themselves and others to look for opportunities to pass to the automation team for automation.
Maintains an in-depth knowledge of specific specialisms, and provides expert advice regarding their application.
Desired Programming / Scripting Skills:
Python, PHP, XML, REST API Programming/Scripting (or similar programming/scripting languages)