We will review problem resolution techniques and help desk management. The goal of problem resolution management is to reduce the impact and frequency of issues / incidents that caused impactful problems.
By reducing problems we will improve the customer experience. To improve problem resolution, we will identify root cause, implement work arounds, and develop best practices.
When dealing with incidents and problems, it is important to know different tactics. When dealing with incidents, you want to solve and mediate quickly. Incidents are typically high impact such as security breaches. These incidents may disrupt critical services until resolved.
A range of problems may attribute to incidents. Problems should be reviewed for root cause and solution. A good solution should be implemented to prevent future occurrence. Case details should include problems, steps taken, and quickest resolution.
The third phase is error control and problem resolution. We document fixes so we can quickly fix problems and incidents. This documentation should improve problem resolution time. We also want to determine the best work around for popular problems and incidents. Implementing permanent fixes will reduce the number of incidentsl and the risk to the organization.
The following are problem resolution phases. Phase 1 problem determination or root cause. In this phase we look at problem trends and analysis. We talk to teams and determine if there are any hot button problems or issues. We look for areas we can improve the overall customer experience. We look at creating a road map to chart improvements. This road map looks at key services and products we deliver and associated improvements.
The second phase is problem control. We look at areas we want to concentrate on and dedicate energy and resources. We determine which key products and services may have problems or issues to address. We determine which potential issues or problems may have the highest customer impact and address accordingly. We will put together a plan to address know issues, fixes, and improve the most plaguing problems.
In problem resolution management, we will define the following roles: problem director – Responsible for improving problem resolution. Develop matrix for continuous improvement for budgeting and management approval. Approve and manage projects to improve the environment and lower overall IT costs. Utilize project management skills to implement solutions to minimize problems and incidents.
The next role in problem management is problem manager. Leads problem resolution team and and priorities investigation, proposes changes to director, and tracks trends in the environment. Uses performance metrics tied to remediation of pressing issues and provides coaching to the team. Helps manage and document risk. If problems need to be escalated, this process gets done.
The next role is problem coordinator. This role helps gather and organize documentation. This role can lead low impact investigations. Assist in reporting and following up on tasks assigned to teams and experts. This role will also help manage major incidents, hand over meetings, manage knowledge base, known errors, and work arounds.
How to set up and manage key metrics. The key metric in problem management is improving the customer experience. To improve the customer experience we will work to lower problem incidents, reduce recurring incidents, reduce cost, and improve efficiency.
A key metric is efficacy, what has project management done for us lately. Host resolution meeting after major incidents. For success we manage key incident events. Emphasize changes that improve the customer experience.
A good questioning method involves 5W2H, what symptoms, where the rlocation, when reported, how often, who, how much impact,
To evaluate success we look at return on investment. We will look at the benefits delivered to the organization and customer versus overall cost. Some of the cost we will review are: time, energy, parts, and labor. This should provide ROI.
To fully appreciate the technical environment we must know all of the key management and executive players. To effectively manage the group, it is important to know what each group brings to the table. We will make goals to ensure great customer experience.
To ensure a great customer experience, we will focus on improving event management. Event management ensures systems, tools, and functionality are tested and monitored. Any time we have an event that disrupts systems or customer service, the incident team springs into action. The team uses logs, device monitoring, and statically analysis to pin point potentials system failure. The team should work proactively correcting problems before the become bigger issues.
Once the incident is resolved, the problem team will do some investigative work. We work to prevent future occurance.
When working to improve services we include subject matter experts (SME). Knowing which group or team SMEs belong to can be helpful. They can help fact gathering, and determine root causes. They help implement fixes and work arounds.
To improve customer service and problem resolution, we will incorporate data analysis. We will perform regular tracking and analysis of problems. We will use data to get out in front of issue before they become bigger problems. Some of the key metrics we will look at are customer feed back, system uptime, and quality of system performance.
A good question to ask is, what can we do to improve the user experience. The Preto principle is a good guide for customer service interaction snd problem resolution measures. 80 % of issues come from 20% of the operational environment.
To break down the 20 % of the problems into manageable data, we can ask the following questions: which products and services are generating the most problematic noise? Which hardware platform is causing the most issues? What cases or problems represent the most trouble tickets? Which present the highest cost for the organization.
Investigate problems via determining biggest contributors to users pain, head aches, cost, and complaints. Some of the tools we will utilize are Excel, Power BI, Tableau, and MATLAB.
One of the major initiatives of problem resolution is to reduce rework. Search for duplicate issues and work towards permanent solution. Ask team members and engineers what problems they have to continually solve! Preventing and minimizing repeat issues will save time and money.
When Data analysis is limited, we can audit bridge or resolution/incident wrap up calls. Take note of good and bad behavior. Observe the flow, cadence, speakers, dead air, and progress over time. We can try graphing the process in Excel (time vs process) step chart to visual the information. After the audit is completed, we will review trends and the the appropriate actions taken.
Improve problem resolution skill set. Learn a data analysis language like R or Python. Improve skill set utilizing SQL to retrieve and organize and group data. One of the end games is to improve ability to quickly and efficiently solve and close issues.
To improve efficiency we will select high value targets to concentrate on. We want to prove the value of problem solving method. Make the team members feel valuable. We will prioritize using a data driven model.
To improve the problem solving process, we will review metric requirements. Typically major incidents should be investigated within 48 hours. This requirement can result in some bad outcomes. Often the results are high number of cases, stress, fake root causes, cases closed but not solved in 48 hour window.
Problem prioritization varies across business units. Some of the variables we should consider. An often over looked metric is the probability of determining root cause based on teams skill setzand data available. The probability of funding, scheduling and completing project to correct root cause. Determine if project is is worth undertaking base on cost benefit analysis. Finally we calculate ROI % = probability of root cause X probability of completion X (benefit – cost / cost) X 100
To successfully utilize data, we need to accurately perform cost analysis. Become a subject matter expert or ask someone to break down steps of project. Calculate work packages, determine how many work hours each work packages will take. We can work with the controller to determine average over head cost and/or all in labor cost. If we have to estimate labor cost, we will use $75 / hour. We will multiply labor cost by estimated hours to get a job cost.
We will use the value formula and enter in all details for projects considered. We will use Excel to organize and manage the data. Once this is completed, we will sort results into high value targets. We will use this as a to do list. Additionally, we will present the top three valued targets and let management prioritize.
The end result will provide high value results to the company that will significantly improve the customer experience. To document the teams efforts, we will write a case study to show the results.
The next step in problem resolution is, cause analysis. We can tell a story based on the events that occurred. We will document the facts, determine chain of events, record variable and circumstances, and make recommendations for improvements. The end result will be, recommend and implement action to reduce or eliminate the reoccurrence of issues and problems.
To document and resolve issues,we will create accurate problems statements. We will provide a detailed description of what caused the problem. The description should contain single main object and deviation that caused the problem. The details should be focused, factual, and evidence based.
To investigate the problem, we will use the five whys. A five why statement consists of why was the system down, power outage, hard drive failure, effect, power line damaged, UPS failure, and data corruption. The 5 ways to help determine why the problem occurred.
We can stop problem resolution when the information source no longer can provide reasons for problem or point of failure. Additionally, we can stop gathering information when we can recommend or execute solution to prevent reoccurrence of problem or point of failure. We don’t always have to uncover every detail that cause the problem to recommend a solution.
To get a complete picture of the problem, we will review contributing factors. These factors provide context for our decision making. They will help reduce the impact of the problem.
When reviewing cause analysis and end results, we will present finding utilizing visual aides. visual aides such as graphs can assist in comparing post incident results to standards. One of the goals is to keep stockholders on track and focused on next steps.
A key aspect of problem resolution is document incident, recommend appropriate action, and sell the best solutions to the executive team. We can try utilizing flow chart based method = KT incident mapping.
To minimize the impact an has incident, we’ll implement work arounds. The focus of a work around is to get some functionality back for affected user. At times we will encounter know errors. These issues will be put on the back burner to address later.
When addressing temporary items, it should be noted that these items are often forgotten. The trouble ticketing system should have a record of problem, reason, type, follow up, and details of work around. Maintenance and review of tickets should ensure temporary work arounds are addressed and solutions found.
The benefits of maintaining a good knowledge base includes: know error and work arounds found easily, engineers do not waste time working on know issues, and problem can be solved quickly.
Customer can be given good update on time frame and solution. Automation potential solutions can improve the customer experience.
To improve the level of service, we can look at the following: the lowest level – reactive, customer initiated, engineers do not have solution. Good – reactive, customer initiated, engineers have solution. Great – proactive, automated or manually initiated, customer forwarded about the problem.
To improve customer service, we will prioritize know issues and work around management. We will address customer concerns and communicate key information to defuse concerns and anger.
By improving known issues and work around management. We can improve customer satisfaction scores and recommendations based on improved services.
Best practice knowledge management – Create and enforce policies to ensure facts are entered in knowledge management system.
Customer support agents should be emphatic, supportive, proactive. and professional.
To improve services we document problem systems, problem triggers, impact, and next steps to prevent reoccurrence. To improve efficiency, we will link some known issue to problem investigation. We will develop a process to implement two way accounting for issues.
To effectively manage problem database, we will update case status to closed when issue has been resolved. Additionally, we will delete known issues when the remediation and resolution had been completed. After closing and deleting know issues produce a report that shows current number, category, and state of each error
To review status and impact of know errors, we will create impact statement. We will perform an analysis of the cost of known error and the benefit derived from solving these issues.
One of the keys to good problem management is, get the most value out of our remediation efforts. Be aware that some known issue are not worth solving. Realize that not everything in the environment is going to be perfect. If the impact and occurrence rate are low, the business can live ewith the risk.
To improve service, we will review tools in the trouble ticket knowledge management database to link know errors with resolution documentation to improve performance.
To continue to move forward, we need to develop permanent solutions to problems. We should not allow temporary solutions to become long term. We can develop a process to track how long temporary fixes have been put in place. Some of the key statistics to review are: how often does the error occur, cause issue, or disruption to service. Consider a permanent solution.
To improve services, we will report on status , technical cost, and conversion to permanent solution. These results will be reported to the management team. We as a team need to make time for remediation process. We need to work on minimizing major incidents and make the environment stable. To help manage this process, we will come up with a prioritize list of issues to work on.
When we prioritize resolution work for work arounds, we will use the following variables: probability of funding, success, cost of solution, labor, parts, down time, Benefits of the project and solution, such as saving time, money, and gaining new customers. Additionally, we should qualify the value of the solution. This will imclude: cost savings in labor and time, improved customer service, and process improvement. after finding a solution to work arounds. we will perform a ROI analysis. The ROI calculation is: probability of funding x probability of completion x (benefits – costs) / (costs) x 100
We will strive to improve the customer experience by prioritizing known error and work arounds by ROI. We work to reduce incidents. The end result of our efforts will be improved customer satisfaction. Additionally, this should lead to increased sales.
When working on problem resolution, we should always clarify issue to ensure accuracy. We should also verify the actual problem exists by checking situation with tools and facts. Additionally, we should ask detailed questions to get additional facts about the problem.
After verifying the facts and situation, we can send out notification about the incident and get other and resources needed involved.
When we are investigating issues we must write a problem statement. We will develop requirement that problem statements are accurate and provide a good description of the problem. A good problem statement can improve average resolve time by 18% – 24%. A good problem statement can improve time to resolution.
A problem statement should consist of two elements. A specific object and specific deviation. A good approach is to focus on one team and get 100% compliance. To verify compliance, we will audit the problem knowledge database. We will review 3 months of closed tickets to determine if key elements exist. We will review case titles and determine if each case Passes or fails specific object and deviation.
A pass requires a single specific object and deviation. Fail – any more than single specific object deviation or any other element. Additionally we will calculate average time to solve between pass and fail cases. We will update management team on results and push for mandatory problems statement requirement and provide training. The end result will be improved problem resolution times and customer service.
One big decision we are going to make is what data to collect. We will work to get all the pertain facts. A good tip is to separate fact from opinion.
As we document the process and problems, we will make information available to team members. We will strive to keep all information updated with current status. Our goal is to make all information easy to understand and follow.
The next step in the process is identify and test possible causes. We will use 5W2H to gather and organize facts. Based on the information we will assemble a team with appropriate skills. Once the team is assembled, we will start generating ideas to solve the problem. The first step is review the case information. Durning the information review, we will present slides and graphs to highlight key information. We will encourage team members to suggest ideas for the solutions. The ideas for the suggestions should be fully developed with specific details.
The next step is to evaluate the ideas. We will focus on evaluating one idea at a time. We will ask questions and review scenarios an assumptions. Eliminate ideas that do not explain the facts. To get a good correlation between facts and assumed solutions, we can use 8D, A3 or Kepner Trevor. We may want to eliminate ideas that do not have supporting facts and assumptions. Document facts that could not be explained and then move on.
-Review risks and benefits, what needs to be improved or what issues need to be addressed and minimized.
To select the best idea and solution we will utilize Occam’s Razor. The best solution will require the least assumptions. We will create list of actions required to verify cause of issue.
The next step in the process is decision making and determining what actions will be taken. Some of the criteria used will be requirements, and options. We will develop a frame to work through the decision making process.
First identify issue in context. Second perform risk benefit analysis. 3rd identify and analyze options. Select a strategy and make a decision. Next we will implement strategy. Finally we will monitor and evaluate results.
– Identify issues and goals of solution. State and specify solution.
-Analyze available options, perform research, identify top 10 list. Write down all of the options. Document whether these options meet criteria two.
-Selecting a strategy. Which option gives us the most benefit.
– Implement the strategy. Purpose the strategy to stake holders. Make recommendation and share documentation and performance. Point out benefits, risks, and values.
– monitor and evaluate results. Monitor progress during implementation. Communication Important actions to stake holders.
When completed, evaluate and document lessons learned and communicate how well the team did.
We will strive to improve daily decision making by utilizing the 5 principles: what is the goal of the decision, what are the benefits needed, what options do we need, do options meet our needs, what is our best option.
To improve services wen will review risk management. We will implement corrective and preventative action process (CAPA). We will implement proactive measures. We will implement containment measures. Determine future actions to be taken.
To improve services, some steps to consider are: preventative action, properly assigning action items to team members. Implement containment measures, Note solutions to difficult problems to improve solution team. Document solution and various damage control measures.
To improve services, we implement corrective action. To determine what actions to implement to correct and prevent issue. Make sure key actions get scheduled and completed. Report important improvements to management.
We will implement containment measures. Note solutions that can solve bad problems. Implementing actions that can reduce overall damage.
The last phase involves correction. What can I do to prevent problem from occurring again? Recommend corrections and decide on which one to implement. We will schedule all important actions and ensure they get completed.
We will improve change management to ensure corrections get implemented by deadlines.
To improve operations we will manage action items and tasks effectively. Problem tasks are work units to help solve a problem as you work toward a goal. We will keep tasks small and manageable. The task to be completed should be clear and concise. We will describe time, cost, and performance, restraints, and requirements.
Some of the challenges of problem tacking is: lack of follow up, back log grows and becomes stale, trouble tickets and current issues may reduce focus and time available to solve issue.
We will strive to improve task completion. We will report on task status and completion percentage. Prioritize, maintain, and reduce outstanding tasks. We will prioritize tasks based on impact to the user. Any repeat problems will be addressed first. We can calculate the cost of allowing tickets to stay open by determine the number of hours spent on working on ticket. We can estimate the amount of hours required to close ticket and then request the needed resources.
To improve service, we will utilize problem clarification tools. We should completely understand the problem, accurately describe current situation, and work to reduce reoccurrence rate. To provide further clarification, we will utilize the 5 whys. This helps us understand the cause and effect correlation. The goal is to throughly understand issue and how to reduce any reoccurrence. If after reviewing the 5 whys and we don’t have a root cause, we will launch an investigation.
One important factor in problem solving is prioritization. We will use prioritization tools to improve the performance of the team. We want to prioritize urgent task first and then important items.
To improve decision making, we will utilize the Eisenhower matrix. If the task is considered important and urgent I’ll do. If the item is considered important but not urgent, I need to make a decision. If the task is considered urgent but not important, I will delegate. If the item is considered not urgent or important, I’ll delete.
We can also use a technique relative prioritization. First we brain storm a list of tasks. Then we mark the most important with a ten. For the rest, we get a relative number of importance. It the task is 1/2 has important as a ten, it gets 5 and so forth. We will complete tasks in this order.
The next technique we’ll utilize is pain value analysis. This approach focuses on formula to determine most important tasks. We will review historical data to determine priority. One way to perform analysis is to export data to excel and configure report. Our pain formula will include down time, users impacted, severity, loss of income, etc.
A formula example: pain = users impacted x outage minutes x cost of down time per minutes . This will help us focus on the most critical tasks.
Another technique we can use to improve problem management is the Pareto analysis. The basis is 80% of problems come from 20% under lying issues. To evaluate, we will create bar chart with x axis metrics and y axis contributors. Sort data by largest to smallest. Identify 80% metric and focus on those contributors.
The impact, priority, urgency matrix is widely used. Impact levels: high, medium, low. Urgency: critical, major, medium, low. Priority 1, 2. 3 single.
To continue to improve services we will review possible cause identification tools. Possible cause or root cause will help determine what caused the issue. There can be multiple issues until cause is narrowed down. Each of the possible solutions must be tested against the facts.
The first step in the process is brain storming. During the brain storming session, people contribute ideas. These ideas are weighted equally. We will focus on generating as many ideas as possible without passing judgement. We will encourage the team to build on ideas that are recommended.
A technique that can help is Kepler-Trego distinction and changes. We will focus on facts is and is – not. What is different or unique of is vs is-not situation. We will list what is unique and the various distinctions. Each distinction is checked for changes. These changes become potential causes of issues and problems.
A technique that we will utilize is Fishbone / Ishikawa. These technique brain storms by categories. Categories can include machines, methods, materials. Environment, measurement, and personnel.
List Measurements, materials, methods, and processes to help determine problem statements. Each section has possible causes.
Firewall problem determination: measurements: log incorrect, monitoring failure.
Environment-load not balances, traffic to intense, and traffic type to heavy.
Material: Lan cable and port is defective, board defective.
People -more user on system, uncontrolled changes, vending poor training
Method and processes – con fig has reset, config not optimal, installation incorrect.
Machines and equipment – hardware failure, poor maintenance, power outage,
Another technique is 6 hat thinking. We will look at the situation from a different perspectives. Each hat is used at different stages to discuss the issues. Hat color – Blue – big picture, process, and planning. Red – emotions reaction and feelings. Yellow – optimistic and and positive affects. White – neutral information and data, Black – critique, risk, and challenges. Green creative, alternatives, and solutions
Cause mapping and avoidance: visually understand how issues occurred helps prevent reoccurrence. We can utilize tools to help map out problem stories and organize and investigate. This will help determine actions that can be implemented alleviate and rectify the situation.
Another technique is Apollo. We will visually map out cause effects, and known relationships. We will focus on implementing effective solutions, prevent reoccurrence, not cause secondary issues, meeting organizational goals and issues. Solutions are designed to causal chain and prevent root cause from recurring.
Another technique is Taproot. Visually map out cause and effect factors. We will document circumstances behind incident. We will develop list of corrective actions. This technique is useful for accident and quality investigations.
Failure mode affect analysis (FMEA) This mode works to address potential failure. The definition is any way a error or defect could affect the customer. Used to prevent failures. Any way how things could fail and the consequences.
During analysis, we document consequences of each failure. High risk, high consequences are prioritized and addressed.
The last step involves investigation frameworks. Framework provides end to end process to accomplish goal. Each framework has its own strength and weaknesses. A good understanding of frameworks can assist on you in solving difficult problems and reach goals.
A3 framework is considered a continuous improvement process. We plan – opportunity for improvement for change. Do – test change, conduct study. Check- review the test, analyze results, what was learned. Act – take action from what was learned.
Another technique is Kepher – Tregoe – systematic technique and process to think critically. Situation appraisal – Prioritize and manage teams. Problem analysis – gather data and determine root cause. Decision analysis – make formal decision or recommendation. Potential problem analysis – avoid risk.
Another technique is 9D which is used on the automotive industry. 9 step methodology to identify, correct, and eliminate reoccurring problems. We should include steps to create COR teams and congratulate teams for a job well done.
The last technique is Six Sigma. This is a continuous improvement process. A DMAIC framework is defined, measure, analyze, and Controlled.
Some additional ways to improve skills. We can improve by studying process improvement. Some process improvements are Six Sigma, project management, data analysis, SQL scripting, Power-shell, and Python. Improve critical thinking skills with Kepner-Trege. Attend annual conferences such Service Now. Develop and updating skill sets is a continuous process improvement.