What Is Fault Tree Analysis (FTA)?

Find out what a fault tree analysis (FTA) is, how to use one, and why conducting one can improve your facility’s risk management efforts.
Fault tree analysis (FTA) is a systematic graphical risk assessment method for identifying the root causes of operational and maintenance (O&M) issues. It starts with a general problem and works backward to figure out what might have caused it. Fault tree diagrams are used to show how different parts of a system can lead to a failure.
The method of working backwards from a problem is called the “top-down approach.” In a fault tree structure, system failures are called “top events” and they trace down to underlying component failures, referred to as a “basic event.”
Fault tree analysis is helpful in many industries because it makes it easier to understand how small issues may affect a whole system. Having clear insights into how individual components impact failure probabilities allows facility managers to strategically plan their future preventative maintenance efforts.
Purpose of Fault Tree Analysis
The purpose of FTA is to identify the cause(s) of system failure and mitigate the risks before they occur. FTA helps determine what factors contributed to an event and the probability of it occurring. This proactive strategy enhances long-term system reliability and reduces overall failure rates.
Components of Fault Tree Analysis
Fault Tree Diagram
A fault tree diagram maps out the sequence of events leading to a failure. These diagrams provide a simplified, visual representation of how various contributing factors ultimately lead to equipment failure events. Having a clear, understandable representation of these relationships helps decision makers decide where to focus maintenance efforts.
Events and Event Symbols
Events represent specific system or process failures. Each event is categorized to clarify its role in the larger failure and fault tree diagrams represent these categories using fault tree analysis symbols.
- Basic Events: Represent a primary event that cannot be broken down further
- Intermediate Events: Combine multiple events leading to a larger failure
- Undeveloped Events: Represent a failure that has not been analyzed in detail
- Conditional Events: Indicate specific conditions or restrictions affecting the system
- House Events: Represent events that are either always true or always false at your facility
- Input Events: External factors or triggers that initiate or alter the fault tree analysis pathway
- Output Events: The final outcomes or consequences of a fault tree analysis pathway, whether desired or not
Logic Gates and Gate Symbols
Logic gates link events to represent their interdependence. The goal is to simplify complex systems and make it easier to understand how components interact with each other to cause a failure.
- “And” gates show scenarios where multiple events must occur for the failure to happen
- “Or” gates indicate that any of the connected events could lead to a failure
- “Not” gates show that failure probability may increase if an event does not occur
Logic gate symbols are graphical representations used on a fault tree diagram to visually depict the function of logic gates. These symbols are standardized shapes that simplify the process of understanding and analyzing the relationships between events.
Steps in Implementing Fault Tree Analysis
1. Define the Top Event
The first step in FTA is to clearly define the undesired event, known as the “top event.” This event represents the specific failure mode or undesirable outcome that you want to analyze. Clearly defining your undesired top event keeps all subsequent steps focused on a specific issue. It’s best to focus on one event at a time for a simpler fault tree analysis process.
2. Understand the System
Once the top event is defined, the next step is to gain a comprehensive understanding of the system being analyzed. Engage with experts, review system documentation, and map out dependencies to learn as much as you can about the target system.
3. Construct the Fault Tree Diagram
With a clear understanding of the system and the top event, the next step is to construct the fault tree diagram. This process involves identifying causes, using logic gates, and continuing the breakdown until reaching basic events.
4. Analyze the Fault Tree
Examine the fault tree to evaluate the probability of the top event occurring. Use quantitative analysis methods to assign probabilities to events and qualitative analysis methods to identify critical paths that may lead to undesired events.
It’s also best practice to focus on minimal cut sets to pinpoint combinations of failures leading to the top event. Cut sets are groups of basic events that may contribute to a failure event when combined. The reason why you should focus on minimal cut sets is because that path will lead you to the most likely and most direct root cause of the issue.
5. Mitigate Risks
Use the analysis results to design strategies that reduce the likelihood of the top event. For example, if the analysis shows a water pump failure often causes a system operation shutdown, replacing it or fixing overheating issues can prevent future shutdowns.
Practical Applications of Fault Tree Analysis
Identify System Failures
Fault trees are used to diagnose failures like assembly line stoppages, electrical failure, or equipment malfunctions. By identifying these specific failures, organizations can focus on targeted interventions to minimize downtime and improve operational efficiency.
Break Down Complex Systems
A fault tree analysis involves the creation of a fault tree diagram. Having this diagram helps both managers and maintenance teams better understand complex systems. Better understanding leads to better risk management and better coordination of preventative efforts.
Analyze Root Causes
Your facility can use FTA to uncover the root causes of recurring system failures. Knowing the root cause of the failure means that your team will be better equipped to remediate the issue in a way that reduces the chances of the same system failure occurring again.
Quantify Risks
A fault tree may be used to estimate the likelihood of a system failure occurring. A fault tree analysis estimates the likelihood of failure by assigning probabilities to each contributing event. These probabilities are then connected through logic gates to calculate the overall probability of the undesired event.