Analysis of High Availability Firewalls versus Firewall sandwich Configurations

Abstract

Firewalls are increasingly being used to protect mission critical systems that require unscheduled outages to be limited to less than six minutes per year. One solution has been to implement a “Firewall Sandwich,” which consists of redundant firewalls placed between two layers of redundant Firewall Load Balancers (FLB). The firewall sandwich uses six or more nodes configured in three layers to achieve High Availability (99.999%). This method requires a significant increase in costs and complexity. The High Availability Redundant PORTUS Systems or “HARPS” achieves High Availability using fewer systems at a fraction of the cost and complexity. A single layer of load balancing fault tolerant firewalls reduces unscheduled outages to a couple of seconds per year. HARPS provides higher availability, lower costs and easier management relative to firewall sandwiches.

Introduction

Firewalls are increasingly being used to protect applications of mission critical importance. This means firewalls need to provide the predictable high levels of security, reliability and survivability required for financial organizations, safety-critical domains such as aviation, healthcare, infrastructure, first-responders and national defense. Applications in these environments require High Availability which is defined as 99.999% availability. High Availability is difficult to achieve as it limits unscheduled outages to less than six minutes per year. Today most servers are down an average of more than 10 hours per year, so a High Availability solution requires a 100-fold reduction in the average downtime.

There are two distinct methods used to achieve High Availability firewall solutions. One method is called a “Firewall Sandwich” which consists of redundant firewalls placed between two layers of redundant Firewall Load Balancers (FLB) that operate as TCP/IP layer 4-7 switches. This method uses six or more nodes configured in three layers to achieve high availability. The other method called a High Availability Redundant PORTUS System “HARPS” uses a cluster of load balancing fault tolerant high availability firewalls. Fault tolerant firewalls are designed to detect, isolate and recover from both hardware and software failures. This method integrates the functions found in the three layer sandwich into a single layer. This reduces complexity, costs and downtime relative to the firewall sandwich.

HARPS uses fault tolerant hardware and software that can detect and recover from component failures without loss of service. This requires hardware redundancy including processors, memory, disk drives, controllers, communication adapters and more. However, hardware redundancy by itself does not make a fault tolerant system. The software has to be capable of detecting failed hardware components and dynamically adjusting its configuration to avoid a lost transaction or system outage. HARPS offers significant enhancements in hardware fault tolerance. In the event of a processor failure, dynamic processor deallocation allows a system to continue running without losing a transaction. Advanced memory error correction increases memory reliability more than 100 times over ECC memory, virtually
eliminating systems failures. Dynamic Ethernet reassignment prevents loss of communications due to a NIC failure.

HARPS achieves high availability using two or more fault tolerant nodes configured in a single layer. Both methods employ load balancing to distribute the workload across multiple firewall systems. Both approaches are capable of delivering high performance high availability solutions. However, there are significant differences in the number of required systems and levels of complexity required to manage them.

The following sections will show how to calculate the availability of each configuration based on the reliability and availability of the components. A cost analysis will also be conducted.

Firewall Sandwich Configuration

The firewall sandwich serves two purposes. It eliminates the single point of failure of a single firewall and it balances the workload across two or more firewalls eliminating potential bottlenecks. Firewall Workload Balancers (FLB) are configured on either side of the firewalls. The FLBs on both sides of the firewall layer ensure connection orientated TCP/IP traffic passes through the same firewall in both directions. This is required for correct operation of stateful packet filter firewalls, application proxy firewalls and the more advanced application level protection systems sometimes called Intrusion Prevention Systems. The configuration is symmetrical as TCP/IP connection can originate and terminate on either side of the firewalls.

The firewall sandwich eliminates the firewall as a single point of failure, but introduces two new single points of failure the FLB systems themselves. Simply adding an FLB on either side of the redundant firewalls will actually reduce network availability. To overcome this problem one has to configure redundant FLB’s on either side of the firewall layer. So now we have gone from a single firewall system to a Firewall Sandwich with a total of six or more systems.

Figure 1 shows a firewall sandwich with a single FLB on either side of a dual redundant firewall system.
Figure 2 shows a firewall sandwich with redundant FLBs on either side of a dual redundant firewall system.

Figure 1: Simple Firewall Sandwich
Figure2: Dual Redundant Firewall Sandwich


Fault Tolerant High Availability Firewall Configuration

Most discussions regarding the construction of high availability firewall configurations have dealt with the use of firewall sandwiches as shown in the previous section. Sandwich configurations achieve High Availability using standard (non fault tolerant) firewall systems through the use of FLBs that bypass failed firewalls. HARPS uses two or more systems running fault tolerant hardware and software with integrated workload balancers. This approach achieves high levels of redundancy using a single layer configuration rather than the triple layer sandwich. The significant advantages of this approach include (1) higher availability, (2) simpler installation and configuration, (3) simpler management, (4) lower initial cost and (5) lower operating costs.

Figure3: Redundant Fault Tolerant Firewall Configuration

HARPS uses fault tolerant software running on a fault tolerant hardware platform.

Fault-tolerant Software

PORTUS is unique in that it was designed from the ground up as a high availability Network Intrusion Prevention System (NIPS). Since all hardware and software is subject to failure the key to High Availability is in the architecture of the product. Hardware and software redundancy is an essential but insufficient condition for High Availability. The design must do more than eliminate single points of failure. The software must provide real-time detection, isolation and recovery from errors. In addition it must provide immunity from hostile attacks to prevent catastrophic system wide failures.

Error Isolation: PORTUS was designed to isolate the effects of both hardware and software errors to prevent error propagation. W ith PORTUS each transaction is handled by a separate process and there is strict separation between processes. This means a failure handling one transaction is isolated and cannot propagate to other transactions. In contrast all stateful packet filter (SPF) firewalls and intrusion detection systems (IDS) run as part of the kernel ermitting errors to propagate. These errors can cause catastrophic system failure or worse yet allow the SPF or IDS to fail wide open permitting malicious traffic to pass undetected.

Immunity: PORTUS transaction processes run at the application level in a chrooted directory. This means the process handling the transaction is unable to access or modify any system files. As a result, a skilled hacker cannot exploit any possible errors in PORTUS to access or modify the programs or configuration files on the system. This makes PORTUS immune to attack even if there are errors in the code.

Granular scalable redundancy provides multiple copies of selected software components that can be non disruptively replaced if damaged. Regeneration techniques recognize damage and automatically recover operational capability. These features allow it to survive massive and extremely hostile attacks.

Cognitive immunity: PORTUS can recognize attacks that have never been seen before by detecting protocol anomalies designed to attack systems. This makes PORTUS, as well as systems it is protecting, immune to new forms of attack. Protocol anomaly detection will also block attacks which remain to be invented.

Triple Level Software Recovery provides for non disruptive service in the event of a software failure. The first level of recovery is provided by the child process. It performs extensive error checking and reporting. Most errors are caused external data errors which are usually classified as application protocol anomalies. Protocol anomalies are blocked and prevented from impacting PORTUS any system it is protecting. Software induced errors are documented using first failure capture techniques so that software errors can be quickly diagnosed and corrected. In most cases a graceful recovery can be performed. Error isolation, previously discussed, limits the scope of any error to a single transaction.

The second level of error recovery is performed by the parent process, which spawns child processes and monitors their behavior. If a child process terminates for any reason, the parent process performs second level of error recovery. Process regeneration, similar to biological response strategies, prevents process depletion and loss of function.

At the third level, specialized system monitors can detect, report and recover from lower level failures. The process monitor checks to make certain the number of processes for each application falls within the normal boundaries defined for the system. If the parent process is not spawning children, the number of child processes will drop. The process monitor issues an alert and tries to correct the problem. This level allows graceful recovery if a parent process (second level) should cease to function. A PORTUS restart command automatically restarts a parent process without disrupting existing child processes. The same procedure is used to dynamically invoke new rules and configuration changes in a non-disruptive fashion. All three levels are available on a single PORTUS system.

A forth level of error recovery is present when two or more PORTUS systems are configured in a HARPS. It this level two systems monitor each other for proper operation and perform automatic recovery should its sibling system fail.

Process self-regeneration is also used to prevent failures caused by slow system resource depletion. Child processes are automatically retired after one thousand transactions. When a process is retired, all resources it acquired during its lifetime are returned to the system. Thus even if there were a memory leak in the code, all the memory would be returned to the system. When a child process is retired, the parent will automatically regenerate a new child process using the same mechanism that was used to recover from an error. This allows PORTUS to run for years without rebooting.

Automated System Management routines keep the system running without operator intervention. Specialized monitors alert the system administrators of pending problems allowing preventive action to be taken before the problem causes a disruption in service.

PORTUS automatically rotates logs, compresses and archives them onto another system. This prevents the file systems from filling up. As a precaution, there is a disk monitor that monitors the disk space utilization. W hen a file system passes one of four configurable thresholds, a notice is automatically sent to the administrator. As the utilization each threshold is passed, a notice of higher severity is sent. If a file system reaches 99% utilization, the proxies will automatically be throttled so that no new transactions will be served as the system would not be able to log all the transactions.

Automated garbage collection routines remove expired files preventing loss of disk space.

The Process manager automatically scans the process table to ensure the processes that should be running are, and ones that should not are not. Alerts are generated when there are either too few or too many processes running of a specific type. This enables preventive administrator action to be taken before a disruption occurs. The parent processes automatically ensure that there are sufficient child processes to handle the work load. However, in the unlikely case the parent process should die the process monitor will alert the administrator enabling them to recover the parent before a disruption occurs.

Automated procedures determine if a process has gone into a loop and automatically terminate the process, allowing other work to run without degradation.

Some models of the PORTUS appliance record all hardware and software errors in a system error log. Error reports are automatically generated that inform the systems administrator of temporary error conditions that can be corrected before they become permanent errors.

As a result, PORTUS is far more robust and resistant to hardware and software failures and hostile attacks than any of its competitors. Its self healing architecture permits it to recover from errors that would cause lesser systems to fail.

Fault Tolerant Hardware

PORTUS High Availability systems use multiple fault tolerant hardware features to provide granular scalable hardware redundancy. Multiple levels of redundancy provide nondisruptive operations in spite of one or more component failures. Standard features include redundant hot swap power supplies, redundant hot-swap cooling fans and disk drives. The system components most likely to fail can be swapped out and be replaced without disrupting the system. Some models offer fault tolerance that far exceeds the standard features found on competitive product offerings. The extended availability features include the following.

Dynamic processor deallocation varies a failing processor offline (SMP configurations only), moves the transaction to a working processor, marks the failed processor as inactive and issues an alert. This prevents lost transactions and system crashes as a result of a processor failure. Other systems, including SMPs, will crash when a processor fails.

Chipkill memory can detect and correct multi-bit errors in a single byte, while ECC memory can only correct single bit errors within a byte. A system with multiple gigabytes of ECC memory is as likely to suffer a memory failure as a system with 64 MB of non-ECC memory. Since Chipkill memory is 100 times more reliable than ECC memory it virtually eliminates system crashes due to memory failure.

Dynamic network adapter reconfiguration automatically enables a spare adapter to takeover from a failed adapter without disrupting service.

PORTUS was the first firewall/NIPS to share its workload across multiple systems. In addition to providing higher throughput, the system provides higher availability. When two PORTUS systems are configured with the high availability option each one monitors its sibling. When one fails the other dynamically assumes the failed systems workload.

First Failure Capture

The hardware and Operating System provide first failure capture so that systems that caused a failure be it hardware or software are logged for later review and analysis. The data is logged in non volatile storage. This process is so quick, if someone should Pull the power plugs there will be a message logged indicating the system power was cut off. Both software and hardware failures are recorded in the log. Often one or more soft failures precede a hard failure. So Error reporting will enable preventive actions to be taken before service is disrupted. Both hardware and software errors are recorded. The error report is often the first indication there was an error, since most errors are recoverable and go unnoticed.

High Availability using Redundant Systems

The HARPS option consists of multiple PORTUS Network Intrusion Prevention Systems configured in pairs. Multiple pairs of systems can be added to the High Availability configuration allowing for nondisruptive growth in capacity. With configured pairs, both systems share the workload while they monitor their sibling for errors. The PORTUS systems have similar configurations, and both are connected to the same networks. This allows each PORTUS system to automatically test its siblings connectivity using every NIC. If system A detects that system B is not functioning (frozen, crashed or one or more inactive NIC’s) then system A can automatically take over all the traffic of system B.

Firewall Availability Models

There are multiple methods that can be used to calculate system availability and unavailability. One method uses a Generalized Stochastic Petri Net (GSPN) to describe the system being evaluated. First a GSPN diagram is drawn to serve as a guide to writing the control statements for Stochastic Petri Net Package (SPNP). The control statements enable the SPNP to calculate the unavailability of the system. This approach can be used to analyze the availability of complex systems with multiple levels of redundancy. Another and simpler approach involves the drawing of availability diagrams. Once the diagrams are drawn, analytical methods can be used to calculate both availabilities as well as unavailability. If one is careful the quicker analytical method will generate availability results that agree with the more sophisticated GSPN approach to at least six decimal places. We will uses the second method for our analysis.

Assumptions

The availability analysis is based on a series of assumptions regarding the nature of hardware and
software failures.

1. FLB and FW nodes failures will generate conditions that are immediately detectable by the other nodes (FLB and FW) generating automated responses. These conditions are recognizable either by an absence of expected responses to an automated query or by an error signal from a failing node.

2. The majority of node failures are soft failures that can be corrected by a reboot. Many of these failures are software failures or transient hardware failures that are often blamed on the software.

3. The vast majority of software found in systems has limited error detection, isolation and correction, that causes most of the reboots.

4. Hard failures are not correctable by rebooting. Hard failures require repair or replacement of the failing node or critical components of the failing node.

5. Node failures are most often mutually independent. A hardware failure in one node normally does not induce a hardware failure in another node. Most software errors on one system will not propagate to another. However, there are some software errors that can cause node failures on multiple nodes of similar type and function. For example, a buffer overrun that can be exploited by an attacker to disable one FW might also disable theothers.

6. Node failure rates, repair rates and reboot rates are exponentially distributed.

A High Availability system is defined to have 99.999% availability which is expressed a 0.99999, a number that is very close to 1. Its unavailability (U) is calculated by subtracting A from one.

U = 1-A.

So a system with five nines availability has its unavailability calculated as follows.

U = 1-0.99999 = 0.00001

A year has an average of 8765.52 hours. So a system with five nines availability will average 0.08766 hours or 5.26 minutes of unscheduled down time per year.

Component Reliability

The values used to describe node reliability are Mean Time Between Failure (MTBF) and Mean Time To Repair (MTTR). To differentiate between hard and soft failures we will subscript the symbols. The node failure rates shown below are not vendor specific but represent reasonable values for computers built with better than average components. The systems considered are server level systems with redundant hot swap power supplies, cooling fans disk drives which are configured in as a raid1 (mirrored) or a raid5. Without this level of hardware redundancy MTBF will be shorter than shown below.

MTBFh is the Mean Time Between hard Failures = 20,000 hours or about 2.28 years.

MTTRh is the Meant Time To Repair a hard failure = 24 hours. This includes time: to do initial problem determination, make a request for on-site hardware repair, wait for their arrival, run the diagnostics, fetch the necessary replacement part, install and replace the failed component, and reboot the system. Most repairs will take less than a day some may take more. The time will vary depending on the level support from the hardware vendors, their proximity and the proximity of replacement parts.

MTBFs is the Mean Time Between soft Failures = 4,000 hours or about 5.5 months.

MTTRs is the Mean Time To Repair soft failures = 0.125 hours or 7.5 minutes.

The MTBF and MTTR for both hard and soft failures represent typical values. Specific hardware and software may vary depending on hardware, software and support quality.

Availability Diagrams

The availability diagram for the Simple Firewall Sandwich (Figure 1) that does not use redundant FLBs is shown below (Figure 4). The availability diagram consists of three levels. Level 1 represents the top FLB. Level 2 represents the dual redundant firewalls. Level 3 represents the internal FLB. A(i,j) is used to represent the availability of each level , where “i” is the number of components and “j” is the level number ranging from 1 to 3.

The availability of the Simple Firewall Sandwich is the product of the availability of all three levels.

A1 = A(1,1)*A(2,2)*A(1,3).

A1 calculation:

The availability of levels 1 and 3 are easy to compute as the contain only one system.

A(1,1) = 1 - fraction of down time from hard failures -fraction of time down from soft failures.

A(1,1) = 1 - ( MTTRh/(MTBFh+MTTRh)) - (MTTRs/(MTBFs+MTTRs))

A(1,1) = 1 - (24/(20024) - ( 0.125/(4000.125)) = 0.9988327.

The unavailability is for configuration 1 is U1.

U1 = 1.0 - A1 = 1.0 - 0.9988327 = 0.0011673.

Since there are an average of 8,765.52 hours per year the average downtime per year for a single
FLB is calculated as follows:

DT1 = 8765.52*U(1) = 8765.52*(1.0 - 0.9988327) = 10.2 hours per year.

A(1,1) = A(1,3) since the external and internal FLB are identical systems.

Figure 4: Simple Firewall Sandwich Availability diagram

The availability of level 2 is calculated by subtracting from one the probability that both firewalls are down at the same time. W e will assume that each of the firewalls has availability characteristics identical to the FLBs. This yields the following formula for calculating the availability of level two.

A(2,2) = 1.0 - (U1)*(U1) = 1.0 - (0.0011673)*(0.0011673) = 0.9999986

A1 = A(1,1)*A(2,2) *A(1,3) = (0.9988327)*(0.9999986)*(0.9988327) = 0.9976654

DT1 = 8765.52*(1.0 - 0.9976654) = 20.44 hours per year.

This analysis shows that the use of non-redundant FLBs to create a Firewall sandwich will more than double the average annual down time relative to a single firewall without FLBs. It is clear that the use of firewall load balancers requires they also be made redundant if one is striving for higher availability. Otherwise you are doubling the number of single points of failure.

Dual Redundant Firewall Sandwich

Figure 5: Dual Redundant Firewall Sandwich Availability diagram

The availability of the dual redundant firewall sandwich is calculated using the following formula.

Afws = A(1,2)*A(2,2)*A(3,2)

Using the same assumptions regarding node availability it is easy to see that A(1,1) = A(2,2) = A(3,2).

Afws = A(1,2)**3., and A(1,2) is the same as the previous A(2,2).

A(1,2) = 1 - (U1)*(U1) = 1 -(0.0011673)*(0.0011673) = 0.9999986

Afws = (0.9999986)**3 = 0.9999959 which is slightly better than five nines.

DTfws = 8765.52*(1-0.9999959) = 0.0358 hours/year = 2.2 minutes/year.

As one can see eliminating single points of failure through the use of high quality redundant nodes can significantly reduce outages. Using the previous assumptions we estimate the annual unscheduled down time will be reduced from 10.2 hours to a 2.2 minutes.

Dual Redundant Load Balancing Firewalls

PORTUS provides High Availability high performance application protection using an architecture that requires only a single level of redundant systems rather than the three levels found in the typical firewall sandwich. This approach is possible because PORTUS integrates firewall workload balancing with automated switch over in the event of a node failure.

Figure6: HARPS Availability Diagram

The availability diagram for the PORTUS High Availability configuration is simpler that the availability diagram for the firewall sandwich because the FLB layer has been integrated into the PORTUS systems. The integrated solution reduces six systems to only two. The level of system performance is the same for all three configurations, as it is limited by the total throughput of the two firewalls and not by the FLBs.

First we will analyze the availability of this solution using the same assumptions regarding hardware and software reliability. In other words we will use the same values for MTBF and MTTR as were used in the other two configurations. Let Ap be availability of a single PORTUS system running on a high quality but not a fault tolerant server.

Ap = 1 -( MTTRh/(MTBFh+MTTRh)) - (MTTRs/(MTBFs+MTTRs))

Ap = 1 -(24/(20024) - ( 0.125/(4000.125)) = 0.9988327.

The unavailability is for level a single PORTUS systems is Up .

Up = 1.0 - Ap = 0.0011673.

Let DTp be the expected annual downtime for a single PORTUS system running on a high quality, but not fault tolerant system.

DTp = 8765.52*Up = 8765.52*(1-9988327) = 10.2 hours per year.

The availability of dual redundant PORTUS systems is calculated by subtracting from one the probability that both PORTUS firewalls are down at the same time. This yields the following formula for calculating the availability of redundant PORTUS systems. Let Arp be availability of a dual redundant PORTUS system running on high quality but not a fault tolerant servers.

Arp = 1 - (Up )*(Up ) = 1.0 -(0.0011673)*(0.0011673) = 0.9999986

The average annual unscheduled downtime can be calculated as follows.

DTrp = 8765.52*(1-0.9999986) = 0.01194 hours/year = 43 seconds/year.

The reduction in complexity (one level versus three) allows a dual redundant PORTUS system to reduce the expected annual downtime to less than one third the level expected from a firewall sandwich using the same type of systems.

High Availability Redundant PORTUS Systems

The HARPS NIPS further improves on the availability offered by the by Dual Redundant Load Balancing Firewalls through the use of fault tolerant hardware and software. The hardware consists of two or more redundant nodes. Each node contains multiple levels of hardware redundancy. In the configuration used in this example, the processors have quadruple redundancy. If a processor fails the transaction running on the failing processor is reassigned to one of the remaining processors. The failing processor is marked non-dispatchable and the system continues running. The transaction that was running on the processor continues running on one of the remaining processors.

The estimated availability of this configuration is computed as follows.

The fault tolerant system has quadruple redundant processors with dynamic processor deallocation. This significantly reduces the probability of a system crash or a lost transaction due to a failing processor. The use of Chipkill memory, sometimes referred to as RAID is 100 times more reliable than ECC memory and virtually eliminates system outages due to a memory failure. The power supplies and cooling fans are triple redundant. Spare Ethernet adapters can be dynamically configured to take over for failing network adapters. The hard drives are configured as a RAID to eliminate outages due to a single disk drive failure. The net result, is a significantly longer MTBF, both hard and soft. For this calculation we have used the conservative value of 100,000 for the MTBF, even though some models of the hardware have exceeded 320,000 hours MTBF in customer locations. We have used the conservative value of 8,000 hours for MTBFs even though it is more than 10,000 hours.

MTBFh = 100,000
MTBFs = 8,000

The uses of fault tolerant processors with dynamic First we will compare examine the availability of this solution using the same assumptions regarding hardware and software reliability. In other words we will
use the same values for MTBF and MTTR as were used in the other two configurations. Let Aft be availability of a single fault tolerant PORTUS system.

Aft = 1.0 - ( MTTRh/(MTBFh+MTTRh)) - (MTTRs/(MTBFs+MTTRs))

Aft = 1 -(24/(100024) - ( 0.125/(8000.125)) = 0.99977568.

The unavailability is for level a single PORTUS systems on a fault tolerant system is Uft .

Uft = 1.0 - Aft = 0.000224318.

The average annual unscheduled downtime for a fault tolerant PORTUS system is DTft .

DTft = 8765.52*Un = 8765.52*(0.000224318) = 1.96626 hours per year.

A PORTUS NIPS running on a single fault tolerant system reduces the annual unscheduled downtime from 10.2 hours to only 1.97 hours, an 80% reduction in downtime.

The High Availability Redundant PORTUS System (HARP) further reduces system downtime. Let Aharp be
the availability of a HARP system.

Aharp = 1 - (Un)*(Un) = 1.0 -(0.000224318)*(0.000224318) = 0.99999995

As you can see a HARP system far exceeds five nine availability. In fact, it exceeds six nines availability. The average annual unscheduled downtime can be calculated as follows.

DTharp = 8765.52*(1-0.99999995) = 0.00044 hours/year = 2 seconds/year.

The expected annual down time of a HARP system is 12,000 times smaller than a standard firewall and 66 times smaller than a firewall sandwich.

Cost Analysis

High speed layer 4-7 web switches used for FLB with gigabit throughput cost upwards of $35,000 each. A high performance application protection system configured with dual 1 gigabit Ethernet adapters will cost approximately $20,000.

Firewall Sandwich
Unit Extended
Component Cost Qty Cost
FLB $35,000 4 $140,000
Firewall $20,000 2 $40,000
Total $180,000

 

Integrated Load Balancing Redundant Firewall
Unit Extended
Component Cost Qty Cost
Firewall $20,000 2 $40,000
Total $40,000

 

High Availability Redundant PORTUS System
Unit Extended
Component Cost Qty Cost
Firewall $40,000 2 $80,000
Total $80,000

Architectural Differences

The HARPS has two redundant load sharing firewalls each with quadruple redundant processors. Up to three out of four processor can fail in system without loss of a system. A total of seven of the eight processors can fail in the two systems and service will still be available. On the other hand a firewall sandwich can fail with the loss of only two processors. If both processor are on the same level of the sandwich there will be a complete loss of service. The firewall sandwich has three possible ways that a dual processor failure can impact the system, (one at each level). ways. Loss of a single processor on a HARP system will reduce throughput only 12.5%. Loss of a single processor on a firewall sandwich can reduce system throughput up to 50%. These are some of many reasons why the HARP system is more reliable and experiences less downtime than the conventional firewall sandwich.

The reliability (MTBF) of the integrated high availability firewall is greater than the dual redundant firewall sandwich. With the firewall sandwich there are six systems each of which can be expected to be rebooted on an average of 4000 hours. With six systems the average time between re-boot events will be about 667 hours. That is on average there will be a minor disruption on a monthly basis. The disruption may last 5 to 10 seconds, while the systems automatically compensate for the failure. However, one can expect system administrators to become involved in the recovery process. With the redundant high availability solution the average time between system reboots should far exceed 4000 hours. A six fold increase in system reliability relative to the firewall sandwich.

Conclusion

The use of redundant fault tolerant high availability firewalls with integrated workload balancing provides higher levels of availability, less complexity and lower purchase and operational costs than firewall sandwiches. A firewall sandwich using redundant firewalls sandwiched between redundant Firewall Load Balancers can reduce unscheduled outages from 10 hours per year to less than 3 minutes per year. Use of fault tolerant firewall systems with integrated workload balancing can further reduce unscheduled outages to less than 3 seconds per year. The HARPS solution is simpler to configure and reduces the initial installation cost more than 56% relative to the firewall sandwich.