|
Analysis
of High Availability Firewalls versus Firewall sandwich Configurations
Abstract
Firewalls are
increasingly being used to protect mission critical systems that require
unscheduled outages to be limited to less than six minutes per year.
One solution has been to implement a Firewall Sandwich,
which consists of redundant firewalls placed between two layers of redundant
Firewall Load Balancers (FLB). The firewall sandwich uses six or more
nodes configured in three layers to achieve High Availability (99.999%).
This method requires a significant increase in costs and complexity.
The High Availability Redundant PORTUS Systems or HARPS
achieves High Availability using fewer systems at a fraction of the
cost and complexity. A single layer of load balancing fault tolerant
firewalls reduces unscheduled outages to a couple of seconds per year.
HARPS provides higher availability, lower costs and easier management
relative to firewall sandwiches.
Introduction
Firewalls are
increasingly being used to protect applications of mission critical
importance. This means firewalls need to provide the predictable high
levels of security, reliability and survivability required for financial
organizations, safety-critical domains such as aviation, healthcare,
infrastructure, first-responders and national defense. Applications
in these environments require High Availability which is defined as
99.999% availability. High Availability is difficult to achieve as it
limits unscheduled outages to less than six minutes per year. Today
most servers are down an average of more than 10 hours per year, so
a High Availability solution requires a 100-fold reduction in the average
downtime.
There are two
distinct methods used to achieve High Availability firewall solutions.
One method is called a Firewall Sandwich which consists
of redundant firewalls placed between two layers of redundant Firewall
Load Balancers (FLB) that operate as TCP/IP layer 4-7 switches. This
method uses six or more nodes configured in three layers to achieve
high availability. The other method called a High Availability Redundant
PORTUS System HARPS uses a cluster of load balancing fault
tolerant high availability firewalls. Fault tolerant firewalls are designed
to detect, isolate and recover from both hardware and software failures.
This method integrates the functions found in the three layer sandwich
into a single layer. This reduces complexity, costs and downtime relative
to the firewall sandwich.
HARPS uses fault
tolerant hardware and software that can detect and recover from component
failures without loss of service. This requires hardware redundancy
including processors, memory, disk drives, controllers, communication
adapters and more. However, hardware redundancy by itself does not make
a fault tolerant system. The software has to be capable of detecting
failed hardware components and dynamically adjusting its configuration
to avoid a lost transaction or system outage. HARPS offers significant
enhancements in hardware fault tolerance. In the event of a processor
failure, dynamic processor deallocation allows a system to continue
running without losing a transaction. Advanced memory error correction
increases memory reliability more than 100 times over ECC memory, virtually
eliminating systems failures. Dynamic Ethernet reassignment prevents
loss of communications due to a NIC failure.
HARPS achieves
high availability using two or more fault tolerant nodes configured
in a single layer. Both methods employ load balancing to distribute
the workload across multiple firewall systems. Both approaches are capable
of delivering high performance high availability solutions. However,
there are significant differences in the number of required systems
and levels of complexity required to manage them.
The following
sections will show how to calculate the availability of each configuration
based on the reliability and availability of the components. A cost
analysis will also be conducted.
Firewall
Sandwich Configuration
The firewall sandwich
serves two purposes. It eliminates the single point of failure of a
single firewall and it balances the workload across two or more firewalls
eliminating potential bottlenecks. Firewall Workload Balancers (FLB)
are configured on either side of the firewalls. The FLBs on both sides
of the firewall layer ensure connection orientated TCP/IP traffic passes
through the same firewall in both directions. This is required for correct
operation of stateful packet filter firewalls, application proxy firewalls
and the more advanced application level protection systems sometimes
called Intrusion Prevention Systems. The configuration is symmetrical
as TCP/IP connection can originate and terminate on either side of the
firewalls.
The firewall sandwich
eliminates the firewall as a single point of failure, but introduces
two new single points of failure the FLB systems themselves. Simply
adding an FLB on either side of the redundant firewalls will actually
reduce network availability. To overcome this problem one has to configure
redundant FLBs on either side of the firewall layer. So now we
have gone from a single firewall system to a Firewall Sandwich with
a total of six or more systems.
Figure 1 shows
a firewall sandwich with a single FLB on either side of a dual redundant
firewall system.
Figure 2 shows a firewall sandwich with redundant FLBs on either side
of a dual redundant firewall system.
|
|
|
|
Figure
1: Simple Firewall Sandwich
|
Figure2:
Dual Redundant Firewall Sandwich
|
Fault
Tolerant High Availability Firewall Configuration
Most discussions
regarding the construction of high availability firewall configurations
have dealt with the use of firewall sandwiches as shown in the previous
section. Sandwich configurations achieve High Availability using standard
(non fault tolerant) firewall systems through the use of FLBs that bypass
failed firewalls. HARPS uses two or more systems running fault tolerant
hardware and software with integrated workload balancers. This approach
achieves high levels of redundancy using a single layer configuration
rather than the triple layer sandwich. The significant advantages of
this approach include (1) higher availability, (2) simpler installation
and configuration, (3) simpler management, (4) lower initial cost and
(5) lower operating costs.
 |
| Figure3:
Redundant Fault Tolerant Firewall Configuration |
HARPS uses fault
tolerant software running on a fault tolerant hardware platform.
Fault-tolerant
Software
PORTUS is unique
in that it was designed from the ground up as a high availability Network
Intrusion Prevention System (NIPS). Since all hardware and software
is subject to failure the key to High Availability is in the architecture
of the product. Hardware and software redundancy is an essential but
insufficient condition for High Availability. The design must do more
than eliminate single points of failure. The software must provide real-time
detection, isolation and recovery from errors. In addition it must provide
immunity from hostile attacks to prevent catastrophic system wide failures.
Error
Isolation:
PORTUS was designed to isolate the effects of both hardware and software
errors to prevent error propagation. W ith PORTUS each transaction is
handled by a separate process and there is strict separation between
processes. This means a failure handling one transaction is isolated
and cannot propagate to other transactions. In contrast all stateful
packet filter (SPF) firewalls and intrusion detection systems (IDS)
run as part of the kernel ermitting errors to propagate. These errors
can cause catastrophic system failure or worse yet allow the SPF or
IDS to fail wide open permitting malicious traffic to pass undetected.
Immunity:
PORTUS transaction processes run at the application level in a chrooted
directory. This means the process handling the transaction is unable
to access or modify any system files. As a result, a skilled hacker
cannot exploit any possible errors in PORTUS to access or modify the
programs or configuration files on the system. This makes PORTUS immune
to attack even if there are errors in the code.
Granular
scalable redundancy
provides multiple copies of selected software components that can be
non disruptively replaced if damaged. Regeneration techniques recognize
damage and automatically recover operational capability. These features
allow it to survive massive and extremely hostile attacks.
Cognitive
immunity:
PORTUS can recognize attacks that have never been seen before by detecting
protocol anomalies designed to attack systems. This makes PORTUS, as
well as systems it is protecting, immune to new forms of attack. Protocol
anomaly detection will also block attacks which remain to be invented.
Triple
Level Software Recovery
provides for non disruptive service in the event of a software failure.
The first level of recovery is provided by the child process. It performs
extensive error checking and reporting. Most errors are caused external
data errors which are usually classified as application protocol anomalies.
Protocol anomalies are blocked and prevented from impacting PORTUS any
system it is protecting. Software induced errors are documented using
first failure capture techniques so that software errors can be quickly
diagnosed and corrected. In most cases a graceful recovery can be performed.
Error isolation, previously discussed, limits the scope of any error
to a single transaction.
The second level
of error recovery is performed by the parent process, which spawns child
processes and monitors their behavior. If a child process terminates
for any reason, the parent process performs second level of error recovery.
Process regeneration, similar to biological response strategies, prevents
process depletion and loss of function.
At the third level,
specialized system monitors can detect, report and recover from lower
level failures. The process monitor checks to make certain the number
of processes for each application falls within the normal boundaries
defined for the system. If the parent process is not spawning children,
the number of child processes will drop. The process monitor issues
an alert and tries to correct the problem. This level allows graceful
recovery if a parent process (second level) should cease to function.
A PORTUS restart command automatically restarts a parent process without
disrupting existing child processes. The same procedure is used to dynamically
invoke new rules and configuration changes in a non-disruptive fashion.
All three levels are available on a single PORTUS system.
A forth level
of error recovery is present when two or more PORTUS systems are configured
in a HARPS. It this level two systems monitor each other for proper
operation and perform automatic recovery should its sibling system fail.
Process self-regeneration
is also used to prevent failures caused by slow system resource depletion.
Child processes are automatically retired after one thousand transactions.
When a process is retired, all resources it acquired during its lifetime
are returned to the system. Thus even if there were a memory leak in
the code, all the memory would be returned to the system. When a child
process is retired, the parent will automatically regenerate a new child
process using the same mechanism that was used to recover from an error.
This allows PORTUS to run for years without rebooting.
Automated
System Management
routines keep the system running without operator intervention. Specialized
monitors alert the system administrators of pending problems allowing
preventive action to be taken before the problem causes a disruption
in service.
PORTUS automatically
rotates logs, compresses and archives them onto another system. This
prevents the file systems from filling up. As a precaution, there is
a disk monitor that monitors the disk space utilization. W hen a file
system passes one of four configurable thresholds, a notice is automatically
sent to the administrator. As the utilization each threshold is passed,
a notice of higher severity is sent. If a file system reaches 99% utilization,
the proxies will automatically be throttled so that no new transactions
will be served as the system would not be able to log all the transactions.
Automated garbage
collection routines remove expired files preventing loss of disk space.
The Process manager
automatically scans the process table to ensure the processes that should
be running are, and ones that should not are not. Alerts are generated
when there are either too few or too many processes running of a specific
type. This enables preventive administrator action to be taken before
a disruption occurs. The parent processes automatically ensure that
there are sufficient child processes to handle the work load. However,
in the unlikely case the parent process should die the process monitor
will alert the administrator enabling them to recover the parent before
a disruption occurs.
Automated procedures
determine if a process has gone into a loop and automatically terminate
the process, allowing other work to run without degradation.
Some models of
the PORTUS appliance record all hardware and software errors in a system
error log. Error reports are automatically generated that inform the
systems administrator of temporary error conditions that can be corrected
before they become permanent errors.
As a result, PORTUS
is far more robust and resistant to hardware and software failures and
hostile attacks than any of its competitors. Its self healing architecture
permits it to recover from errors that would cause lesser systems to
fail.
Fault
Tolerant Hardware
PORTUS High Availability
systems use multiple fault tolerant hardware features to provide granular
scalable hardware redundancy. Multiple levels of redundancy provide
nondisruptive operations in spite of one or more component failures.
Standard features include redundant hot swap power supplies, redundant
hot-swap cooling fans and disk drives. The system components most likely
to fail can be swapped out and be replaced without disrupting the system.
Some models offer fault tolerance that far exceeds the standard features
found on competitive product offerings. The extended availability features
include the following.
Dynamic
processor deallocation
varies a failing processor offline (SMP configurations only), moves
the transaction to a working processor, marks the failed processor as
inactive and issues an alert. This prevents lost transactions and system
crashes as a result of a processor failure. Other systems, including
SMPs, will crash when a processor fails.
Chipkill
memory can
detect and correct multi-bit errors in a single byte, while ECC memory
can only correct single bit errors within a byte. A system with multiple
gigabytes of ECC memory is as likely to suffer a memory failure as a
system with 64 MB of non-ECC memory. Since Chipkill memory is 100 times
more reliable than ECC memory it virtually eliminates system crashes
due to memory failure.
Dynamic
network adapter
reconfiguration automatically enables a spare adapter to takeover from
a failed adapter without disrupting service.
PORTUS was the
first firewall/NIPS to share its workload across multiple systems. In
addition to providing higher throughput, the system provides higher
availability. When two PORTUS systems are configured with the high availability
option each one monitors its sibling. When one fails the other dynamically
assumes the failed systems workload.
First
Failure Capture
The hardware and
Operating System provide first failure capture so that systems that
caused a failure be it hardware or software are logged for later review
and analysis. The data is logged in non volatile storage. This process
is so quick, if someone should Pull the power plugs there will be a
message logged indicating the system power was cut off. Both software
and hardware failures are recorded in the log. Often one or more soft
failures precede a hard failure. So Error reporting will enable preventive
actions to be taken before service is disrupted. Both hardware and software
errors are recorded. The error report is often the first indication
there was an error, since most errors are recoverable and go unnoticed.
High
Availability using Redundant Systems
The HARPS option
consists of multiple PORTUS Network Intrusion Prevention Systems configured
in pairs. Multiple pairs of systems can be added to the High Availability
configuration allowing for nondisruptive growth in capacity. With configured
pairs, both systems share the workload while they monitor their sibling
for errors. The PORTUS systems have similar configurations, and both
are connected to the same networks. This allows each PORTUS system to
automatically test its siblings connectivity using every NIC. If system
A detects that system B is not functioning (frozen, crashed or one or
more inactive NICs) then system A can automatically take over
all the traffic of system B.
Firewall
Availability Models
There are multiple
methods that can be used to calculate system availability and unavailability.
One method uses a Generalized Stochastic Petri Net (GSPN) to describe
the system being evaluated. First a GSPN diagram is drawn to serve as
a guide to writing the control statements for Stochastic Petri Net Package
(SPNP). The control statements enable the SPNP to calculate the unavailability
of the system. This approach can be used to analyze the availability
of complex systems with multiple levels of redundancy. Another and simpler
approach involves the drawing of availability diagrams. Once the diagrams
are drawn, analytical methods can be used to calculate both availabilities
as well as unavailability. If one is careful the quicker analytical
method will generate availability results that agree with the more sophisticated
GSPN approach to at least six decimal places. We will uses the second
method for our analysis.
Assumptions
The availability
analysis is based on a series of assumptions regarding the nature
of hardware and
software failures.
1. FLB and FW
nodes failures will generate conditions that are immediately detectable
by the other nodes (FLB and FW) generating automated responses. These
conditions are recognizable either by an absence of expected responses
to an automated query or by an error signal from a failing node.
2. The majority
of node failures are soft failures that can be corrected by a reboot.
Many of these failures are software failures or transient hardware
failures that are often blamed on the software.
3. The vast
majority of software found in systems has limited error detection,
isolation and correction, that causes most of the reboots.
4. Hard failures
are not correctable by rebooting. Hard failures require repair or
replacement of the failing node or critical components of the failing
node.
5. Node failures
are most often mutually independent. A hardware failure in one node
normally does not induce a hardware failure in another node. Most
software errors on one system will not propagate to another. However,
there are some software errors that can cause node failures on multiple
nodes of similar type and function. For example, a buffer overrun
that can be exploited by an attacker to disable one FW might also
disable theothers.
6. Node failure
rates, repair rates and reboot rates are exponentially distributed.
A High Availability
system is defined to have 99.999% availability which is expressed
a 0.99999, a number that is very close to 1. Its unavailability (U)
is calculated by subtracting A from one.
U = 1-A.
So a system
with five nines availability has its unavailability calculated as
follows.
U = 1-0.99999
= 0.00001
A year has an
average of 8765.52 hours. So a system with five nines availability
will average 0.08766 hours or 5.26 minutes of unscheduled down time
per year.
Component
Reliability
The values used
to describe node reliability are Mean Time Between Failure (MTBF)
and Mean Time To Repair (MTTR). To differentiate between hard and
soft failures we will subscript the symbols. The node failure rates
shown below are not vendor specific but represent reasonable values
for computers built with better than average components. The systems
considered are server level systems with redundant hot swap power
supplies, cooling fans disk drives which are configured in as a raid1
(mirrored) or a raid5. Without this level of hardware redundancy MTBF
will be shorter than shown below.
MTBFh
is the Mean Time Between hard Failures = 20,000 hours or about 2.28
years.
MTTRh
is the Meant Time To Repair a hard failure = 24 hours. This includes
time: to do initial problem determination, make a request for on-site
hardware repair, wait for their arrival, run the diagnostics, fetch
the necessary replacement part, install and replace the failed component,
and reboot the system. Most repairs will take less than a day some
may take more. The time will vary depending on the level support from
the hardware vendors, their proximity and the proximity of replacement
parts.
MTBFs
is the Mean Time Between soft Failures = 4,000 hours or about 5.5
months.
MTTRs
is the Mean Time To Repair soft failures = 0.125 hours or 7.5 minutes.
The MTBF and
MTTR for both hard and soft failures represent typical values. Specific
hardware and software may vary depending on hardware, software and
support quality.
Availability
Diagrams
The availability
diagram for the Simple Firewall Sandwich (Figure 1) that does not
use redundant FLBs is shown below (Figure 4). The availability diagram
consists of three levels. Level 1 represents the top FLB. Level 2
represents the dual redundant firewalls. Level 3 represents the internal
FLB. A(i,j) is used to represent the availability of each level ,
where i is the number of components and j
is the level number ranging from 1 to 3.
The availability
of the Simple Firewall Sandwich is the product of the availability
of all three levels.
A1 = A(1,1)*A(2,2)*A(1,3).
A1 calculation:
The availability
of levels 1 and 3 are easy to compute as the contain only one system.
A(1,1) = 1
- fraction of down time from hard failures -fraction of time down
from soft failures.
A(1,1) = 1
- ( MTTRh/(MTBFh+MTTRh)) - (MTTRs/(MTBFs+MTTRs))
A(1,1) = 1
- (24/(20024) - ( 0.125/(4000.125)) = 0.9988327.
The unavailability
is for configuration 1 is U1.
U1 = 1.0 -
A1 = 1.0 - 0.9988327 = 0.0011673.
Since there
are an average of 8,765.52 hours per year the average downtime per
year for a single
FLB is calculated as follows:
DT1 = 8765.52*U(1)
= 8765.52*(1.0 - 0.9988327) = 10.2 hours per year.
A(1,1) = A(1,3)
since the external and internal FLB are identical systems.
 |
| Figure
4: Simple Firewall Sandwich Availability diagram |
The availability
of level 2 is calculated by subtracting from one the probability that
both firewalls are down at the same time. W e will assume that each
of the firewalls has availability characteristics identical to the
FLBs. This yields the following formula for calculating the availability
of level two.
A(2,2) = 1.0
- (U1)*(U1) = 1.0 - (0.0011673)*(0.0011673) = 0.9999986
A1 = A(1,1)*A(2,2)
*A(1,3) = (0.9988327)*(0.9999986)*(0.9988327) = 0.9976654
DT1 = 8765.52*(1.0
- 0.9976654) = 20.44 hours per year.
This analysis
shows that the use of non-redundant FLBs to create a Firewall sandwich
will more than double the average annual down time relative to a single
firewall without FLBs. It is clear that the use of firewall load balancers
requires they also be made redundant if one is striving for higher
availability. Otherwise you are doubling the number of single points
of failure.
Dual
Redundant Firewall Sandwich
 |
| Figure
5: Dual Redundant Firewall Sandwich Availability diagram |
The availability
of the dual redundant firewall sandwich is calculated using the following
formula.
Afws = A(1,2)*A(2,2)*A(3,2)
Using the same
assumptions regarding node availability it is easy to see that A(1,1)
= A(2,2) = A(3,2).
Afws = A(1,2)**3.,
and A(1,2) is the same as the previous A(2,2).
A(1,2) =
1 - (U1)*(U1) = 1 -(0.0011673)*(0.0011673) = 0.9999986
Afws = (0.9999986)**3
= 0.9999959 which is slightly better than five nines.
DTfws = 8765.52*(1-0.9999959)
= 0.0358 hours/year = 2.2 minutes/year.
As one can see
eliminating single points of failure through the use of high quality
redundant nodes can significantly reduce outages. Using the previous
assumptions we estimate the annual unscheduled down time will be reduced
from 10.2 hours to a 2.2 minutes.
Dual
Redundant Load Balancing Firewalls
PORTUS provides
High Availability high performance application protection using an
architecture that requires only a single level of redundant systems
rather than the three levels found in the typical firewall sandwich.
This approach is possible because PORTUS integrates firewall workload
balancing with automated switch over in the event of a node failure.
 |
| Figure6:
HARPS Availability Diagram |
The availability
diagram for the PORTUS High Availability configuration is simpler
that the availability diagram for the firewall sandwich because the
FLB layer has been integrated into the PORTUS systems. The integrated
solution reduces six systems to only two. The level of system performance
is the same for all three configurations, as it is limited by the
total throughput of the two firewalls and not by the FLBs.
First we will
analyze the availability of this solution using the same assumptions
regarding hardware and software reliability. In other words we will
use the same values for MTBF and MTTR as were used in the other two
configurations. Let Ap be availability of a single PORTUS system running
on a high quality but not a fault tolerant server.
Ap = 1 -(
MTTRh/(MTBFh+MTTRh)) - (MTTRs/(MTBFs+MTTRs))
Ap = 1 -(24/(20024)
- ( 0.125/(4000.125)) = 0.9988327.
The unavailability
is for level a single PORTUS systems is Up .
Up = 1.0 -
Ap = 0.0011673.
Let DTp be the
expected annual downtime for a single PORTUS system running on a high
quality, but not fault tolerant system.
DTp = 8765.52*Up
= 8765.52*(1-9988327) = 10.2 hours per year.
The availability
of dual redundant PORTUS systems is calculated by subtracting from
one the probability that both PORTUS firewalls are down at the same
time. This yields the following formula for calculating the availability
of redundant PORTUS systems. Let Arp be availability of a dual redundant
PORTUS system running on high quality but not a fault tolerant servers.
Arp = 1 -
(Up )*(Up ) = 1.0 -(0.0011673)*(0.0011673) = 0.9999986
The average
annual unscheduled downtime can be calculated as follows.
DTrp = 8765.52*(1-0.9999986)
= 0.01194 hours/year = 43 seconds/year.
The reduction
in complexity (one level versus three) allows a dual redundant PORTUS
system to reduce the expected annual downtime to less than one third
the level expected from a firewall sandwich using the same type of
systems.
High
Availability Redundant PORTUS Systems
The HARPS NIPS
further improves on the availability offered by the by Dual Redundant
Load Balancing Firewalls through the use of fault tolerant hardware
and software. The hardware consists of two or more redundant nodes.
Each node contains multiple levels of hardware redundancy. In the
configuration used in this example, the processors have quadruple
redundancy. If a processor fails the transaction running on the failing
processor is reassigned to one of the remaining processors. The failing
processor is marked non-dispatchable and the system continues running.
The transaction that was running on the processor continues running
on one of the remaining processors.
The estimated
availability of this configuration is computed as follows.
The fault tolerant
system has quadruple redundant processors with dynamic processor deallocation.
This significantly reduces the probability of a system crash or a
lost transaction due to a failing processor. The use of Chipkill memory,
sometimes referred to as RAID is 100 times more reliable than ECC
memory and virtually eliminates system outages due to a memory failure.
The power supplies and cooling fans are triple redundant. Spare Ethernet
adapters can be dynamically configured to take over for failing network
adapters. The hard drives are configured as a RAID to eliminate outages
due to a single disk drive failure. The net result, is a significantly
longer MTBF, both hard and soft. For this calculation we have used
the conservative value of 100,000 for the MTBF, even though some models
of the hardware have exceeded 320,000 hours MTBF in customer locations.
We have used the conservative value of 8,000 hours for MTBFs even
though it is more than 10,000 hours.
MTBFh = 100,000
MTBFs = 8,000
The uses of
fault tolerant processors with dynamic First we will compare examine
the availability of this solution using the same assumptions regarding
hardware and software reliability. In other words we will
use the same values for MTBF and MTTR as were used in the other two
configurations. Let Aft be availability of a single fault tolerant
PORTUS system.
Aft = 1.0
- ( MTTRh/(MTBFh+MTTRh)) - (MTTRs/(MTBFs+MTTRs))
Aft = 1 -(24/(100024)
- ( 0.125/(8000.125)) = 0.99977568.
The unavailability
is for level a single PORTUS systems on a fault tolerant system is
Uft .
Uft = 1.0
- Aft = 0.000224318.
The average
annual unscheduled downtime for a fault tolerant PORTUS system is
DTft .
DTft = 8765.52*Un
= 8765.52*(0.000224318) = 1.96626 hours per year.
A PORTUS NIPS
running on a single fault tolerant system reduces the annual unscheduled
downtime from 10.2 hours to only 1.97 hours, an 80% reduction in downtime.
The High Availability
Redundant PORTUS System (HARP) further reduces system downtime. Let
Aharp be
the availability of a HARP system.
Aharp = 1
- (Un)*(Un) = 1.0 -(0.000224318)*(0.000224318) = 0.99999995
As you can see
a HARP system far exceeds five nine availability. In fact, it exceeds
six nines availability. The average annual unscheduled downtime can
be calculated as follows.
DTharp = 8765.52*(1-0.99999995)
= 0.00044 hours/year = 2 seconds/year.
The expected
annual down time of a HARP system is 12,000 times smaller than a standard
firewall and 66 times smaller than a firewall sandwich.
Cost
Analysis
High speed layer
4-7 web switches used for FLB with gigabit throughput cost upwards
of $35,000 each. A high performance application protection system
configured with dual 1 gigabit Ethernet adapters will cost approximately
$20,000.
| Firewall
Sandwich |
|
Unit |
|
Extended |
| Component |
Cost |
Qty |
Cost |
| FLB |
$35,000 |
4 |
$140,000 |
| Firewall |
$20,000 |
2 |
$40,000 |
| Total |
|
|
$180,000 |
| Integrated
Load Balancing Redundant Firewall |
|
Unit |
|
Extended |
| Component |
Cost |
Qty |
Cost |
| Firewall |
$20,000 |
2 |
$40,000 |
| Total |
|
|
$40,000 |
| High
Availability Redundant PORTUS System |
|
Unit |
|
Extended |
| Component |
Cost |
Qty |
Cost |
| Firewall |
$40,000 |
2 |
$80,000 |
| Total |
|
|
$80,000 |
Architectural
Differences
The HARPS has
two redundant load sharing firewalls each with quadruple redundant
processors. Up to three out of four processor can fail in system without
loss of a system. A total of seven of the eight processors can fail
in the two systems and service will still be available. On the other
hand a firewall sandwich can fail with the loss of only two processors.
If both processor are on the same level of the sandwich there will
be a complete loss of service. The firewall sandwich has three possible
ways that a dual processor failure can impact the system, (one at
each level). ways. Loss of a single processor on a HARP system will
reduce throughput only 12.5%. Loss of a single processor on a firewall
sandwich can reduce system throughput up to 50%. These are some of
many reasons why the HARP system is more reliable and experiences
less downtime than the conventional firewall sandwich.
The reliability
(MTBF) of the integrated high availability firewall is greater than
the dual redundant firewall sandwich. With the firewall sandwich there
are six systems each of which can be expected to be rebooted on an
average of 4000 hours. With six systems the average time between re-boot
events will be about 667 hours. That is on average there will be a
minor disruption on a monthly basis. The disruption may last 5 to
10 seconds, while the systems automatically compensate for the failure.
However, one can expect system administrators to become involved in
the recovery process. With the redundant high availability solution
the average time between system reboots should far exceed 4000 hours.
A six fold increase in system reliability relative to the firewall
sandwich.
Conclusion
The use of redundant
fault tolerant high availability firewalls with integrated workload
balancing provides higher levels of availability, less complexity
and lower purchase and operational costs than firewall sandwiches.
A firewall sandwich using redundant firewalls sandwiched between redundant
Firewall Load Balancers can reduce unscheduled outages from 10 hours
per year to less than 3 minutes per year. Use of fault tolerant firewall
systems with integrated workload balancing can further reduce unscheduled
outages to less than 3 seconds per year. The HARPS solution is simpler
to configure and reduces the initial installation cost more than 56%
relative to the firewall sandwich.
|