High Availability Clustering
They don’t have big TV awards shows for our industry. No red carpet for the tech guys. We’ve come to terms with this. But if we were famous, it would be for the breadth of High Availability Architecture solutions we’ve designed and implemented over the years. We excel in delivering the highest level of quality in the HA world.
Of course, HA isn’t the only component of systems availability; it needs to serve the entire business process. End users don’t care about why their application isn’t working, just that it is or isn’t. We have a thorough, detailed methodology to evaluate the big picture.
In addition to server failover within datacenters, OST has extensive experience with clustering between sites using software based remote data replication products such as Quest Shareplex, Oracle, Veritas Volume Replicator, and GoldenGate and array based remote data replication products such as SRDF, MirrorView, Continuous Access, and others. We have customers who failover to their disaster recovery sites for maintenance purposes and also customers who use disaster recovery servers for production applications.
OST has senior level skills in designing and implementing the following high availability clustering (system failover) solutions:
- IBM HACMP
- HP ServiceGuard
- Veritas Cluster Server
- Oracle RAC (Real Applications Cluster)
- Microsoft Cluster Services (MSCS)
OST has developed custom scripts for many applications which allows for quicker implementation times. Enterprise software applications that OST offers specific clustering solutions for include:
- Oracle Applications
- JD Edwards
- Dynamics AX
OST realizes that the high availability cluster is just one component of systems availability. Availability must be broadly considered as meeting the service level for the entire business process. When architecting data centers for availability OST considers all areas of availability including the following:
- Power and Facilities
- Hardware – redundant systems
- Physical Infrastructure – power, cooling, etc.
- Network – redundant NICs, cabling, switches
- Disk – mirrored disk, RAID, Volume, file system management
- SAN – redundant SAN switches, cabling, HBAs, multipathing
- WAN – facilities, diverse routing, multiple paths, multiple carriers
- Cluster software
- Multiple App Servers with Load Balancing or Application Brokers, Secure Messaging Infrastructure
- Event Monitoring
- Network Services (DNS, NFS, WAN, NIS, etc.)
- Administrative Test Systems
- Backup and Recovery
- Data Replication
The following chart is useful when calculating the number hours or days of downtime based on annual uptime percentages. For example, when a vendor claims “5 nines” of uptime it equates to 5 minutes and 25 seconds of downtime per year.
|Uptime Percentage||Downtime Percentage||Downtime Per Year||Downtime Per Week|
|98||2||7.3 days||3.3 hours|
|99||1||3.65 days||1.7 hours|
|99.9||0.1||8.75 days||10.1 min|
|99.99||0.01||52.5 min||1 min|
|99.999||0.001||5.25 min||6 sec|
|99.9999||0.001||31.5 sec||.6 sec|