TruCluster Server High Availability Case Study
Test Bed Software: V1.6 Archive
The
High
Availability Software Configuration
table lists the operating system, network, and services software configured on the production-level
cluster.
The test bed software configuration duplicates the production-level environment's
operating system version and patch level and the
NFS service, disk service, and tape service. It also uses LSM extensively to mirror the
root (/), /usr,
and /var
file systems.
The test environment
does not duplicate all of the system software running on the production system; nor does it run every
software application.
Our use of the production-level cluster and the test bed environment led to the following conclusions:
- It is more important to synchronize the test bed system with the versions of a select group
of software than it is for the test bed to duplicate everything on the production-level cluster.
You select the
software based on the needs of your critical applications or on the functions being exercised (for instance,
networking, security, etc.).
- Configure the test bed such that you can run load tests that simulate actual usage in
the production environment.
In addition to testing, you must develop a consistent roll-out procedure for quickly deploying software
changes from the test bed to your production-level cluster.
The test bed roll-out procedure that we use with the production-level cluster includes the following:
- Before making software changes, measure performance on the production and test bed environments.
- To modify the test bed, use standard tools, document each step, and record the down time.
For example, to
patch system code or a TruCluster Software product, use the dupatch utility to install
an aggregate Patch Kit on the test bed.
- Run the test bed system. Depending on the complexity of the change, we run the test bed system
for 10 to 20 days. For example, after installing a patch kit, we run the test bed for 10 days.
After upgrading the operating system, we run the test bed for 20 days. Running the test bed includes
the following activities:
- Load tests that cover typical workloads as well as atypical situations (such
as the simulation of low memory or full disks).
- Executing key business applications.
- Standard and automated tests that exercise the subsystems that were changed.
- Upon completion of step 3, re-measure test bed performance and compare it against the initial
measurement. If satisfactory, use
the downtime records to schedule the deployment of the changed software to the production environment.
- Use standard tools and the documented procedure from step 2 to install the changes on the
production system.
|