Definition of Terms
Consists of the processes and procedures implemented in a data
center environment for the purpose of automating day-to-day activities.
These automated activities consist of the following:
- System deployment, configuration, and implementation
- High Availability deployment, configuration, and implementation
- Disaster Recovery deployment, configuration, and implementation
- Business Continuity compliance, configuration, and implementation
- Network resources allocation and deallocation
- Storage resources allocation and deallocation
- Dynamic CPU and Memory allocation and deallocation
- Performance Monitoring
- Error detection and change request submissions
- Security Management
- Document Management
- Change Management
- Audit Management
- Service Level Agreements
Consists of the activities performed on a daily basis to ensure the
business operates normally today, tomorrow, and beyond. Business
continuity is sometimes confused with disaster recovery, but they are
separate entities. Disaster recovery is a small subset of business
continuity. The business continuity plan may be thought of as a
methodology, or as an enterprise wide mentality of conducting day-to-day
business.
The components of the business continuity plan and methodology are:
- Policies - those things that shall be done
- Guidelines - those things that should be done
- Standards - technical specifications derived from policies and guidelines
- Procedures - step-by-step instructions for implementation of standards
- Resource planning and deployment
- Organizational Structure
- Business Impact Analysis (BIA)
- Security Management
- Document Management
- Change Management
- Audit Management
- Service Level Agreements
The implementation of a project plan which describes the tasks
necessary to recover critical business functions after a disaster. The
recovery occurs between geographically separated data centers using one
or more methods of storage replication between the data centers.
- Disaster Recovery Planning is the process of creating the disaster recovery project plan.
- The goal is to minimize downtime for business functions, not systems.
- Business function recovery times and maximum allowable data loss is specified during the business impact analysis.
- Implementation of a disaster recovery plan requires management declaration of a disaster.
- Initiation of the disaster recovery plan is a manual process.
An automated process for minimizing business function downtime
associated with planned or unplanned outages. Typically utilizes
replicated hardware platforms to eliminate single points of failure. The
business function fail-over normally occurs between two or more physical
frames within the same data center using a single shared storage
system.
- Elimination of single points of failure (SPOF's) are a necessary part of HA.
- The goal is to minimize downtime for business functions, not systems.
- This is NOT non-stop computing, downtime will be experienced during fail over.
An examination of ALL business functions to determine those regarded
as critical. Should be conducted as a workshop involving executive
management and all departments in an organization, this is NOT an
information technology activity.
- Assignment of recovery time objectives for each business function.
- Assignment of recovery point objectives for each business function.
- Assignment of support tier associated with each business function.
- Assignment of Service Level Agreement associated with each support tier.
An agreement between a business function owner and the service
provider which designates the amount of time, on an annualized basis,
the business function will be available. Conversely, the SLA also
designates the amount of time, on an annualized basis, for which the
business function will NOT be available.
This should not be regarded as allowable downtime, but rather as
mandatory downtime that requires management approval to reschedule.
The SLA is associated with a business function, not with any
particular machine, system, or frame.
Refers to a parameter that has one distinct value across any or all
platforms throughout the entire enterprise.
Definition: Frame
A physical computing device, which may host one or more partitions or
logical systems.
Definition: Partition (LPAR)
A logical grouping of resources such as CPU, Memory, network and SAN
adapters
Definition: CPU
- May be shared between multiple systems simultaneously.
- Allocated and Deallocated on an as needed basis.
- SLA may address number of CPUs provided, not underlying processing units.
- Assigned to a single system at a time.
- May be allocated and deallocated on an as needed basis.
- May be reassigned during off-peak processing times.
- Virtual representations of physical adapters
- May be shared between multiple systems simultaneously
- An IBM Power6 capability for moving a live running partition from one frame to another
- Requires virtualized I/O
- Requires equal or greater CPU and Memory on target frame
- Requires synchronization of slot numbers between frames
- Not a high availability or disaster recovery solution
- Used to eliminate business function downtime utilizing two healthy systems
- Planned maintenance outages
- Preemptive problem management