Lecture Two - System Availability Flashcards by Libby Griffiths

Information Technology (IT):

Encompasses technologies related to storage, retrieval, manipulation, and communication of information.
Includes computers, networks, phones, and fax machines.

How well did you know this?

Not at all

Perfectly

Infrastructure Definition:

The underlying framework or features of a system or organization.
Fundamental facilities and systems serving a country, city, or area, such as transportation and communication systems.

How well did you know this?

Not at all

Perfectly

Benefits of IT Infrastructure

Commonly Accepted Benefits:
Automates manual activities.
Handles increased volumes of data efficiently.
Extends the range of tasks that can be performed.
Enhances customer service quality.
Increases the quality of finished products.
Improves information sharing and manipulation capabilities.

How well did you know this?

Not at all

Perfectly

Components of IT Infrastructure - Elements of IT Infrastructure

Business Process: Operations that support business goals.
Information and Data: Key resources for decision-making.
Applications and Servers: Software and hardware systems.
Buildings and Electricity Providers: Physical and power resources.
Hardware and Software: Essential computing equipment and programs.
Data & Storage: Systems for data management and retention.
Network Services: Connectivity and communication services.

How well did you know this?

Not at all

Perfectly

IT System Model - System Layers

Process/Information: Core business processes and data handling.
Applications: Software tools and systems.
Application Integration: Ensures seamless operation and data flow.
Infrastructure: Physical and virtual resources.

How well did you know this?

Not at all

Perfectly

IT System Model - Considerations

Availability: System uptime and reliability.
Performance: Efficiency and speed of operations.
Security: Protection against threats and vulnerabilities.
End User Devices: Interfaces for user interaction.
Operating Systems, Servers, Networks, Virtualisation, Data Centres: Core components for IT operation.

How well did you know this?

Not at all

Perfectly

System Availability -

Availability%=( MeasuredTimePeriod/Uptime
)×100

How well did you know this?

Not at all

Perfectly

System Availability and SLAs - Common Availability Levels

99.0%, 99.9%, 99.95% typically specified in SLAs.
99.999% known as carrier-grade availability.

How well did you know this?

Not at all

Perfectly

System Availability and SLAs - Downtime Estimates

99.8%: 17.5 hours/year, 86.2 minutes/month, 20.2 minutes/week.
99.9%: 8.8 hours/year, 43.2 minutes/month, 10.1 minutes/week.
99.99%: 52.6 minutes/year, 4.3 minutes/month, 1.0 minute/week.
99.999%: 5.3 minutes/year, 25.9 seconds/month, 6.1 seconds/week.

How well did you know this?

Not at all

Perfectly

Unavailability Intervals - Definition

Used in conjunction with availability percentage to define acceptable downtime.
Example for 99.9% Availability:
525 minutes of downtime/year should not occur as a single event.
Downtime can be spread across many short events.

How well did you know this?

Not at all

Perfectly

Unavailability Intervals - Interval Specifications

0 - 5 minutes: ≤ 35 times/year
5 - 10 minutes: ≤ 10 times/year
10 - 20 minutes: ≤ 5 times/year
20 - 30 minutes: ≤ 2 times/year
> 30 minutes: ≤ 1 time/year

How well did you know this?

Not at all

Perfectly

Estimating System Availability - SLAs

Provide upfront availability guarantees; actual availability is computed afterward.

How well did you know this?

Not at all

Perfectly

Estimating System Availability - Estimation Factors

Mean Time to Repair (MTTR): Average time to repair/recover failed components.
Mean Time Between Failures (MTBF): Average time between failures.

How well did you know this?

Not at all

Perfectly

Estimating System Availability - Timeline

Failure: Time when a system component fails.
Recovery: Time taken to repair the system.
MTTR and MTBF: Key metrics for assessing system reliability.

How well did you know this?

Not at all

Perfectly

Estimating Availability with MTBF and MTTR

EstimatedAvailability%=(
MTBF/(MTBF+MTTR))×100

How well did you know this?

Not at all

Perfectly

Observed Availability and Failures

Failure Probability: Changes over time, typically following a bathtub curve.
Failure Phases:
Early Failures: Initial phase with higher failure rates.
Random Failures: Stable phase with constant failure rate.
Wear-Out Failures: Increased failures as components age.
Observed Availability: Influenced by component reliability and failure rates.

Multi-Component Availability

Comprise multiple components, each with its availability.
A (system) = A1 x A2 x A3… where A1, A2, A3… are the availabilities of the individual components.

System Availability and Components - Graphical Representation

System availability decreases as the number of components increases.
Visualizes availability for different component reliability (99%, 95%, 90%).

System Availability with Multiple Components - Insight

Increasing the number of components increases the likelihood of system failures.

Redundancy in IT Systems - Purpose

Improves system availability and robustness by duplicating components/functions.
Acts as a backup to mitigate failures.

Redundancy in IT Systems - Cost Implications

Pros: Enhances reliability and reduces downtime.
Cons: Increases overall system cost.

Parallel System Availability

Availability improves as the number of systems/components in parallel increases.
Formula -
A (Parrallel) = 1 - (1-A)^m
A is the availability of a single system/component, m is the number in parallel.

Business Continuity

Disaster Events: Potential incidents like fires, natural disasters, or social unrest.
Preparedness: Businesses must prepare for contingencies to ensure continuity.
Disaster Recovery Plan (DRP):
Outlines procedures to protect and recover IT infrastructure.
Ensures minimal disruption and swift recovery from incidents.

Business Continuity Concepts

Downtime and Data Loss Metrics:
Recovery Time Objective (RTO):
Time needed to restore a business process.
Indicates the maximum allowable downtime.
Recovery Point Objective (RPO):
Data freshness required for recovery.
Commonly set at 24 hours, dictating data lost between last backup and incident.

Trends in IT Infrastructure

Emerging Trends: Cloud Computing Bring Your Own Device (BYOD) Green IT Big Data Analytics

Cloud Computing

Paradigm Shift: Enabled by virtualization technologies. Shared Resources: Applications utilize resources from a virtualized pool. On-Demand: Resources scale up/down based on demand. Deployment Models: Public Cloud: Available to the general public. Private Cloud: Dedicated to a single organization, managed internally or by a third party. Hybrid Cloud: Combines public and private models.

Cloud Computing Service Models

Service Models: Software-as-a-Service (SaaS): Provides software applications over the internet. Platform-as-a-Service (PaaS): Offers hardware and software tools over the internet. Infrastructure-as-a-Service (IaaS): Delivers computing infrastructure over the internet. Future Lecture Topic: Further details on Cloud Computing to be covered later.

Bring Your Own Device (BYOD) - Trend Description

Employees using personal devices (smartphones, tablets, laptops) for work. Access organizational applications and data for information processing and communication.

Benefits of Bring Your Own Device (BYOD)

No commitment to maintaining devices. Low capital expenditure. Agile decision-making and minimal operational expenditure. Familiarity increases efficiency and productivity. Potential cost savings from using employee contracts.

Risks of Bring Your Own Device (BYOD)

Security and Management Risks: Multiple devices not fully controlled by system managers. Data privacy and confidentiality concerns. Potential malware carriers from external networks. Ownership Concerns: Employee-owned phone numbers might be dialled by business contacts.

Green IT

Environmental Goal: Reduce the environmental impact of IT infrastructures. Strategies: Reduce electricity usage and CO2 emissions. Use greener equipment and increase efficiency. Examples: Flash disks instead of rotating disks. Blade servers sharing power supplies to reduce consumption.

Power Usage Effectiveness (PUE)

Efficiency Metric: Measures the energy efficiency of a data centre. PUE= IT Equipment Energy/Total Facility Energy

Power Usage Effectiveness (PUE) - Interpretation

PUE of 2.0 indicates that for every watt of IT power, an additional watt is used for cooling and distribution. PUE closer to 1.0 indicates higher efficiency, with more energy used for computing.

Big Data Analytics - Data Generation

Generated by ubiquitous sensors, mobile telephony, surveillance cameras, RFID tags, and social networks.

Big Data Analytics - Data Characteristics

Raw data is often unstructured. Investments focus on managing and maintaining large datasets.

Big Data Analytics

Analyzing large datasets to discover meaningful patterns, trends, and associations. Enhances decision-making and strategic planning.