Revision as of 06:06, 28 March 2012

Intro

How can we design an architecture that will achieve the desired quality attributes ?
Sources of architecture
- Theft: From previous systems, literature
- Method: Systematic and conscious, derived from requirements via transformations and heuristics.
- Intuition: Ability to conceive without conscious reasoning. Increased reliance on intuition increases the risk.
Ratio of usage of above three methods varies according to architects experience and novelty.

What is a tactic ? - A tactic is a design decision that influences the control of a quality attribute response.
A collection of tactics is an architectural strategy.
Each tactic is a design option for the architect.

Availability Tactics

All approaches to maintaining availability involve some type of redundancy, some type of health monitoring and some type of recovery when a failure is detected.
Availability tactics involve- Fault detection, fault recovery and fault prevention.

Fault Detection

Ping/echo and hearbeat generally operate among distinct processes and the exception tactic operates within a single process.

Ping/Echo

One component issues a ping to a component to be checked and expects to receive back an echo within a predefined time.
Response time allows performance to be assessed.
If bandwidth consumption of pings is an issue, then the ping/echo detectors can be organized in a hierarchy.
- Low-level detector pings low level processes and higher level fault detectors ping lower level ones.

Heartbeat

One component emits a heartbeat message periodically and another component listens for it.
Absence of heartbeat means originating component has failed.
Heartbeat messages can be combined with useful data.

Exceptions

Exceptions encountered during an exception.
Exception handler is invoked which typically executes in the same process that introduced the exception.

Fault Recovery

Fault recovery consists of preparing for recovery and making the actual system repair.

Voting

Processes running on redundant processors each take equivalent input and compute a simple output value that is sent to a voter.
Voter detects deviant behaviour from a single processor - then it fails it.
Different choices of voting algorithm - "majority wins" or "preferred component".
Often used in control systems to correct faulty algo's or processors.

Active Redundancy (Hot restart)

There are N redundant components - all of which respond to events in parallel.
Response/output from only one component is used though and rest are discarded.
Downtime is minimal, because backups are current and time to recover is only the switching time.
E.g. LAN with a number of parallel paths and redundant component in a separate path.
Synch is done by ensuring that all msgs to any component are sent to all redundant components, therefore a reliable transmission protocol may be required.

Passive Redundancy (Warm restart)

One component (the primary) responds to events and informs the other components (the standbys) of status updates.
When a fault occurs, backup state on standby must be fresh before resuming services.

Spare

Standby spare platform.
Must be rebooted to the appropriate software config and the state must be initialized to the point where the failure occurs.
Therefore checkpoints of the system state must be made regularly.

Repair Tactics / Component Reintroduction

When a redundant comp fails, it may be reintroduced after it has been repaired.

Shadow operation

The previously failed component may be made to run in shadow mode to mimic behaviour of working components for a short time before making it operational.

@@ Line 68: / Line 68: @@
 * Must be rebooted to the appropriate software config and the state must be initialized to the point where the failure occurs.
 * Therefore checkpoints of the system state must be made regularly.
+=== Repair Tactics / Component Reintroduction ===
+* When a redundant comp fails, it may be reintroduced after it has been repaired.
+==== Shadow operation ====
+* The previously failed component may be made to run in shadow mode to mimic behaviour of working components for a short time before making it operational.
+==== State resynchronization ====
+*
+==== Checkpoint/Rollback ====
+*

Difference between revisions of "Architecture Tactics"

Revision as of 06:06, 28 March 2012

Contents

Intro

Availability Tactics

Fault Detection

Ping/Echo

Heartbeat

Exceptions

Fault Recovery

Voting

Active Redundancy (Hot restart)

Passive Redundancy (Warm restart)

Spare

Repair Tactics / Component Reintroduction

Shadow operation

State resynchronization

Checkpoint/Rollback

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools