Difference between revisions of "Architecture Tactics"
From Suhrid.net Wiki
Jump to navigationJump to search| (22 intermediate revisions by the same user not shown) | |||
| Line 15: | Line 15: | ||
| * All approaches to maintaining availability involve some type of redundancy, some type of health monitoring and some type of recovery when a failure is detected. | * All approaches to maintaining availability involve some type of redundancy, some type of health monitoring and some type of recovery when a failure is detected. | ||
| + | * Faults cause failures. Availability tactics focus on dealing with faults. | ||
| * Availability tactics involve- Fault detection, fault recovery and fault prevention. | * Availability tactics involve- Fault detection, fault recovery and fault prevention. | ||
| Line 36: | Line 37: | ||
| === Exceptions === | === Exceptions === | ||
| − | * Exceptions encountered during  | + | * Exceptions encountered during operation. | 
| * Exception handler is invoked which typically executes in the same process that introduced the exception. | * Exception handler is invoked which typically executes in the same process that introduced the exception. | ||
| Line 43: | Line 44: | ||
| * Fault recovery consists of preparing for recovery and making the actual system repair as well reintroduction of components after repair. | * Fault recovery consists of preparing for recovery and making the actual system repair as well reintroduction of components after repair. | ||
| − | + | == Preparation and Repair Tactics ==   | |
| ==== Voting ==== | ==== Voting ==== | ||
| Line 82: | Line 83: | ||
| * Restored component must have its state upgraded before return to service. | * Restored component must have its state upgraded before return to service. | ||
| + | * Passive and active redundancy tactics require this. | ||
| * Ideal approach to update the state is a single atomic message. Incremental state upgrades lead to complicated software. | * Ideal approach to update the state is a single atomic message. Incremental state upgrades lead to complicated software. | ||
| Line 120: | Line 122: | ||
| * Responsibilities should work together without excessive reliance on other modules. | * Responsibilities should work together without excessive reliance on other modules. | ||
| − | + | ||
| + | === Abstract common services === | ||
| + | |||
| + | * Makes modifiability easy. | ||
| === Anticipate expected changes === | === Anticipate expected changes === | ||
| Line 138: | Line 143: | ||
| == Prevent ripple effects == | == Prevent ripple effects == | ||
| − | * | + | * A ripple effect from a modification is the necessity of making changes to modules not directly affected by it. | 
| + | * Various types of dependencies one module can have on another: | ||
| + | ** Syntax of data and service. | ||
| + | ** Semantics of data and service. | ||
| + | ** Sequence of data : e.g. protocol sequence | ||
| + | ** Sequence of control: e.g. A must have executed no longer than 5ms before B executes. | ||
| + | ** Identity of an interface of a module: Id (name/handle) of an interface of A must be consistent with assumptions of B. | ||
| + | ** Runtime location of A: For B to exec correctly. | ||
| + | ** QOS of service/data provided by A. e.g. accuracy must be within a certain range. | ||
| + | ** Existence of A: For B | ||
| + | ** Resource behaviour of A: e.g. use of memory or resource ownership. | ||
| + | |||
| + | === Hide Information === | ||
| + | * Oldest technique. Hide private data. | ||
| + | |||
| + | === Maintain existing interfaces === | ||
| + | *Creating abstract interfaces to mask variations. | ||
| + | * Add interfaces, adapters, providing a stub (proxy pattern).  | ||
| + | |||
| + | === Restrict communication paths === | ||
| + | |||
| + | * Reduce the no of data providers and consumers to and from the module. | ||
| + | |||
| + | === Use an intermediary === | ||
| + | |||
| + | * For non semantic dependencies, add an intermediary b/w B and A that manages activities associated with the dependency. | ||
| + | ** Data (syntax) : Convert syntax from A to B's. | ||
| + | ** Service (syntax) : Facade, Proxy, Factory : provide intermediaries that convert syntax of a service from A to B. | ||
| + | ** Identity of an interface: Broker pattern | ||
| + | ** Location of A (Runtime) : Name server. LDAP etc. | ||
| + | ** Resource behaviour: Introduce a resource manager. | ||
| + | ** Existence of A: Factory pattern. | ||
| + | |||
| + | == Defer Binding Time == | ||
| + | |||
| + | * Time to deploy and allowing non developers (sys admins and end users) to make changes. | ||
| + | * Tactics: | ||
| + | |||
| + | * Runtime registration: Pub/sub registration. | ||
| + | * Config files: set params at startup. | ||
| + | * Polymorphism: Late binding of method calls. | ||
| + | * Component replacement: allows load time binding. | ||
| + | * Adherence to defined protocols: Allows runtime binding of independent processes. | ||
| + | |||
| + | = Performance Tactics = | ||
| + | |||
| + | * Goal of performance tactics it to generate a response to an event arriving at the system with some time constraint. | ||
| + | * Main thing is to control the time within which a response is generated - the latency. | ||
| + | * Two basic contributors to resource time: | ||
| + | ** Resource consumption: CPU, database, network, memory, internal entities such as buffers. All these contribute to latency. | ||
| + | ** Blocked time: Blocking can happen due to various reasons: | ||
| + | *** Contention: Multiple events compete for the resource. | ||
| + | *** Availability: Resource may be unavailable for some reason (e.g. failure - network down) | ||
| + | *** Dependency on other computation: For e.g. data must be cached from DB before it can be read - this can cause latency. | ||
| + | |||
| + | == Resource Demand == | ||
| + | |||
| + | One tactic is reduce the resources required: | ||
| + | |||
| + | === Increase Computational Efficiency === | ||
| + | * Use efficient algorithms. | ||
| + | |||
| + | === Reduce computational overhead === | ||
| + | * Eliminate intermediaries (for e.g. RMI - adds lot of overhead) | ||
| + | * This is a trade-off between modifiability and performance. | ||
| + | |||
| + | |||
| + | Another tactic is to reduce the number of events processed: | ||
| + | |||
| + | === Manage Event Rate === | ||
| + | * Reduce sampling rate - there can be unnecessary oversampling. | ||
| + | |||
| + | === Control Frequency of Sampling === | ||
| + | * If no control over the arrival of external events - queued requests can be sampled at lower frequency.  | ||
| + | |||
| + | |||
| + | Control the use of resources | ||
| + | |||
| + | === Bound execution times === | ||
| + | * Place a limit on how much exec time - for e.g. limit the time given to an algo. | ||
| + | |||
| + | === Bound queue sizes === | ||
| + | * Control max no. of queued arrivals. | ||
| + | |||
| + | |||
| + | == Resource Management == | ||
| + | |||
| + | * What if resource demand is not controllable, mgmt of resources affect response times. | ||
| + | |||
| + | === Introduce Concurrency === | ||
| + | |||
| + | * Parallelizing processing can reduce blocking times.  | ||
| + | |||
| + | === Maintain multiple copies of either data or computations === | ||
| + | |||
| + | * In client-server architecture use caching to reduce contention. | ||
| + | |||
| + | === Increase available resources === | ||
| + | |||
| + | * Faster processors, additional processors | ||
| + | * Add more memory, network bandwidth. | ||
| + | * Trade-off between cost and performance. | ||
| + | |||
| + | == Resource Arbitration == | ||
| + | |||
| + | * Whenever there is contention for a resource, the resource must be scheduled. | ||
| + | * Basically, a scheduling policy for the resource. | ||
| + | * Scheduling policies can be: | ||
| + | |||
| + | === FIFO Scheduling ===  | ||
| + | |||
| + | * All requests for resources are treated equally and are satisfied in turn. | ||
| + | |||
| + | === Fixed Priority Scheduling === | ||
| + | |||
| + | * Assign each request a particular priority and assigns resources in that priority. | ||
| + | * Priority can be assigned according to | ||
| + | ** Semantic importance: According to domain characteristics. | ||
| + | ** Deadline monotonic: Higher priority to shorter deadlines. | ||
| + | ** Rate monotonic: Higher priority to streams with shorter periods. | ||
| + | |||
| + | === Dynamic Priority Scheduling === | ||
| + | |||
| + | * Round robin: Orders requests and assigns resource to next request in round robin order. | ||
| + | * EDF: Assigns priorities based on pending requests with the earliest deadline. | ||
| + | |||
| + | === Static Scheduling ===  | ||
| + | |||
| + | * Sequence of assignment of resources is determined offline. | ||
| + | |||
| + | = Security Tactics =  | ||
| + | |||
| + | * Can be divided into three types of tactics : resting attacks (e.g. lock), detecting attacks (e.g. sensor), recovering from attacks (e.g. insurance). | ||
| + | |||
| + | == Resisting attacks == | ||
| + | |||
| + | * Address the requirements of security of a system. | ||
| + | |||
| + | # Authenticate users: Users are who they claim to be. Passwords etc. | ||
| + | # Authorize users: Authenticated user has the rights to access and modify either data or services. Access Control Systems etc. | ||
| + | # Maintain data confidentiality: Encryption, public key authentication. | ||
| + | # Maintain integrity: Redundant information encoded in it - e.g. checksums, hash results. | ||
| + | # Limit exposure: Attacks typically all data & services on a single host. Architect can design allocation of services to hosts so that limited services are available on each host. | ||
| + | # Limit access: Limit access from sources using firewalls. Establish DMZ. DMZ is a subnet that exists between the firewall protecting the internal LAN and the wider internet. Hosts within DMZ have limited connectivity to internal hosts while communicating with external hosts.  | ||
| + | |||
| + | == Detecting attacks == | ||
| + | |||
| + | * Intrusion detection systems. e.g. compare network traffic patterns against known ones. | ||
| + | * Artificial immune systems | ||
| + | * Set traps  | ||
| + | |||
| + | == Recovering from attacks == | ||
| + | |||
| + | * Can be divided into restoring state and identifying attacker. | ||
| + | |||
| + | * Recovering state: related to availability tactics. | ||
| + | ** Especially maintain redundant copoes of sys admin data such as passwords, ACL's, and user profile data. | ||
| + | |||
| + | * Tactics for identifying attackers: Maintain an audit trail. Audit can be used trace actions of attacker, support non repudiation and support system recovery. | ||
| + | |||
| + | = Usability Tactics =  | ||
| + | |||
| + | == Runtime Tactics == | ||
| + | |||
| + | * Support user initiative: | ||
| + | ** Undo, redo, aggregate: All require architectural consideration. | ||
| + | |||
| + | * Support system initiative: | ||
| + | |||
| + | * Maintain a task model: Task analysis. e.g. auto correct beginning of sentences to capital. | ||
| + | * Maintain a model of user: e.g. Personas. How much the user's knows about the system, users behaviour in terms of expected response times.  e.g. Page scrolling rate. | ||
| + | * Maintain a model of the system: Determines expected system behaviour so that appropriate feedback can be given to user. e.g. time needed to complete certain activity.  | ||
| + | |||
| + | == Design time Tactics == | ||
| + | |||
| + | * Separate UI from rest of application: Allows UI developers to frequently change the UI and maintain code separately. | ||
| + | * Classic design pattern to implement this tactic: Model-View-Controller (MVC). | ||
| + | |||
| + | = Testability Tactics = | ||
| + | |||
| + | == Manage input/output == | ||
| + | |||
| + | * Record/playback:Capture information crossing an interface and use it as input to the test harness. | ||
| + | * Separate interface from implementation: Stubbing implementation allows rest of system to be tested in absence of stubbed component. | ||
| + | * Specialize access routes/interfaces: Have specialized testing interfaces. | ||
| + | |||
| + | == Internal Monitoring == | ||
| + | |||
| + | * Built in monitors: Record events when monitoring states have been activated. Increased visibility into activities of the component. | ||
| + | |||
| + | [[Category:SYAR]] | ||
Latest revision as of 10:14, 23 April 2012
Contents
- 1 Intro
- 2 Availability Tactics
- 3 Modifiability Tactics
- 4 Performance Tactics
- 5 Security Tactics
- 6 Usability Tactics
- 7 Testability Tactics
Intro
- How can we design an architecture that will achieve the desired quality attributes ?
- Sources of architecture
- Theft: From previous systems, literature
- Method: Systematic and conscious, derived from requirements via transformations and heuristics.
- Intuition: Ability to conceive without conscious reasoning. Increased reliance on intuition increases the risk.
 
- Ratio of usage of above three methods varies according to architects experience and novelty.
- What is a tactic ? - A tactic is a design decision that influences the control of a quality attribute response.
- A collection of tactics is an architectural strategy.
- Each tactic is a design option for the architect.
Availability Tactics
- All approaches to maintaining availability involve some type of redundancy, some type of health monitoring and some type of recovery when a failure is detected.
- Faults cause failures. Availability tactics focus on dealing with faults.
- Availability tactics involve- Fault detection, fault recovery and fault prevention.
Fault Detection
- Ping/echo and hearbeat generally operate among distinct processes and the exception tactic operates within a single process.
Ping/Echo
- One component issues a ping to a component to be checked and expects to receive back an echo within a predefined time.
- Response time allows performance to be assessed.
- If bandwidth consumption of pings is an issue, then the ping/echo detectors can be organized in a hierarchy.
- Low-level detector pings low level processes and higher level fault detectors ping lower level ones.
 
Heartbeat
- One component emits a heartbeat message periodically and another component listens for it.
- Absence of heartbeat means originating component has failed.
- Heartbeat messages can be combined with useful data.
Exceptions
- Exceptions encountered during operation.
- Exception handler is invoked which typically executes in the same process that introduced the exception.
Fault Recovery
- Fault recovery consists of preparing for recovery and making the actual system repair as well reintroduction of components after repair.
Preparation and Repair Tactics
Voting
- Processes running on redundant processors each take equivalent input and compute a simple output value that is sent to a voter.
- Voter detects deviant behaviour from a single processor - then it fails it.
- Different choices of voting algorithm - "majority wins" or "preferred component".
- Often used in control systems to correct faulty algo's or processors.
Active Redundancy (Hot restart)
- There are N redundant components - all of which respond to events in parallel.
- Response/output from only one component is used though and rest are discarded.
- Downtime is minimal, because backups are current and time to recover is only the switching time.
- E.g. LAN with a number of parallel paths and redundant component in a separate path.
- Synch is done by ensuring that all msgs to any component are sent to all redundant components, therefore a reliable transmission protocol may be required.
Passive Redundancy (Warm restart)
- One component (the primary) responds to events and informs the other components (the standbys) of status updates.
- When a fault occurs, backup state on standby must be fresh before resuming services.
Spare
- Standby spare platform.
- Must be rebooted to the appropriate software config and the state must be initialized to the point where the failure occurs.
- Therefore checkpoints of the system state must be made regularly.
Repair Tactics / Component Reintroduction
- When a redundant comp fails, it may be reintroduced after it has been repaired.
Shadow operation
- The previously failed component may be made to run in shadow mode to mimic behaviour of working components for a short time before making it operational.
State resynchronization
- Restored component must have its state upgraded before return to service.
- Passive and active redundancy tactics require this.
- Ideal approach to update the state is a single atomic message. Incremental state upgrades lead to complicated software.
Checkpoint/Rollback
- A checkpoint is recording of consistent states either periodically or in response to specific events.
- System can be restored using a previous consistent checkpoint and a log of transactions since the last checkpoint was taken.
Fault Prevention
Removal from Service
- Removes a component from operation to undergo activities to prevent anticipated failures.
- For e.g. rebooting a component regularly to prevent memory leaks from causing a failure.
- Arch strategy must be designed to support it.
Transactions
- Bundling together of several actions so that entire bundle can be undone at once.
- If one action is failed, entire transaction is failed.
- Intermediate data doesnt corrupt output and affect rest of system.
- Lock shared data - threads.
Process Monitor
- Detect and shutdown failed processes,
- New process instance created and state recovered.
Modifiability Tactics
- Goal is to control time and cost to implement, test and deploy changes.
Localize Modifications
- Goals of tactics is to assign responsibilities to modules during design such that anticipated changes will be limited in scope.
Maintain semantic coherence
- Responsibilities should work together without excessive reliance on other modules.
Abstract common services
- Makes modifiability easy.
Anticipate expected changes
- Considering set of future changes helps to evaluate assignment of responsibilities.
Generalize the module
- Make a module compute a broader range of functions based on input. For e.g. constants can be passed in as input parameters.
- Basically, more general a module is, the more likely that requested changes can be made by adjusting the input rather than by modifying the module.
Limit possible options
- Restricting possible change options can reduce effect of modifications.
- For e.g. restrict processors to only be members of a certain family - limits the option and reduce the effect of modifications.
Prevent ripple effects
- A ripple effect from a modification is the necessity of making changes to modules not directly affected by it.
- Various types of dependencies one module can have on another:
- Syntax of data and service.
- Semantics of data and service.
- Sequence of data : e.g. protocol sequence
- Sequence of control: e.g. A must have executed no longer than 5ms before B executes.
- Identity of an interface of a module: Id (name/handle) of an interface of A must be consistent with assumptions of B.
- Runtime location of A: For B to exec correctly.
- QOS of service/data provided by A. e.g. accuracy must be within a certain range.
- Existence of A: For B
- Resource behaviour of A: e.g. use of memory or resource ownership.
 
Hide Information
- Oldest technique. Hide private data.
Maintain existing interfaces
- Creating abstract interfaces to mask variations.
- Add interfaces, adapters, providing a stub (proxy pattern).
Restrict communication paths
- Reduce the no of data providers and consumers to and from the module.
Use an intermediary
- For non semantic dependencies, add an intermediary b/w B and A that manages activities associated with the dependency.
- Data (syntax) : Convert syntax from A to B's.
- Service (syntax) : Facade, Proxy, Factory : provide intermediaries that convert syntax of a service from A to B.
- Identity of an interface: Broker pattern
- Location of A (Runtime) : Name server. LDAP etc.
- Resource behaviour: Introduce a resource manager.
- Existence of A: Factory pattern.
 
Defer Binding Time
- Time to deploy and allowing non developers (sys admins and end users) to make changes.
- Tactics:
- Runtime registration: Pub/sub registration.
- Config files: set params at startup.
- Polymorphism: Late binding of method calls.
- Component replacement: allows load time binding.
- Adherence to defined protocols: Allows runtime binding of independent processes.
Performance Tactics
- Goal of performance tactics it to generate a response to an event arriving at the system with some time constraint.
- Main thing is to control the time within which a response is generated - the latency.
- Two basic contributors to resource time:
- Resource consumption: CPU, database, network, memory, internal entities such as buffers. All these contribute to latency.
- Blocked time: Blocking can happen due to various reasons:
- Contention: Multiple events compete for the resource.
- Availability: Resource may be unavailable for some reason (e.g. failure - network down)
- Dependency on other computation: For e.g. data must be cached from DB before it can be read - this can cause latency.
 
 
Resource Demand
One tactic is reduce the resources required:
Increase Computational Efficiency
- Use efficient algorithms.
Reduce computational overhead
- Eliminate intermediaries (for e.g. RMI - adds lot of overhead)
- This is a trade-off between modifiability and performance.
Another tactic is to reduce the number of events processed:
Manage Event Rate
- Reduce sampling rate - there can be unnecessary oversampling.
Control Frequency of Sampling
- If no control over the arrival of external events - queued requests can be sampled at lower frequency.
Control the use of resources
Bound execution times
- Place a limit on how much exec time - for e.g. limit the time given to an algo.
Bound queue sizes
- Control max no. of queued arrivals.
Resource Management
- What if resource demand is not controllable, mgmt of resources affect response times.
Introduce Concurrency
- Parallelizing processing can reduce blocking times.
Maintain multiple copies of either data or computations
- In client-server architecture use caching to reduce contention.
Increase available resources
- Faster processors, additional processors
- Add more memory, network bandwidth.
- Trade-off between cost and performance.
Resource Arbitration
- Whenever there is contention for a resource, the resource must be scheduled.
- Basically, a scheduling policy for the resource.
- Scheduling policies can be:
FIFO Scheduling
- All requests for resources are treated equally and are satisfied in turn.
Fixed Priority Scheduling
- Assign each request a particular priority and assigns resources in that priority.
- Priority can be assigned according to
- Semantic importance: According to domain characteristics.
- Deadline monotonic: Higher priority to shorter deadlines.
- Rate monotonic: Higher priority to streams with shorter periods.
 
Dynamic Priority Scheduling
- Round robin: Orders requests and assigns resource to next request in round robin order.
- EDF: Assigns priorities based on pending requests with the earliest deadline.
Static Scheduling
- Sequence of assignment of resources is determined offline.
Security Tactics
- Can be divided into three types of tactics : resting attacks (e.g. lock), detecting attacks (e.g. sensor), recovering from attacks (e.g. insurance).
Resisting attacks
- Address the requirements of security of a system.
- Authenticate users: Users are who they claim to be. Passwords etc.
- Authorize users: Authenticated user has the rights to access and modify either data or services. Access Control Systems etc.
- Maintain data confidentiality: Encryption, public key authentication.
- Maintain integrity: Redundant information encoded in it - e.g. checksums, hash results.
- Limit exposure: Attacks typically all data & services on a single host. Architect can design allocation of services to hosts so that limited services are available on each host.
- Limit access: Limit access from sources using firewalls. Establish DMZ. DMZ is a subnet that exists between the firewall protecting the internal LAN and the wider internet. Hosts within DMZ have limited connectivity to internal hosts while communicating with external hosts.
Detecting attacks
- Intrusion detection systems. e.g. compare network traffic patterns against known ones.
- Artificial immune systems
- Set traps
Recovering from attacks
- Can be divided into restoring state and identifying attacker.
- Recovering state: related to availability tactics.
- Especially maintain redundant copoes of sys admin data such as passwords, ACL's, and user profile data.
 
- Tactics for identifying attackers: Maintain an audit trail. Audit can be used trace actions of attacker, support non repudiation and support system recovery.
Usability Tactics
Runtime Tactics
- Support user initiative:
- Undo, redo, aggregate: All require architectural consideration.
 
- Support system initiative:
- Maintain a task model: Task analysis. e.g. auto correct beginning of sentences to capital.
- Maintain a model of user: e.g. Personas. How much the user's knows about the system, users behaviour in terms of expected response times. e.g. Page scrolling rate.
- Maintain a model of the system: Determines expected system behaviour so that appropriate feedback can be given to user. e.g. time needed to complete certain activity.
Design time Tactics
- Separate UI from rest of application: Allows UI developers to frequently change the UI and maintain code separately.
- Classic design pattern to implement this tactic: Model-View-Controller (MVC).
Testability Tactics
Manage input/output
- Record/playback:Capture information crossing an interface and use it as input to the test harness.
- Separate interface from implementation: Stubbing implementation allows rest of system to be tested in absence of stubbed component.
- Specialize access routes/interfaces: Have specialized testing interfaces.
Internal Monitoring
- Built in monitors: Record events when monitoring states have been activated. Increased visibility into activities of the component.
