Systems and software engineering — Systems and software Quality Requirements and Evaluation (SQuaRE) — Evaluation module for recoverability

ISO/IEC 25045:2010 is one of the SQuaRE series of International Standards, which provides a framework for software products quality requirements and evaluation including the requirements for methods of software product measurement and evaluation. ISO/IEC 25045:2010 uses a methodology involving two types of evaluation for recoverability. One part of the method makes use of the disturbance injection methodology and a list of disturbances based on common categories of operational faults and events to evaluate the quality measure of resiliency. The second quality measure is based on a set of questions that is defined for each disturbance to evaluate the quality measure of autonomic recovery index by assessing how well the system detects, analyses, and resolves the disturbance without human intervention. ISO/IEC 25045:2010 is applicable to information systems executing transactions in a system supporting single or multiple concurrent users, where speedy recovery and ease of managing recovery is important to the acquirer, owner/operator, and the developer.

Ingénierie des systèmes et du logiciel — Exigences de qualité et évaluation des systèmes et du logiciel (SQuaRE) — Module d'évaluation pour la possibilité de récupération

General Information

Status
Published
Publication Date
23-Aug-2010
Current Stage
9093 - International Standard confirmed
Start Date
09-Aug-2024
Completion Date
30-Oct-2025
Ref Project

Relations

Standard
ISO/IEC 25045:2010 - Systems and software engineering — Systems and software Quality Requirements and Evaluation (SQuaRE) — Evaluation module for recoverability Released:8/24/2010
English language
37 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)


INTERNATIONAL ISO/IEC
STANDARD 25045
First edition
2010-09-01
Systems and software engineering —
Systems and software Quality
Requirements and Evaluation
(SQuaRE) — Evaluation module for
recoverability
Ingénierie des systèmes et du logiciel — Exigences de qualité et
évaluation des systèmes et du logiciel (SQuaRE) — Module
d'évaluation pour la possibilité de récupération

Reference number
©
ISO/IEC 2010
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

©  ISO/IEC 2010
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2010 – All rights reserved

Contents Page
Foreword .iv
Introduction.v
1 Scope.1
1.1 Characteristics.1
1.2 Level of evaluation .1
1.3 Technique.1
1.4 Applicability .2
2 Conformance .2
3 Normative references.2
4 Terms and definitions .3
5 Inputs and measures.3
5.1 Evaluation methodology.3
5.1.1 Practical considerations relating to the methodology .5
5.1.2 Disturbances.5
5.2 Input for the evaluation.8
5.2.1 The SUT description.8
5.2.2 The workload description.9
5.2.3 The fault load description.10
5.3 Data elements .11
5.3.1 Output from the baseline run .11
5.3.2 Output from the test run .11
5.3.3 Completion of the Autonomic Maturity Questionnaire.12
5.4 Quality Measures.12
5.4.1 Summary of the Quality Measures and Quality Measure Elements (QME) .12
5.4.2 Quality Measure - Resiliency.12
5.4.3 Quality Measure - Autonomic Recovery Index.13
5.4.4 Quality Measure Element (QME) - Number of transactions under disturbance.15
5.4.5 Quality Measure Element (QME) - Number of transactions under no disturbance .16
5.4.6 Quality Measure Element (QME) - Autonomic Maturity Score.16
6 Interpretation of results .17
6.1 Mapping of measures.17
6.2 Reporting.17
6.3 Application Procedure .17
Annex A (informative)  Sample Report .18
Bibliography.37

© ISO/IEC 2010 – All rights reserved iii

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 25045 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 7, Software and systems engineering.
ISO/IEC 25045 is one of the SQuaRE series of International Standards, which consists of the following
divisions under the general title Systems and software engineering — Systems and software Quality
Requirements and Evaluation (SQuaRE):
• Quality Management Division (ISO/IEC 2500n),
• Quality Model Division (ISO/IEC 2501n),
• Quality Measurement Division (ISO/IEC 2502n),
• Quality Requirements Division (ISO/IEC 2503n),
• Quality Evaluation Division (ISO/IEC 2504n).

iv © ISO/IEC 2010 – All rights reserved

Introduction
The evaluation of software product quality is vital to both the acquisition and development of software that
meets quality requirements. The relative importance of the various characteristics of software quality depends
on the mission or objectives of the system of which it is a part; software products need to be evaluated to
decide whether relevant quality characteristics meet the requirements of the system.
The essential parts of software quality evaluation are a quality model, the method of evaluation, software
measurement, and supporting tools. To develop good software, quality requirements should be specified, the
software quality assurance process should be planned, implemented and controlled, and both intermediate
products and end products should be evaluated.
This International Standard is part of the SQuaRE series of International Standards. It contains general
requirements for specification and evaluation of systems and software quality and clarifies the associated
general concepts. It provides a framework for evaluating the quality of software products and states the
requirements for methods of software product measurement and evaluation.
The general goal of creating the SQuaRE series of International Standards is to move to a logically organized,
enriched and unified series covering two main processes: software quality requirements specification and
software quality evaluation, supported by a software quality measurement process. The purpose of the
SQuaRE series of International Standards is to assist those developing and acquiring software products with
the specification and evaluation of quality requirements. It establishes criteria for the specification of systems
and software quality requirements, their measurement, and evaluation. It includes a two-part quality model for
aligning customer definitions of quality with attributes of the development process. In addition, the series
provides recommended measures of software product quality attributes that can be used by developers,
acquirers, and evaluators.
SQuaRE provides
• terms and definitions,
• reference models,
• a general guide,
• individual division guides, and
• International Standards for requirements specification, planning and management, measurement and
evaluation purposes.
SQuaRE includes International Standards on quality model and measures, as well as on quality requirements
and evaluation.
SQuaRE replaces the current ISO/IEC 9126 series and the ISO/IEC 14598 series.
ISO/IEC 25040, Systems and software engineering — Systems and software Quality Requirements and
Evaluation (SQuaRE) — Evaluation reference model and guide will replace a part of ISO/IEC 14598-1,
Information technology — Software product evaluation — Part 1: General overview.
ISO/IEC 25041, Systems and software engineering — Systems and software Quality Requirements and
Evaluation (SQuaRE) — Evaluation modules will replace ISO/IEC 14598-6, Software engineering — Product
evaluation — Documentation of evaluation modules.
ISO/IEC 25001, Software engineering — Software product Quality Requirements and Evaluation
(SQuaRE) — Planning and management replaces ISO/IEC 14598-2, Software engineering — Product
evaluation — Part 2: Planning and management.
© ISO/IEC 2010 – All rights reserved v

Quality Model
Division
2501n
Quality
Quality
Quality
Evaluation
Management Division
Requirements
Division
Division
2500n
2504n
2503n
Quality
Measurement Division
2502n
Figure 1 – Organization of the SQuaRE series of International Standards
Figure 1 illustrates the organization of the SQuaRE series, representing families of standards, also called
divisions.
The divisions within SQuaRE model are:
• ISO/IEC 2500n - Quality Management Division. The International Standards that form this division
define all common models, terms and definitions further referred to by all other International Standards
from the SQuaRE series. Referring paths (guidance through SQuaRE documents) and high level practical
suggestions in applying proper standards to specific application cases offer help to all types of users. The
division also provides requirements and guidance for a supporting function which is responsible for the
management of software product requirements specification and evaluation.
• ISO/IEC 2501n - Quality Model Division. The International Standard that forms this division presents a
detailed quality model including internal, external and quality in use characteristics. Furthermore, the
internal and external software quality characteristics are decomposed into sub-characteristics. Practical
guidance on the use of the quality model is also provided.
• ISO/IEC 2502n - Quality Measurement Division. The International Standards that form this division
include a software product quality measurement reference model, mathematical definitions of quality
measures, and practical guidance for their application. Presented measures apply to internal software
quality, external software quality and quality in use. Measurement primitives forming foundations for the
latter measures are defined and presented.
• ISO/IEC 2503n - Quality Requirements Division. The International Standard that forms this division
helps in specifying quality requirements. These quality requirements can be used in the process of quality
requirements elicitation for a software product to be developed or as input for an evaluation process. The
requirements definition process is mapped to technical processes defined in ISO/IEC 15288, Systems and
software engineering — System life cycle processes.
• ISO/IEC 2504n - Quality Evaluation Division. The International Standards that form this division provide
requirements, recommendations and guidelines for software product evaluation, whether performed by
evaluators, acquirers or developers. The support for documenting a measure as an Evaluation Module is
also presented.
vi © ISO/IEC 2010 – All rights reserved

This International Standard is part of the Quality Evaluation Division (ISO/IEC 2504n), which consists of the
following International Standards (see Figure 2).
1)
• ISO/IEC 25040 , Systems and software engineering — Systems and software Quality Requirements and
Evaluation (SQuaRE) — Evaluation reference model and guide, contains general requirements for
specification and evaluation of software quality and clarifies the general concepts. It provides a process
description for evaluating the quality of software products and states the requirements for the application
of this process. The evaluation process is the basis for software product quality evaluation for different
purposes and approaches. Therefore, the process can be used for the evaluation of quality in use,
external software quality and internal software quality. It can also be applied to evaluate the quality of pre-
developed software or custom software during its development process. The software product quality
evaluation can be conducted by an acquirer, a developer organization, a supplier or an independent third
party evaluator.
2)
• ISO/IEC 25041 , Systems and software engineering — Systems and software Quality Requirements and
Evaluation (SQuaRE) — Evaluation modules, defines the structure and content of the documentation to
be used to describe an evaluation module. These evaluation modules contain the specification of the
quality model (i.e. characteristics, sub-characteristics and corresponding internal, external or quality in use
measures), the associated data and information about the planned application of the model and the
information about its actual application. Appropriate evaluation modules are selected for each evaluation.
In some cases, it might be necessary to develop new evaluation modules. Guidance for developing new
evaluation modules is found in ISO/IEC 25041. This International Standard can also be used by
organizations producing new evaluation modules.
• ISO/IEC 25045, Systems and software engineering — Systems and software Quality Requirements and
Evaluation (SQuaRE) — Evaluation module for recoverability provides the specification to evaluate the
sub-characteristic of recoverability defined under the characteristic of reliability of the quality model. The
ability of a software product and thereby a system to remain available or to recover within an acceptable
timeframe from disturbance has always been important since a down time often has economic and other
consequences. The emphasis in recent years has extended to the autonomic ability of the software
product and thereby a system to be self-managed with minimal involvement by human operators. There
are interests in the user domain and industry on how well a software product and thereby a system
handles such disturbances in the way it detects, analyses, adjusts or recovers. This International
Standard determines the quality measures of resiliency and autonomic recovery index when the
information system composed of one or more software products' execution transactions is subjected to a
series of disturbances. A disturbance could be an operational fault (e.g. an abrupt shutdown of an
operating system process that brings down a system) or an event (e.g. a significant increase of users to
the system).
1) To be published.
2) Under preparation.
© ISO/IEC 2010 – All rights reserved vii

INTERNATIONAL STANDARD ISO/IEC 25045:2010(E)

Systems and software engineering — Systems and software
Quality Requirements and Evaluation (SQuaRE) — Evaluation
module for recoverability
1 Scope
This International Standard is one of the SQuaRE series of International Standards, which contains general
requirements for specification and evaluation of systems and software quality and clarifies the associated
general concepts. SQuaRE provides a framework for evaluating the quality of software products and states
the requirements for methods of software product measurement and evaluation.
This International Standard uses a methodology involving two types of evaluation for recoverability. One part
of the method makes use of the disturbance injection methodology and a list of disturbances based on
common categories of operational faults and events to evaluate the quality measure of resiliency. The second
quality measure is based on a set of questions that is defined for each disturbance to evaluate the quality
measure of autonomic recovery index by assessing how well the system detects, analyses, and resolves the
disturbance without human intervention.
This International Standard is applicable to information systems executing transactions in a system supporting
single or multiple concurrent users, where speedy recovery and ease of managing recovery is important to the
acquirer, owner/operator, and the developer.
1.1 Characteristics
This evaluation module measures the quality measures defined under the following characteristic and
sub-characteristics of the quality model as defined in ISO/IEC 9126-1:2001.
NOTE The reference to ISO/IEC 9126-1 will be replaced by a reference to ISO/IEC 25010 when published.
Characteristic – Reliability
Sub-characteristic – Recoverability
Quality measure – Resiliency
Quality measure – Autonomic recovery index
1.2 Level of evaluation
Level D as defined in ISO/IEC 14598-5. This evaluation is intended for a system with executable products.
NOTE The reference to ISO/IEC 14598-5 will be replaced by a reference to ISO/IEC 25040 when published.
1.3 Technique
A disturbance injection methodology is a test methodology where disturbances are injected against the
application and other components of the system while it is running a workload of interest to the acquirer. A
disturbance injection methodology and a list of disturbances based on common categories of operational
faults and events are used to evaluate the quality measure of Resiliency. For each disturbance, the Resiliency
of the system is calculated based on the ratio between the number of transactions that complete successfully
© ISO/IEC 2010 – All rights reserved 1

while the system is under disturbance and the number of transactions that complete successfully in a system
that does not encounter the disturbance. A set of disturbances is defined under the following categories:
• Unexpected shutdown — e.g. abrupt operating system (OS) shutdown, process shutdown, network
shutdown;
• Resource contention — e.g. CPU/memory/IO hogs, memory leak, database management system (DBMS)
runaway query, DBMS deadlock, DBMS and queuing server storage exhaustion;
• Loss of data — e.g. DBMS loss of data, DBMS loss of file, DBMS and queuing server loss of disk;
• Load resolution — e.g. a moderate or significant increase of users or workload;
• Restart failures — e.g. restart failure on OS and middleware server process.
Other disturbance categories may be identified if appropriate.
A set of questions to assess how well the system detects, analyses, and resolves the disturbance is defined
for each disturbance to evaluate the quality measure of autonomic recovery index. A score is calculated for
each disturbance based on the answers to those questions.
The overall Resiliency and autonomic recovery index are calculated respectively as an average of those
individual scores.
The detailed evaluation methodology involved is given in 5.1.
1.4 Applicability
This evaluation module is applicable to an information system that involves a software product and other
software components. The information system must have a workload that has a consistently reproducible
performance result to properly assess the impact of disturbance and recovery.
The evaluation module can be used in the following situations:
a) evaluation as part of the system verification testing;
b) evaluation against the test environment of a production system to gauge recoverability and identify
weakness;
c) evaluation of the recoverability of different solutions proposed by vendors using a common workload.
The evaluation result is only applicable to the specific release and configuration of the software and hardware
components on which they were evaluated. Two results are comparable if they use the same workload and
workload parameter set defined in 5.2.2.2 and fault load and fault load parameter set defined in 5.2.3.2 for the
evaluation.
2 Conformance
An evaluation of the recoverability of a software product conforms to this International Standard if it complies
with Clause 5.
3 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO/IEC 25000:2005, Software Engineering — Software product Quality Requirements and Evaluation
(SQuaRE) — Guide to SQuaRE
2 © ISO/IEC 2010 – All rights reserved

4 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 25000 and the following apply.
4.1
performance baseline
result from a normal execution of a performance workload against a system without performing disturbance
injection
4.2
disturbance
operational fault (e.g. an abrupt shutdown of an OS process that brings down a system) or event (e.g. a
significant increase of users to the system), or anything that could change the state of the system
NOTE For the context of this evaluation module, the disturbances are limited to external faults or events, rather than
internal faults that required modifying the application or OS code.
4.3
injection slot
point where the recoverability of the system under test (SUT) is tested by injecting a disturbance while a
workload is being run
5 Inputs and measures
5.1 Evaluation methodology
The evaluation shall follow the methodology outlined below utilizing an existing performance workload,
injecting disturbances which are faults or events as the workload is executing, and measuring the
performance under disturbance as compared to a stable environment.
The evaluation methodology consists of three phases, as outlined in Figure 2 below. These are the Baseline
phase, the Test phase, and the Check phase. Note that prior to running a Baseline phase or Test phase, the
workload must be allowed to ramp up to steady state, in which the workload runs at a consistent level of
performance.
Check
Baseline Test
Time
Injection Injection Injection
Slot 1
Slot 2 Slot N
Figure 2 — Three Phases of the Evaluation Methodology
The Baseline phase determines the operational characteristics of the system in the absence of the injected
perturbations. This baseline phase is run to generate a performance baseline that shall be used to compare
the result from the test phase, and shall comply with all requirements defined by the performance workload.
The Test phase determines the operational characteristics of the system when the workload is run in the
presence of the disturbances. This phase shall use the same setup and configuration as the Baseline phase.
The Test phase is divided into a number of consecutive Disturbance Injection Slots. These injection slots shall
be run one after another in a specified sequence.
The Check phase ensures that the reaction of the system to the disturbance did not affect the integrity of the
system. During this phase, a check shall be made to ensure that the system is in a consistent state.
© ISO/IEC 2010 – All rights reserved 3

During each injection slot, the fault load driver initiates the injection of a disturbance into the system under test
(SUT). Ideally, the SUT detects the problem and responds to it. This response can consist of either fixing the
problem or bypassing the problem by transferring work to a standby machine without resolving the original
problem. If the SUT is not capable of detecting and then either fixing or bypassing the problem automatically,
the fault load driver waits an appropriate interval of time, to simulate the time it takes for human operator
intervention, and initiates an appropriate human-simulated operation to recover from the problem.

Figure 3 — Injection slot sub-intervals
As Figure 3 demonstrates, each injection slot consists of five sub-intervals.
• The Injection Interval is the predefined time that the system is allowed to run at steady state before a
particular fault is injected into the SUT. The benchmark driver waits for the predefined injection
interval before injecting the fault. The purpose of the injection interval is to demonstrate that the
system is functioning correctly before any disturbance is injected.
• The Detection Interval is the time from when a fault is injected to the time when a fault is detected.
For an SUT that is not capable of detecting a fault automatically, the driver will be configured to wait
for a predefined Detection Interval before initiating a recovery action. This is to simulate the time it
takes for the human operator to detect a fault.
• The Recovery Initiation Interval is the time from when a fault is detected to the time when a
recovery action begins. For an SUT that is not capable of detecting the fault or initiating a recovery
action automatically, the driver will be configured to wait for a predefined Recovery Initiation Interval
before initiating the recovery action. This is to simulate the time it takes for a human operator to
initiate recovery.
• The Recovery Interval is the time that it takes the system to perform recovery.
• The Keep Interval is the time to ramp-up again and run at steady state after the recovery. This is the
time remaining in a measurement interval. If a steady state is not achieved or is lower than that prior
to the Disturbance Injection, this should be noted in the report.

It is important to note two things.
• First, the breakdown of the slot interval into sub-intervals is for expository purposes only. During a
benchmark run, the benchmark driver only distinguishes the boundaries of these sub-intervals when
the SUT requires simulated human intervention.
• And second, only the operations processed during the last four of the five intervals are part of the
measurement interval, and are, therefore, counted when calculating the throughput for the run.
4 © ISO/IEC 2010 – All rights reserved

5.1.1 Practical considerations relating to the methodology
A test is more controllable if each injection slot is run in isolation such that the system is stopped, reset, and
started and ramped-up between each injection slot. This will require the Check phase after each injection slot
instead of after all Injection Slots as described in Figure 2.
If the customer wants or agrees (such as to speed up the test or to see how the system react to multiple
disturbance that occurs one after another), the test could be setup to run some injection slots one after
another without stopping, resetting, starting, and ramping up to steady state between each injection slot. This
might be suitable for disturbances that do not bring down the SUT. Otherwise the resulting database recovery
would take much longer due to the need to recover for all prior transactions from previous injection slots. If the
test is to be run to compare different systems, and the injection slots are not run in isolation, the specific
sequence and grouping of injection slots should be used for all the systems.
The interval length of the run depends on the workload. Larger workload with higher throughput tends to
require a longer ramp-up period to reach the steady state where an injection slot could begin. The following is
one example that had been used to provide a balance between efficiency and the need to allow a SUT
sufficient time to detect and repair from the injected disturbances: For a baseline run allow the system to warm
up for 5 minutes, and then use a 50-minute Baseline Phase. For a test run, allow the system to warm up for 5
minutes, and then use a 50-minute Test Phase, which is broken up into 10 minutes for the Injection Interval,
20 minutes for the combined Detection Interval and Recovery Initiation Interval, and 20 minutes for Recovery
Interval and Keep Interval.
5.1.2 Disturbances
The disturbances and categories of disturbances in the execution runs are not intended to be comprehensive,
and the user of this International Standard can extend the list based on experience and context. The list of
disturbances is intended to cover common operation faults and events, where some disturbances could be
due to operator mistakes or even malicious action but the list does not handle security issues. It is not the
intention of this evaluation module to evaluate system security.
All five disturbance categories described below shall be used for conformance.
5.1.2.1 Unexpected shutdown
Disturbances in this category simulate the unexpected shutdown of an OS, one or more application processes,
or the network link between components in the SUT.
Table 1 — Disturbances for unexpected shutdown
Disturbance name Description
Abrupt OS shutdown for This fault scenario represents the shutdown of the server OS. It is intended to
DBMS, application, HTTP, simulate the situation where an operator accidentally issues an OS shutdown
and messaging servers command either remotely or at the console. All the processes on the server are
stopped and the OS is halted gracefully. This is different from a system crash
due to a software defect, a power failure (which is tied to the hardware), or
accidentally shutting down by using the power switch.
Abrupt process shutdown This fault scenario represents the shutdown of one or more processes supplying
for DBMS, application, the component of the SUT. It is intended to simulate the situation where an
HTTP, and messaging operator accidentally issues an OS command to end the processes. This is
servers different from issuing a command to the processes to inform them of the need to
terminate. The only alert provided to the processes that "the end is near" is that
supplied by the OS to all processes that are to be ended. (E.g. signal 9 in Linux).
Network shutdown for This fault scenario represents the shutdown of the network link between critical
DBMS, application, HTTP, components of the SUT. It is intended to simulate the situation where the
and messaging servers network becomes unavailable because of a pulled cable, faulty switch, or OS
level loss of network control.
© ISO/IEC 2010 – All rights reserved 5

5.1.2.2 Resource contention
Disturbances in this category simulate the case in which resources on a machine in the SUT are exhausted
because of an unexpected process, user action, or application error.
Table 2 — Disturbances for resource contention
Disturbance name Description
Memory hog on DBMS, This fault scenario represents the case where all the physical memory on the
application, HTTP, and system is exhausted. It is intended to simulate the situation in which a certain
messaging servers process in the machine stops being a good citizen and takes over all the physical
memory. All the free physical memory of the system is taken up by the hog process.
This disturbance is complicated by the virtual memory system, so the current
implementation is to request all physical memory and randomly access within this
memory to simulate page requests.
I/O hog on DBMS This fault scenario represents the case where the disk bandwidth of the physical
server drive containing the business data is saturated. It is intended to simulate the
situation in which a certain process in the machine stops being a good citizen and
creates unplanned heavy disk I/O activities. The disk actuator is busy servicing read
or writes requests all the time. This shouldn’t be confused with the case where the
bandwidth of the I/O bus is saturated.
DBMS runaway query This fault scenario represents the case where the DBMS is servicing a runaway
query. It is intended to simulate the situation in which a long-running, resource-
intensive query is accidentally kicked off during operation hours. It shouldn’t be
confused with a batch of smaller queries being executed.
Messaging server This fault scenario represents the case where the message queue is flooded with
poison message flood many poison messages. A poison message is a message that the receiving
application is unable to process, possibly because of an unexpected message
format. It is intended to simulate the situation in which the operator configures a
wrong queue destination. A large number of poison messages are sent to the
message queue. This shouldn’t be confused with the case where the application is
causing a queue overflow.
DBMS and messaging This fault scenario represents the case where the system runs out of disk space. It
server storage is intended to simulate the situation in which a certain process in the machine stops
exhaustion being a good citizen and abuses the disk quota. All the disk space of the drives
containing the business data is taken up by the hog process.
Network hog on HTTP, This fault scenario represents the case where the network link between two
application, DBMS, and systems in the SUT is saturated with network traffic. It is intended to simulate the
messaging servers situation where a certain process in the machine stops being a good citizen and
transfers excessive data on a critical network link. This test should be performed in
a private network such as with its own private switch or contained within a network
segment to avoid impacting other systems in the wider network.
Deadlock on DBMS This fault scenario represents the case in which a deadlock involving one or more
server applications leaves a significant number of resources (rows or tables) in the DBMS
locked, making them inaccessible to all applications. Any queries on the DBMS that
require these locked resources will not complete successfully.
Memory leak in a user This fault scenario represents the case in which a user application causes a
application memory leak that exhausts all available memory on the system. It is intended to
simulate the case in which a poorly written application is deployed onto an
application server.
6 © ISO/IEC 2010 – All rights reserved

5.1.2.3 Loss of data
Disturbances in this category simulate a scenario in which business-critical data is lost.
Table 3 — Disturbances for loss of data
Disturbance name Description
This fault scenario represents the loss of database file which contains critical
DBMS loss of file
business data. It is intended to simulate the situation where an operator accidentally
issues an OS command to delete the one or more database files that contain data
for a particular database object. The DBMS can no longer address the file from the
file system. This is different from an OS file handle loss, which is considered a bug
in the OS
DBMS and messaging This fault scenario represents the loss of a physical hard drive that contains the
loss of disk business data. It is intended to simulate the case where a hard drive is damaged
such that the disk controller marks the targeted hard drive as offline.

5.1.2.4 Load resolution
Disturbances in this category simulate a sudden increase in the workload on the system.
Table 4 — Disturbances for load resolution
Disturbance name Description
Significantly increased This fault scenario represents the case where the load on the SUT increases
load handling and drastically (generally about 10 times the previous load). It is intended to simulate
resolution the situation where a significantly heavy load is introduced because of a
catastrophic event or failure of the primary system. The optimal result for this
disturbance is to handle at least the same amount of business as before without
being overwhelmed by the extreme increase in requests. Technologies that
illustrate this characteristic would be flow control and quality of service monitors.

5.1.2.5 Detection of restart failure
Disturbances in this category simulate a situation in which an application or the component it depends on is
corrupted and cannot be restarted.
Table 5 — Disturbances for restart failure
Disturbance name Description
Process restart failure The fault scenario represents the case where the software component fails to
of DBMS, application, restart. It is intended to simulate the case where a key file, or data, that is required
HTTP, and messaging during the start-up process is lost. When the software program is restarted, it fails
servers at the point where the key file or data cannot be loaded.

© ISO/IEC 2010 – All rights reserved 7

5.2 Input for the evaluation
NOTE Section A.5 to A.8 in the sample report provides an example of the type of output documented here.
5.2.1 The SUT description
5.2.1.1 Specification of the hardware and OS configuration
The properties of the hardware architecture and configuration shall be described in sufficient details to allow replication of
the hardware and OS configuration. These include but not limited to the following:
• vendor and model number;
• system availability date;
• CPU (processor type, number and speed (MHz/GHz) of the CPUs);
• cache (L1, L2, L3, etc);
• main memory (in Megabytes);
• disks and file system used;
• network interface;
• number of systems with this exact same configuration;
• OS (product name, vendor, and availability date);
• OS tuning parameters and options changed from the defaults;
• compilation and linkage options and run-time optimizations used to create/install OS;
• logical or physical partitioning used on this system to host software instances;
• which software components, application, and additional software from 5.2.1.2, 5.2.1.3, and 5.2.1.4 run
on this hardware.
5.2.1.2 Specification of the software component configuration
The properties of the software components such as web server, application server, message server, database
server, JVM, etc, that the applications use shall be described in sufficient details to allow replication of the
software configuration. These include but not limited to the following:
• vendor name, product name and version, and availability date;
• tuning parameters and options changed from the defaults;
• compilation and linkage options and run-time optimization used to create/install the software
component;
• number of instances on each system.
5.2.1.3 The application programs
All programs used by the emulated users shall be presented on a digital storage medium. These programs
shall be ready for use on the SUT (either as an executable program or the complete source code). They shall
be described in sufficient details to allow replication of the software configuration. These include but not
limited to the following:
• vendor name, product name and version, and availability date;
• tuning parameters and options changed from the defaults;
8 © ISO/IEC 2010 – All rights reserved

• compilation and linkage options and run-time optimization used to create/install the software
component;
• number of instances on each system.
5.2.1.4 Additional software required
A list of all additional software components or standard system software modules which are needed to run
shall be described in sufficient details to allow replication of the software configuration. These include but not
limited to the following:
• vendor name, product name and version, and availability date;
• tuning parameters and options changed from the defaults;
• compilation and linkage options and run-time optimization used to create/install the software
component;
• number of instances on each system.
For the baseline run, this shall include the test driver that simulates the multiple users and drives the test
scripts.
For the test run, this shall include the fault injection software.
5.2.1.5 The stored data
All data, which are needed by the programs for their correct working or which have any influence on the
performance of the SUT so long as they are not contained in the descriptions of the task type input, shall be
presented in their entirety on digital storage medium. They shall be formatted ready for immediate use and
storage on the SUT without any further modification. Examples of such data can be:
• data files,
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...