ISO/IEC 19795-2:2007
(Main)Information technology - Biometric performance testing and reporting - Part 2: Testing methodologies for technology and scenario evaluation
Information technology - Biometric performance testing and reporting - Part 2: Testing methodologies for technology and scenario evaluation
ISO/IEC 19795-2:2007 addresses two specific biometric performance testing methodologies: technology and scenario evaluation. The large majority of biometric tests are of one of these two generic evaluation types. Technology evaluations evaluate enrolment and comparison algorithms by means of previously collected corpuses, while scenario evaluations evaluate sensors and algorithms by processing of samples collected from Test Subjects in real time. The former is intended for generation of large volumes of comparison scores and candidate lists indicative of the fundamental discriminating power of an algorithm. The latter is intended for measurement of performance in modeled environments, inclusive of Test Subject-system interactions. ISO/IEC 19795-2:2007 provides requirements and recommendations on data collection, analysis and reporting specific to the two primary types of evaluation: technology evaluation and scenario evaluation. It specifies requirements in the following areas: development and full description of protocols for technology and scenario evaluations; execution and reporting of biometric evaluations reflective of the parameters associated with biometric evaluation types.
Technologies de l'information — Essais et rapports de performance biométriques — Partie 2: Méthodologies d'essai pour l'évaluation des technologies et du scénario
General Information
- Status
- Published
- Publication Date
- 11-Jan-2007
- Technical Committee
- ISO/IEC JTC 1/SC 37 - Biometrics
- Drafting Committee
- ISO/IEC JTC 1/SC 37 - Biometrics
- Current Stage
- 9093 - International Standard confirmed
- Start Date
- 06-Sep-2024
- Completion Date
- 30-Oct-2025
Relations
- Effective Date
- 06-Jun-2022
- Effective Date
- 14-Oct-2020
Overview
ISO/IEC 19795-2:2007 defines standardized methodologies for biometric performance testing and reporting, focusing on the two primary evaluation types used across biometric systems: technology evaluation and scenario evaluation. It builds on the principles in Part 1 and provides requirements and recommendations for protocol development, data collection, analysis and reporting so results are meaningful, repeatable and comparable.
- Technology evaluation: offline testing of enrolment and comparison algorithms using pre‑collected corpuses to generate large volumes of comparison scores and candidate lists indicative of an algorithm’s discriminating power.
- Scenario evaluation: online or real‑time testing that includes sensors, system behavior and Test Subject - system interactions to measure performance in modeled environments.
Key topics and requirements
ISO/IEC 19795-2:2007 prescribes the structure and content of biometric tests and reports, including:
- Test protocol development: full description of test objectives, environments, population characteristics and procedures.
- Test design: guidance on selecting technology vs scenario methodologies and tailoring tests for identification or verification use-cases.
- Data collection and corpuses: requirements for assembling appropriate sample corpuses for technology tests and for real‑time sample capture in scenario tests.
- Performance measurement: standardized metrics and measurement processes (e.g., failure‑at‑source, false match/non‑match considerations) to ensure repeatability and comparability.
- Execution and reporting: template-style expectations for reporting methods, results, and assumptions so stakeholders can interpret outcomes reliably.
- Test crew, fairness and legal considerations: roles, human factors (effort, habituation), fairness expectations, release of source code and supplier comment procedures.
- Conformance rules: evaluation conformance differs by methodology (technology vs scenario) and comparison type (identification vs verification).
Applications and who uses it
ISO/IEC 19795-2 is practical for organizations involved in biometric performance testing, including:
- Biometric system developers and algorithm researchers - to benchmark algorithms using standardized technology evaluation methods.
- System integrators and vendors - to validate sensors and end-to-end systems via scenario evaluations.
- Test laboratories and certification bodies - to design repeatable test protocols and produce comparable reports.
- Deployers and procurement teams - to evaluate candidate systems under representative operational scenarios.
- Academics and R&D groups - for reproducible experimental design and cross-algorithm comparisons.
Keywords: biometric performance testing, biometric evaluation methodologies, ISO/IEC 19795-2:2007, technology evaluation, scenario evaluation, biometric algorithms, biometric sensors, test protocol.
Related standards
- ISO/IEC 19795-1 - Principles and framework for biometric performance testing and reporting (normative reference).
- Other parts under development (modality‑specific testing, data interchange formats, access control performance) expand modality and interoperability guidance.
Frequently Asked Questions
ISO/IEC 19795-2:2007 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology - Biometric performance testing and reporting - Part 2: Testing methodologies for technology and scenario evaluation". This standard covers: ISO/IEC 19795-2:2007 addresses two specific biometric performance testing methodologies: technology and scenario evaluation. The large majority of biometric tests are of one of these two generic evaluation types. Technology evaluations evaluate enrolment and comparison algorithms by means of previously collected corpuses, while scenario evaluations evaluate sensors and algorithms by processing of samples collected from Test Subjects in real time. The former is intended for generation of large volumes of comparison scores and candidate lists indicative of the fundamental discriminating power of an algorithm. The latter is intended for measurement of performance in modeled environments, inclusive of Test Subject-system interactions. ISO/IEC 19795-2:2007 provides requirements and recommendations on data collection, analysis and reporting specific to the two primary types of evaluation: technology evaluation and scenario evaluation. It specifies requirements in the following areas: development and full description of protocols for technology and scenario evaluations; execution and reporting of biometric evaluations reflective of the parameters associated with biometric evaluation types.
ISO/IEC 19795-2:2007 addresses two specific biometric performance testing methodologies: technology and scenario evaluation. The large majority of biometric tests are of one of these two generic evaluation types. Technology evaluations evaluate enrolment and comparison algorithms by means of previously collected corpuses, while scenario evaluations evaluate sensors and algorithms by processing of samples collected from Test Subjects in real time. The former is intended for generation of large volumes of comparison scores and candidate lists indicative of the fundamental discriminating power of an algorithm. The latter is intended for measurement of performance in modeled environments, inclusive of Test Subject-system interactions. ISO/IEC 19795-2:2007 provides requirements and recommendations on data collection, analysis and reporting specific to the two primary types of evaluation: technology evaluation and scenario evaluation. It specifies requirements in the following areas: development and full description of protocols for technology and scenario evaluations; execution and reporting of biometric evaluations reflective of the parameters associated with biometric evaluation types.
ISO/IEC 19795-2:2007 is classified under the following ICS (International Classification for Standards) categories: 35.040 - Information coding; 35.240.15 - Identification cards. Chip cards. Biometrics. The ICS classification helps identify the subject area and facilitates finding related standards.
ISO/IEC 19795-2:2007 has the following relationships with other standards: It is inter standard links to ISO 2320:2015, ISO/IEC 19795-2:2007/Amd 1:2015. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.
ISO/IEC 19795-2:2007 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.
Standards Content (Sample)
INTERNATIONAL ISO/IEC
STANDARD 19795-2
First edition
2007-02-01
Information technology — Biometric
performance testing and reporting —
Part 2:
Testing methodologies for technology
and scenario evaluation
Technologies de l'information — Essais et rapports de performance
biométriques —
Partie 2: Méthodologies d'essai pour l'évaluation des technologies et du
scénario
Reference number
©
ISO/IEC 2007
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.
© ISO/IEC 2007
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2007 – All rights reserved
Contents Page
Foreword. iv
Introduction . v
1 Scope .1
2 Conformance.1
3 Normative references .1
4 Terms and definitions .2
4.1 Biometric data.2
4.2 Components of a biometric system.2
4.3 User interaction with a biometric system .2
4.4 Performance measures .3
5 Overview of technology evaluations and scenario evaluations.3
6 Technology evaluation.6
6.1 Test design.6
6.2 Assembling an appropriate test corpus.8
6.3 Performance measurement .11
6.4 Reporting .16
7 Scenario evaluation.18
7.1 Test design.18
7.2 Test crew .23
7.3 Performance measurement .24
7.4 Reporting .26
8 Other issues applicable to technology and scenario evaluations .29
8.1 Parties to a test .29
8.2 Fairness .29
8.3 Basis for inclusion of test systems .29
8.4 Use of Frequently Asked Questions.30
8.5 Legal issues .30
8.6 Release of test source code .30
8.7 Supplier comment on test report .30
Annex A (informative) Phases and activities for primary technology test types .31
Annex B (informative) Relationship between presentations, attempts, and transactions .37
Annex C (informative) Reporting effort levels.38
Annex D (informative) Client-server testing .40
Annex E (informative) Comparing results across systems in multi-system tests .41
© ISO/IEC 2007 – All rights reserved iii
Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 19795-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 37, Biometrics.
ISO/IEC 19795 consists of the following parts, under the general title Information technology — Biometric
performance testing and reporting:
⎯ Part 1: Principles and framework
⎯ Part 2: Testing methodologies for technology and scenario evaluation
The following parts are under preparation:
⎯ Part 3: Modality-specific testing [Technical Report]
⎯ Part 4: Performance and interoperability testing of data interchange formats
⎯ Part 5: Performance of biometric access control systems
iv © ISO/IEC 2007 – All rights reserved
Introduction
This part of ISO/IEC 19795 addresses two specific biometric performance testing methodologies: technology
and scenario evaluation. The large majority of biometric tests are of one of these two generic evaluation types.
Technology evaluations evaluate enrolment and comparison algorithms by means of previously collected
corpuses, while scenario evaluations evaluate sensors and algorithms by processing of samples collected
from Test Subjects in real time. The former is intended for generation of large volumes of comparison scores
and candidate lists indicative of the fundamental discriminating power of an algorithm. The latter is intended
for measurement of performance in modeled environments, inclusive of Test Subject-system interactions.
This part of ISO/IEC 19795 builds on requirements and best practices specified in ISO/IEC 19795-1, which
addresses specific philosophies and principles that can be applied over a broad range of test conditions.
This part of ISO/IEC 19795 is meant to provide biometric system developers, deployers and end users with
mechanisms for design, execution and reporting of biometric performance tests in a fashion that allows
meaningful benchmarking of biometric performance within and across technologies, usage scenarios and
environments.
© ISO/IEC 2007 – All rights reserved v
INTERNATIONAL STANDARD ISO/IEC 19795-2:2007(E)
Information technology — Biometric performance testing and
reporting —
Part 2:
Testing methodologies for technology and scenario evaluation
1 Scope
This part of ISO/IEC 19795 provides requirements and recommendations on data collection, analysis and
reporting specific to two primary types of evaluation: technology evaluation and scenario evaluation.
This part of ISO/IEC 19795 specifies requirements in the following areas:
⎯ development and full description of protocols for technology and scenario evaluations;
⎯ execution and reporting of biometric evaluations reflective of the parameters associated with biometric
evaluation types.
2 Conformance
A test shall claim conformance to either the technology evaluation or scenario evaluation clauses of this part
of ISO/IEC 19795.
The set of clauses to which a scenario test shall conform differs from the set of clauses to which a technology
test shall conform. In addition, the set of clauses to which an identification-system test shall conform differs
from the set of clauses to which a verification-system test shall conform. To conform to this part of
ISO/IEC 19795, an evaluation shall conform to clauses of this part of ISO/IEC 19795 as shown in Table 1.
Table 1 — Conformance for evaluation methodologies and comparison types
Evaluation methodology Comparison type Required clauses
Technology or scenario Identification or verification Clauses 5 and 8
Technology Identification All of Clause 6, except 6.3.3
Technology Verification All of Clause 6, except 6.3.4
Scenario Identification All of Clause 7, except 7.3.4
Scenario Verification All of Clause 7, except 7.3.5
3 Normative references
The following referenced documents are indispensable for the application of this document. For dated
references, only the edition cited applies. For undated references, the latest edition of the referenced
document (including any amendments) applies.
ISO/IEC 19795-1, Information technology — Biometric performance testing and reporting — Part 1: Principles
and framework
© ISO/IEC 2007 – All rights reserved 1
4 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 19795-1:2006 and the following
apply.
4.1 Biometric data
4.1.1
biometric reference
〈template, model〉 user’s stored reference measure based on features extracted from enrolment samples
4.2 Components of a biometric system
4.2.1
feature extractor
apparatus that extracts features from a sample
4.2.2
biometric reference generator
apparatus that transforms a sample into a biometric reference
4.3 User interaction with a biometric system
4.3.1
acclimatization
reduction, over the course of an evaluation, in a temporal condition of a biometric characteristic that may
impact the ability of a sensor to process a sample
4.3.2
effort level
number of presentations, attempts or transactions needed to successfully enrol or match in a biometric system
4.3.3
enrolment attempt
submission of one or more biometric samples for a Test Subject for the purpose of enrolment in a biometric
system
NOTE 1 One or more enrolment attempts may be permitted or required to constitute an enrolment transaction. An
enrolment attempt may comprise one or more enrolment presentations.
NOTE 2 See Annex B for illustration of the relationship between presentation, attempt and transaction.
4.3.4
enrolment attempt limit
maximum number of attempts, or the maximum duration, a Test Subject is permitted before an enrolment
transaction is terminated
4.3.5
enrolment presentation
submission of an instance of a biometric characteristic for a Test Subject for the purpose of enrolment
NOTE One or more enrolment presentations may be permitted or required to constitute an enrolment attempt. An
enrolment presentation may or may not result in an enrolment attempt.
4.3.6
enrolment presentation limit
maximum number of presentations, or the maximum duration, a Test Subject is permitted before an enrolment
attempt is terminated
2 © ISO/IEC 2007 – All rights reserved
4.3.7
guidance
direction provided by an Administrator to a Test Subject in the course of enrolment or recognition
NOTE Guidance is separate from feedback provided by a biometric system or device in the course of enrolment or
recognition, such as audible or visual presentation queues.
4.3.8
habituation
degree of familiarity a Test Subject has with a device
NOTE A Test Subject having substantial familiarity with a biometric device, such as that gained in the course of
employment, is referred to as a habituated Test Subject.
4.3.9
comparison attempt
submission of one or more biometric samples for a Test Subject for the purpose of comparison in a biometric
system
4.3.10
comparison attempt limit
maximum number of attempts, or the maximum duration, a Test Subject is permitted before a comparison
transaction is terminated
4.3.11
comparison presentation
submission of an instance of a single biometric characteristic for a Test Subject for the purpose of comparison
NOTE One or more comparison presentations may be permitted or required to constitute a comparison attempt. A
comparison presentation may or may not result in a comparison attempt.
4.3.12
comparison presentation limit
maximum number of presentations, or the maximum duration, a Test Subject is permitted before a
comparison attempt is terminated
4.4 Performance measures
4.4.1
failure at source rate
proportion of samples discarded from the corpus either manually or by use of an automated biometric system
prior to use in a technology evaluation
EXAMPLE A proportion of images collected in a face data collection effort may be discarded due to lack of a face in
the image.
5 Overview of technology evaluations and scenario evaluations
This standard addresses two types of evaluation methodologies: technology evaluations and scenario
evaluations. A test report shall state whether it presents results from a technology evaluation, a scenario
evaluation, or an evaluation that combines aspects of both technology and scenario evaluations.
Technology evaluation is the offline evaluation of one or more algorithms for the same biometric modality
using a pre-existing or specially-collected corpus of samples. The utility of technology testing stems from its
separation of the human-sensor acquisition interaction and the recognition process, whose benefits include
the following:
© ISO/IEC 2007 – All rights reserved 3
⎯ Ability to conduct full cross-comparison tests. Technology evaluation affords the possibility to use the
entire testing population as claimants to the identities of all other members (i.e. impostors) and this allows
estimates of false match rates to be made to on the order of one in N , rather than one in N.
⎯ Ability to conduct exploratory testing. Technology evaluation can be run with no real-time output
demands, and is thus well-suited to research and development. For example, the effects of algorithmic
improvements, changes in run time parameters such as effort levels and configurations, or different
image databases, can be measured in, essentially, a closed-loop improvement cycle.
⎯ Ability to conduct multi-instance and multi-algorithmic testing. By using common test procedures,
interfaces, and metrics, technology evaluation affords the possibility to conduct repeatable evaluations of
multi-instance systems (e.g. three views of a face) and multi-algorithmic (e.g. supplier A and supplier B)
performance, or any combination thereof.
⎯ Provided the corpus contains appropriate sample data, technology testing is potentially capable of testing
all modules subsequent to the human-sensor interface, including: a quality control and feedback
module(s), signal processing module(s), image fusion module(s) (for multi-modal or multi-instance
biometrics), feature extraction and normalization module(s), feature-level fusion module(s), comparison
score computation and fusion module(s), and score normalization module(s).
⎯ The nondeterministic aspects of the human-sensor interaction preclude true repeatability and this
complicates comparative product testing. Elimination of this interaction as a factor in performance
measurement allows for repeatable testing. This offline process can be repeated ad infinitum with little
marginal cost.
⎯ If sample data is available, performance can be measured over very large target populations, utilizing
samples acquired over a period of years.
NOTE 1 Collecting a database of samples for offline enrolment and calculation of comparison scores allows greater
control over which samples and attempts are to be used in any transaction.
NOTE 2 Technology evaluation will always involve data storage for later, offline processing. However, with scenario
evaluations, online transactions might be simpler for the tester — the system is operating in its usual manner and storage
of samples, although recommended, is not absolutely necessary.
Scenario evaluation is the online evaluation of end-to-end system performance in a prototype or simulated
application. The utility of scenario testing stems from the inclusion of human-sensor acquisition interaction in
conjunction with the enrolment and recognition processes, whose benefits include the following:
⎯ Ability to gauge impact of additional attempts and transactions on system's ability to enrol and recognize
Test Subjects.
⎯ Ability to collect throughput results for enrolment and recognition trials inclusive of presentation and
sample capture duration.
NOTE 3 In online evaluations, the Experimenter may decide not to retain biometric samples, reducing storage
requirements and in certain cases ensuring fidelity to real-world system operations. However, retention of samples in
online tests is recommended for auditing and to enable subsequent offline analysis.
NOTE 4 Testing a biometric system will involve the collection of input images or signals, which are used for biometric
reference generation at enrolment and for calculation of comparison scores at later attempts. The images/signals collected
can either be used immediately for an online enrolment, verification or identification attempt, or may be stored and used
later for offline enrolment, verification or identification.
4 © ISO/IEC 2007 – All rights reserved
Information on differences between technology and scenario evaluations is presented in Table 2.
Table 2 — Distinctions between technology and scenario evaluations
Technology Evaluations Scenario Evaluations
What is tested Biometric component (comparison or extraction Biometric system.
algorithm).
Objective of test Measure performance of algorithm(s) on a Measure performance of end-to-end
standardized corpus. system in simulated application.
Ground truth Known associations between data samples and Known associations between system
source of samples, subject to data collection decisions and independently recorded
errors and intersections in merged data sets. sources of presented samples, subject to
data collection errors and tester failure to
note unwanted Test Subject behaviour.
Test Subject behaviour Not applicable during testing. Controlled (unless Test Subject behaviour
controlled by is an independent variable).
May be known to be controlled when biometric
Experimenter
data recorded, otherwise considered to be
uncontrolled.
Test Subject has real-No. Yes.
time feedback of the
result of attempt
Repeatability of results Repeatable. Quasi-repeatable (if test environment
conditions and human factors variables
are controlled).
Control of physical May be known to be controlled when biometric Controlled and/or recorded.
environment data recorded, otherwise considered to be
uncontrolled.
Test Subject interaction Not applicable during testing. Recorded.
recorded
May be recorded when biometric data recorded.
Typical results reported Relative robustness of biometric components or Relative robustness of biometric systems.
versions of components (e.g., comparison or
Determine critical performance factors.
extraction algorithms).
Measure simulated performance.
Determine critical performance factors.
Typical metrics Most error rates. Predicted end-to-end throughput.
Not end-to-end throughput. False match rate, false non-match rate.
Good for large-scale identification system Failure to acquire, failure to enrol.
performance where difficult to assemble large
GFAR, GFRR.
test crew.
Constraints Appropriate test database, e.g., gathered with Operational, instrumented system.
one or more sensors, the identity of which may
or may not be known.
Human test population Recorded. Real time participation.
NOTE 5 Although in some cases there may be exceptions to the entries in this table, these are the main distinctions.
© ISO/IEC 2007 – All rights reserved 5
6 Technology evaluation
6.1 Test design
6.1.1 Goals
An evaluation shall be designed to evaluate a system's enrolment, acquisition and matching functions on the
target application.
6.1.2 Application realism
If the test intends to evaluate performance within an application or concept of operations, the test shall be
designed and executed so that it mimics the functional (input to output) and procedural (e.g. enrolment or
verification processes) aspects of such an application or concept of operations.
EXAMPLE If several images are typically gathered to constitute an enrolment transaction in a real-world enrolment
attempt, technology test design should follow a similar process.
For testing purposes, the implementations under test should, if possible, return the comparison score of each
comparison attempt.
6.1.3 Determination of appropriate performance measures
Experimenters shall determine which performance measures are applicable to their evaluation, in addition to
those listed at clause 6.3.
Test design shall ensure that all required metrics can be generated.
Experimenters shall determine and report on the type(s) of comparison functionality to be incorporated within
the technology test. One or more of the following types of comparison shall be specified:
a) verification
b) open-set identification
c) closed-set identification
The rationale for selection of one or more types of comparison functionality within a technology test shall be
reported. The comparison functionality evaluated should be applicable to the algorithm in question, such that
systems designed to conduct a specific type of comparison such as watchlist identification are tested in a
fashion that generates the appropriate type of result.
NOTE Formulae for error rate calculation are provided in ISO/IEC 19795-1:2006, Clause 7.
6.1.4 Implementation primacy
The test plan shall not dictate the method(s) by which the biometric recognition system implements its
functions. It is the responsibility of the biometric recognition implementation to perform its functions in its own
way.
NOTE The separation of what a tested biometric system does from how it does it is the fundamental construct for
allowing offline testing to be done. It is primarily useful in establishing the responsibilities of tester versus supplier. The
system under test should be regarded wherever possible as a black box: Its essential function is to render decisions on
input samples. The internal details of how this occurs may be proprietary, but in any case, are of no concern to the tester.
This construction facilitates the testing of arbitrary biometric samples.
6 © ISO/IEC 2007 – All rights reserved
EXAMPLE 1 If a fingerprint is sampled at 1000 dpi, and a test device is known to process only half that, then the tester
should a) not execute the down-sampling on the basis that the method for doing so is non-trivial, and b) apprise the
supplier of the need to handle the down-sampling internally.
EXAMPLE 2 A set of simultaneously acquired non-frontal face images could be processed by a biometric system and
device in at least three ways: selection of the best image; fusion of all images; or stereoscopic synthesis of a 3D model. In
any case the biometric system or device decides.
EXAMPLE 3 Most automated fingerprint identification system (AFIS) machines (i.e. those identifying multi-fingerprint
records) implement some binning mechanism to partition the database according to some criterion (most simply, the
Henry class) and to search only that part of the database of the same category as a user or impostor sample, thereby
obtaining throughput benefits, but possibly incurring accuracy losses. Such a tradeoff is achieved by the supplier setting
internal binning parameters, and measured by conducting full-scale repeats of the test for each configuration.
EXAMPLE 4 In a study seeking to demonstrate the utility of multiple fingers' prints in a recognition system, the tester
should not pass separate samples through the device and perform subsequent score-level fusion, but should instead
compose all the imagery as a sample (e.g. in an American National Standards Institute [ANSI]-National Institute of
Standards and Technology [NIST] record, or a Common Biometric Exchange Formats Framework [CBEFF] wrapped
instance of 19794-2) so as to let the biometric device perform the fusion internally. See ISO/IEC 19794-2, Information
technology — Biometric data interchange formats — Part 2: Finger minutiae data for further information on CBEFF
wrapped instances. See ANSI/NIST-ITL 1-2000 NIST Special Publication 500-245 for information on ANSI-NIST records.
6.1.5 Policies on disclosure of information to suppliers
The tester shall formulate policies before testing begins that govern what information will be disclosed to the
suppliers a) before test equipment is configured, shipped or installed, and b) at execution time.
6.1.6 Non-interchangeability of identification and verification attempts
Comparison scores that result from a one-to-many identification search shall not be presented as the results
of verification attempts without justification.
NOTE 1 The principle of operational realism indicates that performance shall be estimated from outcomes of attempts
(i.e. rejects and acceptances). A verification system shall be evaluated on the results of a sequence of user claims of
identity. Likewise an identification system shall be tested over one-to-many searches. Even in the case that a one-to-many
search produces a full candidate list, the candidate list is atomic, meaning that it should not be regarded as the result of N
verification attempts (to be used in the computation of verification performance).
NOTE 2 The non-equivalency of a single identification attempt and N one-to-one verifications arises because
verification can be improved by comparing the user sample with additional hidden samples in a process known as cohort-
normalization. The method adjusts the raw single comparison score in order to drive down false accept rates by effectively
setting user-specific thresholds. The method trades off performance for throughput because the additional comparisons
mean 1:1 verification incurs the expense of 1:M, where M is the size of the hidden biometric reference set.
NOTE 3 The use of cohort normalization is properly conducted internally to the device, making private use of an
internally selected enrolled population.
6.1.7 Acknowledgement of models
If a model, approximation or prediction of identification performance is reported in place of, or in addition to, an
empirical trial, the model shall be verified to the extent possible with the available data and fully documented.
6.1.8 Sequential use
The test plan shall define the order of use of the test data. This order shall be appropriate to the application.
The implementation should process the test data in this sequence.
NOTE 1 Transactions are ordinarily executed separately. Therefore the implementation would need to complete one
transaction before commencing the next transaction.
© ISO/IEC 2007 – All rights reserved 7
NOTE 2 The majority of biometric applications involve the sequential and separate use of the biometric systems or
device by individuals, subsequent, in the case of genuine users, to prior enrolment.
NOTE 3 Certain identification tasks may not be sequential. For example batch identification of all persons in a closed
room is easier because it reduces to the linear assignment problem.
6.1.9 Pre-test procedures
6.1.9.1 Installation and validation of correct operation
The test organization shall take steps to ensure that the hardware/software is installed and configured
appropriately and shall verify that the system is operating correctly.
NOTE Installation, configuration, and verification of system operations may involve supplier(s).
6.1.9.2 Data preparation
Data preparation shall ensure that Test Subject-identifying information and any associated metadata that
would not ordinarily be available to the application (e.g. sex, age) is expunged from the samples. Otherwise a
supplier might deduce the true identities to game the test.
6.1.10 Generic test execution sequence
The following is a generic description of the sequence of technology test execution:
⎯ Enrolment samples are converted to biometric references and may be stored in a linear collection.
⎯ Identification and verification samples are converted to sample features.
⎯ Verification attempts are a direct comparison of sample features to a biometric reference.
⎯ Closed-set identification attempts are a search of the enrolled population intended to return the user’s
identifier.
⎯ Open-set identification attempts search the enrolled database and
⎯ return one or more identities;
⎯ return a null identity, indicating Test Subject is not found in the enrolled database.
NOTE 1 The above functionality may be implemented at the API level, or by scripting around executables.
NOTE 2 Annex A describes test execution sequences for specific types of technology tests.
6.2 Assembling an appropriate test corpus
6.2.1 General
Technology evaluation is designed to evaluate one or more biometric algorithms for enrolment and
comparison performance. Technology test planning is contingent on the type of data an Experimenter wishes
to generate.
6.2.2 Unique enrolment
All corpus samples should correspond to real people. An evaluation design should not intentionally enrol
different samples from the same individual as if they were from different individuals. For tests in which each
8 © ISO/IEC 2007 – All rights reserved
identity corresponds to a different individual, the testing organization shall report processes implemented to
ensure this.
If it is possible that an individual has multiple identities in the corpus, the corpus may be "cleaned" to reconcile
such instances if practical. Otherwise the test should proceed under the assumption that each identity
corresponds to a different individual.
NOTE 1 Biometric systems are intended to uniquely identify single individuals. If more than one image or signal is
available for an individual it should be encapsulated as a single sample and used for enrolment or comparison.
NOTE 2 Populating an identification system with more than one sample per individual (from within one or more
modalities) and then regarding the enrolments as nominally separate is a deprecated practice for the following reasons:
Identification entails a search through enrolled samples and generation of a candidate list. When multiple samples are
separately enrolled a score-level fusion of each user’s samples using the max criterion is implied because the largest
scored entry wins. Even if the number of samples per person is equal for all persons, the practice is deprecated because it
is the supplier’s responsibility to combine each individual’s samples in what it deems the best way.
Error metrics that depend on the size of the enrolled population, N, will be incorrect if the size of the enrolled population is
not the number of separate individuals.
NOTE 3 An evaluation that seeks to investigate the effect of multiple (separate) enrolled biometric references per Test
Subject is exempted from this clause, provided that this is documented in both the test plan and the test report.
6.2.3 Recurrence of data acquisition
Depending on the Experimenter’s level of access to the test population, each Test Subject may be able to
provide data multiple times over the course of multiple visits. The number of transactions and visits can be
maximized in order to enable granular measurement of biometric reference aging effects though this will also
be informed by habituation effects.
6.2.4 Test Subject identification
The Experimenter shall report information related to Test Subject identification, including at a minimum the
following:
a) Types of identifiers used to identify Test Subjects
b) Amount and type of personal data collected
6.2.5 Provision of non-biometric information
If available in the corpus, metadata normally available to a deployed system shall be provided to the system(s)
under test. The test report shall state the names and types of any metadata variables that were made
available to the systems under test.
EXAMPLE Such data may be sensor-specific (e.g. sensor settings), environmental (e.g. temperature, humidity), Test
Subject-specific (e.g. gender, age), or any other germane information.
NOTE Technology testing is unable to incorporate multiple aspects of real-world biometric operations, but evaluation
design should not exclude aspects of real-world biometric operations unnecessarily.
6.2.6 Representativeness of corpus
Evaluation design shall consider, and a test report shall document, whether the data in the test corpus is
appropriate for the goals of the test or the applications of interest.
If data is acquired under the supervision or control of the test organization, information pertaining to
Experimenter-Test Subject interaction shall be recorded in the areas of acclimatization, training, habituation,
and guidance.
© ISO/IEC 2007 – All rights reserved 9
NOTE 1 The utility of technology evaluation in producing predictive estimates of deployed performance is predicated
upon the assumption that it is possible to consistently acquire samples from users in the same format and with the same
quality as the data used in the test.
NOTE 2 Ideally, data collected for different modalities has equivalent levels of habituation, acclimatization, guidance,
etc.
6.2.7 Untainted corpus
A corpus may be considered “tainted” to a greater or lesser extent if
a) any implementation supplier has had possession of the corpus;
b) any implementation supplier has provided equipment used in collecting or processing the corpus,
particularly if this activity influenced the nature or quality of the corpus such as by excluding samples;
c) a system being tested has previously been tested and tuned using the corpus.
When use of a tainted corpus is unavoidable, this fact shall be documented in the test report.
Sample data should not be used in an evaluation if one or more of the participating suppliers has had
possession of it. Previous testing / tuning of the system using the test corpus (in whole or in part) shall be
documented in the test report.
NOTE 1 This clause is necessary because performance may be improved via gaming.
NOTE 2 Note that it is generally insufficient to trivially alter the sample to skirt the reuse prohibition. Gaming may still
be possible if any identifiable trait of the previously seen samples remains.
6.2.8 Retirement of corpus
Samples should not be reused in an evaluation if one or more systems under test have been tuned on the
basis of performance measured in a previous test with that data.
NOTE 1 This is most readily achievable by using sequestered data.
NOTE 2 This may be expensive in that it implies additional collection activity.
6.2.9 Corpus validation
Validation is the process whereby Test Subject data is screened for the purpose of removal of data not
suitable for the purpose of the evaluation.
Validation may include checking to ensure that Test Subject data is present, that the data is in the correct
format, that the correct instance has been collected, and that ground truth errors are identified.
Experimenters shall report whether Test Subject data has been validated. If data has been validated, the
Experimenter shall detail the method(s) applied in validating the data. The proportion and criteria for data
removal shall be reported.
EXAMPLE 1 Database quality control might be used to avoid images with bad contrast in the Test Subject data.
EXAMPLE 2 Data samples might be excluded that do not show a face in face recognition technology (e.g. no face at
all or full body) or that do not show a fingerprint in a fingerprint recognition technology test (e.g. a palm print).
NOTE 1 Since some types of biometric data may be more easily validated than others, use of data validation could
introduce a bias in performance results.
NOTE 2 Data removed by “corpus validation” is distinct from that discarded as “Failure at source.” Sometimes a
judgement call will be needed as to whether excluded data should be considered invalid or failure at source.
10 © ISO/IEC 2007 – All rights reserved
6.2.10 Corpus collection environment
Environmental conditions present during data collection may be known or specified. Such collection would
typically be intended to measure performance under specific environmental conditions relative to baseline
environmental conditions. Such controls may be established for temperature, lighting, humidity, and other
factors known or suspected to impact biometric performance.
Available information pertaining to environmental conditions during corpus acquisition relevant to the
modalities under evaluation should be reported, such as the following:
⎯ temperature;
⎯ exposure to elements;
⎯ lighting, including type, direction, intensity;
⎯ ambient noise;
⎯ vibration.
If applicable, Experimenters shall report that such information was not available.
NOTE See ISO/IEC 19795-1:2006, C.2.6 for information on environmental factors that can impact performance.
6.2.11 Failure at source
Offline tests use stored biometric samples, which may have been gathered with or without a biometric system
in the acquisition process. The test report shall disclose any known information about how the data was
processed at any stage before use in the test. Particularly if samples were discarded, either manually or by
use of an automated biometric system, then a failure at source rate (FAS) shall be reported.
NOTE 1 FAS may relate to a different biometric sensor or image quality assessment algorithm than the system under
test.
NOTE 2 A judgment call may be needed: For example if a few legacy image samples are found to be entirely blank
then these could be legitimately not counted in FAS, unless of course such samples would routinely occur in the
application that the test is intended to mimic.
6.3 Performance measurement
6.3.1 Enrolment
Offline tests shall record, as the failure to enrol rate (FTE), the proportion of Test Subjects for whom an
implementation elects to reject enrolment of their designated enrolment samples in the corpus. Criteria by
which failure to enrol is declared shall be defined.
NOTE 1 Failure to enrol measured in a technology test is only partially representative of the failure modes possible in a
live acquisition.
NOTE 2 A failure to enrol can be declared by a system for any reason. A frequent reason is that the system (as
configured with its native image or signal detection and processing capability and with some quality acceptance criteria)
fails to detect the needed signal on the basis of low quality.
NOTE 3 By declaring a failure to enrol, a system can attain better comparison performance. This trade-off must be
accounted for by combining failure to enrol and false non-match rate to produce generalized false reject rate (GFRR).
The Experimenter shall specify the minimum number of samples required, and the maximum number of
samples permitted, for successful enrolment.
© ISO/IEC 2007 – All rights reserved 11
For each biometric system tested, the Experimenter should calculate the following:
a) distribution of enrolment quality scores, if available;
b) failure to enrol for different demographic groups, or associated with different environmental conditions, or
for other logical segments of the corpus.
6.3.2 Failure to acquire
Offline tests shall record the proportion of verification or identification attempts for which the system fails to
capture or locate an image or signal of sufficient quality. This is the failure to acquire rate (FTA).
NOTE 1 The failure to acquire rate is the comparison-phase analogue of the enrolment-phase failure to enrol, and thus
Notes 1 and 2 in Clause 6.3.1 apply analogously.
NOTE 2 The failure to acquire must be used along with false match rate to compute false accept rate.
NOTE 3 In a technology test, FTA is typically declared by an encoding or comparison component and is attributable to
failure to process an attempt.
Th
...










Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.
Loading comments...