ASTM E2849-13
(Practice)Standard Practice for Professional Certification Performance Testing
Standard Practice for Professional Certification Performance Testing
SIGNIFICANCE AND USE
3.1 This practice for performance testing provides guidance to performance test sponsors, developers, and delivery providers for the planning, design, development, administration, and reporting of high-quality performance tests. This practice assists stakeholders from both the user and consumer communities in determining the quality of performance tests. This practice includes requirements, processes, and intended outcomes for the entities that are issuing the performance test, developing, delivering and evaluating the test, users and test takers interpreting the test, and the specific quality characteristics of performance tests. This practice provides the foundation for both the recognition and accreditation of a specific entity to issue and use effectively a quality performance test.
3.2 Accreditation agencies are presently evaluating performance tests with criteria that were developed primarily or exclusively for multiple-choice examinations. The criteria by which performance tests shall be evaluated and accredited are ones appropriate to performance testing. As accreditation becomes more critical for acceptance by federal and state governments, insurance companies, and international trade, it becomes more critical that appropriate standards of quality and application be developed for performance testing.
SCOPE
1.1 This practice covers both the professional certification performance test itself and specific aspects of the process that produced it.
1.2 This practice does not include management systems. In this practice, the test itself and its administration, psychometric properties, and scoring are addressed.
1.3 This practice primarily addresses individual professional performance certification examinations, although it may be used to evaluate exams used in training, educational, and aptitude contexts. This practice is not intended to address on-site evaluation of workers by supervisors for competence to perform tasks.
1.4 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety and health practices and determine the applicability of regulatory limitations prior to use.
General Information
Relations
Standards Content (Sample)
NOTICE: This standard has either been superseded and replaced by a new version or withdrawn.
Contact ASTM International (www.astm.org) for the latest information
Designation: E2849 −13 An American National Standard
Standard Practice for
1
Professional Certification Performance Testing
This standard is issued under the fixed designation E2849; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope 2.1.6 inter-rater reliability, n—measurement of rater consis-
tency with other raters.
1.1 This practice covers both the professional certification
2.1.6.1 Discussion—See rater reliability.
performance test itself and specific aspects of the process that
produced it.
2.1.7 item, n—scored response unit.
1.2 This practice does not include management systems. In
2.1.7.1 Discussion—See task.
this practice, the test itself and its administration, psychometric
2.1.8 item observer, n—human or computer element that
properties, and scoring are addressed.
observes and records a candidate’s performance on a specific
1.3 This practice primarily addresses individual profes-
item.
sional performance certification examinations, although it may
2.1.9 on the job, n—another term for “target context.”
be used to evaluate exams used in training, educational, and
2.1.9.1 Discussion—See target context.
aptitude contexts. This practice is not intended to address
on-site evaluation of workers by supervisors for competence to
2.1.10 performance test, n—examination in which the re-
perform tasks.
sponse modality mimics or reflects the response modality
required in the target context.
1.4 This standard does not purport to address all of the
safety concerns, if any, associated with its use. It is the
2.1.11 power test, n—examination in which virtually all
responsibility of the user of this standard to establish appro-
candidates have time to complete all items.
priate safety and health practices and determine the applica-
2.1.12 practitioners, n—people who practice the contents of
bility of regulatory limitations prior to use.
the test in the target context.
2. Terminology
2.1.13 raterreliability,n—measurementofraterconsistency
with a uniform standard.
2.1 Definitions—Some of the terms defined in this section
2.1.13.1 Discussion—See inter-rater reliability.
are unique to the performance testing context. Consequently,
terms defined in other standards may vary slightly from those
2.1.14 reconfiguration, n—modificationoftheuserinterface
defined in the following.
for a process, device, or software application.
2.1.1 candidate, n—someone who is eligible to be evaluated
2.1.14.1 Discussion—Reconfiguration ranges from adjust-
through the use of the performance test; a person who is or will
ing the seat in a crane to importing a set of macros into a
be taking the test.
programming environment.
2.1.2 construct validity, n—degree to which the test evalu-
2.1.15 reliability, n—degree to which the test will make the
ates an underlying theoretical idea resulting from the orderly
same prediction with the same examinee on another occasion
arrangement of facts.
with no training occurring during the intervening interval.
2.1.3 differential system responsiveness, n—measurable dif-
2.1.16 rubric, n—set of rules by which performance will be
ference in response latency between two systems.
judged.
2.1.4 examinee, n—candidate in the process of taking a test.
2.1.17 speeded test, n—examinationthatistime-constrained
2.1.5 gating item, n—unit of evaluation that shall be passed
so that more than 10 % of candidates do not finish all items.
to pass a test.
2.1.18 target context, n—situation within which a test is
designed to predict performance.
1
This practice is under the jurisdiction of ASTM Committee E36 on Accredi-
2.1.19 task, n—unit of performance requested for the can-
tation & Certification and is the direct responsibility of Subcommittee E36.80 on
didate to do; a task can be scored as one item; a task may also
Personnel Performance Testing and Assessment.
be comprised of multiple components each of which is scored
Current edition approved Dec. 1, 2013. Published December 2013. DOI:
10.1520/E2849-13. as an item.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
1
---------------------- Page: 1 ----------------------
E2849 − 13
2.1.20 test, n—sampling of behavior over a limited time in 4.3.1.1 User Interface Preparation—A practice test or tests
which an authenticated examinee is given specific tasks under to familiarize candidates with the user interface shall be made
specified conditions, tasks that are scored by a uniformly available to the candidate at no charge. The practice test shall
applied rubric. be sufficient to assure adequate candidate practice t
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.