Discuss the Testing and Measurement Concepts of Assessment.


Assessment techniques must meet the four technical criteria: standardization, norms, reliability and validity.


Standardization means the uniform procedures adopted in the administration and scoring of an assessment tool. For example, in self-report scale, the examiner must read the subject, understand the instructions, respond to the same questions and stay within the specified time limits. The information about who should or should not take the test, the conditions under which the assessment test should be given, specific procedures for scoring the test and the interpretative significance of the scores come under it.


Norms refer to information regarding whether a particular “raw score” ranks low, high, or average relative to other “raw scores” on the test. Test norms give standards with which the scores of people who have taken the test can be compared. The raw scores are generally converted into percentile scores which show the individual’s relative rank compared to others.


Reliability is the consistency or stability of an assessment technique when given to the same group of people on two different occasions. It implies that repeated administrations of the same test should produce reasonably the same results. This is also called test-retest reliability. Test-retest reliability is determined by correlating the scores from the first administration with those of the second by a simple correlation procedure.

The magnitude of resulting,correlation coefficient provides an estimate of the test’s consistency over a period of time. Above 70 is the reliability coefficient for most psychological tests.

Reliability of test is also determined by splitting the test into two sets, then summing the scores for each set and correlating them. This is called the split-half reliability. It shows the test’s internal consistency. If the test items are consistently measuring the same dimension, then people who score high on odd items should also score high on even items.

Another reliability test is about the correlation of two versions of the same test given to the same people. If the scores remain the same, the test yields reliability of parallel forms. It would show that the helm on both tests measure the same thing.

Reliability also refers to the degree of agreement between two or more judges in scoring the same test. This is called inter scorer reliability. It Must be demonstrated whenever scoring involves subjective interpretations.


Validity refers to whether a test measures what it is intended to measure. There are three types of validity:

  • Content validity,
  • Criterion-related validity,
  • Construct validity.

Content validity means the test must include items which are representative of the entire dimension it measures. For example, a personality. test measuring shyness should cover each of these components defining the construct of shyness.

Criterion related validity means the assessment test should accurately forecasts the agreed-upon criterion. For example, the behavioural criterion being predicted may include academic. performance in management school and occupational success. There are three types of criterion related validity: (i) Predictive validity, (ii) Concurrent validity, and (lii) Construct validity. Predictive validity means a test’s capacity to predict some criterion behaviour in the future. An intelligence test, for example, has predictive validity if it accurately predicts subsequent performance in school.

Concurrent validity means the extent to which a test correlates with another test from a theory or existing criterion measure. For example, if a person’s scores on a test that measures paranoid tendencies is examined by clinical psychologists, and the test also shows paranoid tendencies as was received in the interview.

Construct validity addresses the question of how well a test measures a useful abstract invention. The abstract nature such as ego-identity, self-actualization, social interest and repression cause complication and results uncertain. In the construct validation process, evidence is gathered to show that a test measures a hypothetical construct.

Convergent Validity: In convergent validity procedure, test scores of the construct in question are correlated with scores from another test that measures the same construct. For example, if a new test measures the construct of self-esteem, it should correlate positively with another established measure of self-esteem.

Divergent Validity: Divergent validity demonstrates, the extent the assessment tool correlate with measures of qualities that it was not intended to measure. For example, if new self-esteem test does not correlate with measures of other, conceptually distinct qualities, it is an evidence of discriminate validity.

Enable registration in settings - general
Compare items
  • Total (0)