A subsequent study found the BINS to possess a sensitivity of 70% and a specificity of 71% when a population of infants born prematurely was screened at 12 and 24 months.38
Both studies used the Bayley Scales of Infant Development–II as the gold standard. The BINS requires only about 10 minutes to administer, but requires experience in standardized assessment and familiarity with infant development.39 A recent study found the BINS insensitive in the detection of developmental delays as compared with the Bayley Scales of Infant Development–II in environmentally at-risk children at ages of 6 months and 12 months.40
Tests of questionable value
Denver–II
The Denver Developmental Screening Test (DDST) was introduced in 1967. Research has consistently found it lacking in sensitivity. In response to this criticism, a revised version, the Denver II, was released in 1992.41
The Denver II is the most commonly used developmental screening tool.6 It combines direct observation and parental report. The tool consists of 125 items, organized into 4 developmental domains: gross motor, fine motor/adaptive, language, and personal/social. Items are displayed in bars that indicate the ages in which 25%, 75%, and 90% of children in the standardization study mastered a given task. A Denver test kit consists of scoring pads, materials used in eliciting skills, and a technical manual that details the appropriate administration and scoring of the test. Thirty-one percent of items can be addressed by parental report; the remainder requires observation of elicited skills.42
Though the Denver II used more than 2000 children to establish normative data, all of them were from Colorado, undermining our ability to generalize this data to a more heterogeneous population. Furthermore, both versions of the test were published without data on the test’s validity, sensitivity, and specificity. The authors have instead relied on the significance of a child falling outside of the normal range as evidence of delay. This approach has been criticized.
Two studies have examined the validity of the Denver–II. In 1992, Glascoe et al43 studied a demographically representative sample of 102 children and found that though the Denver II had a high sensitivity (83%), it had an unacceptably low specificity (43%). Attempts to improve specificity through categorizing questionable/untestable scores as normal raised specificity to 80%, but at the expense of sensitivity, dropping it to 56%.43 Assuming a 16% prevalence of developmental disorders, the low specificity of the Denver–II would produce suspect scores in nearly 3 out of 5 children tested, but true problems could only be expected to be found in 1 of 4 children with suspect scores.
A follow-up study of 89 children by Glascoe and Byrne44 found the Denver–II to possess excellent sensitivity (83%) but similarly disappointing specificity (26%), producing a positive predictive value of 28% in the study population (20% prevalence of disabilities).
In both studies, a battery of tests similar to those used to determine eligibility for special services were used as the gold standard. Properly performed, administration of the Denver–II requires approximately 20 minutes.42 Shortened versions or informal scoring of the Denver–II can only further degrade the questionable validity of this measure.
Child Development Inventories
The Child Development Inventory or CDI, formerly known as the Minnesota Child Development Inventory, was created to provide a systematic, standardized method for parents to report on their children’s strengths, problems, and present development. The original 300-item instrument has been broken down into instruments that apply to 3 age intervals.
The CDI measures a child’s development in 8 areas: social, self-help, gross motor, fine motor, expressive language, language comprehension, letters, and numbers. It consists of a 300-item booklet and answer sheet for the parent to complete and a profile sheet for recording the results. It was standardized on a sample of 568 children from South St. Paul Minnesota, a predominantly Caucasian, working-class community near a large metropolitan area.44
Parents complete the questionnaire by circling Yes/No responses to the statements. Children are considered “borderline” if their CDI scores are 25% below chronological age (1.5 standard deviations [SD] below the mean) and “delayed” if their scores are >30% below chronological age (2.0 SD). The CDI has been researched in presumed normal populations and in high-risk populations such as children born prematurely.
In a high-risk population of infants and children, it was found to have a sensitivity of 80% and specificity of 96% for detecting developmental delay (ie, CDI scores >2.0 SD below the mean) when compared with to the Bayley Scales of Infant Development–II, using 2 SD below the mean as the cutoff.45 It seems to have particular utility for screening at-risk children even when applied to a population of low socioeconomic status and low education level.46 In addition to validity, good predictive value has been established for future cognitive, reading, academic, intellectual, and adaptive functioning.