Abstract:
This study was conducted to develop a math proficiency test using IRT, based on
NAEP’s adapted Mathematics Framework, and aligned with the newly developed
standards, and benchmarks of National Mathematics Curriculum. Mathematical
Proficiency is the ability to use the mathematical power for conceptual understanding,
procedural knowledge, and problem-solving in the real world using appropriate
strategies. One hundred and ninety six multiple choice and short constructed response
items were developed. Due to limitations of the study only, multiple choice and short
constructed response, dichotomously scored items were developed. Items were spot
tested on 200 students and piloted on 550 students. A final 60-item math proficiency
test was constructed. Stratified cluster sampling technique was used for administration
of final math proficiency test. All the 9th grade students studying in public high and
higher secondary schools in the province of Punjab comprised the population of the
study. Sample of 2680 students and 134 schools was selected using sample design
tables and IRT based analysis requirements. Final math proficiency test was
administered and data was received from 2617 students. Data was analyzed by using
SPSS, Conquest, and Multilog software. Data was well fitted with Rasch Model. The
infit and outfit means square statistics of items was within 0.8 to 1.30. The reliability
and discrimination index of final math proficiency test was above 0.90 and 0.45
respectively. The IRT based person-item map showed that test items covered ±3 range
of abilities. Test characteristics and item characteristics curves, factor analysis, and
dimensionality analysis also supported the reliably and validity of the test. Different
ix
estimations-WLE, EAP, and MLE also supported reliability and validity of test items.
Estimations WLE, EAP, and MLE have advantages and disadvantages over each
other. For this study the average of these estimations was used as proficiency score
for each student. It explored that math proficiency test is appropriately constructed.
Replication of this study is recommended by using 2PLM and 3PLM of Item
Response Theory (IRT).