كليدواژه :
معيارگزيني , نقطه گذاري معيار , علامت گذاري , آموزش رياضي
چكيده فارسي :
ﻣﻌﯿﺎرﮔﺰﯾﻨﯽ ﯾﮑﯽ از ﻓﻨﻮن ﺳــﻨﺠﺶ ﺑﺮاي ﻃﺒﻘﻪﺑﻨﺪي ﻣﻌﺘﺒﺮِ آزﻣﻮدﻧﯽﻫﺎ اﺳــﺖ. در اﯾﻦ ﻣﻄﺎﻟﻌﻪ، ﺗﺄﺛﯿﺮ اﺳــﺘﻔﺎده از دو روش ﻣﻌﯿﺎرﮔﺰﯾﻨﯽِ ﻧﻘﻄﻪﮔﺬاري ﻣﻌﯿﺎر و ﻋﻼﻣﺖﮔﺬاري ﺑﺮ ﻧﺘﺎﯾﺞ ﺣﺎﺻـــﻠﻪ از ﻣﻄﺎﻟﻌﮥ ﮐﻼن ﻣﻘﯿﺎﺳـــﯽ ﺗﺤﻠﯿﻞ ﺷـــﺪ ﮐﻪ ﺑﺮاي ﺳﻨﺠﺶ ﯾﺎدﮔﯿﺮي رﯾﺎﺿﯽ ﭘﺎﯾﮥ ﺷﺸﻢ در ﺑﯿﻦ داﻧﺶآﻣﻮزان ﺷﻬﺮ ﺗﻬﺮان اﺟﺮا ﺷﺪه ﺑﻮد.
روش ﭘﮋوﻫﺶ: اﯾﻦ روشﻫﺎ روي دادهﻫﺎي ﺳﻨﺠﺶ ﮐﻼنﻣﻘﯿﺎس اﺳﺘﺎﻧﯽ ﮐﻪ ﺑﺮ 9720 داﻧﺶآﻣﻮز ﭘﺎﯾﮥ ﺷﺸﻢ ﺷﻬﺮ ﺗﻬﺮان اﺟﺮا ﺷﺪه ﺑﻮد، ﻣﻘﺎﯾﺴﻪ ﺷﺪﻧﺪ. ﻣﺸﺎرﮐﺖﮐﻨﻨﺪﮔﺎن در اﯾﻦ ﭘﯿﻤﺎﯾﺶ در ﻣﺠﻤﻮع 264 ﺳﺆال رﯾﺎﺿﯽ را ﭘﺎﺳﺦ دادﻧﺪ و ﭘﺎﺳﺦﻫﺎي آﻧﺎن ﺑﺎ اﺳﺘﻔﺎده از روش ﻣﻘﺎدﯾﺮ ﻣﺤﺘﻤﻞ ﺗﺤﻠﯿﻞ ﺷﺪﻧﺪ.
ﯾﺎﻓﺘﻪﻫﺎ: ﻧﺘﺎﯾﺞ ﻧﺸﺎن دادﻧﺪ ﮐﻪ ﺑﻪﮐﺎرﮔﯿﺮي روش ﻧﻘﻄﻪﮔﺬاري ﻣﻌﯿﺎر ﺑﺎﻋﺚ ﻣﯽﺷﻮد ﮐﻪ ﺑﻪ ﺗﺮﺗﯿﺐ 75، 48، 18 و 2 درﺻﺪ از داﻧﺶآﻣﻮزان ﺣﺪاﻗﻞ ﻧﻤﺮات ﻻزم را در ﺳﻄﻮح ﻋﻤﻠﮑﺮدي ﭘﺎﯾﯿﻦ، ﻣﺘﻮﺳﻂ، ﺑﺎﻻ و ﭘﯿﺸﺮﻓﺘﻪ ﮐﺴﺐ ﮐﻨﻨﺪ. ﻫﻢﭼﻨﯿﻦ، ﺑﺎ اﺳﺘﻔﺎده از اﯾﻦ روش 23/9 درﺻﺪ از ﺳﺆاﻻت در ﻫﻤﺎن ﺳﻄﺤﯽ ﻗﺮار ﮔﺮﻓﺘﻨﺪ ﮐﻪ ﺗﻮﺳﻂ ﮐﺎرﺷﻨﺎﺳﺎن ﻣﻮﺿﻮﻋﯽ ﺗﻌﯿﯿﻦ ﺷﺪه ﺑﻮدﻧﺪ. در ﻣﻘﺎﺑﻞ، ﻣﻘﺎﯾﺴﮥ ﻓﺎﺻﻠﮥ ﻣﯿﺎﻧﮕﯿﻦﻫﺎي ﻣﺘﻮاﻟﯽِ ﭘﺎراﻣﺘﺮ ﺟﺎﯾﮕﺎه ﺑﺎ اﻧﺤﺮاف ﻣﻌﯿﺎر ﺟﺎﯾﮕﺎه در ﺳﻄﻮح ﻋﻤﻠﮑﺮدي، ﮐﯿﻔﯿﺖ دﺳﺘﻪﺑﻨﺪي اوﻟﯿﮥ ﮐﺎرﺷﻨﺎﺳﺎن ﺑﺮاي اﺳﺘﻔﺎده در روش ﻋﻼﻣﺖﮔﺬاري را زﯾﺮ ﺳﺆال ﺑﺮد. ﻋﻼوهﺑﺮاﯾﻦ، ﺗﺄﺛﯿﺮ اﺳﺘﻔﺎده از ﭘﻨﺞ اﺣﺘﻤﺎل ﭘﺎﺳﺦِ 0/52، 0/57، 0/62، 0/67 و 0/75 ﺑﺮ دﺳﺘﻪﺑﻨﺪي داﻧﺶآﻣﻮزان ﻧﺸﺎن داد ﮐﻪ ﺑﺎ وﺟﻮد ﺗﺄﮐﯿﺪ ﭘﯿﺸﯿﻨﮥ ﭘﮋوﻫﺸﯽ ﺑﺮ اﺣﺘﻤﺎل ﭘﺎﺳﺦِ 0/67، ﮐﻢﺗﺮﯾﻦ اﺣﺘﻤﺎل ﭘﺎﺳﺦ )0/52( ﻧﺘﺎﯾﺞ واﻗﻌﯽﺗﺮي را ﻧﺴﺒﺖ ﺑﻪ ﺑﻘﯿﻪ ﺗﻮﻟﯿﺪ ﻣﯽﮐﻨﺪ وﻟﯽ ﻫﻢﭼﻨﺎن در ﻣﻘﺎﯾﺴﻪ ﺑﺎ روش ﻧﻘﻄﻪﮔﺬاري ﻣﻌﯿﺎر ﻣﻌﯿﺎر ﺳﺨﺖﮔﯿﺮاﻧﻪاي ﺑﻪﻧﻈﺮ ﻣﯽرﺳﺪ.
ﻧﺘﯿﺠﻪﮔﯿﺮي: ﺑﺎﯾﺪ ﺑﻪ ﻣﻌﯿﺎرﮔﺰﯾﻨﯽ ﺑﻪ ﻋﻨﻮان ﯾﮏ ﻣﺒﺤﺚ ﻓﻨﯽ در ﻫﻤﻪ ﺳﻨﺠﺶﻫﺎﯾﯽ ﮐﻪ درﺟﻪﺑﻨﺪي ﯾﺎ ﻗﺒﻮل و ردي ﯾﮑﯽ از ﺗﺒﻌﺎت ﺷﺮﮐﺖ در آزﻣﻮن اﺳﺖ، ﺗﻮﺟﻪ ﺑﯿﺸﺘﺮي ﺷﻮد.
چكيده لاتين :
Standard setting is one of the assessment techniques to create valid
classifications of examinees. In present study, the effect of two standard setting
methods, benchmark and bookmarking, was examined in results of a large-scale
study, which was planned for assessing mathematics learning in sixth grade
students of Tehran city.
Methods: Two methods were compared using data of a provincial large-scale
assessment which carried out on 9720 sixth grade students in Tehran city. They
asked 264 mathematics items and their response were analyzed by plausible
values.
Results: Results of applying benchmark showed that 75, 48, 18, and 2 percent of
students attained minimum scores in low, mediate, high, and advanced levels;
respectively. In addition, 23.9 percent of items located in the same level that
identified by content experts. In contrast, quality of classification by content
experts in bookmarking was critiqued due to comparing of successive averages
with standard deviations of location parameters. Moreover, effect of using five
response probabilities: 0.52, .057, 0.62, 0.67, and 0.75 in classification of
students indicated that, in spite of recommendation of response probability 0.67
in literature, the lowest response probability (0.52) produced the most realistic
results rather than other response probabilities, however, this is still a strictly
standard comparing benchmarking methods.
Conclusion: Standard setting should be considered as a technical issue in all
assessments that grading or pass/fail is consequent of the test.