مرکز منطقه ای اطلاع رساني علوم و فناوري - Designing human benchmark experiments for testing software agents

DocumentCode :

2373123

Title :

Designing human benchmark experiments for testing software agents

Author :

Grant, R.D. ; DeAngelis, D. ; Luu, D. ; Perry, D.E. ; Ryall, K.

Author_Institution :

Empirical Software Eng. Lab., Univ. of Texas at Austin, Austin, TX, USA

fYear :

2011

fDate :

11-12 April 2011

Firstpage :

124

Lastpage :

128

Abstract :

Background: Software agents are becoming increasingly common in the engineering of software systems. We explore the use of humans in creating benchmarks for the evaluation of software agents. In our case studies, we address the domain of instructable software agents (e-students) as proposed by the Bootstrapped Learning project [Oblinger, 2006]. Aim: Our aim is to define and refine requirements, problem solving strategies, and evaluation methodologies for e-students, paving the way for rigorous experiments comparing e-student performance with human benchmarks. Method: Little was known about what factors would be critical, so our empirical approach is exploratory case studies. In two studies covering three distinct groups, we use human subjects to develop an evaluation curriculum for e-students, collecting quantitative data through online quizzes and tests and qualitative data through observation. Results: Though we collect quantitative data, our most important results are qualitative. We uncover and address several intrinsic challenges in comparing software agents with humans, including the greater semantic understanding of humans, the eidetic memory of e-students, and the importance of various study parameters (including timing issues and lesson complexity) to human performance. Conclusions: Important future work will be controlled experiments based on the experience of these case studies. These will provide benchmark human performance results for specific problem domains for comparison to e-student results.

Keywords :

Internet; computer aided instruction; program testing; software agents; benchmark human performance; bootstrapped learning project; e-student; eidetic memory; evaluation curriculum; human benchmark experiment; instructable software agents; online quizzes; online test; qualitative data; quantitative data; software agent testing; software system;

fLanguage :

English

Publisher :

iet

Conference_Titel :

Evaluation & Assessment in Software Engineering (EASE 2011), 15th Annual Conference on

Conference_Location :

Durham

Electronic_ISBN :

978-1-84919-509-6

Type :

conf

DOI :

10.1049/ic.2011.0015

Filename :

6083170

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2373123