Establishing Human Observer Criterion in Evaluating Artificial Social Intelligence Agents in a Search and Rescue Task.

Huang, Lixiao; Freeman, Jared; Cooke, Nancy J; Cohen, Myke C; Yin, Xiaoyun; Clark, Jeska; Wood, Matt; Buchanan, Verica; Corral, Christopher; Scholcover, Federico; Mudigonda, Anagha; Thomas, Lovein; Teo, Aaron; Colonna-Romano, John

Huang, Lixiao; Freeman, Jared; Cooke, Nancy J; Cohen, Myke C; Yin, Xiaoyun; Clark, Jeska; Wood, Matt; Buchanan, Verica; Corral, Christopher; Scholcover, Federico; Mudigonda, Anagha; Thomas, Lovein; Teo, Aaron; Colonna-Romano, John.

Afiliación

Huang L; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Freeman J; Aptima, Inc.
Cooke NJ; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Cohen MC; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Yin X; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Clark J; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Wood M; Aptima, Inc.
Buchanan V; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Corral C; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Scholcover F; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Mudigonda A; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Thomas L; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Teo A; Center for Human, Artificial Intelligence, and Robot Teaming, Arizona State University.
Colonna-Romano J; Aptima, Inc.

Top Cogn Sci ; 2023 Apr 13.

Article en En | MEDLINE | ID: mdl-37052261

RESUMEN

Artificial social intelligence (ASI) agents have great potential to aid the success of individuals, human-human teams, and human-artificial intelligence teams. To develop helpful ASI agents, we created an urban search and rescue task environment in Minecraft to evaluate ASI agents' ability to infer participants' knowledge training conditions and predict participants' next victim type to be rescued. We evaluated ASI agents' capabilities in three ways: (a) comparison to ground truth-the actual knowledge training condition and participant actions; (b) comparison among different ASI agents; and (c) comparison to a human observer criterion, whose accuracy served as a reference point. The human observers and the ASI agents used video data and timestamped event messages from the testbed, respectively, to make inferences about the same participants and topic (knowledge training condition) and the same instances of participant actions (rescue of victims). Overall, ASI agents performed better than human observers in inferring knowledge training conditions and predicting actions. Refining the human criterion can guide the design and evaluation of ASI agents for complex task environments and team composition.

Palabras clave

Artificial social intelligence; Baseline; Evaluation; Human observer criterion; Minecraft; Search and rescue; Theory of mind

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: Top Cogn Sci Año: 2023 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google