EGPAI 2016 : 1st International Workshop on Evaluating General-Purpose AI


When Aug 30, 2016 - Aug 30, 2016
Where The Hague, The Netherlands
Submission Deadline Jun 12, 2016
Notification Due Jun 28, 2016
Final Version Due Jul 15, 2016
Categories    artificial intelligence   cognitive science   machine learning   reinforcement learning

Call For Papers


The 1st International Workshop on Evaluating General-Purpose AI
In conjunction with the 22nd European Conference on Artificial Intelligence (

August 30th, 2016 in The Hague, The Netherlands


The aim of this workshop is to analyse all aspects of the evaluation of general AI systems. Most AI systems are tested on specific tasks. However, to be considered truly intelligent, a system must be flexible enough to be able to learn how to perform a wide variety of tasks, some of which may not be known until the system is deployed. This workshop will examine formalisations, methodologies and test benches for evaluating the numerous aspects of this type of general AI systems. We are interested in theoretical or experimental research focused on the development of concepts, tools and clear metrics to characterise and measure the intelligence, and other cognitive abilities, of general AI agents.

== TOPICS ==

We welcome regular papers, demo papers about benchmarks or tools, and position papers, and encourage discussions over a broad list of topics (not exhaustive):

* Analysis and comparisons of AI benchmarks and competitions. Lessons learnt.
* Proposals for new general tasks, evaluation environments, workbenches and general AI development platforms.
* Theoretical or experimental accounts of the space of tasks, abilities and their dependencies.
* Evaluation of development in robotics and other autonomous agents, and cumulative learning in general learning systems.
* Tasks and methods for evaluating: transfer learning, cognitive growth, structural self-modification and self-programming.
* Evaluation of social, verbal and other general abilities in multi-agent systems, video games and artificial social ecosystems.
* Evaluation of autonomous systems: cognitive architectures and multi-agent systems versus general components: machine learning techniques, SAT solvers, planners, etc.
* Unified theories for evaluating intelligence and other cognitive abilities, independently of the kind of subject (humans, animals or machines): universal psychometrics.
* Analysis of reward aggregation and utility functions, environment properties (Markov, ergodic, etc.) in the characterisation of reinforcement learning tasks.
* Methods supporting automatic generation of tasks and problems with systematically introduced variations.
* Better understanding of the characterisation of task requirements and difficulty (energy, time, trials needed..), beyond algorithmic complexity.
* Evaluation of AI systems using generalised cognitive tests for humans. Computer models taking IQ tests. Psychometric AI.
* Application of (algorithmic) information theory, game theory, theoretical cognition and theoretical evolution for the definition of metrics of cognitive abilities.
* Adaptation of evaluation tools from comparative psychology and psychometrics to AI: item response theory, adaptive testing, hierarchical factor analysis.
* Evaluation methods for multiresolutional perception in AI systems and agents.


Workshop paper submissions: June 12, 2016
Workshop paper notifications: June 28, 2016
Final submission of workshop program and materials: July 15, 2016
Workshop date: August 30, 2016


We welcome submissions describing work in progress as well as more mature work related to AI evaluation. Submitted papers must be formatted according to the camera-ready style for ECAI'16, and submitted electronically in PDF format through

Papers (technical, demos, position) are allowed a maximum eight (8) pages. An additional page containing the list of references is allowed. Authorship is not anonymous (single-blind review). Papers will be reviewed by the program committee.

Authors of accepted papers will be asked to prepare a presentation (short or long) during the workshop. Pre-proceedings containing all accepted papers will be provided electronically on the workshop web page. The final workshop proceedings will be distributed electronically together with the ECAI conference proceedings. After the workshop, we intend to invite some of the contributing authors to submit a paper to a special issue (journal to be announced).


The workshop will begin with a short presentation, followed by four sessions (two in the morning, two in the afternoon) of around 80 minutes, with breaks between them. Technical sessions will consist of an invited speaker followed by short paper presentations, devoting an important share of time to discussion and interaction. The demo session will present real platforms and ways to evaluate AI systems for several tasks in these platforms. The discussion session will include a panel on "how to benchmark cumulative learning, one-shot learning and learning from a small number of examples" and a more open discussion about the research challenges around the workshop topics, continuation of the workshop, future initiatives, etc. Confirmed speakers include Katja Hofmann from Microsoft Research, Cambridge, UK, who will talk about a "Benchmark for AI implemented within a Game Engine".


Jordi Bieger, CADIA, Reykjavik University.
Angelo Cangelosi, Plymouth University.
David L. Dowe, Monash University.
Devdatt Dubhashi, Chalmers University of TEchnology.
Helgi Pall Helgason, Activity Stream.
Sean B. Holden, Cambridge University.
Jan Koutnik, IDSIA.
Edward Keedwell, Exeter University.
Frans A. Oliehoek, University of Amsterdam.
Henri Prade, IRIT, Université Paul Sabatier.
Ute Schmid, Bamberg University.
Bas Steunebrink, IDSIA.
Peter Sunehag, Google Deepmind.
Joel Venness, Google Deepmind.
Pei Wang, Temple University.


Christos Dimitrakakis, Chalmers University of Technology and University of Lille
José Hernández-Orallo, Universitat Politècnica de Valencia
Martin Sandsmark, The Magma Company AS
Claes Strannegård, Chalmers University of Technology
Kristinn R. Thórisson, Reykjavik University and the Icelandic Institute for Intelligent Machines

