Proposal for the Research Advancement Program

Assistant Professor Neil Heffernan

Computer Science Department

May 1, 2003

Requested Budget: $7,500

 

Title: Programming by Demonstration for Intelligent Tutoring Systems

 

Introduction

I am initiating a new line of research in machine learning that is related to my existing line of research in intelligent tutoring systems.  In order to establish a track record in this area, I am requesting funds to employ an excellent student for the summer in order to create some initial results.  I have a reasonable belief that this work will provide the basis for attracting external funding.[1]  I stumbled across this interesting and hard problem while working on a project that the Office of Naval Research is funding me for; the goal of the project is to build tools to make it easier to create cognitive models that are used in intelligent tutoring systems.  We have built a tool that allows an “author” (i.e., a subject matter expert like a teacher) to demonstrate how a problem should be solved.  (See Koedinger, Aleven & Heffernan, 2003 at http://nth.wpi.edu/pubs_and_grants/03-BRIMS-063.doc for more details on this tool.)  Later a programmer writes the code needed to make the examples the author has demonstrated work in general for cases other than those that the author specified.  This programmer is assisted by our tools by being able to do automated testing to make sure that the code she writes works for all of the examples.  We propose to use these examples for the purposes of automatically generating the rules (in first order logic) needed to fit the examples. This research problem is in to the exciting new area of Programming by Example (Lieberman, 2001), where the goal is for the computer to infer a human’s intent from a few positive and negative examples.  Programming by Example is revolutionary in that the large numbers of people don’t know how to use a traditional programming language, could program a computer by showing a computer several examples of what they would like to happen, as well as a few negative example of what they would not like to happen.  The goal is for the computer to infer, using some available background knowledge, the steps needed to reproduce all the positive examples.  This would enable a non-programmer (i.e., a subject matter expert such as a school teacher) to be able to build an intelligent tutoring system, a task that currently is very hard.  The Armed Forces is greatly interested in that they have enormous training needs and would like their subject matter experts to be able to quickly produce intelligent tutoring systems. The general problem of programming by example is very difficult but we think the version of the problem we seek to address should be tractable due to the fact that we have additional data we can bring to bear due to the fact that we are doing this in the context of building intelligent tutoring system.  For a budget, I request $6,000 to pay my undergraduate student and $1,500 for myself (and possibly him) to present our results at a conference[2]. Below, we lay out a sketch of the problem and some background literature.

 

           

Details of the Problem

 

            The focus of this research project is to develop tools that aid in the construction of Intelligent Tutoring Systems. Specifically, we sought to construct a system with which to automate the process of rule generation. In current intelligent tutoring systems, CLIPS or JESS rules, which describe each type of problem, are a critical and time-consuming part of development. Our aim was to create a system to author such rules, but not require the developer to actually write the rule, only provide examples and possibly hints about the rule. The end goal is a system that allows the developer to provide a set of examples (question and answer), and the system will induce a rule completely describing the problem. A simplified but instructive example instance is this: the developer provides the following sets of numbers: {14,3,5,6}, {56,3,30,23}, {5,1,2,2}. The system is informed that the first three numbers in each set are input parameters, and the last number is an output. It is also given a set of possible functions (or knowledge base), such as arithmetic operators, modulo, or more complex relationships. The system will then generate a rule that does indeed describe all three examples, along these lines:

           

            Subtract B from A, saving into temporary result E

            Subtract C from E, giving us output D

            Where the examples are of the form {A,B,C,D}

 

            This provides a simple instance of the type and quality of rule we are attempting to learn.

            We have initially approached this problem as an inductive logic problem, exploring the methods of FOIL (Quinlan, 1993) and PROGOL (Muggleton, 1995). FOIL builds rules by successively adding literals from the knowledge base to the right hand side of the rule. Literals are considered for addition if they exclude negative instances or introduce new variables that may be needed for future literals. It evaluates the effectiveness of new literals by considering how well the new rule covers the positive examples given, and that it does not cover the possible negative examples.

            PROGOL is an inductive logic programming algorithm that uses inverse entailment to generate an initial set of possible rules. This generated “lattice” of terms contains a number of correct generalizations of the examples. Thus, a search is conducted through the “lattice” to determine the rule that covers the examples best. We have done most of our testing with the PROGOL algorithm, using the CPROGOL system or the Aleph system (simulator for several ILP techniques).

            We initially encountered fair success using the aforementioned systems when approaching simple arithmetic problems such as the one given as an example instance. However, as the problem size increased beyond 4 steps, the search time required forced us to encode problems stepwise. For instance, in an attempt to induce a complete rule to describe multi-column addition using standard arithmetic operators and modulo, each column operation had to be learned incrementally. The search time problem became more acute as we ventured into problems with a larger knowledge base and more steps, such as inducing symbolic algebra equation solution rules.

            At the time of this writing, our focus has turned to improving the PROGOL algorithm or developing our own function modeling method. We have noted that our problem differs in some important respects from the standard ILP problem. We are almost always looking for a single rule to generalize and cover all the examples presented. Also, our number of examples will always be fairly limited (usually 4 or 5 instances). In addition, at least for our initial research, our examples are all mathematical in nature, so the rules we are learning have an infinite set of possible inputs and outputs. Finally, no partial rules are acceptable, or even useful in developing a search. This in particular is contrary to most ILP algorithms, which use partial covering of the examples as a heuristic in developing the rule.

            These differences from standard ILP problems provide us with a focus and useful information about solving our specific problem.  Our efforts thus far give us a clear direction to continue research. We anticipate that our continued research will provide new and useful results in this exciting field.

 

Koedinger, K. R., Aleven, V., & Heffernan, N. T. (2003) Toward a rapid development environment for cognitive tutors. 12th Annual Conference on Behavior Representation in Modeling and Simulation. Simulation Interoperability Standards Organization.

Lieberman, H editor (2001) Your Wish is My Command: Programming by Example. Morgan Kaufmann, San Francisco, 2001

Muggleton, S. Inverse Entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming, 13, 1995.

Quinlan, J.R., and R.M. Cameron-Jones. FOIL: A Midterm Report. Sydney: University of Sydney, 1993.


Budget and Justifications

 

Requesting $6,000 dollars to pay undergraduate Goss Nuzzo-Jones during the summer months of 2003.

 

I am also requesting $1,500 to pay for my (and possibly Goss) to present our work at an appropriate conference (i.e., International Conference on Machine Learning).

 

 

 

Statement of Relevance to an Interdisciplinary Research Area and Thrust Areas

 

It is my understanding that WPI wants to improve the quality of research done in the computer science department, and as such wants to do a better job of attracting funding.  I believe that if the Research Advancement Program decided to fund this project there is good chance that I will be able to attract external funding to support further work in this exciting and growing area.

 

 

 

 


Neil T. Heffernan

A. Professional Preparation

Ph. D.  Computer Science.  Carnegie Mellon University.  2001

M.S.  Computer Science. Carnegie Mellon University. 1997

B.A. Computer Science and History.  Amherst College.  Summa cum laude. 1993.

B. Recent Appointments

Assistant Professor.   Worcester Polytechnic Institute.  Department of Computer Science.  July, 2002 to present.

Post-Doctoral Researcher. Human-Computer Interaction Institute.  School of Computer Science, Carnegie Mellon University. 2001to 2002.

C. Career Summary

Dr. Neil Heffernan graduated summa cum laude from Amherst College in History and Computer Science.  Neil taught mathematics and science to eighth grade students as part of Teach for America, a program that selectively recruits top candidates to teach in inner-city schools, Neil attended Carnegie Mellon University's Computer Science department to do research in creating educational software that leads to higher student achievement.  For his dissertation, Neil built the first intelligent tutoring system that incorporated a model of tutorial dialog.  This system has been shown to lead to higher student learning, by getting students to think more deeply about problems.  It is based upon detailed studies of student learning as well as studies of experienced human teachers.  The system (free at www.AlgerbaTutor.org) is the most widely used web-based intelligent tutoring system.  It has been used by thousands of students and teachers and has been awarded many educational awards.  Carnegie Mellon has applied for a patent for this unique web-based tutor.  As a post-doc, Neil managed a team of four programmers and PhDs to create authoring tools to make it easier to build intelligent tutoring systems.   The first version of these tools was successful used in the “CIRCLE 2002 Summer Schools for Building Intelligent Tutoring Systems” where researchers from around the world came to learn how to build intelligent tutoring systems.  Neil is a Spencer Foundation / National Academy of Education Postdoctoral Research Fellow.  Neil is now an assistant professor at Worcester Polytechnic Institute, where one of his projects is organizing "The Learning Open" (www.LearningOpen.org), an interdisciplinary collaboration between educational software researchers and classroom teachers to study what are the benefits of different instructional approaches.

D. Pertinent Publications

Heffernan, N. T., & Koedinger, K. R. (Submitted) An Intelligent Tutoring System Incorporating a Model of an Experienced Human Tutor. International Journal of Artificial Intelligence in Education.

Heffernan, N. T. (accepted 2003) Web-Based Evaluations Showing both Cognitive and Motivational Benefits of the Ms. Lindquist Tutor 11th International Conference Artificial Intelligence in Education

Heffernan, N. T., & Koedinger, K. R. (2002) An Intelligent Tutoring System Incorporating a Model of an Experienced Human Tutor. In the Proceedings of the Sixth International Conference on Intelligent Tutoring System 2002. Biarritz, France.

Heffernan, N. T., & Koedinger, K. R. (2001) Results from a Web-Based Tutor for Writing Algebra Expressions for Word-Problems. Sciences et Techniques Educatives. A French journal named "Educational Sciences and Technology."

Heffernan, N. T (2001)  Intelligent Tutoring Systems are Forgotten the Tutor: Adding a Cognitive Model of Human Tutors. Dissertation.  Computer Science Department, School of Computer Science, Carnegie Mellon University. Technical Report CMU-CS-01-127.

Heffernan, N. T., & Koedinger, K. R. (2000) Intelligent Tutoring Systems are Missing the Tutor: Building a More Strategic Dialog-Based Tutor. AAAI Fall Symposium on Building Dialogue Systems for Tutorial Applications.

Heffernan, N. T. & Koedinger, K. R. (1998). A developmental model for algebra symbolization: The results of a difficulty factors assessment. In Proceedings of the Twentieth Annual Conference of the Cognitive Science Society, (pp. 484-489). Hillsdale, NJ: Erlbaum

 Heffernan, N. T. & Koedinger, K.R. (1997). The composition effect in symbolizing: The role of symbol production vs. text comprehension. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society, (pp. 307-312). Hillsdale, NJ: Erlbaum. [The Marr Prize winner for best student paper.]

 

E. Funding

(Full test of these papesr and proposals at http://www.algebratutor.org/pubs.html)

US Dept of Education: Institute of Education Sciences Using Web-Based Cognitive Assessment Systems for Predicting Student Performance on State Exams Requested $1.4 million over 4 years. Submitted with colleagues at Carnegie Mellon University, April 18th, 2003. Pending

US Army STTR Grant ARMY03-T01. Requested $100,000 over 6 months for Phase One STTR. Submitted with Sonalysts Inc, April 16th, 2003. Pending

Office of Naval Research Affordable Cognitive Modeling Authoring Tools using HCI Methods" Three year grant beginning in 2003. $203,304 over three years. Grant number N00014-0301-0221

Office of Naval Research "Cognitive Tutor Tools for Advanced Instructional Strategies" April 1st -Sept, 30th. $200,000. Co-pi's with Ken Koedinger and Vincent Aleven. Grant was used to build authoring tools to make the construction of cognitive tutors easier. I hired 4 individual researchers, including one PhD, and then supervised them to help produce the tools.

Spencer Foundation. $50,000 One-year. 2002-2003. "A comparison of student learning under multiple conditions: classroom instruction, one-on-one human tutoring, and different types of computer tutoring". Used money to buy out one year of teaching time (3 classes). Result in progress.

 



[1] The Department of Defense has huge training needs.  I just visited Ralph Chatham at DARPA and discussed this idea with him for 2 and a half hours. Ralph’s focus is on funding systems that will lead to better training for the military.  Ralph came to the week long summer school that I helped teach at Carnegie Mellon University and is writing me a letter of support for a grant I just submitted to the Army that was requesting research on how to integrate intelligent tutoring systems into their warrior simulation systems.  (See http://nth.wpi.edu/ResearchTopics/WarriorTutoring/Warrior%20Tutoring.htm for more about that proposal.)  ONR where I have already got funding another possibility.  NSF is yet a third funding possibility and I just met with 5 program officers there searching for the best place to request send this. This idea was well received by two program officer in the Directorate for Education and Human Resources that has previously funding work in intelligent tutoring systems.  Finally,  the U.S. Department of Education, who I just sent a 1.4 million dollar proposal (http://nth.wpi.edu/pubs_and_grants/Grant_to_IES_with_WPS.pdf)  to, is another potential target, because if we can show how a normal teacher could create an intelligent tutoring systems without having to learn how to program, this could have huge impact. But first I need to demonstrate the feasibility of this approach.

[2] I believe in this project so much I after talking with Bill Durgin I have chosen to gamble and hire this student for the summer using my own startup funds with the hope that the RDC will choose to fund this project.  Bill told me that if RDC funds the project he will reimburse my startup funds for the amount that I will have already paid this student.