Reinforcement learning

As an important part of my thesis I have been reading and learning about this subject. My effort has been split between an existing implementation in the sandbox, and working on a new implementation that can take advantage of a regression tree a fellow sandbox AI programmer has been working on.

For my first part I loaded the step animations I had created, and modified the reward function of the existing RL solution in order to achieve my desired behaviour. The demo works by simply having Marvin (sandbox star character) react to the mouse cursor. For me this meant stepping away and to the side of the cursor.

After being pretty happy with the result I was presented with the opportunity to work on a new implementation with the primary source of inspiration being  a paper on Real-Time Planning for Parameterized Human Motion by Wan-Yen Lo and Matthias Zwicker. Following this and eventually ending up doing things pretty differently a new implementation never the less was completed. The biggest difference from the former implementation is that with a regression tree you get a continues space for each dimension, and can use several dimensions.

Going to try to take some footage of the result at a later date, closer to the end of the thesis work.

Some links for Reinforcement Learning:

http://webdocs.cs.ualberta.ca/~sutton/RL-FAQ.html

http://aigamedev.com/open/reviews/planning-parameterized-human-motion/

Richard

Teaching autonomous agents to get out of your way

Or for short, taatgoyw.

The coming months I am going to be working on my thesis project. My goal is to solve a pretty simple problem in an interesting way. I see this problem all over the place however, games where the NPC doesn’t even blink to indicate that you are running into them.

The work will be done using the AIGameDev.com sandbox. The benefit of this is an array of great people to ask for help, a website with a lot of information on my subject and an engine with all the components I need to start working on this project.

The idea is to from a simple model of the interaction, design a system that can solve this. The system will use reinforcement learning and a policy aimed to teach the agent how to step out of the way off the player. With everything in place a demo will be constructed to show this behaviour.

  • Design a simple model for the interaction.
  • Animate a number of steps appropriate for this behaviour.
  • Learn the sandbox animation system.
  • Implement using Reinforcement learning in the sandbox.
  • Create a small demo.

My goal is to work with reinforcement learning as an area that seems to need more work before it can fully bloom. Hopefully the result will be simple and effective, and give way for other uses for the system in an animation context.

Richard