HIERARCHICAL SPATIAL KEYFRAMING
IGOR MORDATCH 2005/2006
UNIVERSITY OF TORONTO CSC490/CSC491 PROJECT
PROFESSORS: KARAN SINGH, RAVIN BALAKRISHNAN
MARCH 29 2006
Pencils are down, the project is over. A report covering most of the details and additional videos are available here.

MARCH 23 2006
As mentioned before, one of the more interesting applications of "Suggestion Mode" is to find best cursor position given constraints on only certain joints, rather than entire character pose. Then pose mixing is no longer tied to the cursor. This is useful in cases where it is difficult to manipulate the cursor in the keyframe space (such as when there are many keyframes laid out in an unintuitive manner, which happens when keyframes are generated from an existing animation). Then a more traditional control scheme (controlling joint end effectors) is preferable. Similar to Style IK [3], this control scheme is provided while keeping all the advantages of 3D pose mixing (i.e. generating poses similar to examples given).

A problem arises when the system becomes underconstrained. A case when there are multiple poses given the constraints is not handled well in the current implementation. Usually character jerking between different suitable poses will be the result. A nice thing to do would be to modify the optimization algorithm to take previous optimal solution into account, preventing rapid changes when choosing between many optimal candidates. Another limitation of the implementation is that it only tries to find the best-fitting cursor for its own node in the hierarchy (leaving cursors for other nodes unchanged). A better solution would be to find optimal positions of all cursors in the hierarchy.

FEBRUARY 08 2006
One of the more interesting ideas brought up at the last meeting was associating a character pose with the most fitting cursor position in keyframe space (in a way an inverse of what the system is currently doing). I have decided to approach this problem by trying to find a cursor position that minimizes the difference between the desired pose and the pose generated by that cursor position. A particularly useful optimization algorithm for this situation is Particle Swarm Optimization. Instead of starting with a certain number of random particles, the algorithm was modified to start with particles corresponding to keyframe positions, which then "flock" to a (hopefully) global minimum. This method has the benefit of choosing a minimum that is close to the keyframes (where the system is most certain about poses it generates).

I have experimented with various metrics of difference between poses, and distance between vectors containing every joint's position (in world coordinate system) seemed to yield best results. The largest drawback is that fitting is not sensitive to orientations of the joints.

Much can be done to improve performance of this process, but I decided not to focus on such things until the design is more mature. The next interesting step to take would be to create best-fit cursor positions given certain constraints (positions of the hands and feet, for example) rather than entire pose. This would be similar to Style IK with the added bonus of user being able to modify cursor position if the best fit is unsatisfactory.

JANUARY 11 2006
Stage 2 of the project is finished, and animation tree editor is completed. Schematic View UI turned out to be less flexible than expected, but ended up working well. Below is a video of how the animation tree editor behaves. Each node in the editor represents a separate keyframe space.

NOVEMBER 16 2005
Stage 1 of the project is finished. My predictions were:

Get familiar with 3D Studio MAX SDK and implement simplified spatial keyframing as a plugin. This should behave as a prototype, but must support saving/loading of the scene and being placed in a tree unlike the prototype (design has to be changed). Don't use radial basis functions for interpolation, using simpler methods instead (but which can't do extrapolation)

I've actually finished more than I predicted. Besides implementing spatial keyframe nodes as a plugin, I also implemented management and interpolation for entire trees of these nodes (which was a chunk of stage 2). Below is a video showing how a tree of spatial keyframe nodes behaves. Each cross on the left represents a tree node (for which I will be creating a GUI in stage 2).

There are a few rough edges at this point however. Firstly, all controlled objects still use a scripted controller which the user is free to edit (and break the system functionality). It would be nice to create a plugin controller so these details are hidden from the user. Secondly, when nodes are deactivated, the corresponding cursor and spatial keyframes are simply hidden (but still exist in the scene). It would probably get annoying to have many objects in the scene the user did not create. Nice thing to do would be to create/delete them as nodes are activated/deactivated, or to group them. Finally, spatial keyframes only work on still poses right now. It would be nice to blend entire animation tracks (as in [2]) instead of just poses.

NOVEMBER 2 2005
3D Studio MAX SDK is big. It will still take some time to get comfortable with it, but I think I figured out how to do most of what I want with it for now. More specifically, I now have a C++ plugin that can maintain complex data structures, save information with the scene, and perform heavy computation that's just not practical in a scripting language. At the moment, the plugin does not have a GUI, instead exposing its functionality to MAXScript. Thus, my project will probably end up as a MAXScript/plugin hybrid.

Working with the plugin, I implemented dimensionality reduction for character poses through Principal Component Analysis. It could be useful on its own for UI (automatically suggesting spatial keyframe positions for a series of poses), but would probably be most useful for my Stage 5's PCA+MoG last resort. The implementation is straightforward, but does make use of David Eberly's Symmetric Eigensolver. Below is a demo video of an animation and of 40 poses reduced to 2 and 3 dimensions (the red marker indicates the currently active pose):

OCTOBER 17 2005
Still considering the idea of having spatial keyframes being sensitive to orientation in addition to position, I realized I would need to know how "close" the keyframe orientation is to my cursor orientation. There are a few proposed solutions to this and all claim to be only approximations. Luckily, somebody thought distance metrics were important enough to write an entire paper on them. It suggests that finding the distance between two quaternions is as cheap as taking their inner product. Other source uses the distance between logarithms of quaternions, while another just states 2acos(|<q1,q2>|) without any justification...Looks like I'll have to try each of them and see how they behave. As long as I'm consistent, any of them will probably do.

This of course would mean that distance between positions and distance between orientations will be unrelated to each other and I'm wondering if this would cause problems. Another possible option is to somehow find a distance between two transform matrices, which would tie position and orientation together.

OCTOBER 17 2005
My roadmap:

Core:

Stage 1: Get familiar with 3D Studio MAX SDK and implement simplified spatial keyframing as a plugin. This should behave as a prototype, but must support saving/loading of the scene and being placed in a tree unlike the prototype (design has to be changed). Don't use radial basis functions for interpolation, using simpler methods instead (but which can't do extrapolation)

Stage 2: Implement the animation tree editor with a GUI and integrate the work in Stage 1 as a tree node. This is not a trivial step as MAX does not favor node-based editors. Hopefully Schematic View UI base can be used to edit animation hierarchy. If not, there's always the last-resort option of hacking in some non-standard node-editing UI such as Bobo's TreeMatoGraph (in no way is offense meant to Bobo's great script)

Stage 3: Finish tree editor GUI and create a GUI for setting the animation node's influence on a per-object basis. Switch to radial basis functions for interpolation. Investigate the viability of setting spatial keyframes per bone (instead of per character as in [1]) and viability of having keyframes sensitive to orientation in addition to position.

Extra:

Stage 4: Work with the system to create sample animations for presentations and to make sure it's actually usable. Investigate other innovative uses of spatial keyframes and hierarchical animation (both concepts provide a lot of power, but that is not immediately obvious). Create any necessary GUI graphics. One possible animation project is to create a walk along a complex path only by specifying the placement of the feet, while blending between different walk styles and layering other actions on top of the walk (and then focus on how easy it is to modify the result). Another possible project is to create a complex animation (possibly involving martial arts) and then draw only the cursors the animator had to control to create it (and focus on how simple it is to control complex animations).

Stage 5: Implement support for performance-driven animation (this would involve getting MAX to recognize one of the DGP lab devices as a motion capture input) or implement automatic keyframe space generation from examples. The latter would work as a new node type. Ideally, this would be implemented using SGPLVM as described in [3] using Neil Lawrence's codebase [5]. Even though the implementation is said to be straightforward, I don't think I have enough background in Gaussian Processes to deal with all the subtle issues that would invariably come up. Another less effective method would be to use Principal Component Analysis and Mixture of Gaussians, which I am at least familiar with.

Stage 6: Continue implementation of automatic keyframe generation and/or implement any extra ideas (discovered as a result of Stage 4) and make final adjustments. Possible idea for extras right now is to make several other tree nodes to show that the system can be extended (such as an IK solver node that blends IK results with the rest of the nodes, making IK not a special case operation as in [1]).

My predictions:

Best Case Scenario: By the end of Stage 2, the core is complete as a MAX plugin. The tree editor is using standart MAX GUI elements and integrated well into the software. By the end of second semester, there are well-designed animations showcasing the capabilities of the system. Automatic keyframe space generation is implemented similar to [3]. Any extra improvements are implemented as well.

Worst Case Scenario: By the end of Stage 2, the core is complete but is Maxscript-based instead of plugin-based. The tree editor uses hacked-in interface or simply a collection of dummy objects in the scene representing tree nodes (no interface). By the end of second semester, the core usability is improved. There are some animations created by the system. Automatic keyframe space generation is either a shoddy implementation using PCA and Mixture of Gaussians or not implemented at all. No extra improvements are implemented.

OCTOBER 16 2005
A productive day! I started and finished a MaxScript-based prototype/interactive mockup of what a single spatial keyframe node should behave like. Below is a video of it in action and a mockup of tree editor GUI: