Travels of Code Monkey: Final Thoughts

January 9, 2010 sporksmith Leave a comment

This is part 4 of a series describing the making of the Travels of Code Monkey project.

Technology choices

I owe a lot to Processing for allowing me to quickly sketch my ideas for this project, even with my limited graphics and animation programming experience. That being said, it’s not without its limitations. A lot of the benefit of Processing is in hiding complexity of Java and its graphics APIs. The Processing IDE rewrites your Processing code to Java code, which is then handed to a standard Java compiler. Errors from the compiler or run-time refer to that intermediate Java code rather than the original Processing code. Usually I could still locate the source of the error, but sometimes it could be a bit tricky. There’s also no debugger; it’s ‘println’-style debugging all the way.

The Processing framework can also be used from within Java code. Once a project gets to a certain level of complexity, it probably makes sense to rewrite it as Java code that calls the Processing library. I never bit the bullet and performed the step, but I would seriously consider doing so before doing much more with the current code. The Processing syntax is pretty close to Java syntax, so theoretically it should be fairly straightforward.

I also owe a big thanks to Blender, which was a huge help for the animation and video sequence editing portions of the project. Blender is a powerful tool, but its learning curve is a bit high. In particular, it can be a bit tough to get your bearings for using the Python API. For my animation export plugin, I managed to get by by looking at and stealing code from other plugins, and by looking at the individual API function documentation.

Sharing and motivation

I debated a lot whether and how much to share about this project when it was in progress. I’ve really enjoyed reading other technical works in progress, such as Shamus Young’s Procedural City series. Given that I only had a few hours here and there to work on this project, though, I was worried that I wouldn’t be able to hold people’s interest, or that someone with more time and\or expertise would get tired of waiting for me to finish the project and beat me to the punch.

I needed to share with someone though, and I ended up sharing progress updates with Hack Pittsburgh and DevHouse instead of the Internet at large. It definitely paid off to share – seeing other people get excited about the project really helped motivate me to bear down and finish it, and I had a lot of good technical feedback along the way. I tried not to subject people to too many updates so that they wouldn’t get tired of the whole thing before the final result came out.

In the end I probably wouldn’t have had all that much to fear by sharing on the Internet, and maybe I would have had that much more motivation and feedback as a result. I’ll never know for sure.

What next?

I think there’s a lot more that could be done with this style of animation, and maybe with the tools I’ve written so far. For now, I’m taking a break from this direction for a bit, but releasing the code in case anyone else would like to give it a try. It should be possible to make some interesting animations with the tools in their current state with the monkey pictures. It should also be straight-forward to extend the tools into generalized tools for opportunistic stop-motion using arbitrary objects. Some ideas for improving the tools include:

  • Better matching\rendering performance. Faster is always better, and I think with a little work it should be perfectly feasible to perform real-time matching and rendering. It could then be used to create on the fly animations by again ‘driving’ the character around, or digital puppetry techniques.
  • Command-line matching\rendering tool. As I was iterating on the animation, I’d often end up needing to re-render parts of a sequence, or several small sequences. It would have been nice to be able to set up a Makefile to rerender everything that needed to be rerendered, rather than having to choose individual sequences through the GUI.
  • Generalize to easily support other characters. This will require a little bit of refactoring, but should be straight-forward for the most part. Right now the biggest remaining pain is that the character must be modeled separately in Processing and Blender. The ideal solution is probably to model it once in Blender, and import the model into Processing.
  • Collaboration labeling. The more labeled photos that are available, the better this technique will work. It may be possible to integrate with existing photo sharing sites such as Flickr, using tags to encode what characters are present in a photo and their positions\orientations. For an example of this sort of tool, see Colr Pickr. Alternatively, an open and independent database with a web services API could store this metadata, along with links to the photos.

Feedback

I’m taking a break from this project for now, but depending on people’s interests I may make more animations and\or improve the tools.

The code is still quite rough, but if you’d like to take a look, it’s now available on github: EepEepMotion

Let me know what you think!

Categories: Uncategorized

Monkey Animation Project Part 3: Animation

November 27, 2009 sporksmith Leave a comment

This is part 3 of a series describing the making of the Travels of Code Monkey project.

Up to this point, I wasn’t sure exactly what sort of animation I wanted to make. The original intent was to make the simplest possible animation that used all of the photos exactly once, perhaps at a speed where you could actually look at the photos. In fact, I made a couple of these without too much trouble, basically by sorting the monkey pictures, and then moving through them. The first version of this just had Mr. Monkey sitting in the middle and gradually rotating a full three hundred-sixty degrees. This was cute, but it was a bit boring, and many of the pictures were scaled too large, or too small, or cropped quite a bit, etc. The next step was to divide the screen into sections, and have Mr. Monkey gradually move around to use all of the pictures, with each one left mostly intact.

Here is one of those test animations. In this one, the pictures are sorted into a number of ‘size’ buckets, and then within each size bucket sorted by rotation. The result is that Mr. Monkey starts out small \ far away, and gradually gets larger \ closer, while rotating.

I was tempted to call the project done at that point, but, thanks partially to some prodding from my then-fiancee (now wife) BlackCatBonifide, I decided to take it to the next level, and make a real animation. I started off trying to create some animation tools within my Processing app. First I added a mode where I could “drive” mock-up Mr. Monkey around like an airplane with the arrow keys, after which the app would go back and “render” the animation using the real photos. This was amusing, but the results are unsurprisingly quite sloppy. It turns out there’s a reason you don’t see many 3rd person 3D flight simulators.

Suzanne meets Mr. Monkey

My cheap tricks having failed, I resigned myself to doing “real” animation, with keyframes and tweening. I spent a while debating whether or not to implement these things myself inside the Processing app. The advantage to that approach would have been to have the entire process smoothly integrated inside one application. In the end, though, I decided against reinventing the wheel. Instead, I used Blender, an open-source 3d-modeling, animation, and non-linear video editing software. Coincidentally, Blender already comes with a model of a monkey head: Suzanne. Unfortunately, she has somewhat different proportions from our own Mr. Monkey, so I ended up re-creating him anyways.

I modified one of Blender’s existing export scripts (using its Python API) to export a given animation of a single object in a very simple text format: Each line has the desired x and y coordinates, size, and rotations for a single frame of animation. Finally, I extended my Processing app to read the animation files exported from Blender. It then performs the previously described matching algorithm for each frame of the animation, and then dumps each frame as a sequentially named jpeg. I eventually imported these back into Blender’s non-linear-video editing tool to create the final product.

The Real Work: Animating

With the tool chain more or less complete, I was left to finally create the animation itself, using Blender’s built-in animation tools. For the verses, I started out by choosing photos that I wanted to freeze on at key parts of the song, and setting up those poses as key frames. Of course, these initial animation were quite boring, with Mr. Monkey drifting in a straight line between each key frame. I spent quite a bit of time making these animations more interesting, by having Mr. Monkey travel in circular paths, rock his head to the music, jump around the screen, etc. Creating interesting, smooth, natural-looking animations even for something as simple as a monkey head is not easy, and I have a lot of respect for those who can do it well. Luckily for me, I only had to do it not-too-terribly; I relied on the photos to make things interesting. Once I had animated Mr. Monkey’s disembodied head flying around in Blender, I exported the animation to text, imported that into my Processing tool, and used that to render the animation using the pool of labeled photos.

Cheap isolation, using mock Mr. Monkey's silhouette as a mask.

For the chorus animations, I utilized my secret weapon: my wife Bonnie (BlackCatBonifide), who does professional work in multimedia, particularly in animation and music. She had already given me a lot of inspiration and feedback up to this point, and she proposed that she do the chorus animations to mix up the animation style a bit. I happily agreed. I created some simple monkey animation loops for her to work with, including versions with Mr. Monkey’s head isolated from the rest of the photo, using Mock Mr. Monkey’s head rendered in silhouette as a mask. Finally, Bonnie combined these with other photos from the Travels of Code Monkey pool, and with other creative-commons photos from Flickr, animating everything together using Final Cut.

Until next time

That just about sums up the technical process. There will probably be at least one more post in the near future with some final thoughts, including what I might have done differently knowing what I know now, and what’s next for animation tools.

Categories: Uncategorized

Monkey Animation Project: Now online!

November 21, 2009 sporksmith 3 comments

Here is the finished video!

Unfortunately, youtube recompresses everything, and I think the unusual animation style used here does not play well with their compression techniques. I’m still pretty happy with it, but I’m thinking about posting a better version somewhere else. One solution may be to seed it on bittorrent, but I don’t know if my frail comcast upload pipe would fair too well. Any ideas, lazy web?
EDIT: Actually, it looks good when it doesn’t drop frames. Unfortunately it sometimes drops a lot of frames. Letting the video load 100% seems to help a little bit. It also happens more on some computers than others.

I’m still working on writing up more about the development process, including posting some videos of variations of this animation technique. Stay tuned!

Categories: Uncategorized

Monkey Animation Project Part 2: Matching

November 17, 2009 sporksmith 5 comments

If you haven’t already, you might want to read Part 1: Labeling.

The next feature I needed was, given a desired position for Mr. Monkey, to come up with a photo with Mr. Monkey in that position. There’s a number of ways to come at this problem, but let’s first consider the problem of taking any of the labeled photos, and making it match the desired position and orientation. We can actually come pretty close for any of the photos, using the following transformations:

  • Scale. Scale the photo so that Mr. Monkey in the photo is exactly the desired size.
  • Pan. Move the photo so that Mr. Monkey in the photo is at exactly the desired location.
  • Rotate. Unfortunately, I don’t have access to the magic CSI computers to rotate Mr. Monkey in the photo freely in all dimensions, so it’s not generally possible to make Mr. Monkey in the photo face exactly the desired direction. I can’t make him face the camera if he’s facing away from the camera in the photo, but if he’s facing the camera straight-on, I can rotate the photo to make him lean to the left. There’s some annoying 3d geometry required to do this in a fully general way, of which we shall not speak.
  • Mirror. Here’s one more dirty trick I can pull to make Mr. Monkey face closer to the desired location. If he’s facing to the left, I can make him face to the right instead by mirroring the photo.

While we can force any of the photos to sort-of match the desired position and rotation, it’s not going to look equally good for any given photo. Here are some of the problems we can end up with:

  • Rotation still doesn’t match closely. There’s only so much I can do to change which direction Mr. Monkey is facing.
  • Some of the picture is no longer in frame. After applying the above transformations (scaling, panning, and rotating), some of the photo may be effectively cropped off (out of frame). If Mr. Monkey is talking to someone in the photo, cropping that person out of the photo causes it to no longer make sense.
  • Some of the frame might be unused. If Mr. Monkey was all the way on the right side of the photo, and we moved him to be all the way on the left side of the frame, not only is the picture out of frame, but a lot of the frame is unused, revealing a boring blank background. This can also happen if we scale the photo down smaller than the frame size.
  • Pixellation from scaling. Mr. Monkey starts looking chunky if we scale him from fifty pixels to two-hundred…twunkey.

So, to find the best match, we just see which transformations are needed to make each candidate photo match, see which of those had the least of those undesired effects, and then go with that one. There are smarter things we could do here, but this is good enough for now. I’ve thought about using a more sophisticated data structure so that I don’t need to consider the entire photo pool for each match, but for now the simple way is good enough: it takes a little under one second to find the best match and load the selected photo from disk and apply the transformations.

A couple notes about the video: I had the mirroring transformation turned off, so none of the matches demonstrated are mirrored. Of course, the mirroring is typically only noticeable if there happens to be some text in the photo. Second, if you are somewhat obsessed with arithmetic, you might have noticed that the components of the match score don’t add up to the total score. This is because there is a weighting applied to each sub-score before adding them all together. For example, in the current settings, the fraction out-of-frame is multiplied by 0.1 before adding it to the other scores. In the actual animation, having a large portion of an individual photo go out of frame doesn’t matter much, because it won’t be on-screen long enough for you to notice anyways; it’s more important that the rotations match, and that it isn’t scaled too horrifically.

Next time: Animation!

Categories: Uncategorized

Monkey Animation Project Part 1: Labeling

November 16, 2009 sporksmith 4 comments

This Friday, we (BlackCatBonifide and I) release our Code Monkey video, comprising 600 photos of a stuffed monkey, traveling across the world at 12 frames per second, to the tune of Jonathan Coulton‘s song, Code Monkey. This week I will post a few essays describing the steps leading up to the final product.

This project has been almost two years in the making, all starting with this forum comment from Colleenky:

OK, this might be a crazy idea. What if I mailed a stuffed monkey to one of you, and you mailed it to another JoCo fan, and so on, until it finally reached JoCo one year from now on JoCo Day 2008? We could set up a Picasa site for pictures of the monkey in every location.

Over the course of the project, Mr. Monkey visited JoCo fans all over the U.S., including Bon and myself in Pittsburgh, and even ventured across the pond to visit fans in Sweden and the U.K. The fans were good enough to document Mr. Monkey’s travels, and upload them to a shared Picasa album. I had the idea almost immediately to put together an animation where Mr. Monkey stays in one place, with a flickering background, à la “Paris Hilton’s Face Never Changes”. I pretty quickly realized though, that most of the pictures being uploaded did not feature Mr. Monkey in the same pose; the distance and viewing angle from the camera varied pretty drastically. This made the problem more difficult, but also much more interesting. Instead of just making a simple gimmick animation where Mr. Monkey stays in one place, I got to create a complex gimmick animation where Mr. Monkey is semi-smoothly animated monkeying his way around the screen!

I wasn’t sure exactly where I was going with this when I started the project. It was exploratory coding, with the vague intention of creating some sort of animation. Many shortcuts were taken. Since I typically only had a couple hours at a time to work on it, I made it a point to take the shortest path to making something cool happen. Given a choice between a dirty hack I could pound out in one session, and doing something “right” across multiple sessions, I did the quick hack every time, and only returned to the more difficult path when and if the hack turned out not to be good enough.

Step 1: Labeling

The first step was to figure out how to record Mr. Monkey’s position, orientation, and scale in each picture. In accordance with my philosophy of doing the simplest, dumbest, thing that might work, my first attempt was to manually use Gimp to measure Mr. Monkey’s (x,y) position and size in pixels, guesstimate his yaw, pitch, and roll, and annotate all of these as a prefix to the filename of each picture. This approach turned out to be a bit too simple to work. I actually did get as far as adding a prefix to all of the pictures specifying Mr. Monkey’s pitch, sorting by filename, and using that as a test animation where Mr. Monkey smoothly turned his head, but jumped around all over the screen. It was a start, but it quickly became evident that this labeling approach was too slow and inaccurate to be practical. The project got put on hold for a while at this point.

Some weeks or months later, I stumbled across Processing. Processing is a programming environment meant to make graphics-programming accessible to non-programmers. It is highly simplified, yet surprisingly capable. While I am a computer engineer by trade, I wasn’t quite motivated enough to learn full-fledged 3d graphics and gui programming for the purposes of this side-project. Processing turned out to be the shortest path to making something cool happening. While I did bump up against some of its limitations, it was capable enough to do what I needed to do, and simple enough to keep the cool-stuff-happening to pain ratio above my screw-this-project threshold.

If I remember correctly, I managed to make the first version of the labeling application and do a first pass at all the labeling in a couple weeknights and a weekend. In the application there is a very simple mock-up of Mr. Monkey’s head. For each picture, I just drag the mock-up head over Mr. Monkey’s real head in the photo, scale it, and rotate it, until it exactly overlaps. Then I hit save, and the path-name to that picture, and Mr. Monkey’s coordinates within the picture, get appended to an index-file.

A lot of people ask me why I didn’t do something more sophisticated for this step. Again, my philosophy was to take the shortest path towards the immediate goal. Yes, it would be cool to use some sort of computer-vision techniques to automatically make at least a first pass over the labels in a fully automated way. However, for the number of pictures that I needed to label (about 600), the time spent building such a system would have far exceeded the time it would have saved me; especially since the results would almost certainly need to be manually double-checked and tweaked anyways.

Another option which I did consider more carefully was to crowd-source the work. In fact, that is one reason I chose Processing: the resulting application can be compiled as a Java applet, which people could then run from a web page without installing any software. From there, I could have probably rounded up some help amongst the JoCo forum members and other friends. If I really wanted to crank up production, maybe I could have actually paid people to do it on Mechanical Turk. Once I got started doing the actual labeling though, I realized it was only going to take me half of a day to just do all of the labels myself. The extra time needed to make the labeling program nice enough for people-who-aren’t-me to use, round up those people, divide the work, get everyone to label in a consistent way and/or double-check the results, and merge the results just wasn’t worth it. That said, if I were going to scale up this project to animations using a lot more photos, this is one of the first features I would consider adding.

That’s all for today. Next post: using the labels to render mock-Mr.Monkey using photographed Mr. Monkey.

Categories: Uncategorized