TheRealFishTracker - a project to assist the Gerlai laboratory at UTM.


0.4.0Win32 Linux64June 19, 2011

When using the Windows version, be sure to extract all the files to the same folder. When using the Linux version, remember to set the permission bit to allow program execution, a number of shared object libraries will also be required.

The changelist details what has changed in each version.

If you don't know what to get, get the Win32 build. :)


The general purpose of the program is to track a moving object over frames of a video sequence. A large number of video formats are supported. One specifies the tracking region within the video, as well as what additional measures to track other than the object's position in the video. The algorithm works by taking the current frame's difference with an "average image" computed over the last several frames in the video. The object's position is calculated by comparing the absolute differences between the current and averaged frames, while considering the prior known position of the object. Summed area tables are used to compute frame differences over areas, making the algorithm efficient for tracking large objects as well.

Support for multiple moving objects coherently from frame to frame has also been added, but many of the other features (measures, tracking data post-processing) are not supported while using this mode.

Detailed Instructions:

Opening the video

brings up an open file dialog to load the video. A lot of standard formats are compatible (.avi, .mpg, .mod, .wmv). A view of the video will be shown on the right side of the window, and the first frame in the video will be displayed:

The slider below the video will allow you to scrub through the video in time if you left-click drag on it (useful if you want to start tracking at a particular point in the video other than the beginning). The time is shown in two formats: hours:minutes:seconds.milliseconds and seconds.

Setting the tracking area

You must always set the "Tracking Area". This is a rectangular region you specify that the fish you are tracking will be moving in. Press button. Then, left-click drag a rectangle into the video view (rectangle from top left to bottom right). Make sure you bound the whole region the fish can move as if you don't it will not be tracked.

In the example below, we set the tracking zone (always shown light green) to frame the water-portion of the tank (unless your fish is a serious jumper!):


You can specify rectangular zones that detect when the fish enters and exits them over time, which you can later use in analysis. The data given by zones is binary - true or false, based on whether the tracked fish is in that zone. During tracking, the zone will be drawn thicker when the fish is within the zone. There is no imposed limit on the number of zones you can add.

Press to add each new zone. You will be asked to give the zone a label, which should be some concise one word description for the zone:

The label you choose comes after a numeric prefix, e.g. if you enter 'leftside' and it's the first zone, the full name for the zone will be 'Zone1-leftside'. This convention avoids duplicate names. Do not use label names with whitespace characters.

Note that the zone does not have to be fully within the tracking area, in fact it probably should not to ensure that the whole part of the tracking area for the intended zone gets covered by the rectangle:

BAD - gaps in tracking area at the edges:

GOOD - no gaps in tracking area, making the zone effectively a line the fish crosses:


Rulers allow you to specify actual known measurements and dimensions in your video, and to track the fish with respect to these measurements. Without such measurements, it is not possible to generate real-world statistics (e.g. speed of the fish, since we need to know the distance it covers in the world over time). You can use whatever unit you wish (e.g. cm, inch), just be consistent!

To add a ruler, first click the button .

You will be asked to provide a label for the ruler (some sort of identifier for it). In our example we will be specifying the width of the tank so we just enter 'width' here:

You will then be asked to specify the units of the measurement you will be providing. Typically, this measurement would be something like the known dimensions of the tank. In our example, our tank is 100 cm across so we enter '100' here:

We then drag a line over the video to overlay the measurement. I did it on the bottom of the tank here. The ruler is shown as a dashed grey line, with short orthogonal bars at either end.

Note that the ruler can be used for 'distance to a line' measurements. The ordering of where you first click is the 'zero point' for the measurement. In our case since we left clicked on the left side and then dragged the ruler line to the right, when the tracked fish is at the left side of the tank its 'Ruler1-width' measurement will be 0.0, but when it's at the far right of the tank it will be 100.0. While the tracker is running, this value is constantly updated and shown (once you use rulers a few times, this subtle point becomes clear):

Fish at left side of the tank, the ruler measures a value of 3.643 cm.

Fish at right side of the tank, the ruler measures a value of 90.337 cm.

So effectively, the 'width' measurement here is also a measure of the fish's 'rightness' in the tank, or could also be thought of as it's distance to a line going up and down the left side of the tank. Again: where you first left click when specifying the ruler is where the ruler measures 0.

Another point: to combat the distortion of perspective projection, you may decide to use ruler endpoints not at the extents of the tank, but somewhere in the middle between the front and rear extent, or the rear extents. That is up to you.

Distance to a Point

You can also track the distance of the fish to a specific point. First press the button . As with the ruler, you specify first a label for this point-distance measure (we are putting ours at the tank's centre so we called it 'tankcentre'):

We know the tank is 100 cm across, and we will end up specifying a circle that is centered at the point, and we need to have a known measure for the radius of this circle. Thus we use the value 50 cm for the radius value.

We then left-click drag the measure in. We start out placing an outer point of the circle (so here we left click at one of the tank boundaries, and while still dragging, release once we are satisfied with the centre-point, which is shown as a cross):

While the tracker runs, we can see the measurement being made of the distance between fish and the point chosen, which is derived from the known measurement of 50 previously specified:

Running the tracker

To start the tracker, use the button around the bottom left of the window . The tracker will begin processing the video from whatever point you are at in the video (either at the beginning, or, if you played with the time slider, whatever part of the video you are currently at).

Note that the tracker actually dumps the video out to a file either when:

  1. The end of the video is reached
  2. You stop the tracker prematurely by pressing the button

Say the video you loaded was located at:


The output tracking data file (including all measurements, etc., you specify goes out to a file with the path):


Note that the writing of this file obliterates whatever data was in that file previously - so make sure if you have processed the tracking data for a video to copy the tracking data file to a safe location in case you ever decide to run the tracker on that video again!

So, it's a file in the same folder, with the same filename, but with '.data.txt' appended to it. The file is a plain text format file. Naturally, these files will tend to be much smaller than the input videos, making them easier to transfer around and back up if need be. Feel free to examine the contents to see what's stored. I have tried to keep the format simple and explanatory enough that others could write parsers for it. (Fortunately this shouldn't be necessary as I have written a whole part of this application to do some basic mean/variance analysis on this data over user-specified intervals, more on that later).

Back to details of the tracker itself. When the tracker is confident ("believes it is doing a good job") of tracking the fish, the rectangle showing the fish's tracked position will be white (the "tail" extending to the left shows a trail of the last second of tracked positions):

The tracker can lose the target. The reason can vary, but is broadly one of two things: the fish is no longer visible, or the fish is visible but some tracking parameters have been set incorrectly. When this happens, the tracker position remains motionless and is red:

The tracker might also track something moving that isn't the target fish. This might be caused for example by some moving reflection, or something moving elsewhere in the tank. In this case, and when the tracker fails to detect the fish, it would be useful to pause, go back, and manually fix the tracking path...

Manually adjusting the tracking path

There are three controls down by the time slider that let you manipulate the tracking data. While tracking, only the "Pause" button is enabled:

When you press pause this halts the tracker. It switches to a "Resume" button then that you can press, which just resumes the tracking right where you left off when you pressed Pause.

More importantly though, is that while paused, you have the option to freely move back (and once back, forward again) in time to observe the tracked path and make sure it's accurate. (Once paused, try pressing back a few times and you'll see a path that looks like the following):

You see the trail of the last second of data move back in time, as well as a path in light blue showing the tracked path up to the last point in the time the fish was tracked so far. I will hit back a few more times (pressing "Back" moves you back roughly a third of a second each time, 10 frames in a 30 frame per second video):

If I now click somewhere in the video frame, I can "snap" the tracking point to that position (I clicked a spot somewhat above the fish's shown position):

If I hit "Forward" once then click another spot, I edit another tracking point a third of a second later:

You may notice it does a linear blend with the tracking path in a local area around the current tracking point (1/3 a second in either direction). That way you don't have to edit and specify the fish's position for every frame of the video you need to modify (for a 30 frame per second video this could be time consuming). Once you are content with your editing, press Resume to let the tracker continue from where it left off, and your tracking data changes will be preserved. (To confirm this, you can always pause again and move back in time).

Tracking parameters - Confidence Threshold

I will survey the available parameters in order of importance, and provide some discussion.

The Confidence Threshold you set is the most important value. Intuitively, it expresses a minimum threshold on how clearly the fish is visible in terms of its contrast in relation to the background. The confidence of detecting a fish that was totally white on a totally black background would have a confidence of 255, that is the maximum possible. In practice, the value you choose will be a minimum threshold that is much lower. In most cases, a value between 20 to 40 would likely work well in practice for a video that was not particularly awful.

If you find the tracker occasionally losing the fish for a moment, yet it still seems quite visible even when this happens, try making the confidence threshold a little smaller.

If you find the tracker jumping around on and off target, try a value a little higher (the tracker may be following noise above the threshold, meaning you need to raise the threshold above the level of noise).

If neither of these solve your issue, the "signal to noise" ratio of your video may not be good. The differential method employed is crippled at tracking the fish in the presence of noise that is as dominant as the appearance of the fish itself.

To gain insight into how well the tracker is working, examine the "Tracker View" while the tracker is running. To do this, check the box . You can toggle it at any time the tracker is running. Instead of seeing the original video, you will see the difference between an "average frame" (over some past interval) and the "current frame". Ideally, the fish should be standing out (look like a bright white blob) and the rest of the tracking view should be dark blue/black, like this:

If you examine the above closely, you will see speckles of noise throughout the rectangle. Since our fish is much "brighter" this indicates we have a good setting for the confidence threshold. Suppose instead the noise is also bright, raise the confidence threshold. If the fish seems dim/grey/not bright white, we have to lower the confidence threshold so the fish will be picked up by the tracker. If when we do so the noise also goes very bright as well, then the video is not very ideal :)

Tracking parameters - Mean Filter Size

The tracker can work on objects of all different sizes. Knowing how large the objects look in the video, you can use this prior knowledge to further guide the performance of the tracker.

Instead of just looking for that single pixel of the video where the difference between object and background is greatest, it is better (handles noise more robustly) to consider some x by x square region of the video where, for all the pixels in the box, the average difference is greatest. As long as the object you are tracking fills the x by x box up, you should get a good result.

If you set this value to 1, the result is equivalent to looking for the pixel in the video with the greatest difference. The default value of 3, computes the maximal difference averaged over a 3x3 box (9 distinct pixels), and is generally a minimal size for objects you may track in your video that are not "too far away" to track. If the projected size of the fish you track are significantly larger, by all means try larger values. As my implementation uses a summed area table, there is no performance penalty for choosing large values.

The size of the mean filter is shown by the size of the rectangle used by show the tracker position. Compare:

Mean Filter Size 3:

Mean Filter Size 7:

Mean Filter Size 11:

(In order to make the tracker work when I set the value to 11, I had to reduce the confidence threshold. This is because since the fish doesn't "fill up" the whole 11x11 box, the unoccupied pixels "pull down" the average making the average contrast across the box lower.)

Tracking parameters - Frames to Average

This specifies the number of frames before the current frame to average together to compute an "average image" which is used to compare with the current frame, the difference of which determines where the fish is.

The default value is 300, which for a 30 frame per second video amounts to 10 seconds. So from the time the tracker is started, it requires this 10 seconds essentially to "warm up" or collect those frames to make the average, before it can start the process of tracking (this is why the tracker does not start working immediately). There are pros and cons to choosing small or large values:

Choosing a large value:

  • Pro: Noise gets averaged out more
  • Pro: The tracker does not lose fish that go motionless as quickly
  • Con: If the camera is bumped, or something in the background permanent changes suddenly during the video, the tracker will get confused for the duration
  • Con: More frames are skipped at the start before the tracker begins to work

Tracking parameters - Gaussian Variance

The Gaussian Variance parameter relates to a key assumption about fish: a detected fish in one frame will not move drastically (i.e. teleport) in the next frame.

While tracking the fish, we weigh pixels nearby the current track position higher, that is the tracker has a higher preference for points near the current tracking position. This weight is modeled as a gaussian function whose mean is centered at the current tracking position. The Gaussian Variance specifies the variance (essentially the width) of that function.

The explicit function incorporating this weighting being used in the implementation is in fact:

  • call imgdiff_x the absolute value of the image difference between average and current frame for pixel x
  • call dist the 2-norm distance of pixel x from the current tracking position (normalized by distance of diagonal of tracking area
  • call gaussvar this gaussian variance parameter
  • :

imgdiff_x * (0.25 + 0.75 e ^ ( - ( dist^2 / gaussvar ) ) )

I have somewhat arbitrarily chosen a non-zero constant weight (0.25) to multiple all pixels in the tracking area by, leaving the other 0.75 to the gaussian function. Thus each point in the tracking area gets a weight between 0.25 and 1. In the future, I may expose these parameters to the user.

Tracking parameters - Number of Fish to Track

A special-case feature that is used specifically for calculating the average distance between pairs of fish in a shoal, when this value is not set to 1. While the tracker is in fact capable of tracking multiple fish, many other functionalities of the application are not yet tied in well yet with this feature (e.g. measures).

Note that for shoal average distance: you need to specify at least one ruler with a known measure, so the shoal average distance can be computed as a real world unit.

Tracking parameters - Signed Image Difference

If you are certain your fish are light on a dark background, or the reverse, you can optionally enable the "Signed Image Difference" mode to incorporate the sign of the difference between average and current frames. If the tracker totally fails to track the fish, you know you have the radio button setting wrong, so try the other one :)

Unfortunately however, as noise tends to have a normal distribution (evenly sampled both from positive and negative sides of a bell curve), enabling this mode will only eliminate half of the normally-distributed noise from the input video.

Data Analysis

Pressing the Data Analysis button brings up a separate window:

First click Browse which brings up an Open File dialog to specify the tracking data file (the data file has the same path as the video file but has the .data.txt extension).

If you want the Speed of a single fish being tracked, or the average distance of a shoal of fish being tracked, you'll need to specify the ruler(s) here (they will automatically get populated with some guesses when you first load the video data file):

You can then add all the variables that you want to analyze into the table below. To add a new row (variable) to the table, click the Add Variable button .

The first column for the row that pops up lets you specify the variable itself from a drop-down menu.

It contains all the zones, rulers, point distances that you specified in the tracking stage that are in the data file, as well as some custom measures such as speed and relative and absolute turn angle that can be calculated.

Concerning turn angle: a counter-clockwise rotation in the observed orientation of the fish is defined as a positive change in turning angle, a clockwise rotation is a negative change. For each frame tracked, a turn angle value for a given frame is calculated as the angular difference in orientation (i.e. tangent vector) of the fish in the last frame and the current frame. Units are in degrees, and since direction of orientation matters the angle may be positive or negative. With all this in mind:

  • Relative turn angle uses the sign of the angle, making the direction of rotation relevant. A fish turning an equal amount clockwise and counter-clockwise over an interval would have a relative turn angle of zero, since the mean value would be zero.
  • Absolute turn angle uses the absolute ("always positive") value for the angle, which essentially makes the fish's direction of rotation irrelevant. A fish turning an equal amount clockwise and counter-clockwise over an interval would have an absolute turn angle that was strictly positive, since the mean of a set of positive numbers is positive.

A point worth noting is that since a mean value is computed across all the frames in the interval, the relative and absolute turn angles do not give the total turn angle or "amount of turning" over the interval. Also, they do not provide the average change in orientation per second (which might be a more desirable measure and be in a future release compared to average change in orientation between frames, as the current measure is not invariant to differences in recording framerate).

The Offset Time specifies how far to seek into the video data before starting the first interval to measure. In our example, we will start analyzing intervals right away. Note you can use a decimal value for any of these intervals meaning fraction of a second, they don't have to be integers.

Time to Measure specifies the interval length. For each variable, a mean value and variance value will be computed for all data recorded in the interval. I have chosen 10 seconds here. Again, a decimal value for the interval length is perfectly fine.

Time between specifies a time to wait (essentially skip by) before starting the next interval. I have used 0 here, so the intervals will be back-to-back every 10 seconds.

When you have specified all your variables and interval information, you are ready to produce the data analysis (i.e. the means and variances for each of your variables over the specified cyclic intervals):

Press the analyze button .

All the results for each variable show up just below in the text box (to be manipulated, copy/pasted, etc.), you can always regenerate this data again and again as long as you keep the tracking data file. Here are some example results:

For each variable we first see the name of the variable. On the next line we have the word "Time" with space-delimited values for the start of each measured interval. On the next line is the word "Mean" followed by the means for each interval, and finally the word "Variance" followed by the variance values for each interval. Copy-pasted into a spreadsheet program, and noting that space delimits fields, the interval data all aligns by column.

We see the mean speed of the fish between 860 to 910 seconds in the video varies from 8 to 9.5. We could compute a single mean across that duration by changing the interval length for the Speed variable from 10 to say 1000:

We see that the average speed of the fish over that interval (where data was present) was 8.846387 cm/sec. A variance of 23 indicates no drastic changes in speed over the interval.

Between 860 and 870 seconds into the video, we can see that the fish spent all of its time in zone Zone1-left. However, between 870 and 880 seconds, it only spent 18.6667% of its time there.

Note that over intervals where there is no recorded tracking data, those intervals are not included in the final output since there's no data for them. For example, say you skipped past the first 5 minutes of video with the time slider then started the tracker, intervals in the first 5 minutes would be included in the output. Or say you ended the tracker 2 minutes before the end of the video, there would be no interval information for that last 2 minutes of video.

Happy tracking! :)