So I’ve been playing with video tracking code in python for a while. In my first week as a post-doc I bashed together some code to automatically track chickadees doing captive trials. I’ve also tracked several crickets in various test arenas. These systems are steadily getting better, and now we’re hoping to ramp up the cricket work, both here and at Carleton University. Additionally, we’re hoping to potentially do something with tracking the chickadee captive trials..

Clearly at this stage, I can’t just send undergrads and volunteers my code. It’s too specialised for each particular problem and I can’t be everywhere at once. It therefore felt like it was time to perhaps starting sticking a graphical user interface on these things.

I had come up with a few quality of life improving pieces of code before. I’d written a script that automatically downloaded RFID data from an SD card, sorted it into it’s appropriate folder and then removed it from the SD card. That still looked like a command prompt though, even if all you had to do was hit enter.

I hadn’t really had any experience with coding user interfaces before, but TKinter seemed the way to go in terms of putting a friendly face on my python code. So I sat down and sketched out what I needed my software to do initially.

  • I needed it to be able to automatically load videos from a file structure, based on whatever variables the user had chosen.
  • I needed users to be able to refer it to a datafile containing data such as the time to start tracking, end tracking or find a clean background plate.
  • I needed users to be able to define areas of interest in the video such as maze arms or trees.
  • I needed a button to start tracking videos.

This seemed like a simple enough initial list. The first thing I decided was to have separate programs for doing the video tracking and defining the areas of interest, though they would share features. I decided to start with this “polygon clicker” as I referred to it. I had written some basic code to let me do this in the past, but it was set up to only work for a particular problem, and I wanted something far more generalised and user friendly.

I started pulling together a GUI, selecting a base folder, selecting variable names, automatically generating the required list of trials based on those variables, etc.

I quickly learnt that there were always more features to add. The first thing I bolted on was a config file so that I didn’t have to click through all the dialogues every time I wanted to test. This of course required a whole set of extra functions to save it and load it when the program was opened and closed.

Eventually I got to the stage where I could add the code to define polygons. This wasn’t without a few hiccups though.

Accidental modern art?

Finally I had the program in a form that I was happy with.

Now it’s time to work on the video tracker.

Once again I had some code ready, but there were always more features to add. What about cropping? What about the fact that our cricket videos had two trials going on in the same frame? What about the ability to copy and paste polygons? Or assign polygons based on a variable rather than on a trial by trial basis? What about making sure it works on a Mac?What about shiny progress bars?

I have to confess I like shiny progress bars:

Progress shall continue.

If this works well, I’ll hopefully make it public. I’m sure it might have some applications for other people!

Oh, in other news, it was my birthday!

Image may contain: 1 person, standing

As I have advanced in years, I have been granted WISDOM by this bird. Admittedly it was mainly “CAWWW”, but every little helps..




So a while back I reported I was going a tad insane translating code from Matlab to R. I’ve since written a guide that I hope will help others with similar issues. The code I translated is now available as part of the ASNIPE package for R. ASNIPE stands for Animal Social Network Inference and Permutations for Ecologists. The code I translated is gmmevents, which runs a Gaussian mixture model on time series data. A gaussian mixture model is a form of unsupervised machine learning, sorting one dimensional data into clusters, similar to k-means clustering. To illustrate:

Adapted from Fig. 2 – Psorakis et al., J. R. Soc. Interface  (2012)

This obviously has a lot of applications in social network analysis, as we can infer that individuals present in the same event likely have some interaction. I hope people find it useful, and that it was worth me going slightly mad over!

Translating between Matlab and R

I mentioned in a previous post that I’d write a guide about translating Matlab code to work in R, so that others can avoid the same mistakes I made. This should also function as an R users guide to learning Matlab syntax and vice versa. I hope some people find it useful!

Full article below.


Continue reading

Translation Issues

Where on earth did the time go? One minute I was looking at ice sculptures and wading through snow drifts and now the snow is gone (mostly) and various wildlife has emerged from hiding.

Look, beavers!

I’ve also managed to get out on the water myself. And eat bacon while doing it. Canadian bacon is Different.

riverbaconRiver bacon!

Perhaps the main reason I’ve lost track of time in the last few weeks is that whether I’ve been awake or asleep, this keeps on flashing in front of my eyes.


I haven’t been able to escape it. My office desktop has looked like this on and off for the whole month. I hope to later show what this has resulted in, but in the meantime I’m going to moan about the amount of pain it’s caused me.

The main source of trouble was that this code was originally written in Matlab. I decided to save us from having to acquire a Matlab license by translating the many scripts that make up the code into R.

Initially this was tedious. There are enough syntax differences (not to mention different names of functions etc) between R and Matlab that this required me to go through the code line by line. Then, even when I done this a number of errors arose simply due to the differing ways that the two programs handle data. I’ll post a guide based on what I learnt in separate post, featuring less pictures of beavers.

Much hair pulling later I got the code running, fed it my data and got a result. These results were consistent with some previous findings obtained using simpler methods. So far so good.

I then decided that instead of feeding my data to the code all in one go, it would be useful to give it one day at a time and then collate the results. “Fine” I thought. “Just modify my overarching processing code, no trouble”.

I was wrong.

c149182a696e73890de876f3d392e2da(Found via googling evil Matlab)

Once again the way in which the two programs handle data required me to make a lot of modifications to the various scripts. Cue more hair pulling. I should also mention that I’ve written this code to be run in parallel, utilising all of my computers cores to increase speed, which means R’s normal debugging tools don’t work.

Finally I got the code to run again and got a result. However, something had changed. A previously suggested relationship had completely reversed in direction. Was this simply due to the new way of feeding the data in? Or was it due to a bug in my code? Or due to me deleting some faulty data? I ran the code again using the original way of processing the data.

Even using parallel processing, this code can take anything from several hours, to all night to run. This meant that getting results was a slow process. So after waiting several hours for the code to run again using the original data processing, once again I got results.

The relationship had flipped direction in these results too.


This suggested that the changes I’d made to accommodate the new data processing method had resulted in a COMPLETELY DIFFERENT RESULT. On the one hand, this was good. It meant that the biologically unrealistic result was due to my error rather than a fundamental problem with the methods. On the other hand, this is the sort of thing that can wake a scientist up at night screaming. A series of small changes in the way data was analysed leading to completely misleading results. In this case we’d caught it before we went too far, but if we’d approached this naively it might have been very easy to miss.

So, now I needed to work out which of my changes had caused this change. Luckily I save all my working files in Dropbox, which keeps a backup of all previous versions. I found a word document containing graphs I’d made to show my supervisor before I’d made changes and reverted all my code to a date before then. Then one by one I reinstated my changes.

In the end I pinned it down to one file. In that file, one line of code.


One line of code had resulted in huge, significant changes to my final result. As I said, the stuff of nightmares.

In the end I stripped out all the changes I’d made and carefully rewrote the scripts to deal with the new method of data processing. So my tale of woe has a happy ending, the code now works and perhaps I’ll even have some results soon. For everyone who made it this far, here is a view of Gatineu Park:


Snowy owls, snowmobile trails and beaver tails

So I think I’ve been in Canada for nearly two weeks now. I am gradually learning things.

For example, when someone talks about getting beaver tails, they actually mean some sort of delicious pastry:


I also learned how to write some basic video analysis stuff in python, and that a single chickadee in a controlled environment is an easier thing to track than multiple shags on the ocean:vlcsnap-2016-01-23-23h49m11s193I learnt that they deliberately thicken the ice here:


This is so the canal (which I walk along on the way to the office) can be turned into a massive ice skating rink. It opened for the first time today. I saw many people gliding effortlessly along, as well as some children being dragged along on sleds, which looks more my speed.  Since I got here I have occasionally been asked if I skate, to which I give a rueful laugh. I don’t think me and skates would really mix.

Another thing I learned is that batteries drain very quickly in the cold weather. Today myself and the grad students in my research group went to see if our lab vehicle still worked after sitting in a garage for a month or so. The answer was, it didn’t.


Battery is flat.

However once a man from the Canadian AA turned up and jump started it, the car had to be driven somewhere to charge the battery. So I got to go on jaunt to one of the field sites where chickadees are studied, alongside a snowmobile trail. This was great as I was keen to get out of the city and see some Canadian countryside, and see some chickadees in the wild.


We found many chickadees, but also a bald eagle!



Shortly after seeing this I left the trail to go and look at a heap of rusty farm machinery buried in the snow. When I came back to the path, I knew something was afoot. It was then that I learnt that Canadian ambushes are exceptionally polite, as I was warned I might want to put my binoculars away:

12615379_10208372060120941_7821439240601366873_oArg! (Photo by Teri Jones)

It was then decided that we would head back toward Ottawa and try and find some snowy owls that had been reported in the area. On the way we would stop for “Timbits”. I did not know who Tim was, or the answers to any related questions. I found out:


It seems that Timbits are a big box full of the centres of various doughnuts! These tided us by as we headed toward where we might find snowy owls.

We stopped at the edge of a field in the general area where the owls had been seen. We all climbed out of the car and had a general scan of the trees and hedgerows. Nothing. We got back in the car to try a different spot. This was rather similar to my previous experience of attempting to see specific birds, so I wasn’t overly hopeful about finding a snowy owl in an expanse of snowy fields.

Then Shannon who had been looking out of the moving car with her binoculars (I am fairly certain doing this would make me carsick) suddenly spotted something on an electricity pylon in the middle of a field. We parked as close as we could to have a look. The post was quite a distance from the road, but through our binoculars we could clearly see a snowy owl! I had never even seen an owl in daylight, let alone that clearly.

I tried to get a picture but even at maximum zoom it wasn’t enormously clear.


Still, the view through the binoculars was great. We stood and watched the owl for a bit, before deciding to head back down the road to try and find the female that was also supposed to be about. Once again I was sceptical. I think I’d just said finished muttering something along those lines when I suddenly has to ask:

“What’s that on the post?”


There, on a post right next to the road was another snowy owl. We parked up right next to it, getting a much better view than before. I decided I had to try and take a photo.

It was at this point was once again reminded that batteries drain incredibly fast in the cold. Like the car earlier, my camera refused to start up. I fumbled with the various spare batteries. None of them worked. This was absolutely typical but luckily the owl was fairly accomodating. Eventually through luck and the strategic warming up the batteries, my camera finally fired up:


Tomorrow: statistics course!

Python is Nice.

During my undergraduate degree, we had one statistics module in our second year. While this module gave us a rather nice theoretical introduction into the use of the basic statistical tests such as t tests, the practical side was a little hazier. We had a few scheduled practical sessions in which theoretically we would be introduced to the various options for statistical analysis available. Practically however, we were encouraged very heavily to use SPSS. R was mentioned as an option, but only as a SCARY option. I had a brief look through some lists of commands, but without any real sense of direction and soon reverted to using SPSS like the rest of my class.

(I should add that this has all changed a lot in my University now, students now get introduced to R and other statistical techniques in their first year)

It was only when doing my masters course that I first started getting to grips with using R, my first basic coding language. While initially daunting I soon found that it appealed to me for a variety of reasons, laziness being one of them. In SPSS, deciding to change a parameter on a test would require re-clicking through a whole host of dialogues. In R a simple code change can produce the change immediately. The same was true of graphs. While SPSS had a friendly WYSIWYG interface, it could prove immensely fiddly when you had a particular graph in mind or if you needed to produce a series of identical graphs.

Beyond laziness we started playing with real, messy datasets. The advantage of being able to manipulate large  amounts of data quickly and efficiently soon became apparent, not to mention the various powerful tools R provided to analyse these datasets.

When  began my PhD I also started using matlab to play with mathematical models. While similar to R, it had some annoying differences in syntax that frequently tripped me up when I was starting out. Still, I was very much coming round to the idea that “Code is Good” and decided that being able to do all this stuff would save me an infinite number of headaches. This very much proved to be the case, even for something as daft as wanting to have all individual birds I had GPS coordinates for share a similar naming convention. Suddenly I was playing around with the regular expressions commands to manipulate strings. I can’t pretend I 100% understand these even now, but I can understand how powerful they can be.

I don’t know perl 😦

Throughout my MSc and even into my PhD I knew other for whom code just would not click. I saw this again when demonstrating on undergrad and masters stats modules. Occasionally (especially at undergrad level) I saw students making self fulfilling prophecies that anything that would involve writing code was not for them, and that engaging in it was therefore worthless. I’d always argue that is definitely worth engaging in. The stuff commonly used by bioscientists is infinitely friendlier than “proper” programming languages once you get to know it, and is so ridiculously useful.

This brings me round to the main thrust of this blog post, python. Python is Nice. It’s really really nice. You just won’t begin to imagine how very nice it is. R is powerful for statistics, but starts to fall down when you start getting into the realms of image analysis, file manipulation etc. Matlab is pretty powerful and flexible, but costs MONEY.

Python is free, friendly and powerful. Using python I have been able to chuck files around different directories automatically, process large data files, run bayesian statistics and create some pretty graphs. Here is one that ended up getting cut from what I’m currently working on, so it’s probably ok to show.



The initial setup can seem slightly confusing, and I highly recommend using a distribution like winpython which installs everything required and comes with lots of useful packages (including the essential numpy which lets you do all the sorts of n-dimensional matrix manipulations that you might be used to from R or matlab). Using a prepackaged version largely avoids all the confusion between firing up python in windows commandline vs. using a python interpreter that I certainly puzzled over initially. It also comes with an interpreter/editor spyder which makes the writing and trialling of big scripts extremely straight forward, in that it combines the script editor/console in the same way as Matlab or R commander might.

Will Python be replacing R for me? No, definitely not, R does too much good statistical stuff and has too many clever people writing packages for it. Will I be using it to do some of the things that I used to do that pushed R out of it’s comfort zone? Yes. Will I be using it instead of Matlab? Yes, because it is free!

I suppose the other main point of this rather rambly post is for students just starting out with all this: Don’t be afraid of code. Getting to grips with it is daunting at first, but incredibly rewarding in the long run.



Back to Reality

When I eventually got back into the office after my trip to the USA, I turned on my computer and stared at it for a while, trying to remember how it worked.

Once the basics of how to operate a computer came back to me, I took a long hard look at where I am in my PhD. My basic thought process was something like this:

  • I have some stuff.
  • I need to write about this stuff.
  • Most of this stuff is going to require the doing of additional stuff before it is in a state where I can write about this stuff.
  • I REALLY need to write about this stuff.
  • I wonder where I have put all this stuff?

By which I mean I need to consolidate a few years worth of data, that in the past has been exported, rather haphazardly, to various folders in my dropbox. I also probably need to redo some statistics to include additional data collected this year. I then need to try and hammer this out into writing that people other than myself can understand and find interesting.

Let the gathering of Excel files commence.