Translating between Matlab and R

I mentioned in a previous post that I’d write a guide about translating Matlab code to work in R, so that others can avoid the same mistakes I made. This should also function as an R users guide to learning Matlab syntax and vice versa. I hope some people find it useful!

Full article below.

rmatlab

Continue reading

Translation Issues

Where on earth did the time go? One minute I was looking at ice sculptures and wading through snow drifts and now the snow is gone (mostly) and various wildlife has emerged from hiding.

Look, beavers!

I’ve also managed to get out on the water myself. And eat bacon while doing it. Canadian bacon is Different.

riverbaconRiver bacon!

Perhaps the main reason I’ve lost track of time in the last few weeks is that whether I’ve been awake or asleep, this keeps on flashing in front of my eyes.

codegif.gif

I haven’t been able to escape it. My office desktop has looked like this on and off for the whole month. I hope to later show what this has resulted in, but in the meantime I’m going to moan about the amount of pain it’s caused me.

The main source of trouble was that this code was originally written in Matlab. I decided to save us from having to acquire a Matlab license by translating the many scripts that make up the code into R.

Initially this was tedious. There are enough syntax differences (not to mention different names of functions etc) between R and Matlab that this required me to go through the code line by line. Then, even when I done this a number of errors arose simply due to the differing ways that the two programs handle data. I’ll post a guide based on what I learnt in separate post, featuring less pictures of beavers.

Much hair pulling later I got the code running, fed it my data and got a result. These results were consistent with some previous findings obtained using simpler methods. So far so good.

I then decided that instead of feeding my data to the code all in one go, it would be useful to give it one day at a time and then collate the results. “Fine” I thought. “Just modify my overarching processing code, no trouble”.

I was wrong.

c149182a696e73890de876f3d392e2da(Found via googling evil Matlab)

Once again the way in which the two programs handle data required me to make a lot of modifications to the various scripts. Cue more hair pulling. I should also mention that I’ve written this code to be run in parallel, utilising all of my computers cores to increase speed, which means R’s normal debugging tools don’t work.

Finally I got the code to run again and got a result. However, something had changed. A previously suggested relationship had completely reversed in direction. Was this simply due to the new way of feeding the data in? Or was it due to a bug in my code? Or due to me deleting some faulty data? I ran the code again using the original way of processing the data.

Even using parallel processing, this code can take anything from several hours, to all night to run. This meant that getting results was a slow process. So after waiting several hours for the code to run again using the original data processing, once again I got results.

The relationship had flipped direction in these results too.

Moving-animated-clip-art-dont-panic-picture

This suggested that the changes I’d made to accommodate the new data processing method had resulted in a COMPLETELY DIFFERENT RESULT. On the one hand, this was good. It meant that the biologically unrealistic result was due to my error rather than a fundamental problem with the methods. On the other hand, this is the sort of thing that can wake a scientist up at night screaming. A series of small changes in the way data was analysed leading to completely misleading results. In this case we’d caught it before we went too far, but if we’d approached this naively it might have been very easy to miss.

So, now I needed to work out which of my changes had caused this change. Luckily I save all my working files in Dropbox, which keeps a backup of all previous versions. I found a word document containing graphs I’d made to show my supervisor before I’d made changes and reverted all my code to a date before then. Then one by one I reinstated my changes.

In the end I pinned it down to one file. In that file, one line of code.

as.matrix(Y)

One line of code had resulted in huge, significant changes to my final result. As I said, the stuff of nightmares.

In the end I stripped out all the changes I’d made and carefully rewrote the scripts to deal with the new method of data processing. So my tale of woe has a happy ending, the code now works and perhaps I’ll even have some results soon. For everyone who made it this far, here is a view of Gatineu Park:

IMG_3201