Part 1
Disclaimer
For those unfamiliar with my work, this is yet another LinkedIn contribution where I will be praising the merits of Python for academic research and rapid prototyping. The idea of writing a piece on how easy (or hard) it is to reproduce a published paper on Computer Vision using Python has been in my head for a while, but my interest sparked when I read a paper that cited my main PhD contribution on road detection, using it as a baseline algorithm. What better test for reproducibility than something I have published myself?
More details to follow.
Prelude
Motivation
As I already mentioned, I started writing this article, after I read this paper called: “Road Scene Simulation Based on Vehicle Sensors: An Intelligent Framework Using Random Walk Detection and Scene Stage Reconstruction”.
In the paper, my earlier work on road detection was cited and improved, also being used as a baseline. I was surprised to see that a 6 years old method was still on-par with state of the art (well, almost). I was also surprised to see that it scored quite close to Deep Learning approaches (hint: my algorithm is the only one with a surname):
I am only copying the least favorable table from the paper cited above, because it is the only reproducible one, since it uses a dataset from KITTI. The others are new datasets used in the paper are not yet released. The columns with numbers describing each algorithm’s score are, from left to right: Precision, Recall, F-score.
Reading all these, two very important questions were raised:
- How did these guys reproduce my research?
- Did they do it better than I would have?
In any case, they deserve big kudos for their work. I swear I am not biased because they chose mine as a foundation :)
So, I set out to answer these questions by, well, reproducing my own research, as best as I could. My main goals in this quest:
- Demystify the process of reproducing results from academic papers.
- Publicly shame my old self and my old ways to encourage young researchers who may be facing similar issues to pursue the path of doing-things-right.
- Correct my omissions and wrong-doings from my PhD era and provide a decent, non-DL-using, baseline for road detection methods.
- Write about a highly technical issue with as much informality as I can, as a follow up from my recent rant on STEM papers and their possible extinction.
This time around, I will be using Python instead of Matlab (not a student anymore, so no freebies). But first, let’s see why I had to do these things at all. Where was my code? Why was Matlab used? Why so many questions? Let’s take one thing at a time…
Background
As a (not so young) PhD candidate back in the previous decade, I chose my scientific tools by staying inside my comfort zone. Therefore, Matlab on Windows was the obvious choice for my Computer Vision applications, since:
- I had already worked with Matlab extensively in my Undergraduate and MSc years, so I had a very good knowledge of how to program efficiently and rapidly in it.
- Matlab and Windows were given to me for free. Little did I know at the time, that the academic haven that provided such expensive tools to students for free wouldn’t be anything like the real world.
- Matlab at the time (around 2006) had no serious competition when it came to rapid prototyping. OpenCV was too cumbersome (no C++ interfaces), and had too little documentation to offer for someone who wanted results quickly. Python was a baby with too many problems and too few and limited tools for Computer Vision.
- Matlab for Linux was way too problematic when it came to video processing, especially for a noob like me, so Windows was the OS I decided to go for.
- I was too lazy to fight the issues mentioned above. Procrastination is a PhD candidate’s essential skill and I was no exception to the rule.
Image taken from https://www.slideshare.net/embeddedvision/e04-open-cvbradsk
Of course, these excuses were true at the time I started, but not in 2013, when I finally (and with a great delay) defended my thesis, or even in 2011-2012 when I was developing most of my novel work. Why didn’t I switch to Python + OpenCV then? Well, see (4) above and combine it with my very small reserves of patience and perseverance at the time and you will get your answer…
For the same reasons, I committed several deadly sins during my PhD years:
- I didn’t use any versioning system (well, other than Dropbox, which of course isn’t one).
Source of meme: Reddit
- I didn’t release my code during or after my PhD, mainly because of (1) and also because I was too ashamed of my ugly, sloppy, get-things-done codebase.
- I didn’t spend time on annotating a basic dataset that could be used for my research and failed to find any BSc/MSc students working with me to take over such a task. It is a great surprise that my papers are cited at all with no publicly available datasets and no code.
- Of course, all of the aforementioned sins were more than common at the time, and things changed only after the Deep Learning big bang of 2012. This is, in my personal opinion, the biggest contribution of Deep Learning to the scientific community, but it is a different subject, so I may revisit it in the future.
Back to my sins, at least they came with a virtue, so I am in academia/industry limbo instead of either place’s hell: you’ll have to take my word for it, but I was always an honest researcher. So, my results were always real, even if they couldn’t be reproduced. As I say to my daughter ever so often… I pinky swear :)
New tools, old methods
Now that you know this part of my story, I am ready to show you my new shiny tools which will be assisting me in this exercise, as well as my old, rusty methods. Let’s start with the shiny parts.
- I will be using Python 3.6 and there is probably no reason why not to use 3.7. I will not indulge in any discussions about not choosing earlier versions, as I feel it is a resolved issue by now.
- I will be using Anaconda to manage the dependencies. Well, no. Miniconda will do just fine for those not willing to install every module on planet earth without needing it - does this remind of usage of Matlab in academia to anyone? The modules I will be using, the way to install them and the need for them will be demonstrated in next sections.
- I will also be using Linux (my flavor is Mint, no need for holy wars), but Windows will be just the same, if you have them and prefer them.
- For developing I will be using VS Code as an IDE. I love Spyder, but it brings too many memories of Matlab and I pinky-swore I won’t look back.
- For version control, I will be using git, because I already have a github account. My first ever public personal repository at the age of 41. I feel very ashamed and strangely proud at the same time! I promise I will contribute more now that I started (sounds like a good new year resolution for 2019).
- For a nice presentation of my progress with code, commentary and images I might be using JupyterLab. However, I will also first give a try to VS Code’s neuron.
- For running my old Matlab code as a baseline… well, I have no legal way of doing that, but I may use GNU Octave and hope for the best (something tells me I am in for a lot of pain).
The methods and the datasets are not new. I will however enhance them a bit, by adding new datasets to make benchmarking more interesting:
- The original road detection paper is from 2012 and can be found at IEEExplore.
- For those (many, many normal and sane people) who have no access to IEEE, I will be mentioning the core steps of the algorithm as I try to reproduce it from scratch.
- In the paper, I used an ancient, but very useful, image sequence for road detection algorithms, which can be found at: https://tev.fbk.eu/databases/diplodoc-road-stereo-sequence
- In my PhD, I also used Jose Alvarez’s more challenging sequences from: https://rsu.data61.csiro.au/people/jalvarez/research_bbdd.php
- In the last step of this exercise, I will be aiming to reproduce the new numbers using the road dataset from KITTI, which can be downloaded at: http://www.cvlibs.net/datasets/kitti/eval_road.php End of Part 1. Those interested, please stay tuned for Part 2.