Bioinformatics – What it is, What it isn’t

Every undergraduate who enters into an academic programme wishes to do well and graduate with a handsome CGPA. Sufficiently motivated students and those who have adequate preparation to cope with the demands of the course are more likely to do so. After teaching bioinformatics majors for two semesters, I conclude that these two elements are lacking in students of our programme. Since sufficient motivation hinges on how clear the objectives of a field are defined, I will first talk about the common misconception that bioinformatics is a biology course.

Biology today has become such an interdisciplinary subject that it is barely recognisable from the kind of biology done 100 or even 50 years ago. Mathematical modelling and statistical reasoning are rife in current biology, and rightly so. Quantitative models generate falsifiable hypotheses that can be tested using data. Bioinformatics arose as a discipline to provide quantitative tools to help biologists solve problems. As such, its foundation is firmly grounded in the mathematical, statistical and computing sciences. A bioinformatics major is expected to provide solutions to a biology problem, often through computational means. His or her skills must therefore lie mainly in the technical side, with some knowledge of biology to help put the problem into a proper context and interpret the results. Knowing lots of biology but little technical knowhow is like putting the cart before the horse.

The proliferation of bioinformatics tools in the form of free softwares has helped many biologists in their research. However, rountine use of bioinformatics software like CLUSTALW, MEGA, PHYLIP etc. does not make one a bioinformatician, just like routine use of statistical software does not make a person a statistician – unless you understand how things work inside the software. Students may get the wrong idea that bioinformatics is about mastering a couple of such softwares and trying to earn a living with it. It’s not a big deal knowing how to use the softwares. Anyone who spends enough time going through the manual and routinely uses the software can become an expert user. A software is merely an implementation of a model; if the model assumptions are not met in the data, then any output must be interpreted with caution. You have got to understand how the model works in order to judge the usefulness of the results. Again, this requires adequate technical background.

I hope by this time the student is convinced about the need to learn more math, stats and programming. Let’s take a look at the programme structure. If a student avoids the non-core technical subjects, only 19% (23 credits) of the programme is technical in nature; most of these being concentrated in the second year. On the other hand, a student may try to take as many technical papers as possible; this will stretch the coverage to about 32% (39 credits) of the total programme credits. By comparison, other bioinformatics programmes overseas may do up to 80% technical subjects, and they are often placed in departments of mathematics, statistics or computer science.

Of course, the student can always treat a BSc like any basic degree and move on to do something else. In this sense, he or she would probably satisfy the minimum technical requirements, and go on to do as many soft papers as possible, since these are more likely to return an A with less effort than a technical paper. The CGPA would look quite good at the end. Those who are seriously thinking about a career in bioinformatics may want to take the difficult path. This involves taking papers from the Institute of Mathematical Sciences such as Calculus I, Introduction to Probability, Probability and Statistical Inference I, Linear Algebra and Combinatorics. You’ll have to plan carefully since some of these papers have pre-requisites. Surviving and doing quite OK in these papers is likely to strengthen your foundation skills and also instill some confidence for future work. Computing is important to a bioinformatician, so serious students need to sharpen their programming skills. Try to pick up an additional language such as Python, Perl or R on your own during the long holidays, and test your understanding by writing programs to solve a computational problem. The website Project Euler contains many problems for self-study that are sure to improve your skills when you successfully solve them.

So, despite of the programme’s structural shortcoming, I think it is still possible to get a decent background in technical bioinformatics. This will doubtless involve a lot of hard work – but when you finally develop a sense of elation watching the correct solution returned by a program that you have written, you know you have discovered the joy of doing real bioinformatics.

Advertisements

About Tsung Fei

A teacher, researcher in the bioinformatics division at the University of Malaya
This entry was posted in Education. Bookmark the permalink.

One Response to Bioinformatics – What it is, What it isn’t

  1. Sophos says:

    Good to read this.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s