Programming Languages
In computational science, we face a fundamental trade-off. We are constantly torn between two opposing needs: human efficiency and machine efficiency.
We want to write code quickly. We want the syntax to be readable, forgiving, and close to plain English. But we also need our programs to run fast. When simulating millions of molecules or training a massive neural network, a delay of milliseconds per calculation adds up to weeks of lost time.
Historically, no single programming language could give us both. This dilemma is known as the Three-Language Problem. It segments our world into three distinct tiers:
- The Interface (Glue): Languages like Python. They are easy to learn and write, but they are computationally slow.
- The Systems: Languages like C++ and Rust. They are incredibly fast and memory-efficient, but they are difficult to learn and cumbersome to write.
- The Accelerators: Specialized languages like CUDA or OpenCL are used to talk to hardware accelerators (GPUs and TPUs). These offer raw, blazing speed but require deep expertise to manage.
Alex’s Soap Box
Mojo is a new programming language designed to solve the Three-Language Problem. It unifies the stack. You can write high-level scripts and low-level system code in the same file. Because of this potential, I am currently transitioning my entire science stack to Mojo. While it will take years for the broader scientific community to catch up, adopting this technology now provides a massive strategic advantage. In both research and the startup world, being able to build faster, more efficient tools than your competitors can help get that sweet, sweet venture capital.
If you walk into any computational lab today, you will find that Python is the undisputed winner. It is the standard “glue” language of modern science.
We call it a Glue Language because its primary job is not to do the heavy lifting itself, but to stick different powerful tools together. Python excels at this because it prioritizes you, the human. It handles the boring details of memory management and system calls so you can focus on the biology.
You might ask: If Python is slow, why do we use it for high-performance supercomputing?
The answer lies in how we use it. When you run a heavy calculation in Python, Python isn’t actually doing the math. It is merely the steering wheel. Under the hood, Python passes that command down to a highly optimized library written in a systems language like C, C++, or Rust.
These libraries are the engines. They are written by systems programmers who enjoy obsessing over memory layout and processor instructions so that you don’t have to. You write a simple line of Python, and it triggers a blazing-fast C++ routine deep inside the machine.
Key Idea: The Steering Wheel vs. The Engine Think of Python as the steering wheel and the underlying libraries (like NumPy or PyTorch) as the engine. The steering wheel doesn’t make the car move; it directs the power. You can drive a Formula 1 car with a comfortable leather steering wheel. Python allows you to “drive” complex, high-performance C++ code with a simple, comfortable syntax.
Note
You will inevitably encounter R.
It is a language built specifically for statistics and data visualization.
In certain niches of biology—particularly bioinformatics and ecology—R is deeply entrenched.
Many researchers use it because they were trained on specific R-based tools (like ggplot2 or Bioconductor) that were standard in their labs.
There is nothing wrong with using R for these specific tasks. However, it is essential to recognize its limits. R is a specialized tool for a specialized trade. Outside of specific statistical niches, Python dominates the entire landscape. From running web servers and automating cloud infrastructure to training state-of-the-art AI models, Python is the universal language.