A program written in a compiled languages need be turned into an executable binary by the compiler, before running. In contrast, interpreted languages do not need this extra step, they are being turned into CPU instructions on-the-fly by the interpreter. So, which group does Python belong to?
CPython - the reference implementation of Python - is an interpreted language. However the Python language specification does not impose any restrictions in this regard, so different Python implementations can apply different strategies. For example, PyPy is an alternative, fully compliant Python implementation, using a JIT (Just-In-Time) compiler.
Also, to improve the performance of the interpreter CPython uses an intermediate representation called byte code.
If that does not make sense on first read, do not worry: I’ll explain all of it below.
Compiled vs Interpreted Languages - What’s The Difference?
If you already know the difference between compiled, interpreted and JIT-compiled languages, you can skip this part and jump to the part about python.
Computers understand only machine code - a code consisting of a set of CPU instructions. Each type of CPU has its own set of available instructions, so machine code is system-specific. To translate a piece of code written in a high-level programming language to low-level machine code, a so-called compiler can be used. This compilation process results in an executable binary.
The target CPU can understand and run the instructions contained in this executable, without the original source code.
Some of the most well-known compiled languages:
- Common Lisp
Advantages of Compiled Languages
- Performance: Compiled languages tend to run faster, as the program statements do not have to be translated into machine code at runtime. It already happened during the compilation step.
- Safety and robustness: Many errors - that would cause an interpreter to crash when running the script - are caught at compile-time. That makes compiled languages more robust than interpreted languages.
- The code is not portable: The executables are platform-dependent. To run the code on a different type of CPU it needs to be recompiled.
- The binaries are not modifiable without the original source code: Reverse engineering it is a difficult process. You need to have the source files to be able to modify and recompile the executables. (This can also be an advantage if you want to distribute proprietary software and stop users from modifying/redistributing it.)
Translating the source code into machine code can also happen on-the-fly, with the help of interpreters. Interpreters are programs that take the source file (often times called a script), turn it into machine code and execute it line-by-line (or rather statement-by-statement) without the intermediate compilation step.
Some of the more popular interpreted programming languages (also called scripting languages or scripts) include:
- UNIX shell
- Development speed and developer happiness: Skipping the time-consuming compilation step provides a short feedback loop for developers, which can result in faster turnaround times during the software development process.
- Portability: The source code can be run on any supported platform without modification - the program is portable. No need for recompilation.
- Flexibity: Interpreted languages are usually more flexible: in some cases they provide dynamic typing, powerful reflection features and run-time code evaluation.
- Performance: Interpreted languages are usually slower than compiled languages due to the overhead of the interpreter.
- Security: Flexibility comes with a price: dynamic typing and code evaluation can lead to serious security issues.
We also need to mention JIT compilers. JIT stands for just-in-time compilation. This is a hybrid technology that tries to strike a balance between compilation and interpretation. The aim is to have the flexibility of an interpreter, but without the performance trade-offs.
The compilation step does not happen before running the program, instead a JIT compiler compiles the necessary code chunks on-the-fly. The code is constantly analyzed and recompiled/optimized if needed. Usually it involves an intermediate step of compiling the source code into bytecode.
Some examples of languages using JIT compilers:
- Java (JVM - Java Virtual Machine)
- C# (CLR - Common Language Runtime)
The Python Language Specification And Its Implementations
There is a little confusion here: when we talk about Python we can mean several things: first of all Python is a language specification - a set of documents that specify the syntax of the Python language. It is maintained and owned by the Python Software Foundation.
In addition to the Python specification they also maintain a reference implementation of it: CPython. This is the oldest, most mature and most popular Python implementation: when we talk about the Python interpreter, we usually mean CPython.
However, there are several alternative - partially, or fully compliant - Python implementations, developed by various organizations and individuals. Let’s have a look at some of them, and see what compilation strategy they use:
We already mentioned CPython, it is the reference implementation: mature, robust, performant and fully compliant to the language specs. It is considered to be an interpreted language, however it uses an intermediate compilation step (somewhat similar to PHP) that translates the program into bytecode, which can be run by the Python Virtual Machine.
Works in a very similar manner to CPython, the only difference is that instead of relying on the Python Virtual Machine it compiles Python code to Java bytecode which can be run by the Java Virtual Machine.
PyPy is a JIT compiler for Python - its feature set is fully compatible with CPython. PyPy strives to solve the performance issues with the standard Python interpreter by taking advantage of just-in-time compilation.
As you can see most Python implementations are byte code interpreted: they compile the source code to byte code that can be run on a Virtual Machine.