How Does Python Work? Debunking Myths about Python Names and Python Variables

An Introduction to the Anatomy and Core Concepts of the Python Execution Model

Yusup
AI³ | Theory, Practice, Business

--

Python has become a de facto programming language for machine learning. However, if you are jumping into Tensorflow, PyTorch, or any other Python machine learning framework without previous Python experience, you might be overwhelmed by its complexity.

(P.S. I’ll be writing about the Tensorflow programming model soon. Follow my blog to stay tuned!)

I made my own transition into a Python machine learning framework a few months ago and found it quite confusing. Fortunately, I’ve since come to a much better understanding of the Python execution model.

In this article, I’m going to share what I’ve learned by showcasing and deconstructing a few Python code samples.

Simple Python Examples

Although Python is a simple language, it has some subtle nuances that can make it tricky to understand. Now, without further ado, let’s dig into the code!

I’m going to show you four examples. Have a look and see if you can’t figure out why some work and others don’t. In the next section, we’ll deconstruct them to reveal the errors.

Code Sample1: This code works and it should run without errors.

import math
bar = math.pi
def foo1():
print(bar)
print(bar)
foo1()

Code Sample2: This runs without errors, but the variable bar has not changed at all.

import math
bar = math.pi
def foo2():
bar = math.e
print(bar)
print(bar)
foo2()
print(bar)

Code Sample3: This code does not work. Why, oh why?

import math
bar = math.pi
def foo3():
print(bar)
bar = math.e
foo3()

Code Sample4: Why is the global necessary to make changes to the global bar?

import math
bar = math.pi
def foo4():
global bar
bar = math.e
print(bar)
print(bar)
foo4()
print(bar)

Deconstruction of Python Code Samples

It may be difficult to grasp the behaviors of the “bar” in previous examples, especially if you assume the bar would operate in the same manner as a global variable would in a language such as C, JavaScript, or Golang.

Well, it does and doesn’t. Python uses names and bindings, but it also uses variables. If you try to print something that is undefined, such as print(hahahahahahahahahaha), you’ll get NameError: name ‘hahahahahahahahahaha’ is not defined thrown in your face.

Now before we can grasp why some codes work and others don’t, we need to stop and clarify our understanding of some core terminology.

Understanding Python Code Terminology

The NameError mentioned earlier indicates that all variables must be referred to as names. And yet again, if you run code sample3, it reports an error as below: UnboundLocalError: local variable ‘bar’ referenced before assignment. This UnboundLocalError shows that some names must be referred to as variables.

Wait, now the name is a variable? Yes, even in official documentation, names and variables are used interchangeably. In most cases, they refer to the same thing but, as mentioned before, the word variable comes with some baggage.

In a classic sense, a variable points to some location in the memory, which is the common case for most languages. So, to make sense of what is happening above, we can abandon the notion of a variable for now and instead use names and bindings.

Moving on. To better understand the resolution of names, we have to talk about scopes.

Let’s expand our understanding of scope with this definition from official Python documentation:

“A scope defines the visibility of a name within a block. If a local variable is defined in a block, its scope includes that block. If the definition occurs in a function block, the scope extends to any blocks contained within the defining one, unless a contained block introduces a different binding for the name.”

Next, here’s how the official documentation defines a binding:

“The following constructs bind names: formal parameters to functions, import statements, class and function definitions (these bind the class or function name in the defining block), and targets that are identifiers if occurring in an assignment, for loop header, or after as in a with statement or except clause. The import statement of the form from … import * binds all names defined in the imported module, except those beginning with an underscore. This form may only be used at the module level.”

To simplify this, we can list a few commonly-used binding expressions:

  • import foo
  • import foo as bar
  • foo = bar
  • with foo() as bar
  • except FooError as f1

Deciphering The Code Samples: Why Some Work and Others Don’t

Unlike Java, C, and Golang, Python does not utilize a “declaration statement.”, so the compiler will scan the whole block to determine the binding scope for a name. If there are no bindings within the function body, it will stick to the LEGB (Local, Enclosing, Global, Builtin) rule to find a binding for the given name. This explains why Code Sample1 works.

As for Code Sample2, the bar is interpreted as a locally bound name because there is a binding expression bar=math.e within the function body. Hence, the global name bar is not affected.

And the same goes for Code Sample3. After the compiler scans the whole function body, it finds a binding happening within the function. Inadvertently, the bar is again bound locally and, at runtime, print(bar) produces an error because the binding did not happen before print.

FAQ: Is Python Interpreted or Compiled?

In case you were wondering, the answer to this frequently asked question is “yes”. Python code is compiled by the compiler; then the bytecode is interpreted and executed by the virtual Python machine. When encountering performance issues, or in doubt about language semantics, bytecodes are the very first things you should check out.

Photo by Umberto on Unsplash

Troubleshooting with Bytecodes in Python

In Python, every keyword — with the exception of global, nonlocal, and pass keywords — has a corresponding bytecode. Global and nonlocal are merely hints for the compiler to determine a given name’s binding scope.

In Code Sample4, the global bar tells the compiler to reuse the bar’s binding from the global scope. Therefore, we can reuse the global bar, and we can change it.

In order to understand how the compiler interprets our code, let’s use dis (pun intended). dis is short for disassembly, which shows the bytecodes generated after compilation. You can see what each bytecode represents here.

For foo1, the corresponding bytecode looks something like this:

              0 LOAD_GLOBAL              0 (print)
2 LOAD_GLOBAL 1 (bar)
4 CALL_FUNCTION 1
6 POP_TOP
8 LOAD_CONST 0 (None)
10 RETURN_VALUE

For foo2, the corresponding bytecode looks something like this:

              0 LOAD_GLOBAL              0 (math)
2 LOAD_ATTR 1 (e)
4 STORE_FAST 0 (bar)
6 LOAD_GLOBAL 2 (print)
8 LOAD_FAST 0 (bar)
10 CALL_FUNCTION 1
12 POP_TOP
14 LOAD_CONST 0 (None)
16 RETURN_VALUE

In foo1, LOAD_GLOBAL is used to push the global name into the stack, but in foo2, STORE_FAST and LOAD_FAST are used to store/load the name into the stack.

In Conclusion

Sure, variables can mislead Python newcomers, but when you put scope, name, and bindings into perspective and use bytecodes to help you out, you’ll find that you can in fact wrap your head around how these elements work together. From there, you’re well on your way to creating efficient codes that run perfectly every time!

Thank you for reading! Feel free to share your questions and comments, as well as some claps if you’ve enjoyed this post.

Come back next week for an article on the Tensorflow programming model.

Cheers!

--

--