Seunghyun Yoo

Posts | Development | About

[EN] Performance implication of using global variables in Python

Code snippets

Sum.py

s = 0
for i in range(1,10000000):
    s += i
print(s)

Sum_local.py

def foo():
    s = 0
    for i in range(1,10000000):
        s += i
    print(s)

foo()

I am a little bit surprised because the above Python code snippets have a huge performance difference.

$ time python3 sum.py
49999995000000
python3 sum.py  1.43s user 0.01s system 99% cpu 1.439 total
$ time python3 sum_local.py
49999995000000
python3 sum_local.py  0.69s user 0.00s system 99% cpu 0.691 total

The code with local variables is much faster (0.691 seconds) than the one with global variables (1.439 seconds). It is not just a matter of code style but how the Python interpreter generates its bytecode.

Bytecode

Sum.py

  1           0 LOAD_CONST               0 (0)
              2 STORE_NAME               0 (s)

  2           4 SETUP_LOOP              26 (to 32)
              6 LOAD_NAME                1 (range)
              8 LOAD_CONST               1 (1)
             10 LOAD_CONST               2 (10000000)
             12 CALL_FUNCTION            2
             14 GET_ITER
        >>   16 FOR_ITER                12 (to 30)
             18 STORE_NAME               2 (i)

  3          20 LOAD_NAME                0 (s)
             22 LOAD_NAME                2 (i)
             24 INPLACE_ADD
             26 STORE_NAME               0 (s)
             28 JUMP_ABSOLUTE           16
        >>   30 POP_BLOCK

Sum_local.py

  4           0 LOAD_CONST               1 (0)
              2 STORE_FAST               0 (s)

  5           4 SETUP_LOOP              26 (to 32)
              6 LOAD_GLOBAL              0 (range)
              8 LOAD_CONST               2 (1)
             10 LOAD_CONST               3 (10000000)
             12 CALL_FUNCTION            2
             14 GET_ITER
        >>   16 FOR_ITER                12 (to 30)
             18 STORE_FAST               1 (i)

  6          20 LOAD_FAST                0 (s)
             22 LOAD_FAST                1 (i)
             24 INPLACE_ADD
             26 STORE_FAST               0 (s)
             28 JUMP_ABSOLUTE           16
        >>   30 POP_BLOCK

Calling dis.dis(foo) and python3 -m dis sum.py (to see the bytecode for the module) will show bytecodes generated by the interpreter. It is now clear that the use of global variables requires object lookups LOAD_NAME, which are much slower operations than LOAD_FAST.