Note

This page was generated from a jupyter notebook.

A super-brief intro to Python and NumPy

Python is: * interpreted (high level) * readable * concise * cross-platform * dynamically typed * object oriented * automatically memory-managed

Almost all of the below is explained much more fully at various places online. For a nice entry level tutorial set, try the Software Carpentry intros: http://swcarpentry.github.io/python-novice-inflammation/

The main Python documentation is also an extremely readable source of knowledge. Just Google!

PROGRAM FILES AND INTERACTIVE ENVIRONMENTS

Put Python code in .py text files (AKA “scripts”). Run these from a shell, as:

python myscript.py

OR

Use one of Python’s interactive environments (e.g., iPython)

ipython

In an interactive environment:

  • Run code line-by-line, preserving variables

  • Run your scripts, using the magic command %run (and preserve the variables)

This Jupyter notebook is an interactive environment.

MODULES

Python has some built-in functions, but everything else comes in a library module.

Modules are imported, then the functions they hold run with a dot syntax:

[ ]:
import math  # comments with a hash

x = math.cos(2.0 * math.pi)
print(x)  # print is a built-in function

OR import the functions and properties individually:

[ ]:
from numpy import cos, pi  # numpy, numeric python, also has these functions

x = cos(2.0 * pi)
print(x)

Get help in an interactive shell with a trailing ?, quit it with q

[ ]:
?pi

TYPES

Python distinguishes: * integer (int), e.g., 3, * float (float), e.g., 1.0 or 2.5, * boolean (bool), e.g., True * complex (complex), e.g., 3.2 + 2.4i, * strings (str), e.g., ‘Hello world!’

You may also encounter NumPy types, like numpy.float64

[ ]:
type(pi)

Typecasting (switching between types on-the-fly) is automatic where possible, but can be time-expensive if you want efficient code.

Python’s inbuilt data structures are: * tuples, with parentheses—immutable (you can’t change values once they exist) * lists, with square brackets—mutable (you can change values once they exist) * sets, as set()—unordered collection with no duplicates * dictionaries, with curly brackets—associated pairs of key:value

Note that all these data structures let you happily mix data types… But the cost is that Python runs more slowly than, e.g., C++.

[ ]:
mytuple = (0, 1, 2, 3)
print(
    f"You can index: {mytuple[1]}"
)  # this uses a string method which replaces {} with the argument of format

Tuples are immutable, meaning that you cannot reassign values to them. Python will give TypeError if you try to do so. We can test this using a try...except block: if a TypeError occurs in the assignment statement below, we should see the printed message:

[ ]:
try:
    mytuple[0] = 100
except TypeError:
    print("A TypeError occurred")

… and indeed we do.

[ ]:
mylist = [0, 1, 2, 3]
print("This time reassignment works:")
mylist[0] = "I can store a string!"
print(mylist)
[ ]:
myset = {0, 1, 2, 3}
print(myset)
[ ]:
myset.add("string!")  # you can use both ' and " to declare a string
print(f"Adding is easy: {myset}")
[ ]:
myset.add(0)
print(f"But remember, duplicates don't count! {myset}")

Almost anything can be a key or a value:

[ ]:
mydict = {"firstkey": 1, "secondkey": 2, 3: "three"}
print(mydict)
[ ]:
print(mydict["firstkey"])
print(mydict["secondkey"])
print(mydict[3])
[ ]:
print(f"Get the keys (note lack of ordering): {mydict.keys()}")
print(f"Get the values: {mydict.values()}")
[ ]:
try:
    print("The next line should generate a KeyError...")
    print(f" {mydict[2]}")
except KeyError:
    print("...and indeed it did.")

INDEXING

  • Indexing starts from 0

  • Index with square brackets [start : stop : step]

  • “stop” is exclusive of the named index

  • Colon alone means “all of these” or “to the start/end”

[ ]:
x = list(range(10))
print(f"x[3] gives {x[3]}")
print(f"x[1:5:2] gives {x[1:5:2]}")
[ ]:
print(f"x[8:] gives {x[8:]}")
print(f"x[:7] gives {x[:7]}")
print(f"x[::3] gives {x[::3]}")

PYTHON IS LIKE, BUT ISN’T, MATLAB

  • This is a power:

[ ]:
x = 10.0**2  # …or…
import numpy as np

x = np.square(10.0)  # NEVER 10.^2.

Likewise, it’s also useful to know about the “truncation” (//) and “remainder” (%) division operators:

[ ]:
print(f"truncate: {(13 // 4)}")
print(f"remainder: {(13 % 4)}")
  • End indices are NOT inclusive

[ ]:
len(range(0, 100))  # in Matlab this would be 101
[ ]:
[
    x for x in range(5)
]  # this is called "list comprehension", and is a readable way to make a list
  • Intelligent memory management means Python will pass objects by reference where possible. In other words, if you set two things equal and then later change the first one, the second one will also change (and vice versa):

[ ]:
x = [0] * 3
y = [1, 2, 3]
print(f"x starts as {x}")
print(f"y starts as {y}")
[ ]:
x = y
print(f"After setting equal, x is {x}")
[ ]:
y[1] = 100
print(f"After modifying y, x is {x}")
[ ]:
# one way to stop this automatic behaviour is by forcing a copy with [:]
x = y[:]
print(x)
print(y)
[ ]:
y[1] = 1000000
print(f"After forcing a copy, x is still {x} but y is now {y}")
  • In Matlab, assigning a value to a variable triggers output unless you suppress it with a semi-colon at the end of the line; this isn’t necessary in Python:

[ ]:
x = range(10)  # …see?
  • Python doesn’t use brackets to delineate code blocks. It uses indentation with a fixed number of spaces (normally 4). This also applies to for loops, while loops, if statements, try/except statements, class declarations, function declarations, etc.

[ ]:
def myfunction(arg1, arg2, **kwds):
    # **kwds is a special (optional) dictionary input type,
    # that you can use as an input "wildcard"
    try:
        print_this = kwds["printme"]
    except KeyError:
        x = arg1 * arg2
        return x  # ...no brackets needed; both lines have 4 space indents
    else:
        print(print_this)
[ ]:
print("first time:")
myfunction(3.0, 4.0)
[ ]:
print("second time…")
myfunction(5, 6, printme="Printed this time!")
  • Python’s plotting is a blatant clone of matlab’s, and lives in the library matplotlib.pyplot:

[ ]:
%matplotlib inline
# that command tells this notebook to put plots into the notebook
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(10)  # like range(), but produces a numpy array
y = np.random.rand(10)  # ten random floats, 0->1
plt.plot(x, y, "*--")
plt.xlabel("xaxis")
plt.ylabel("yaxis")
plt.title("my plot!")

NumPy and Landlab

Landlab makes extensive use of the NumPy (Numeric Python) libraries. These allow significant acceleration of standard Python processes on matrix-like data arrays. Here we look at some of the key features and differences with pure Python along with some NumPy best-practice.

[ ]:
import numpy as np

Initialize NumPy arrays from standard Python iterables (lists, tuples):

[ ]:
myarray = np.array([0, 1, 3, 6, 18])

…or with one of the many standard array creation methods in NumPy. Particularly useful ones are:

[ ]:
a = np.zeros(10, dtype=int)
print(f"a: {a}")
[ ]:
b = np.ones(5, dtype=bool)
print(f"b: {b}")
[ ]:
c = np.random.rand(10)
print(f"c: {c}")
[ ]:
d = np.arange(5.0)
print(f"d: {d}")
[ ]:
e = np.empty((3, 3), dtype=float)
e.fill(100.0)
print(f"e: {e}")

Arrays also have some built-in methods and properties. We see ‘fill’ above, but also noteworthy are:

[ ]:
print(f"e has shape: {(e.shape)}")
print(f"e has size: {e.size} ")  # preferred to len() when working with arrays
c.max(), c.min(), c.mean(), c.sum()
[ ]:
f = c.copy()
print(f"flatten: {e.flatten()}")

Slicing works like (better than?) in pure Python:

[ ]:
print(d[2:])
[ ]:
e[1, 1] = 5.0
print(e)
[ ]:
print(e[1:, 1:])

Note that logical operations with NumPy tend to require NumPy-native functions, rather than pure Python and, or, not etc.

[ ]:
bool1 = np.array([True, True, False, False])
bool2 = np.array([True, False, True, False])
print(f"AND: {np.logical_and(bool1, bool2)}")
print(f"OR: {np.logical_or(bool1, bool2)}")
print(f"NOT: {np.logical_not(bool1)}")
[ ]:
print(f"ANY: {np.any(bool1)}")
print(f"ALL: {np.all(bool1)}")

Now, let’s demonstrate the speed of NumPy over pure Python:

[ ]:
f_list = range(1000)
f_array = np.arange(1000, dtype=int)


def addone_py(list_in):
    for i in list_in:
        i += 1


def addone_np(array_in):
    array_in += 1
[ ]:
print("time for list:")
%timeit addone_py(f_list)  # a magic command for timing things
[ ]:
print("time for array:")
%timeit addone_np(f_array)

In particular, never loop to do a logical test:

[ ]:
# NOT THIS:
myoutput_slow = np.zeros(10, dtype=float)
for i in range(len(c)):  # c is our random number array
    if c[i] > 0.5:
        myoutput_slow[i] = c[i]

# DO THIS INSTEAD:
myoutput_fast = np.zeros(10, dtype=float)
greater_than_half = c > 0.5
myoutput_fast[greater_than_half] = c[greater_than_half]

print(np.all(np.equal(myoutput_slow, myoutput_fast)))

The online NumPy help is actually an extremely readable resource, and is highly recommended to find out more about the family of available NumPy methods.


Generated by nbsphinx from a Jupyter notebook.