Note
This page was generated from a jupyter notebook.
A super-brief intro to Python and NumPy¶
Python is: * interpreted (high level) * readable * concise * cross-platform * dynamically typed * object oriented * automatically memory-managed
Almost all of the below is explained much more fully at various places online. For a nice entry level tutorial set, try the Software Carpentry intros: http://swcarpentry.github.io/python-novice-inflammation/
The main Python documentation is also an extremely readable source of knowledge. Just Google!
PROGRAM FILES AND INTERACTIVE ENVIRONMENTS¶
Put Python code in .py text files (AKA “scripts”). Run these from a shell, as:
python myscript.py
OR
Use one of Python’s interactive environments (e.g., iPython)
ipython
In an interactive environment:
Run code line-by-line, preserving variables
Run your scripts, using the magic command
%run
(and preserve the variables)
This Jupyter notebook is an interactive environment.
MODULES¶
Python has some built-in functions, but everything else comes in a library module.
See the built-in functions here: https://docs.python.org/2/library/functions.html
Modules are imported, then the functions they hold run with a dot syntax:
[ ]:
import math # comments with a hash
x = math.cos(2.0 * math.pi)
print(x) # print is a built-in function
OR import the functions and properties individually:
[ ]:
from numpy import cos, pi # numpy, numeric python, also has these functions
x = cos(2.0 * pi)
print(x)
Get help in an interactive shell with a trailing ?
, quit it with q
[ ]:
?pi
TYPES¶
Python distinguishes: * integer (int), e.g., 3, * float (float), e.g., 1.0 or 2.5, * boolean (bool), e.g., True * complex (complex), e.g., 3.2 + 2.4i, * strings (str), e.g., ‘Hello world!’
You may also encounter NumPy types, like numpy.float64
[ ]:
type(pi)
Typecasting (switching between types on-the-fly) is automatic where possible, but can be time-expensive if you want efficient code.
Python’s inbuilt data structures are: * tuples, with parentheses—immutable (you can’t change values once they exist) * lists, with square brackets—mutable (you can change values once they exist) * sets, as set()—unordered collection with no duplicates * dictionaries, with curly brackets—associated pairs of key:value
Note that all these data structures let you happily mix data types… But the cost is that Python runs more slowly than, e.g., C++.
[ ]:
mytuple = (0, 1, 2, 3)
print(
f"You can index: {mytuple[1]}"
) # this uses a string method which replaces {} with the argument of format
Tuples are immutable, meaning that you cannot reassign values to them. Python will give TypeError
if you try to do so. We can test this using a try...except
block: if a TypeError
occurs in the assignment statement below, we should see the printed message:
[ ]:
try:
mytuple[0] = 100
except TypeError:
print("A TypeError occurred")
… and indeed we do.
[ ]:
mylist = [0, 1, 2, 3]
print("This time reassignment works:")
mylist[0] = "I can store a string!"
print(mylist)
[ ]:
myset = {0, 1, 2, 3}
print(myset)
[ ]:
myset.add("string!") # you can use both ' and " to declare a string
print(f"Adding is easy: {myset}")
[ ]:
myset.add(0)
print(f"But remember, duplicates don't count! {myset}")
Almost anything can be a key or a value:
[ ]:
mydict = {"firstkey": 1, "secondkey": 2, 3: "three"}
print(mydict)
[ ]:
print(mydict["firstkey"])
print(mydict["secondkey"])
print(mydict[3])
[ ]:
print(f"Get the keys (note lack of ordering): {mydict.keys()}")
print(f"Get the values: {mydict.values()}")
[ ]:
try:
print("The next line should generate a KeyError...")
print(f" {mydict[2]}")
except KeyError:
print("...and indeed it did.")
INDEXING¶
Indexing starts from 0
Index with square brackets [start : stop : step]
“stop” is exclusive of the named index
Colon alone means “all of these” or “to the start/end”
[ ]:
x = list(range(10))
print(f"x[3] gives {x[3]}")
print(f"x[1:5:2] gives {x[1:5:2]}")
[ ]:
print(f"x[8:] gives {x[8:]}")
print(f"x[:7] gives {x[:7]}")
print(f"x[::3] gives {x[::3]}")
PYTHON IS LIKE, BUT ISN’T, MATLAB¶
This is a power:
[ ]:
x = 10.0**2 # …or…
import numpy as np
x = np.square(10.0) # NEVER 10.^2.
Likewise, it’s also useful to know about the “truncation” (//
) and “remainder” (%
) division operators:
[ ]:
print(f"truncate: {(13 // 4)}")
print(f"remainder: {(13 % 4)}")
End indices are NOT inclusive
[ ]:
len(range(0, 100)) # in Matlab this would be 101
[ ]:
[
x for x in range(5)
] # this is called "list comprehension", and is a readable way to make a list
Intelligent memory management means Python will pass objects by reference where possible. In other words, if you set two things equal and then later change the first one, the second one will also change (and vice versa):
[ ]:
x = [0] * 3
y = [1, 2, 3]
print(f"x starts as {x}")
print(f"y starts as {y}")
[ ]:
x = y
print(f"After setting equal, x is {x}")
[ ]:
y[1] = 100
print(f"After modifying y, x is {x}")
[ ]:
# one way to stop this automatic behaviour is by forcing a copy with [:]
x = y[:]
print(x)
print(y)
[ ]:
y[1] = 1000000
print(f"After forcing a copy, x is still {x} but y is now {y}")
In Matlab, assigning a value to a variable triggers output unless you suppress it with a semi-colon at the end of the line; this isn’t necessary in Python:
[ ]:
x = range(10) # …see?
Python doesn’t use brackets to delineate code blocks. It uses indentation with a fixed number of spaces (normally 4). This also applies to
for
loops,while
loops,if
statements,try/except
statements, class declarations, function declarations, etc.
[ ]:
def myfunction(arg1, arg2, **kwds):
# **kwds is a special (optional) dictionary input type,
# that you can use as an input "wildcard"
try:
print_this = kwds["printme"]
except KeyError:
x = arg1 * arg2
return x # ...no brackets needed; both lines have 4 space indents
else:
print(print_this)
[ ]:
print("first time:")
myfunction(3.0, 4.0)
[ ]:
print("second time…")
myfunction(5, 6, printme="Printed this time!")
Python’s plotting is a blatant clone of matlab’s, and lives in the library matplotlib.pyplot:
[ ]:
%matplotlib inline
# that command tells this notebook to put plots into the notebook
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10) # like range(), but produces a numpy array
y = np.random.rand(10) # ten random floats, 0->1
plt.plot(x, y, "*--")
plt.xlabel("xaxis")
plt.ylabel("yaxis")
plt.title("my plot!")
NumPy and Landlab¶
Landlab makes extensive use of the NumPy (Numeric Python) libraries. These allow significant acceleration of standard Python processes on matrix-like data arrays. Here we look at some of the key features and differences with pure Python along with some NumPy best-practice.
[ ]:
import numpy as np
Initialize NumPy arrays from standard Python iterables (lists, tuples):
[ ]:
myarray = np.array([0, 1, 3, 6, 18])
…or with one of the many standard array creation methods in NumPy. Particularly useful ones are:
[ ]:
a = np.zeros(10, dtype=int)
print(f"a: {a}")
[ ]:
b = np.ones(5, dtype=bool)
print(f"b: {b}")
[ ]:
c = np.random.rand(10)
print(f"c: {c}")
[ ]:
d = np.arange(5.0)
print(f"d: {d}")
[ ]:
e = np.empty((3, 3), dtype=float)
e.fill(100.0)
print(f"e: {e}")
Arrays also have some built-in methods and properties. We see ‘fill’ above, but also noteworthy are:
[ ]:
print(f"e has shape: {(e.shape)}")
print(f"e has size: {e.size} ") # preferred to len() when working with arrays
c.max(), c.min(), c.mean(), c.sum()
[ ]:
f = c.copy()
print(f"flatten: {e.flatten()}")
Slicing works like (better than?) in pure Python:
[ ]:
print(d[2:])
[ ]:
e[1, 1] = 5.0
print(e)
[ ]:
print(e[1:, 1:])
Note that logical operations with NumPy tend to require NumPy-native functions, rather than pure Python and
, or
, not
etc.
[ ]:
bool1 = np.array([True, True, False, False])
bool2 = np.array([True, False, True, False])
print(f"AND: {np.logical_and(bool1, bool2)}")
print(f"OR: {np.logical_or(bool1, bool2)}")
print(f"NOT: {np.logical_not(bool1)}")
[ ]:
print(f"ANY: {np.any(bool1)}")
print(f"ALL: {np.all(bool1)}")
Now, let’s demonstrate the speed of NumPy over pure Python:
[ ]:
f_list = range(1000)
f_array = np.arange(1000, dtype=int)
def addone_py(list_in):
for i in list_in:
i += 1
def addone_np(array_in):
array_in += 1
[ ]:
print("time for list:")
%timeit addone_py(f_list) # a magic command for timing things
[ ]:
print("time for array:")
%timeit addone_np(f_array)
In particular, never loop to do a logical test:
[ ]:
# NOT THIS:
myoutput_slow = np.zeros(10, dtype=float)
for i in range(len(c)): # c is our random number array
if c[i] > 0.5:
myoutput_slow[i] = c[i]
# DO THIS INSTEAD:
myoutput_fast = np.zeros(10, dtype=float)
greater_than_half = c > 0.5
myoutput_fast[greater_than_half] = c[greater_than_half]
print(np.all(np.equal(myoutput_slow, myoutput_fast)))
The online NumPy help is actually an extremely readable resource, and is highly recommended to find out more about the family of available NumPy methods.