import addutils.toc ; addutils.toc.js(ipy_notebook=True)
After learning the basic Python concepts, there are still some skills to learn to start working effectively.
In this Notebook we will see how to manage functions and how to work with inport, namespaces and packages. Then we will see how to read and write external data and how to manage the external environment. Since most of our customers are working on Windows-based systems, this notebook is mainly oriented to this specific OS. Nevertheless many concepts you will find here can be applied to Mac or Linux systems.
from addutils import css_notebook
css_notebook()
pwd
'/home/matteo/Projects/tutorials/python-ipython'
If there was a single thing you have to keep in mind, well, this is it: comment your code! This is particulary important when you start to have structured code involving classes and functions and when you start to collaborate with someone else. As you see in the following code there are two types of comments in Python:
# Single line comments are defined by the # sign
'''
Milti-line comments are defined using
three consecutive single quote signs.
'''
But remember:
Local functions are used to avoid code repetition and to give a tidy face to your code. Have a look to the code in the next cell and notice the following things:
spacing_string
def local(spacing_string, n=5):
'''Print n carriage returns
"spacing_string" must be provided
"n" can be omitted and gets the default value'''
# Variables defined inside functions are local
print(spacing_string*n)
local('-') # n = 5 (default value)
local('*',n=9) # n = 9 (named argument)
----- *********
# Since you wrote a nice description for you function
# you can invoke help with help(local) or alternatively with local?
local?
External functions are saved in external files. As an example you will find in this folder a file named my_module.py
. This is the code:
import addutils.my_module as my
%pfile my.my_function
# Check the code below ↓
my_function(name) accepts a tuple made of two strings and calls _my_private_function. Functions whos name begins with '_' are meant to be private and cannot be called from outside. Lets try a call
import addutils.my_module as my
print(my.my_function(('rick', 'bayes')))
rick [BAYES]
*Try by yourself* the following commands:
my.MODULE_CONSTANT
my.module_variable
my.my_function?
my?
my?
Testing your code with if name == 'main': To write reliable code, one of the most important things is to do continuous testing. In Python there is an easy way to test your code every time you modify your functions. When check name == 'main' is True, it means that the module has been called from the command line. You can use this check to write your Unit Testing code:
if __name__ == '__main__':
''' This is a Unit Test: use "run my_module" from Python interpreter'''
print 'This is the testing code:'
print my_function(('John', 'Doe'))
Try to call your module from the command line:
%run -m addutils.my_module
This is the testing code: Johnn [DOE]
/home/matteo/anaconda3/envs/addfor_tutorials/lib/python3.6/runpy.py:125: RuntimeWarning: 'addutils.my_module' found in sys.modules after import of package 'addutils', but prior to execution of 'addutils.my_module'; this may result in unpredictable behaviour warn(RuntimeWarning(msg))
Methods whose names start with '_' are meant to be private: this means you aren't supposed to access it. This is an example:
def _my_private_function(first_name, second_name):
...
return full_name
If you "import my_module as my" and try to type my.[TAB] you'll see just the public methods and variables.
Actually Python allows you to call private methods anyway but we advice you to do it just when you'll be much more proficient in using this language. Try if you want:
my._my_private_function('John', 'McEnroe')
#my._my_private_function('John', 'McEnroe')
*Try by yourself* some more examples:
# Explore other private methods with: my._ + TAB
import numpy as np
print(my.__doc__)
print(my.my_function.__doc__)
name = ('Graham', 'Chapman')
print(my.my_function(name))
my? # Module documentation: OBJECT INTROSPECTION
my?? # will also show the function's source code if possible
np.*load*? # ? can be also used to search the namespace
my.my_function? # Module function documentation
help(my) # Module Help: notice private functions not listed
#help(my)
Python supports the creation of anonymous functions (i.e. functions that are not bound to a name) at runtime, using a construct called "lambda".
This piece of code shows the difference between a normal function definition f and a lambda function g:
def f (x):
return x**2
g = lambda x: x**2
print(f(4), g(4))
16 16
Note that the lambda definition does not include a 'return' statement (it always contains an expression which is returned).
Also note that you can put a lambda definition anywhere a function is expected, and you don't have to assign it to a variable at all.
Check the following code to have an idea of the typical usage for lambda functions: here we sanitize a list of strings by 'mapping' a list:
import re
states = [' Alabama ', 'Georgia!', ' ## Georgia', ' ? georgia', 'FlOrIda']
clean = lambda str: re.sub('[!#?]', '', str.strip()).title()
for c in map(clean, states):
print(c)
Alabama Georgia Georgia Georgia Florida
In Python is very easy to work with files. *Try by yourself* this self-explaining code:
import os.path
path = os.path.join(os.path.curdir, "example_data", "my_input.txt")
ifile = open(path, 'r')
for l in ifile: # ifile is an iterator
print(l, end='') # ',' is for suppressing the newline '\n'
ifile.close()
First Second 10 0.32432 20 1.324 21 7.237923 36 .83298932 56 237.327823
# Read all the lines in a list
ifile = open(path, 'r')
lines = ifile.readlines()
print(lines)
ifile.close()
['First Second\n', '10 0.32432\n', '20 1.324\n', '21 7.237923\n', '36 .83298932\n', '56 237.327823\n']
Read a file, format and write back
ifile = open(path) # 'read mode' is default
path_2 = os.path.join(os.path.curdir, "tmp", "my_input2.txt")
ofile = open(path_2, 'w') # Open Output file in 'write mode'
for line in ifile: # Read ONE line at a time
s = line.split()
try:
ofile.write('{:g} {:14.3e}\n'.format(float(s[0]), float(s[1])))
print('{:g} {:14.3e}\n'.format(float(s[0]), float(s[1])), end='')
except:
ofile.write('{0} {1}\n'.format(s[0], s[1]))
print('{} {}\n'.format(s[0], s[1]))
# Notice: 'print' automatically adds a newline at the of the string
ifile.close()
ofile.close()
First Second 10 3.243e-01 20 1.324e+00 21 7.238e+00 36 8.330e-01 56 2.373e+02
When it's possible use the "with" syntax, this will close the file automatically in case of an exception preventing the program flow to reach the 'close' statements. This is also considered a *"more pythonic"* style.
with open(path) as fid:
for line in fid:
s = line.split()
try:
print('{:g} {:14.3e}\n'.format(float(s[0]), float(s[1])), end='')
except:
print('{} {}\n'.format(s[0], s[1]))
First Second 10 3.243e-01 20 1.324e+00 21 7.238e+00 36 8.330e-01 56 2.373e+02
This is the most common way to serialize and save to disk any type of Python object. Mind that if you need to save complex and structured data and share it, cPickle is not the preferred method: consider instead of using a specific file format like hdf5
A Python pickle file is (and always has been) a byte stream. Which means that you should always open a pickle file in binary mode: “wb” to write it, and “rb” to read it. The Python docs contain correct example code.
See also Programming Python for absolute beginners: Chapter 7 Storing Complex Data on stackoverflow.
import pickle # in Python 3 cPickle doesn't exist anymore
ls = ['one', 'two', 'three']
with open('tmp/out_ascii.pkl', 'wb') as f: # Can choose an arbitrary extension
pickle.dump(ls, f, 0) # dump with protocol '0': readable ASCII
with open('tmp/out_compb.pkl', 'wb') as f: # Can choose an arbitrary extension
pickle.dump(ls, f, 2) # dump with protocol '2': compressed bin
with open('tmp/out_compb.pkl', 'rb') as f:
d2 = pickle.load(f) # Protocol is automatically detected
print(d2)
['one', 'two', 'three']
Python gives you extensive possibilities to access you PC operating system. There are three modules in the Python Standard Library that you must be aware of:
Try some example commands by running the following cells
import sys
# Platform identifier
sys.platform
'linux'
# Version number of the Python interpreter
sys.version
'3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) \n[GCC 7.2.0]'
# PYTHONPATH: Folders in which looking for modules
for p in sys.path:
print(p.strip())
/home/matteo/anaconda3/envs/addfor_tutorials/lib/python36.zip /home/matteo/anaconda3/envs/addfor_tutorials/lib/python3.6 /home/matteo/anaconda3/envs/addfor_tutorials/lib/python3.6/lib-dynload /home/matteo/anaconda3/envs/addfor_tutorials/lib/python3.6/site-packages /home/matteo/anaconda3/envs/addfor_tutorials/lib/python3.6/site-packages/IPython/extensions /home/matteo/.ipython
# Shows where the Python files are installed
print(sys.exec_prefix)
/home/matteo/anaconda3/envs/addfor_tutorials
# Information about the float DataType
sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)
# The largest (simple) positive integer supported, in Python 2.x was sys.maxint
# now in Python 3 the integers are only limited by 'maxsize'. Example:
int(2**4000)
13182040934309431001038897942365913631840191610932727690928034502417569281128344551079752123172122033140940756480716823038446817694240581281731062452512184038544674444386888956328970642771993930036586552924249514488832183389415832375620009284922608946111038578754077913265440918583125586050431647284603636490823850007826811672468900210689104488089485347192152708820119765006125944858397761874669301278745233504796586994514054435217053803732703240283400815926169348364799472716094576894007243168662568886603065832486830606125017643356469732407252874567217733694824236675323341755681839221954693820456072020253884371226826844858636194212875139566587445390068014747975813971748114770439248826688667129237954128555841874460665729630492658600179338272579110020881228767361200603478973120168893997574353727653998969223092798255701666067972698906236921628764772837915526086464389161570534616956703744840502975279094087587298968423516531626090898389351449020056851221079048966718878943309232071978575639877208621237040940126912767610658141079378758043403611425454744180577150855204937163460902512732551260539639221457005977247266676344018155647509515396711351487546062479444592779055555421362722504575706910949376
# Maximum size integers, lists, strings, dicts can have
sys.maxsize
9223372036854775807
Try some example commands by running the following cells
import os
for counter, osvariable in enumerate(os.environ):
if counter >= 10:
print('AND MORE ...')
break
print('{:>25s}: {:s}'.format(osvariable,os.environ[osvariable][:64]))
else:
print('============ No more OS Variables ============')
XDG_VTNR: 7 LC_PAPER: it_IT.UTF-8 LC_ADDRESS: it_IT.UTF-8 XDG_SESSION_ID: c2 XDG_GREETER_DATA_DIR: /var/lib/lightdm-data/matteo LC_MONETARY: it_IT.UTF-8 CLUTTER_IM_MODULE: xim SESSION: ubuntu GPG_AGENT_INFO: /home/matteo/.gnupg/S.gpg-agent:0:1 TERM: xterm-color AND MORE ...
# How to check a system variable:
if 'NUMBER_OF_PROCESSORS' in os.environ:
print('Number of processors in this machine:', os.environ['NUMBER_OF_PROCESSORS'])
# Working directory
print(os.getcwd())
/home/matteo/Projects/tutorials/python-ipython
# List the files in the current directory
for filename in sorted(os.listdir(os.getcwd())):
print(filename)
.ipynb_checkpoints example_data index.ipynb py01v04_ipython_notebook_introduction.ipynb py02v04_python_basics.ipynb py03v04_python_getting_started.ipynb py04v04_python_style_guide.ipynb py05v04_python_more_examples.ipynb py06v04_python_object_oriented.ipynb py07v04_python_speed-up_with_C.ipynb py08v04_Unicode.ipynb py09v04_python_regular_expressions.ipynb py10v04_ipython_notebook_widgets.ipynb tmp utilities
Difference between os.name
and sys.platform
:
sys.platform
will distinguish between linux, other unixes, and OS Xos.name
is "posix" for all of themos.name
'posix'
# Correctly handle paths, and filenames
# os.name can be 'posix', 'nt', 'mac', 'os2', 'ce', 'java', 'riscos'
if os.name == 'posix':
full_path = "/Users/dani/myfile.py"
elif os.name == 'nt':
full_path = 'C:\\myfile.py'
print(os.path.splitdrive(full_path))
print(os.path.split(full_path))
('', '/Users/dani/myfile.py') ('/Users/dani', 'myfile.py')
if os.name == 'posix':
os.system('ls')
else:
os.system('dir')
Try some example commands by running the following cells
import glob
print(glob.glob('*.txt'))
[]
Visit www.add-for.com for more tutorials and updates.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.