Python Manual

Manual

 2020/11/29 

multi-line statements:

1
2
3

total = item_one + \
        item_two + \
        item_three

List

Find Median of List in Python: statistics.median(list)
Union of two lists: to remove all repetitions, use res = list(set().union(lst1, lst2, lst3, ...))
concatenate two lists: lst1 + lst2
Hash a list: you cannot hash a list, because list is mutable. You can only hash immutable objects. Therefore, to hash a list, you first convert it to a tuple: hash(tuple([1,2,3])).
Take only the elements indexed at xth multiple to form a new list: lst[::x] so the third argument of : means step, just as in range.
Generate range with floats: use np.arange(start, stop, step), or np.linspace(start, stop, num_wanted)
Unzip a zipped list: list(zip(*test_list)) note * is the unpacking operator to unpack the iterable into separate elements that can then be passed as arguments to the zip() function.

Dict

Sort a dictionary: dct= dict(sorted(dct.items(), key=lambda item:item[0])) to sort by keys; change to item[1] to sort by values.
Remove an item from dict by key: dct.pop(your_key)

Function

Access & Change global variable in local functions: you can access global variable in local functions without any other keywords. However, if you want to change the global variable in your local function. You will have to use the global keyword.

By using a global keyword, you can either create a global variable in a local function, or link back to a global variable already created.
1
2
3
4
5
6
7
x = "h"

def myfunc():
global x
x = "fantastic"

myfunc()

Change Variable in an Outer Scope: Similar to the global keyword, we have a nonlocal keyword for this purpose.

def foo():
    a = 1
    def bar():
        nonlocal a
        a = 2
    bar()
    print(a)  # Output: 2

Multiple number of arguments to a function:

1
2
3

def foo(a, b, c, *others):
    print(a, b, c)
    print("And all the rest are ", list(others))

Import a Custom Module: the same as import, but now the module name can be a variable instead of a static string
1
2
3
package_name = "numpy"
package = __import__(package_name)
package.array()

Class

print a class like Java’s toString :

class Test:
    def __repr__(self): # what to display when looked at in an interactive prompt
        return "Test()"
    def __str__(self): # what to print when called print(Test)
        return "member of Test"

Self-defined comparator:

class CustomNumber:
    def __init__(self, value):
        self.value = value

    def __lt__(self, obj):
        """self < obj"""
        return self.value < obj.value

    def __le__(self, obj): """self <= obj"""
    def __eq__(self, obj): """self == obj"""
    def __ne__(self, obj): """self != obj"""
    def __gt__(self, obj): """self > obj"""
    def __ge__(self, obj): """self >= obj"""

hash on a custom object:

class Emp:
    def __init__(self, emp_name, id):
        self.emp_name = emp_name
        self.id = id
 
    def __hash__(self):
        # when you want to get the hash, use hash(instance_of_custom_object)
        return hash((self.emp_name, self.id))

Local variable in a class:
- Elements outside the __init__ method are static elements; they belong to the class.
- Elements inside the __init__ method are elements of the object (self); they don’t belong to the class.
1
2
3
4
class MyClass:
static_elem = 123 # static
def __init__(self):
self.object_elem = 456 # specific to eacy instance

Exception

Self-specified exception:

class MyCustomError(Exception):
    def __init__(self, *args):
        if args:
            self.message = args[0]
        else:
            self.message = None

    def __str__(self):
        if self.message:
            return 'MyCustomError, {0} '.format(self.message)
        else:
            return 'MyCustomError has been raised'

try catch clause in python:

try:
  print(x)
except:
  print("Exception thrown. x does not exist.")

String

convert string to int: int(s)
How to remove the leading and trailing spaces in Python: my_string.strip()
合并一个 String List: "".join(str_lst)
Join with seperator: ",".join(str_lst)
advanced split with re : re.split("split_on_what_in_regex", str)
Extract characters from a string: "".join(re.findall("[a-zA-Z]+", str))

Convert String of Digits into a List of Digits: and just to characters

num = 2019

# If you want a list of integers
res = [int(x) for x in str(num)]
# If you are good with a list of characters
res = list(str(num))

Format float to scientific computing: print("a = %.2e" %(num))

String Format in General: f-string is a new feature since python 3.6 and you should use it as string formatting convention f"iter: {i}"

to align signs:

'+' indicates that a sign should be used for both positive as well as negative numbers.
'-' indicates that a sign should be used only for negative numbers (this is the default behavior).
' ' indicates that a leading space should be used on positive numbers, and a minus sign on negative numbers (most used)

# scientific format with f-string
f'{num:.5e}' 
# float number, use space to also align negative sign
f'{num: .3f}' 
# align integers to have a fixed length
f'{num:3d}
# f-string braced evaluation also supports everything (including functions)
f"{"Eric Idle".lower()} is funny."

Data Structures

Queue: Python 用的不是 enqueue dequeue，而是 put get
1
2
3
4
import queue
q = queue.Queue()
q.put(s)
v = q.get()

Priority Queue:

1 2	from queue import priorityQueue q = PriorityQueue()

函数式编程

python的filter基本用法: lst = list(filter(func, lst)), dct = dict(filter(func, dct))
python3中map()函数用法: map(func, list)
Python reduce() 函数: similar to fold_left reduce(lambda acc x : ..., list, init)

IO

读入多行文件: lines = file1.readlines()

file_path = os.getcwd() + "\\" + file_name
f = open(file_path, 'r', encoding='utf-8')
lines = f.readlines()

f = open(file_path, 'w', encoding='utf-8')
f.write(string)
f.writelines(lst_of_strs)

f.close()

写入csv文件的几种方法
写入csv文件的几种方法总结
获取当前文件夹下所有文件夹名
1
filenames = os.listdir(path)
Get Parent Directory Name: os.path.dirname(os.getcwd())

Profiling

Creating Profiling Data

import cProfile
profiler = cProfile.Profile()
profiler.enable()
# Code goes here 
profiler.disable()
profiler.dump_stats("execution.stats")

Inspecting Profiling Data

1
2
3

import pstats
stats = pstats.Stats("example.stats")
stats.print_stats()

The following columns will be shown:

ncalls: number of times function was called
tottime: amount of time spent in the function (not counting any time spent in subfunctions)
percall: tottime / ncalls
cumtime: all the time spent in the function and subfunctions
percall: cumtime / ncalls
filename:lineno(function): name of function that was called and where it is defined

A fairly common practice is to sort by one of the above attributes. Or to look at its callees to see where that function wound up spending time. You can also perform the inverse, and look up a function’s callers. This can be helpful if you have a function that is taking a lot of time, but you don’t know who is calling it.

1
2
3

stats.sort_stats("cumtime").print_stats(2) #print first 2 functions that spent highest cumulative time
stats.print_callers("cprofile_example.py:7") # 7 is line7
stats.print_callees("cprofile_example.py:3")

You can also use the visualization tool snakeviz.

Reference: Profiling Python Code with cProfile

Profile Memory

import tracemalloc

tracemalloc.start()
# Code goes here 
print("maximum memory usage is " + str(tracemalloc.get_traced_memory()[1] / 1024 / 1024 / 1024) + " Gb")
tracemalloc.stop()

Pip

pip freeze to show all installed packages
pip show <package_name> to show a specific package

NumPy

Difference between max and maximum:
- numpy.maximum(A,B) returns the element-wise bigger one of the two
- numpy.max(A) returns the maximum value inside A
Matrix/Vector Multiplication:
- np.matmul(A, B): Returns matrix product of A and B
- np.multiply(A, B): Returns element-wise multiplication of A and B
- np.dot(A, B): Returns dot product of A and B
numpy.diagonal(M): Returns the diagonal of a 2-D matrix M
numpy.tile(A, reps): repeats A reps times
numpy.where(cond, A, B): condition on array. Really useful function, so is just A if cond else B
Solve TypeError: only integer scalar arrays can be converted to a scalar index when you execute a[a == b]: this happens because a is not an np array. It is a list and the message above comes from the list type. reference
Convert sclacr to array or to any shape: np.reshape(scalar, (1,1))
When your matrix operations involve inverses $A^{-1}$, it is always better to use the inverse indirectly than to manifest it explicitly because manifesting it often involves intricate computation that may harm numerical stability. That is, use np.linalg.solve() instead of np.linalg.inv reference
np.frompyfunc to more efficiently apply function on numpy arrays: This function is internally called when you apply a function to a np.array, but if the otuput doesn’t meet your expectation, you can use this function to specify what it should do.
1
2
3
4
double = lambda x = 2x
npfunction = np.frompyfunc(f, <input_number>, <output_number>)
npf = np.frompyfunc(double, 1, 1)
# npf(arr) <==> f(arr) in this particular case
For each row, extract the corresponding column: Qs = network(states)[np.arange(actions.shape[0]), actions]

network(states) is $B \times dim_A$ representing for each sample, the value of taking a specific action. actions is vector of $B$ storing which action we actually took. Using this command, we extract the value of taking a specific action at a specific state. Note There are a total $B$ (state, action) pairs.

Pandas

# 直接循环 df 循环的是 col 名
for col in df:
    print(col)

# 想要循环每一行的数据应使用 iterrows()
# row = (row_index: int, data: pd.Series)
for row in df.iterrows():
    print(row)

# 想要读取某一行的数据使用 loc[i]，返回 pd.Series
row0 = df.loc[0]

# loc 用来过滤时如果有两个以上条件：只能用&，用and会报错，此外也要用圆括号括起来 
df.loc[ (df["att1"] == "012") & (df["code"] == "2A") ]

AttributeError: ‘float’ object has no attribute ‘split’

Mathplotlib

import matplotlib.pyplot as plt

Change where y range starts in matplotlib: plt.ylim(bottom = x)
Rotate the labels in x-axis by 90 degrees: this trick helps you when you have too long x-axis labels. plt.xticks(rotation = 90 )
Output/Save Plot: plt.savefig('filename.png')

Change labels, ticks, …

Change ticks are applicable when your x-axis is discrete, like [1, 2, 5, 10] and you want any neighboring two only has unit distance instead of, say between 2 and 5 have 3 unit distance.

plt.xlabel('X axis', fontsize=15)
plt.ylabel('Y axis', fontsize=15)
  
plt.xticks(lst_of_tick_position, labels, color='blue', rotation=60)  
  
# disabling yticks by setting yticks to an empty list
plt.yticks([])

Different Kinds of Plot:
- scatter plot: plt.scatter(x,y)
- histogram: plt.hist(x,y)
- 普通折线图:
  1
  2
  3
  x = np.arange(-10,10,0.1)
  y = 2*x
  plt.plot(x,y)
reset plot: plt.clf()

Plot lines w/ custom line label:

#plot individual lines with custom colors, styles, and widths
plt.plot(df['leads'], label='Leads', color='green')
plt.plot(df['prospects'], label='Prospects', color='steelblue', linewidth=4)
plt.plot(df['sales'], label='Sales', color='purple', linestyle='dashed')

plt.legend()

Json

Json doesn’t dump UTF-8: When you have json output like \u2019, it may not be your fault. Note the json standard is to escape non-ascii characters even if it’s not needed. You can override this with the following command:
1
2
with open('output.json', 'w') as f:
json.dump(posts, f, indent=4, ensure_ascii=False)

Author：Yao Lirong

Link：https://yao-lirong.github.io/blog/2020-11-29-Python-%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98/

Publish date：November 29th 2020, 12:00:00 am

Update date：April 12th 2023, 4:38:42 am

License：本文采用 Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) 进行许可

Next Post

C++ Manual
Previous Post

LaTeX Manual

CATALOG

1. List
2. Dict
3. Function
4. Class
5. Exception
6. String
7. Data Structures
8. 函数式编程
9. IO
10. Profiling
11. Pip
12. NumPy
13. Pandas
14. Mathplotlib
15. Json