multi-line statements:
1
2
3total = item_one + \
item_two + \
item_three
List
- Find Median of List in Python:
statistics.median(list)
- Union of two lists: to remove all repetitions, use
res = list(set().union(lst1, lst2, lst3, ...))
- concatenate two lists:
lst1 + lst2
- Hash a list: you cannot hash a list, because list is mutable. You can only hash immutable objects. Therefore, to hash a list, you first convert it to a tuple:
hash(tuple([1,2,3]))
. - Take only the elements indexed at xth multiple to form a new list:
lst[::x]
so the third argument of:
means step, just as inrange
. - Generate range with floats: use
np.arange(start, stop, step)
, ornp.linspace(start, stop, num_wanted)
- Unzip a zipped list:
list(zip(*test_list))
note*
is the unpacking operator to unpack the iterable into separate elements that can then be passed as arguments to thezip()
function.
Dict
- Sort a dictionary:
dct= dict(sorted(dct.items(), key=lambda item:item[0]))
to sort by keys; change toitem[1]
to sort by values. - Remove an item from dict by key:
dct.pop(your_key)
Function
Access & Change global variable in local functions: you can access global variable in local functions without any other keywords. However, if you want to change the global variable in your local function. You will have to use the
global
keyword.By using a
global
keyword, you can either create a global variable in a local function, or link back to a global variable already created.1
2
3
4
5
6
7x = "h"
def myfunc():
global x
x = "fantastic"
myfunc()Change Variable in an Outer Scope: Similar to the
global
keyword, we have anonlocal
keyword for this purpose.1
2
3
4
5
6
7def foo():
a = 1
def bar():
nonlocal a
a = 2
bar()
print(a) # Output: 2Multiple number of arguments to a function:
1
2
3def foo(a, b, c, *others):
print(a, b, c)
print("And all the rest are ", list(others))Import a Custom Module: the same as
import
, but now the module name can be a variable instead of a static string1
2
3package_name = "numpy"
package = __import__(package_name)
package.array()
Class
print a class like Java’s toString :
1
2
3
4
5class Test:
def __repr__(self): # what to display when looked at in an interactive prompt
return "Test()"
def __str__(self): # what to print when called print(Test)
return "member of Test"-
1
2
3
4
5
6
7
8
9
10
11
12
13class CustomNumber:
def __init__(self, value):
self.value = value
def __lt__(self, obj):
"""self < obj"""
return self.value < obj.value
def __le__(self, obj): """self <= obj"""
def __eq__(self, obj): """self == obj"""
def __ne__(self, obj): """self != obj"""
def __gt__(self, obj): """self > obj"""
def __ge__(self, obj): """self >= obj""" -
1
2
3
4
5
6
7
8class Emp:
def __init__(self, emp_name, id):
self.emp_name = emp_name
self.id = id
def __hash__(self):
# when you want to get the hash, use hash(instance_of_custom_object)
return hash((self.emp_name, self.id)) Local variable in a class:
- Elements outside the
__init__
method are static elements; they belong to the class. - Elements inside the
__init__
method are elements of the object (self
); they don’t belong to the class.
1
2
3
4class MyClass:
static_elem = 123 # static
def __init__(self):
self.object_elem = 456 # specific to eacy instance- Elements outside the
Exception
Self-specified exception:
1 | class MyCustomError(Exception): |
try catch clause in python:
1 | try: |
String
convert string to int:
int(s)
How to remove the leading and trailing spaces in Python:
my_string.strip()
合并一个 String List:
"".join(str_lst)
Join with seperator:
",".join(str_lst)
advanced split with re :
re.split("split_on_what_in_regex", str)
Extract characters from a string:
"".join(re.findall("[a-zA-Z]+", str))
Convert String of Digits into a List of Digits: and just to characters
1
2
3
4
5
6num = 2019
# If you want a list of integers
res = [int(x) for x in str(num)]
# If you are good with a list of characters
res = list(str(num))Format float to scientific computing:
print("a = %.2e" %(num))
String Format in General: f-string is a new feature since python 3.6 and you should use it as string formatting convention
f"iter: {i}"
-
'+'
indicates that a sign should be used for both positive as well as negative numbers.'-'
indicates that a sign should be used only for negative numbers (this is the default behavior).' '
indicates that a leading space should be used on positive numbers, and a minus sign on negative numbers (most used)
1
2
3
4
5
6
7
8# scientific format with f-string
f'{num:.5e}'
# float number, use space to also align negative sign
f'{num: .3f}'
# align integers to have a fixed length
f'{num:3d}
# f-string braced evaluation also supports everything (including functions)
f"{"Eric Idle".lower()} is funny."-
Data Structures
Queue: Python 用的不是 enqueue dequeue,而是 put get
1
2
3
4import queue
q = queue.Queue()
q.put(s)
v = q.get()Priority Queue:
1
2from queue import priorityQueue
q = PriorityQueue()
函数式编程
- python的filter基本用法:
lst = list(filter(func, lst)), dct = dict(filter(func, dct))
- python3中map()函数用法:
map(func, list)
- Python reduce() 函数: similar to
fold_left
reduce(lambda acc x : ..., list, init)
IO
读入多行文件:
lines = file1.readlines()
1
2
3
4
5
6
7
8
9file_path = os.getcwd() + "\\" + file_name
f = open(file_path, 'r', encoding='utf-8')
lines = f.readlines()
f = open(file_path, 'w', encoding='utf-8')
f.write(string)
f.writelines(lst_of_strs)
f.close()-
1
filenames = os.listdir(path)
Get Parent Directory Name:
os.path.dirname(os.getcwd())
Profiling
Creating Profiling Data
1 | import cProfile |
Inspecting Profiling Data
1 | import pstats |
The following columns will be shown:
ncalls
: number of times function was calledtottime
: amount of time spent in the function (not counting any time spent in subfunctions)percall
:tottime
/ncalls
cumtime
: all the time spent in the function and subfunctionspercall
:cumtime
/ncalls
filename:lineno(function)
: name of function that was called and where it is defined
A fairly common practice is to sort by one of the above attributes. Or to look at its callees to see where that function wound up spending time. You can also perform the inverse, and look up a function’s callers. This can be helpful if you have a function that is taking a lot of time, but you don’t know who is calling it.
1 | stats.sort_stats("cumtime").print_stats(2) #print first 2 functions that spent highest cumulative time |
You can also use the visualization tool snakeviz
.
Reference: Profiling Python Code with cProfile
Profile Memory
1 | import tracemalloc |
Pip
pip freeze
to show all installed packagespip show <package_name>
to show a specific package
NumPy
Difference between
max
andmaximum
:numpy.maximum(A,B)
returns the element-wise bigger one of the twonumpy.max(A)
returns the maximum value inside A
-
np.matmul(A, B)
: Returns matrix product of A and Bnp.multiply(A, B)
: Returns element-wise multiplication of A and Bnp.dot(A, B)
: Returns dot product of A and B
numpy.diagonal(M)
: Returns the diagonal of a 2-D matrix Mnumpy.tile(A, reps)
: repeats A reps timesnumpy.where(cond, A, B)
: condition on array. Really useful function, so is justA if cond else B
Solve
TypeError: only integer scalar arrays can be converted to a scalar index
when you executea[a == b]
: this happens becausea
is not an np array. It is a list and the message above comes from the list type. referenceConvert sclacr to array or to any shape:
np.reshape(scalar, (1,1))
When your matrix operations involve inverses $A^{-1}$, it is always better to use the inverse indirectly than to manifest it explicitly because manifesting it often involves intricate computation that may harm numerical stability. That is, use
np.linalg.solve()
instead ofnp.linalg.inv
referencenp.frompyfunc
to more efficiently apply function on numpy arrays: This function is internally called when you apply a function to anp.array
, but if the otuput doesn’t meet your expectation, you can use this function to specify what it should do.1
2
3
4double = lambda x = 2x
npfunction = np.frompyfunc(f, <input_number>, <output_number>)
npf = np.frompyfunc(double, 1, 1)
# npf(arr) <==> f(arr) in this particular caseFor each row, extract the corresponding column:
Qs = network(states)[np.arange(actions.shape[0]), actions]
network(states)
is $B \times dim_A$ representing for each sample, the value of taking a specific action.actions
is vector of $B$ storing which action we actually took. Using this command, we extract the value of taking a specificaction
at a specificstate
. Note There are a total $B$ (state, action) pairs.
Pandas
1 | # 直接循环 df 循环的是 col 名 |
Mathplotlib
import matplotlib.pyplot as plt
Change where y range starts in matplotlib:
plt.ylim(bottom = x)
Rotate the labels in x-axis by 90 degrees: this trick helps you when you have too long x-axis labels.
plt.xticks(rotation = 90 )
Output/Save Plot:
plt.savefig('filename.png')
Change labels, ticks, …
Change ticks are applicable when your x-axis is discrete, like [1, 2, 5, 10] and you want any neighboring two only has unit distance instead of, say between 2 and 5 have 3 unit distance.
1
2
3
4
5
6
7plt.xlabel('X axis', fontsize=15)
plt.ylabel('Y axis', fontsize=15)
plt.xticks(lst_of_tick_position, labels, color='blue', rotation=60)
# disabling yticks by setting yticks to an empty list
plt.yticks([])Different Kinds of Plot:
scatter plot:
plt.scatter(x,y)
histogram:
plt.hist(x,y)
普通折线图:
1
2
3x = np.arange(-10,10,0.1)
y = 2*x
plt.plot(x,y)
reset plot:
plt.clf()
Plot lines w/ custom line label:
1
2
3
4
5
6#plot individual lines with custom colors, styles, and widths
plt.plot(df['leads'], label='Leads', color='green')
plt.plot(df['prospects'], label='Prospects', color='steelblue', linewidth=4)
plt.plot(df['sales'], label='Sales', color='purple', linestyle='dashed')
plt.legend()
Json
Json doesn’t dump UTF-8: When you have json output like
\u2019
, it may not be your fault. Note the json standard is to escape non-ascii characters even if it’s not needed. You can override this with the following command:1
2with open('output.json', 'w') as f:
json.dump(posts, f, indent=4, ensure_ascii=False)