Yao Lirong's Blog

TensorFlow 1.x Manual

2021/05/28

海尔实习期间记录下的 TensorFlow 笔记

Basic Notion

  • Graph: often refers to Computation Graph, which describes how to compute the output

  • Eager execution: evaluates operations immediately, without building graphs

    Enabling eager execution changes how TensorFlow operations behave—now they immediately evaluate and return their values to Python. tf.Tensorobjects reference concrete values instead of symbolic handles to nodes in a computational graph. Since there isn’t a computational graph to build and run later in a session, it’s easy to inspect results using print() or a debugger. Evaluating, printing, and checking tensor values does not break the flow for computing gradients.


  • Operation: 图中的节点, takes Tensor object as input, and produces Tensor objects as output

  • Tensor: multi-dimensional arrays with a uniform type (called dtype), 包含一个 n 维的数组或列表. 一个静态类型 rank, 和 一个 shape.

    It does not hold the values of that operation’s output, but instead provides a means of computing those values. It is a symbolic handle of input/output of Operation.

    图上操作间传递的数据都是 Tensor: A Tensor can be passed as an input to another Operation. This builds a dataflow connection between operations, which enables TensorFlow to execute an entire Graph that represents a large, multi-step computation.

  • Session: launch the computation of a graph

  • InteractiveSession: a better graph runner that allows you to compute each operation step by step instead of only giving out the final result, as in Session

1
2
3
4
5
6
7
8
9
10
# Build a dataflow graph.
a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[1.0, 1.0], [0.0, 1.0]])
c = tf.matmul(a, b)

# Construct a `Session` to execute the graph.
sess = tf.compat.v1.Session()

# Execute the graph and store the value that `e` represents in `result`.
result = sess.run(e)

a, b, c are Tensor here.

c = tf.matmul(a, b) creates an Operation of type “MatMul” (Matrix Multiplication) that takes tensors a and b as input, and produces c as output.


  • Variable: represent shared, persistent state your program manipulates (parameters of the model)

    it is a tf.Tensor whose value can be changed by running ops on it

  • Placeholder: a tensor whose value will later be fed.

Operations on Tensors

  • tf.reduce_xxx(t, axis=i): If we have a tensor t of dimension $d_1 \times d_2 \times … \times d_n $, apply r = reduce_xxx(t, axis = i), Each entry along axis i will be collapsed into a single entry, so r will have dimension $d_1 \times d_2 \times … d_{i-1} \times d_{i+1} … \times d_n $:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    a=np.random.randint(1,10,(2,3,4))
    '''
    2 arrays of dimension 3 X 4
    [[[8 5 7 1]
    [9 7 2 2]
    [7 7 4 6]]
    [[7 7 8 4]
    [7 4 3 6]
    [5 3 2 8]]]
    '''
    sess = tf.Session()
    with sess.as_default():
    r = (tf.reduce_sum(a, axis=1)).eval() # reduce along axis of length 3
    '''
    [[24 19 13 9]
    [19 14 13 18]]
    '''
  • tf.reshape(t, list): Reorder all the elements in t so that we have a new dimension in r: $d_1’ = list[0], d_2’ = list[1], …$ If we have $d’_i = -1$ as one of the dimension, $d_i’ = \frac{d_1 \times d_2 \times … \times d_n}{list[0] \times list[1]\times…list[i-1]\times list[i+1] … \times list[n-1]} $, so

    1
    r = tf.reshape(a, [-1,2,2]).eval() # r will havee shape (6, 2, 2)
  • tf.concat([t1, t2, ...], axis = i): pile all the arrays along axis i. These arrays must have the same length along the other axis. In the result, only the length along axis i will increase, the length of other axis remain the same.

  • tf.tile(t, [m1,m2,...]): multiple axis i with mi, so the result tensor dimension is $(d_1\times m_1, d_2\times m_2, …)$

Debug with Tensorboard

  • tf.summary: Follow the official guide
  • tf.estimator: Specify model_dir when initializing your estimator. Everything about the trained model will be stored in this directory, including event files logging training process. Reference

Utility

  • Sometimes we encounter module 'tensorflow' has no attribute ... because TensorFlow changed/refactored its function name. We can use this list to manually update all changed names or directly use this script.
CATALOG
  1. 1. Basic Notion
  2. 2. Operations on Tensors
  3. 3. Debug with Tensorboard
  4. 4. Utility