ie_test: tensorflow の RNN 実装について

# 目的
RNN は入力が結構複雑なので意図通りのネットワークになっているかを確認したい.

## RNN コード

以下のコードをみる。わかりやすくするために $$ W \in R^{3 \times 5}$$ の要素を全て 1,$$b \in R^5$$ を全て 0 にして実行する。
```
#coding:utf-8

import tensorflow as tf
import numpy as np

n_inputs = 3
n_neurons = 5

X0 = tf.placeholder( tf.float32, [None, n_inputs] )
X1 = tf.placeholder( tf.float32, [None, n_inputs] )

basic_cell = tf.contrib.rnn.BasicRNNCell( num_units=n_neurons )

output_seq, states = \
    tf.contrib.rnn.static_rnn( basic_cell, [X0, X1], dtype=tf.float32 )

Y0, Y1 = output_seq
init = tf.global_variables_initializer()

X0_batch = np.array( [ [0, 0.1, 0.2] ] )
X1_batch = np.array( [ [0, 0.1, 0.2] ] )

assign_ops = []
for v in tf.global_variables():
    #rnn/basic_rnn_cell/kernel:0 (8, 5)
    print(v.name,v.shape)    
    if v.name == "rnn/basic_rnn_cell/bias:0":
        assign_ops.append ( tf.assign( v, tf.zeros( v.shape, dtype=tf.float32 ) ) )
    else:
        assign_ops.append ( tf.assign( v, tf.ones( v.shape, dtype=tf.float32 ) ) )
assign_ops_exec = tf.group(*assign_ops)

with tf.Session() as sess:
    init.run()
    sess.run( assign_ops_exec )
    Y0_val, Y1_val = sess.run( [Y0,Y1], feed_dict={ X0: X0_batch, X1:X1_batch } )
    print(Y0_val) # math.tanh( 0.3 )
    print(Y1_val) # math.tanh( 5*math.tanh(0.3) + 0.3 )

```

実は以下のコードと同様のネットワークになっている。

```
#coding:utf-8

import tensorflow as tf

n_inputs = 3
n_neurons = 5

X0 = tf.placeholder( tf.float32, [None, n_inputs] )
X1 = tf.placeholder( tf.float32, [None, n_inputs] )

Wx = tf.Variable( tf.random_normal( shape=(n_inputs, n_neurons), dtype=tf.float32 ) )
Wy = tf.Variable( tf.random_normal( shape=(n_inputs, n_neurons), dtype=tf.float32 ) )
b  = tf.Variable( tf.zeros( [1, n_neurons], dtype=tf.float32 ) )

Y0 = tf.tanh( tf.matmul( X0, Wx) + b )
Y1 = tf.tanh( tf.matmul( Y0, Wy) + tf.matmul( X1, Wx ) + b )
print("ok")
```

## 確認
rnn.py を実行すると以下の出力になる。
```
[[0.2913126 0.2913126 0.2913126 0.2913126 0.2913126]]
[[0.94211787 0.94211787 0.94211787 0.94211787 0.94211787]]
```
W が Wx と Wy を合わせた形となり shape = (8,5) となっていることが確認できる。
また、 
math.tanh(0.3) = 0.2913126124515909
math.tanh( 5*math.tanh(0.3) + 0.3 ) = 0.9421178971878335
となることが確認できる。

## LSTM コード
以下は MNIST に対して無理やり LSTM やってみたコードの一部。
https://blog.scimpr.com/2018/01/26/tensorflow%E3%81%AE%E3%83%A2%E3%83%87%E3%83%AA%E3%83%B3%E3%82%B0%EF%BC%92%E3%80%9Crnn/
```
#coding:utf-8
import tensorflow as tf

x = tf.placeholder( tf.float32, [None, 784] )
x2 = tf.reshape( x, [-1, 28, 28] )
x3 = tf.transpose( x2, [1, 0, 2] )  #[28, 50, 28]のtensor
x4 = tf.reshape( x3, [-1, 28] ) #[28*50, 28]のtensor
x5_list = tf.split( x4, 28 ) # shape=(*,28) となる tensor を 28 個もつリスト.
# print( "x4: ", x4.shape )
# print( "type(x5): ", type(x5) )
# print( "type(x5[0]): ", type(x5[0]) )
# print( "x5[0]: ", x5[0].shape )

lstm_cell = tf.contrib.rnn.BasicLSTMCell( 13, forget_bias=1.0)
outputs, states = tf.contrib.rnn.static_rnn( lstm_cell, x5_list, dtype=tf.float32 )
print( "len(outputs): ", len(outputs) )
print( "outputs[0].shape: ", outputs[0].shape )
print( "len(states): ", len(states) )
print(  type(states[0]), type(states[1]) )
print(  states[0].shape, states[1].shape )
print("variable print")
for v in tf.global_variables():
    print( v.name, v.shape )

```

## LSTM 確認
以下のように shape=(input*output, 4* output ) の変数が内部で確保されている事が確認できる。
```
len(outputs):  28
outputs[0].shape:  (?, 13)
len(states):  2
<class 'tensorflow.python.framework.ops.Tensor'> <class 'tensorflow.python.framework.ops.Tensor'>
(?, 13) (?, 13)
variable print
rnn/basic_lstm_cell/kernel:0 (41, 52)
rnn/basic_lstm_cell/bias:0 (52,)
ok

```
さらにあの LSTM のめんどい式が計算されていることを確認した方が良いが、
そのためには、もう少し小さいサンプルでデータを作成した方がよい。。
(力尽きた。)

ie_test

2018年8月1日水曜日

tensorflow の RNN 実装について

0 件のコメント:

コメントを投稿