ie_test: tensorflow の batch normalize に関して

# 目的
実際の処理についてデバッグしてみた。

# コード
以下のコードは
バッチの大きさ = 3 で
変数一個の層が2 つ重なったネットワークで
 $$f = x^3 + 1 $$
という関数を回帰するだけのトレーニング。

```
#coding: utf-8

import random

import tensorflow as tf
import numpy as np

random.seed(0)

training = tf.placeholder_with_default( False, shape=(), name="training" )

x = tf.placeholder( tf.float32, shape=(None,1) )

#w = tf.Variable( tf.zeros([1,1]) )
w = tf.Variable( tf.ones([1,1]) )
#b = tf.Variable( tf.zeros([1]) )
b = tf.Variable( tf.ones([1]) )
hidden_tmp = tf.matmul( x, w ) + b # 行列の掛け算
#hidden = tf.sigmoid( hidden_tmp )
bn1 = tf.layers.batch_normalization( hidden_tmp, training=training, momentum=0.9 )
hidden = tf.sigmoid( bn1 )
#print( dir(bn1) )
#exit(1)

w2 = tf.Variable( tf.ones([1,1]) )
#b2 = tf.Variable( tf.zeros([1]) )
b2 = tf.Variable( tf.ones([1]) )
f = tf.sigmoid( tf.matmul( hidden, w2 ) + b2 ) # 行列の掛け算

f_ = tf.placeholder( tf.float32, shape=(None,1) )
loss = tf.reduce_mean( tf.abs( f_ - f) )
learn_rate = 0.5
trainer = tf.train.GradientDescentOptimizer( learn_rate )
extra_update_ops = tf.get_collection( tf.GraphKeys.UPDATE_OPS )
with tf.control_dependencies(extra_update_ops):
    trainer_op = trainer.minimize(loss)
#trainer_op = trainer.minimize( loss )

batch_size = 3
epochs = 10
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    init.run()
    for i in range( epochs ):
        batch_xs, batch_fs = [], []
        #print( w.eval() )
        for j in range( batch_size ):
            x1 = random.random()
            f1 = x1*x1*x1 + 1 # この関数を訓練させる！
            batch_xs.append( [ x1 ] )
            batch_fs.append( [ f1 ] )

print( batch_xs )
        for v in tf.global_variables():
            print(v.name, v.eval() )
        print( hidden_tmp.eval( feed_dict={x: batch_xs } ) )
        print( bn1.eval( feed_dict={x: batch_xs } ) )

sess.run( [trainer_op, extra_update_ops],
                  feed_dict={x: batch_xs, f_: batch_fs, training:True } )

result_loss = loss.eval( feed_dict={x: batch_xs, f_: batch_fs } )
        print( "result_loss:", result_loss )
 
```

# 出力
以下のような出力がなされる。
```
[[0.6108869734438016], [0.9130110532378982], [0.9666063677707588]]
Variable:0 [[0.9991983]]
Variable_1:0 [1.]
batch_normalization/gamma:0 [0.9833956]
batch_normalization/beta:0 [0.12067804]
batch_normalization/moving_mean:0 [0.9730693]
batch_normalization/moving_variance:0 [0.41279468]
Variable_2:0 [[1.2562904]]
Variable_3:0 [1.5393883]
[[1.6103973]
 [1.9122791]
 [1.9658315]]
[[1.0949917]
 [1.5564928]
 [1.638361 ]]
....
```
# 考察

```
hidden_tmp = tf.matmul( x, w ) + b # 行列の掛け算
bn1 = tf.layers.batch_normalization( hidden_tmp, training=training, momentum=0.9 )
```
上記の hidden_tmp と bn1 の出力値を比べると
m_m : moving_mean
m_v : moving_variance
bn1 = gamma*( (hidden_tmp - m_m) /math.sqrt( m_v+0.00001) ) + beta
の関係にあることが確認できる。

# 課題
hidden_tmp のバッチの平均値に対して移動平均が取られて, moving_meabnとなっていることを確認した方が良い。

ie_test

2018年8月1日水曜日

tensorflow の batch normalize に関して

0 件のコメント:

コメントを投稿