问题:对于一组样本,label如果是非0即1的话,通过神经网络的预测,预测值一定不是非0即1,针对这种情况我们将如何从中获取信息?
实验:
????????实验背景:
????????????????样本:随机生成一组数据,数据格式为(x,y)的序对,且x,y都在3到-3内。
????????????????标签:x^2 + y ^ 2 < 2 label为1 否则为0
????????????????可视化:我们将1视为红色,0视为蓝色,样本的(x,y)为坐标,将其打印在坐标轴上输出。
? ? ? ? ? ? ? ? 训练过程:读入样本和标签,对参数进行训练。
? ? ? ? ? ? ? ? 预测过程:生成从-3到3的序对,间隔为0.01,将其全部喂入神经网络,并通过前向传播过程进行预测。
? ? ? ? ? ? ? ? 结果及结论:预测出的结果prob中,每一个值都对应一个坐标,这个值我们发现是趋近于0或1的,并且大多聚集于0 1周围,离0 1较远的值较少。我们利用plt中的等高线,对于值相近于0.7的坐标进行显示化,效果图如下。我们可以看到几乎红色的点都被包在了等高线内,也就是说大于0.7的点就是label中的1点,小于0.7的点就是label中的0点。
下附全部代码:
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
BATCH_SIZE = 30
seed = 2
rdm = np.random.RandomState(seed)
#返回具有标准正态分布的随机数组
#random返回0-1的 不一定正态分布
#它从均值0和方差1的单变量“normal”(高斯)分布返回随机值矩阵。
#构造样本和标签
X = rdm.randn(300,2)
Y_ = [int(x0 * x0 + x1 * x1 < 2) for (x0,x1) in X]
Y_c = [['red' if y else 'blue'] for y in Y_]
#np.vstack(X)就是把X的元素全堆起来
#reshape改变array的形状
X = np.vstack(X).reshape(-1,2)
Y_ = np.vstack(Y_).reshape(-1,1)
#scatter(x,y,c) 表示在x,y处点c颜色
plt.scatter(X[:,0],X[:,1],c = np.squeeze(Y_c))
plt.show()
def get_weight(shape,regularizer):
w = tf.Variable(tf.random_normal(shape,seed=111),dtype = tf.float32)
tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
return w
def get_bias(shape):
#这里bias定为0.01
b = tf.Variable(tf.constant(0.01,shape = shape))
return b
x = tf.placeholder(tf.float32,shape=(None,2))
y_ = tf.placeholder(tf.float32,shape=(None,1))
w1 = get_weight([2,11],0.01)
b1 = get_bias([11])
#nn.relu()激活函数 为了区分模型
y1 = tf.nn.relu(tf.matmul(x,w1) + b1)
w2 = get_weight([11,1],0.01)
b2 = get_bias([1])
y = tf.matmul(y1,w2) + b2
#定义损失函数
loss_mse = tf.reduce_mean(tf.square(y-y_))
loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))
#定义反向传播方法不含正则化
train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_mse)
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS = 40000
for i in range(STEPS):
#BATCH_SIZE 一定是小于样本总数的 这样本总数是300
start = (i * BATCH_SIZE) % 300
end = start + BATCH_SIZE
sess.run(train_step,feed_dict={x:X[start:end],y_:Y_[start:end]})
if i % 2000 == 0:
loss_mse_v = sess.run(loss_mse,feed_dict={x:X,y_:Y_})
print("After %d steps, loss is: %f"%(i,loss_mse_v))
xx,yy = np.mgrid[-3:3:.01,-3:3:.01]
grid = np.c_[np.ravel(xx),np.ravel(yy)]
#其中y就是跑出来的模型
probs = sess.run(y,feed_dict={x:grid})
print("probs:\n", probs)
probs = probs.reshape(xx.shape)
print("probs:\n",probs)
print("w1:\n",sess.run(w1))
print("b1:\n",sess.run(b1))
print("w2:\n",sess.run(w2))
print("b2:\n",sess.run(b2))
plt.scatter(X[:,0],X[:,1],c = np.squeeze(Y_c))
plt.contour(xx,yy,probs,levels=[.5])
plt.show()
#定义反向传播方法:包含正则化
train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_total)
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS = 40000
for i in range(STEPS):
start = (i * BATCH_SIZE) % 300
end = start + BATCH_SIZE
sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
if i % 2000 == 0:
loss_v = sess.run(loss_total,feed_dict={x:X,y_:Y_})
print("After %d steps, loss is: %f"%(i,loss_v))
xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
grid = np.c_[np.ravel(xx), np.ravel(yy)]
probs = sess.run(y, feed_dict={x: grid})
probs = probs.reshape(xx.shape)
print(probs)
print("w1:\n", sess.run(w1))
print("b1:\n", sess.run(b1))
print("w2:\n", sess.run(w2))
print("b2:\n", sess.run(b2))
plt.scatter(X[:, 0], X[:, 1], c=np.squeeze(Y_c))
plt.contour(xx, yy, probs, levels=[.60 ])
plt.show()
?
|