最近参加了美团算法大赛,现在比赛也告一段落,最后的名次是93。其实对于这次比赛我对自己的要求就是完成比赛,毕竟作为一个Android开发,对于深度学习、算法这块也是个初学者,而且也是第一次参加这种比赛。在比赛后期也没怎么提交了,因为遇到了瓶颈,成绩确实上不去了。不过通过这次比赛,学到了非常非常多的东西。所以写点东西来记录一下。

数据清洗

数据清洗相当重要,开始没有意识到这一点。其实一开始都在自己从头写python代码来进行数据清洗。后来用pandas来做数据清洗,效率高了好多。

特征工程

为什么算法的效果不好,并不是算法的问题,也不是神经网络的问题,主要还是特征没选好。这一部分以后还要加强学习。 选择好特征 (Good Features)

数据标准化

就是把数据压缩到0-1或者-1-1的范围。 特征标准化 (Feature Normalization)

批标准化 (Batch Normalization)

在使用批标准化技术之前,你基本上只能只用很浅的神经网络,稍微层数多一点,整个神经网络就爆炸了,根本不能学习。 在使用批标准化技术之后,就可以创建真正的深度神经网络了。莫烦的这篇文章写的很好:批标准化 (Batch Normalization)

TensorFlow使用批标准的代码:

# -- coding: utf-8 --
import tensorflow as tf
import data_provider
import countdown
import sys
import base_model
import base_model_train
import matplotlib.pyplot as plt

global_step = tf.Variable(0)
keep_prob = tf.placeholder(tf.float32)
xs = tf.placeholder(tf.float32, [None, 4])
ys = tf.placeholder(tf.float32, [None, 1])
training = tf.placeholder_with_default(False, shape=(), name='training')

with tf.name_scope('dnn'):
    hidden1 = base_model_train.batch_normalization_layer(xs, 256, activation=tf.nn.elu, training=training)
    hidden2 = base_model_train.batch_normalization_layer(hidden1, 256, activation=tf.nn.elu, training=training)
    hidden3 = base_model_train.batch_normalization_layer(hidden2, 256, activation=tf.nn.elu, training=training)
    prediction = tf.layers.dense(hidden3, 1, name='prediction')

with tf.name_scope('loss'):
    # loss = base_model.huber_loss(ys, prediction)
    loss = tf.reduce_mean(tf.abs(ys - prediction))
    mae = loss

with tf.name_scope('train'):
    learning_rate = tf.train.exponential_decay(0.01, global_step, 1000, 0.977, staircase=True)
    train_optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss, global_step=global_step)

init = tf.global_variables_initializer()
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)

provider = data_provider.TrainDataProvider(use_feature=False)
countdown = countdown.Countdown(hours=12)
batch = 1000
batch_size = 1000
model_saver = base_model.ModelSaver(target_train_loss=90, target_test_loss=90)
x_train, y_train = provider.random_train_batch(provider.test_size)
# x_train, y_train = provider.random_train_batch(batch)
x_test, y_test = provider.get_total_test_batch()
# x_test, y_test = provider.random_test_batch(batch)
with tf.Session() as sess:
    sess.run(init)
    for i in xrange(sys.maxint):
        print('batch %s start...' % i)
        print('learning_rate: %s' % (sess.run(learning_rate)))
        for j in xrange(batch):
            x, y = provider.random_train_batch(batch_size)
            sess.run([train_optimizer, extra_update_ops], feed_dict={xs: x, ys: y, keep_prob: 1, training: True})

        train_mae = sess.run(mae, feed_dict={xs: x_train, ys: y_train, keep_prob: 1})
        print('train_mae: %s' % train_mae)
        validate_mae = sess.run(mae, feed_dict={xs: x_test, ys: y_test, keep_prob: 1})
        print('validate_mae: %s' % validate_mae)
        model_saver.check_save(sess, train_mae, validate_mae)
        if i > 0 and i % 50 == 0:
            model_saver.save_model(sess)
        expire, left_time = countdown.check_expire()
        if expire:
            break
        else:
            print('left_time: %s' % left_time)

    model_saver.final_save(sess)

    x_t, y_t = provider.random_test_batch(10)
    print('----------label----------')
    print(y_t)
    print('----------prediction----------')
    ret = sess.run(prediction, feed_dict={xs: x_t, ys: y_t, keep_prob: 1})
    print(ret)
    x_t, y_t = provider.random_test_batch(100)
    y_c = sess.run(prediction, feed_dict={xs: x_t, ys: y_t, keep_prob: 1})
    fig = plt.figure()
    for i in xrange(x_t.shape[1]):
        x_i = x_t[:, i]
        ax = fig.add_subplot(2, 5, i + 1)
        ax.scatter(x_i, y_t)
        ax.plot(x_i, y_c, 'r-')
        ax.grid()
    plt.show()
# -- coding: utf-8 --
import tensorflow as tf


def batch_normalization_layer(inputs, units, activation, training, momentum=0.9, name=None):
    hidden = tf.layers.dense(inputs, units, name=name)
    bn = tf.layers.batch_normalization(hidden, training=training, momentum=momentum)
    bn_act = activation(bn)
    return bn_act

这种实现批标准化的代码比莫烦提供的那种实现要好一些,值得注意的是需要定义一个training的placeholder对批标准化进行控制,只在训练的时候training设为True,验证的时候为False。

其他对神经网络进行优化的技巧

莫烦的博客中提供了一系列对神经网络进行优化的方法,写的非常好:

检验神经网络 (Evaluation)

特征标准化 (Feature Normalization)

选择好特征 (Good Features)

激励函数 (Activation Function)

过拟合 (Overfitting)

加速神经网络训练 (Speed Up Training)

处理不均衡数据 (Imbalanced data)

批标准化 (Batch Normalization)

L1 / L2 正规化 (Regularization)