通过输出梯度进行深层和复发神经网络的安全突变|ti8 竞猜雷竞技appUber工程博客雷竞技到底好不好用

雷竞技是骗人的

通过输出梯度进行深层和复发神经网络的安全突变

乔尔·雷曼（Joel Lehman），，，，杰伊·陈，，，，杰夫·克莱恩（Jeff Clune），和肯尼思·斯坦利（Kenneth O. Stanley）

2017年12月1日

抽象的

尽管神经进化（不断发展的神经网络）在从增强学习到人造生活的各个领域都有成功的往绩记录，但它很少应用于大型的深神经网络。一个核心原因是，尽管随机突变通常在低维度上起作用，但数千或数百万重量的随机扰动可能会破坏现有功能，即使某些单独的体重变化是有益的，也没有提供学习信号。本文提出了一种解决方案，通过引入一个安全突变（SM）操作员的家族，该家族的目标是在突变操作员本身内找到一种不会太多改变网络行为的变化程度，但仍然可以促进探索。重要的是，这些SM操作员不需要与环境的任何其他互动。The most effective SM variant capitalizes on the intriguing opportunity to scale the degree of mutation of each individual weight according to the sensitivity of the network’s outputs to that weight, which requires computing the gradient of outputs with respect to the weights (instead of the gradient of error, as in conventional deep learning). This safe mutation through gradients (SM-G) operator dramatically increases the ability of a simple genetic algorithm-based neuroevolution method to find solutions in high-dimensional domains that require deep and/or recurrent neural networks (which tend to be particularly brittle to mutation), including domains that require processing raw pixels. By improving our ability to evolve deep neural networks, this new safer approach to mutation expands the scope of domains amenable to neuroevolution.