|
| 1 | +<title>NumPy实现DNC、RNN和LSTM神经网络算法 - <%-__DOC_NAME__ %></title> |
| 2 | +<meta name="keywords" content="numpy实现DNC,numpy实现RNN,numpy实现LSTM,numpy实现神经网络" /> |
| 3 | +<meta name="description" content="NumPy实现DNC、RNN和LSTM神经网络算法,可微分神经计算 https://www.nature.com/articles/nature20101 的实现尽可能接近于本文的描述。任务:char-level 预测。报告还包括简单RNN(RNN-numpy.py)和LSTM(LSTM-numpy.py)。一些外部数据(ptb、wiki)需要单独下载。"> |
| 4 | + |
| 5 | +# 译者的话 |
| 6 | + |
| 7 | +本文的项目的作者是一个老外 [krocki](https://github.com/krocki),关于dnc、rnn、lstm 的实现源码都在他的github仓库 [https://github.com/krocki/dnc](https://github.com/krocki/dnc)。 |
| 8 | + |
| 9 | +# NumPy中的微分神经计算 |
| 10 | + |
| 11 | +可微分神经计算 https://www.nature.com/articles/nature20101 的实现尽可能接近于本文的描述。任务:char-level 预测。报告还包括简单RNN(RNN-numpy.py)和LSTM(LSTM-numpy.py)。一些外部数据(ptb、wiki)需要单独下载。 |
| 12 | + |
| 13 | +### 快速开始 |
| 14 | + |
| 15 | +```python |
| 16 | +python dnc-debug.py |
| 17 | +``` |
| 18 | + |
| 19 | +这些版本已完成。 |
| 20 | + |
| 21 | +```python |
| 22 | +python rnn-numpy.py |
| 23 | +python lstm-numpy.py |
| 24 | +python dnc-numpy.py |
| 25 | +``` |
| 26 | + |
| 27 | +#### 积分 |
| 28 | + |
| 29 | +RNN代码基于A.Karpath(min-char-rnn.py)的原始工作 |
| 30 | + |
| 31 | +gist: https://gist.github.com/karpathy/d4dee566867f8291f086 |
| 32 | + |
| 33 | +文章: http://karpathy.github.io/2015/05/21/rnn-effectiveness/ |
| 34 | + |
| 35 | +### 特性 |
| 36 | + |
| 37 | +- RNN版本仍然依赖numpy |
| 38 | +- 添加批处理 |
| 39 | +- 将RNN修改为LSTM |
| 40 | +- 包括梯度检测 |
| 41 | + |
| 42 | +### DNC |
| 43 | + |
| 44 | +**实施** |
| 45 | + |
| 46 | +- LSTM控制器 |
| 47 | +- 2D存储器阵列 |
| 48 | +- 内容可寻址的读/写 |
| 49 | + |
| 50 | +**问题** |
| 51 | + |
| 52 | +关键相似度的softmax会导致崩溃(除以0) - 如果遇到这种情况,需要重新启动 |
| 53 | + |
| 54 | +**将要做** |
| 55 | + |
| 56 | +- 动态内存分配/自由 |
| 57 | +- 更快的实现(使用PyTorch?) |
| 58 | +- 保存模型 |
| 59 | +- 例子 |
| 60 | + |
| 61 | +### 示例输出: |
| 62 | + |
| 63 | +时间,迭代,BPC(预测误差 - >每个字符的位数,越低越好),处理速度 |
| 64 | + |
| 65 | +```python |
| 66 | +0: 4163.009 s, iter 104800, 1.2808 BPC, 1488.38 char/s |
| 67 | +``` |
| 68 | + |
| 69 | +### 模型中的样本(alice29.txt): |
| 70 | + |
| 71 | + |
| 72 | +```python |
| 73 | + e garden as she very dunced. |
| 74 | + |
| 75 | + Alice fighting be it. The breats? |
| 76 | + here on likegs voice withoup. |
| 77 | + |
| 78 | + `You minced more hal disheze, and I done hippertyou-sage, who say it's a look down whales that |
| 79 | +his meckling moruste!' said Alice's can younderen, in they puzzled to them!' |
| 80 | + |
| 81 | + `Of betinkling reple bade to, punthery pormoved the piose himble, of to he see foudhed |
| 82 | +just rounds, seef wance side pigs, it addeal sumprked. |
| 83 | + |
| 84 | + `As or the Gryphon,' Alice said, |
| 85 | +Fith didn't begun, and she garden as in a who tew.' |
| 86 | + |
| 87 | + Hat hed think after as marman as much the pirly |
| 88 | +startares to dreaps |
| 89 | +was one poon it |
| 90 | +out him were brived they |
| 91 | +proce? |
| 92 | + |
| 93 | + |
| 94 | + CHAT, I fary,' said the Hat,' said the Divery tionly to himpos.' |
| 95 | + |
| 96 | + `Com, planere?"' |
| 97 | + |
| 98 | + `Ica--' |
| 99 | + Onlice IN's tread! Wonderieving again, `but her rist,' said Alice. |
| 100 | + |
| 101 | + |
| 102 | + She |
| 103 | +sea do voice. |
| 104 | + |
| 105 | + `I'mm the Panthing alece of the when beaning must anquerrouted not reclow, sobs to |
| 106 | + |
| 107 | + `In of queer behind her houn't seemed |
| 108 | +``` |
| 109 | + |
| 110 | +### 检查反向传递的数值梯度(最右边的列应该具有值<1e-4); |
| 111 | + |
| 112 | +中间列具有计算的分析和数值梯度范围(这些应该更多/更少) |
| 113 | + |
| 114 | +```python |
| 115 | +---- |
| 116 | +GRAD CHECK |
| 117 | + |
| 118 | +Wxh: n = [-1.828500e-02, 5.292866e-03] min 3.005175e-09, max 3.505012e-07 |
| 119 | + a = [-1.828500e-02, 5.292865e-03] mean 5.158434e-08 # 10/4 |
| 120 | +Whh: n = [-3.614049e-01, 6.580141e-01] min 1.549311e-10, max 4.349188e-08 |
| 121 | + a = [-3.614049e-01, 6.580141e-01] mean 9.340821e-09 # 10/10 |
| 122 | +Why: n = [-9.868277e-02, 7.518284e-02] min 2.378911e-09, max 1.901067e-05 |
| 123 | + a = [-9.868276e-02, 7.518284e-02] mean 1.978080e-06 # 10/10 |
| 124 | +Whr: n = [-3.652128e-02, 1.372321e-01] min 5.520914e-09, max 6.750276e-07 |
| 125 | + a = [-3.652128e-02, 1.372321e-01] mean 1.299713e-07 # 10/10 |
| 126 | +Whv: n = [-1.065475e+00, 4.634808e-01] min 6.701966e-11, max 1.462031e-08 |
| 127 | + a = [-1.065475e+00, 4.634808e-01] mean 4.161271e-09 # 10/10 |
| 128 | +Whw: n = [-1.677826e-01, 1.803906e-01] min 5.559963e-10, max 1.096433e-07 |
| 129 | + a = [-1.677826e-01, 1.803906e-01] mean 2.434751e-08 # 10/10 |
| 130 | +Whe: n = [-2.791997e-02, 1.487244e-02] min 3.806438e-08, max 8.633199e-06 |
| 131 | + a = [-2.791997e-02, 1.487244e-02] mean 1.085696e-06 # 10/10 |
| 132 | +Wrh: n = [-7.319636e-02, 9.466716e-02] min 4.183225e-09, max 1.369062e-07 |
| 133 | + a = [-7.319636e-02, 9.466716e-02] mean 3.677372e-08 # 10/10 |
| 134 | +Wry: n = [-1.191088e-01, 5.271329e-01] min 1.168224e-09, max 1.568242e-04 |
| 135 | + a = [-1.191088e-01, 5.271329e-01] mean 2.827306e-05 # 10/10 |
| 136 | +bh: n = [-1.363950e+00, 9.144058e-01] min 2.473756e-10, max 5.217119e-08 |
| 137 | + a = [-1.363950e+00, 9.144058e-01] mean 7.066159e-09 # 10/10 |
| 138 | +by: n = [-5.594528e-02, 5.814085e-01] min 1.604237e-09, max 1.017124e-05 |
| 139 | + a = [-5.594528e-02, 5.814085e-01] mean 1.026833e-06 # 10/10 |
| 140 | +``` |
| 141 | + |
| 142 | + |
| 143 | +## 文章出处 |
| 144 | + |
| 145 | +由NumPy中文文档翻译,原作者为 krocki,翻译至:[https://github.com/krocki/dnc](https://github.com/krocki/dnc) |
| 146 | + |
| 147 | +关于dnc、rnn、lstm 的实现源码都在[https://github.com/krocki/dnc](https://github.com/krocki/dnc)。 |
0 commit comments