Skip to content

Python v2 API to new operator framework #3129

@wangkuiyi

Description

@wangkuiyi

Design Doc: RNNOp

A Plain Network

predict = paddle.layer.fc( paddle.layer.data(name="x"), output_size = 100) cost = paddle.layer.mse( predict, paddle.layer.data(name="y")) parameters = paddle.train(cost) paddle.save_model(cost, paddle.datasets.mnist.train(), parameters, "filename"); p = paddle.load_model(predict, "filename") paddle.infer(predict, 

Layers, Variables, and Default Scope

# in package paddle.layer def data(name): return paddle.cpp.variable(paddle.cpp.default_scope(), name) def fc(input, output_size): output = paddle.cpp.variable(paddle.cpp.default_scope()) W = paddle.cpp.variable(paddle.cpp.default_scope(), label="parameter") b = paddle.cpp.variable(paddle.cpp.default_scope(), label="parameter") paddle.cpp.operator("FC", read={input, W, b}, output_size, write={output}) return output def mse(input1, input2): output = paddle.cpp.variable(paddle.cpp.default_scope()) paddle.cpp.operator("MSE", read={input1, input2}, write={output}) return output

where

  • paddle.cpp.variable is a Python binding of C++ method Scope::NewVar().
  • paddle.cpp.operator creates an operator and mark it as a reader of some variable and a writer of some others. We will cover this later in more details.

paddle::operator::Net

paddle.train receives a variable created by paddle.layer.mse and need to trace all related operators and sort them by the topological order.

Please be aware that all operators are derived from class OperatorBase, which refers to Variables by their names:

class OperatorBase: vector<string> inputs_; vector<string> outputs_; };

and Variables doesn't have names if they are not in a Scope.

Also, each Varaible maintains:

class Variable { list<Operator*> readers_; list<Operator*> writers_; };

Please be aware the trace from an operator to its input variables depends on the default scope. The tracing is done in C++ space, so paddle.cpp.default_scope is a binding to C++ code.

class Net { public: static Net* TraceAndBuild(Variable* output, Scope* scope) { std::list<std::pair<Operaor*, int/*distance to output*/> > dists; std::list<std::pair<Variable*, int /*distance to output*/> > frontier; frontier.push_back(make_pair<output, 0>); while (frontier.size() > 0) { Variable* v = frontier.front().first; int dist = frontier.front().second; frontier.pop_front(); for (Operator* o : v->writers_) { dists.push_back(make_pair(v, dist)); for (const string& s : o->writers_) { frontier.push_back(make_pair(scope->FindVar(s), dist+1)); } } } std::sort(dists, /*by the descending order of dist*/); return new Net(dists); } };

We can call

Net::TraceAndBuild(output_variable, DefaultScope()).Run(DefaultScope());

to extract the network using the default scope and run it.

Scope Hierarchy

An RNN operator may have kinds of variables:

  1. global variable -- in outer scope
  2. memory variable -- in RNNOp-local scope
  3. local variable -- in step-local scope
 outer scope /|\ | RNNOp scope (the memory over steps) /|\ /|\ /|\ | | | step-0 step-1 step-2 scope scope scope 

Just like what a programing language compiler/interpreter would do, for each step, there is a step-local scope, but there is only one copy of compiled code (binary code) or step-net in our case.

Above three tiers can be simplified to two-tier by moving memory variables to the outer scope, but this is not necessary.

outer scope (including all memory variables of an RNNOp) /|\ /|\ /|\ | | | step-0 step-1 step-2 scope scope scope 

A Recurrent Network

x = paddle.layer.data(name="features") y = paddle.layer.data(name="labels") accum = paddle.framework.tensor() cost = paddle.layer.mse( paddle.layer.fc( paddle.layer.rnn( input = paddle.layer.fc(x), step_net = paddle.layer.fc( paddle.layer.add_to(accum, NULL), output_size=100), concat_output=true)), y) paddle.train(cost, ...)

Here we use NULL as the placeholder of the input of the step net.

Please notice that we don't have to consume the output of an RNNOp. For example, we can use the memory as the RNNOp's output:

x = paddle.layer.data(name="features") y = paddle.layer.data(name="labels") memory = paddle.framework.tensor() paddle.layer.rnn( input = paddle.layer.fc(x), step_net = paddle.layer.fc( paddle.layer.add_to(memory, NULL), output_size=100), concat_output=true) cost = paddle.layer.mse(paddle.layer.fc(memory), y) paddle.train(cost, ...)

Step-Net

Above example shows that the step_net parameter of paddle.layer.rnn accepts a variable returned by paddle.layer.fc. We need to trace the step-net from this variable. This can be done by calling the aforementined paddle::operator::Net::TraceAndBuild

namespace paddle { namespace operator { class RNN { public: void Run(Scope* scope) { RNNInput* whole_input = inputs_[0]->Get<RNNInput>(); int sequence_len = whole_input->Len(0); for (int i = 0; i < sequence_len; ++i) { Scope* step_scope = scope->NewScope(); step_scope->NewVar("step_input")->GetMutable<Tensor>()->Slice(whole_input, i); Net* net = Net::TraceAndBuild(GetAttr<Variable*>("step_net"), step_scope); net->Run(step_scope) } } };

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions