- Notifications
You must be signed in to change notification settings - Fork 13.9k
Start documenting autodiff activities #148201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| The job Click to see the possible cause of the failure (guessed by this bot) |
| Thanks, @ZuseZ4 ! I will have a look tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ZuseZ4 ! This already clarifies some things. Knowing which activity types are valid for forward/reverse mode is very helpful.
For me, it is still a bit unclear how the input/output annotations work in the reverse mode case where the output of the function is via a mutable reference function.
For example if we have a function f(x) = C * x^2 and we want to compute df / dx, is this how we would apply the macro?
#[autodiff_reverse(d_foo, Const, Active, Duplicated)] fn foo(c: f32: x: f32, out: &mut f32){ c * x * x }If so, is this how we call the function?
let C: f32 = 4.0; let x: f32 = 3.0; // store the result of foo let mut foo_result: f32 = 0.0; // shadow variable to store the dF/dx value let mut dFoo_dx = 1.0; d_foo(C, x, &mut foo_result, &mut dFoo_dx); // I would expect dFoo_dx to be 2* 4 * 3 = 24 *[View changes since this review](https://triagebot.infra.rust-lang.org/gh-changes-since/rust-lang/rust/148201/df984edf44203c862e01b5a20c8092d5614d872e..a2dce774bc35a7fbafbe2d191a4eedf99023e17a)*| /// if we are not interested in computing the derivatives with respect to this argument. | ||
| /// | ||
| /// We often want to track how one or more input floats affect one output float. This output can | ||
| /// be a scalar return value, or a mutable reference or pointer argument. In this case, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, it is unclear what "this case" refers to. Based on the following text, I think this refers to the case when the output is a mutable reference or pointer argument. Is that right? If so, maybe something like the following would be more clear
| /// be a scalar return value, or a mutable reference or pointer argument. In this case, the | |
| /// be a scalar return value, or a mutable reference or pointer argument. In the latter case, the |
or
| /// be a scalar return value, or a mutable reference or pointer argument. In this case, the | |
| /// be a scalar return value, or a mutable reference or pointer argument. If the output is stored in a mutable reference or pointer argument, the |
| /// the output should be marked as active or duplicated and initialized to `1.0`. After calling | ||
| /// the generated function, the shadow(s) of the input(s) will contain the derivatives. If the | ||
| /// function has more than one output float marked as active or duplicated, users might want to | ||
| /// set one of them to `1.0` and the others to `0.0` to compute partial derivatives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit unclear to me how the 1.0 and 0.0 are used to specify the partial derivatives. I would assume the output given 1.0 has it's derivative evaluated and the one 0.0 not, but I think it would be good to be explicit in the docs.
@kevinyamauchi A first draft, please lmk whether they answer your question, or where you'd like clarifications or rewording, to make it easier to understand.
r? ghost