Skip to content

Commit fd2949a

Browse files
authored
Fix AdamW documentation (#1686)
1 parent 888788d commit fd2949a

File tree

1 file changed

+6
-12
lines changed

1 file changed

+6
-12
lines changed

bitsandbytes/optim/adamw.py

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def __init__(
2626
Base AdamW optimizer.
2727
2828
Arguments:
29-
params (`torch.tensor`):
29+
params (`torch.Tensor`):
3030
The input parameters to optimize.
3131
lr (`float`, defaults to 1e-3):
3232
The learning rate.
@@ -87,7 +87,7 @@ def __init__(
8787
8-bit AdamW optimizer.
8888
8989
Arguments:
90-
params (`torch.tensor`):
90+
params (`torch.Tensor`):
9191
The input parameters to optimize.
9292
lr (`float`, defaults to 1e-3):
9393
The learning rate.
@@ -159,7 +159,7 @@ def __init__(
159159
32-bit AdamW optimizer.
160160
161161
Arguments:
162-
params (`torch.tensor`):
162+
params (`torch.Tensor`):
163163
The input parameters to optimize.
164164
lr (`float`, defaults to 1e-3):
165165
The learning rate.
@@ -219,7 +219,7 @@ def __init__(
219219
Paged AdamW optimizer.
220220
221221
Arguments:
222-
params (`torch.tensor`):
222+
params (`torch.Tensor`):
223223
The input parameters to optimize.
224224
lr (`float`, defaults to 1e-3):
225225
The learning rate.
@@ -241,8 +241,6 @@ def __init__(
241241
Adapts clipping threshold automatically by tracking the last 100 gradient norms and clipping the gradient at a certain percentile to improve stability.
242242
block_wise (`bool`, defaults to `True`):
243243
Whether to independently quantize each block of tensors to reduce outlier effects and improve stability.
244-
is_paged (`bool`, defaults to `False`):
245-
Whether the optimizer is a paged optimizer or not.
246244
"""
247245
super().__init__(
248246
"adam",
@@ -279,7 +277,7 @@ def __init__(
279277
Paged 8-bit AdamW optimizer.
280278
281279
Arguments:
282-
params (`torch.tensor`):
280+
params (`torch.Tensor`):
283281
The input parameters to optimize.
284282
lr (`float`, defaults to 1e-3):
285283
The learning rate.
@@ -303,8 +301,6 @@ def __init__(
303301
Adapts clipping threshold automatically by tracking the last 100 gradient norms and clipping the gradient at a certain percentile to improve stability.
304302
block_wise (`bool`, defaults to `True`):
305303
Whether to independently quantize each block of tensors to reduce outlier effects and improve stability.
306-
is_paged (`bool`, defaults to `False`):
307-
Whether the optimizer is a paged optimizer or not.
308304
"""
309305
# Validate unsupported parameters
310306
if amsgrad:
@@ -350,7 +346,7 @@ def __init__(
350346
Paged 32-bit AdamW optimizer.
351347
352348
Arguments:
353-
params (`torch.tensor`):
349+
params (`torch.Tensor`):
354350
The input parameters to optimize.
355351
lr (`float`, defaults to 1e-3):
356352
The learning rate.
@@ -372,8 +368,6 @@ def __init__(
372368
Adapts clipping threshold automatically by tracking the last 100 gradient norms and clipping the gradient at a certain percentile to improve stability.
373369
block_wise (`bool`, defaults to `True`):
374370
Whether to independently quantize each block of tensors to reduce outlier effects and improve stability.
375-
is_paged (`bool`, defaults to `False`):
376-
Whether the optimizer is a paged optimizer or not.
377371
"""
378372
super().__init__(
379373
"adam",

0 commit comments

Comments
 (0)