- Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingcallback: model checkpointcheckpointingRelated to checkpointingRelated to checkpointingver: 2.5.x
Description
Bug description
for the following snippet, a last.ckpt -> last.ckpt symlink is generated
What version are you seeing the problem on?
v2.5
Reproduced in studio
No response
How to reproduce the bug
import uuid import torch from pytorch_lightning import Trainer, LightningModule from pytorch_lightning.callbacks import ModelCheckpoint from torch.utils.data import DataLoader, Dataset class RandomDataset(Dataset): def __init__(self, size, length): self.len = length self.data = torch.randn(length, size) def __getitem__(self, index): return self.data[index] def __len__(self): return self.len class BoringModel(LightningModule): def __init__(self): super().__init__() self.layer = torch.nn.Linear(32, 2) def forward(self, x): return self.layer(x) def training_step(self, batch, batch_idx): loss = self(batch).sum() self.log("train_loss", loss) return {"loss": loss} def validation_step(self, batch, batch_idx): loss = self(batch).sum() self.log("valid_loss", loss) def test_step(self, batch, batch_idx): loss = self(batch).sum() self.log("test_loss", loss) def configure_optimizers(self): return torch.optim.SGD(self.layer.parameters(), lr=0.1) if __name__ == "__main__": tmpdir = f"/tmp/{uuid.uuid4()}" print(tmpdir) trainer = Trainer( default_root_dir=tmpdir, max_epochs=2, callbacks=[ModelCheckpoint(dirpath=tmpdir, every_n_epochs=10, save_last='link', save_top_k=-1)], enable_checkpointing=True, ) model = BoringModel() trainer.fit(model, train_dataloaders=DataLoader(RandomDataset(32, 64), batch_size=2))Error messages and logs
ls -lh /tmp/0087f725-e4fa-42bc-a60d-51dfbcd57b41 9 Aug 22 13:53 last.ckpt -> last.ckpt Environment
pytorch-lightning==2.5.2
torch==2.7.1
More info
maybe related to the code that was introduced in #12391
cc @lantiga
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcallback: model checkpointcheckpointingRelated to checkpointingRelated to checkpointingver: 2.5.x