What is the loss funciton used by the model? #122

CoCoNuTeK · 2024-06-15T16:15:50Z

CoCoNuTeK
Jun 15, 2024

Hello there,
I would like to ask about the loss function given i wanted to create my own loss function lets say moving average MASE as loss function to the model, everything is there except the outputs of the model contains: loss, logits and other stuff but there are no predicted values directly,
so is there a way to use the tokenizer used for creating the input_ids, labels and attention_mask to somehow turn the logits into the predicted values, so the reverse operation?

abdulfatir · 2024-06-15T16:27:09Z

abdulfatir
Jun 15, 2024
Maintainer

@CoCoNuTeK let's use discussions for such open ended questions. You can check the implementation of the T5 model in transformers and modify as needed.

1 reply

CoCoNuTeK Jun 15, 2024
Author

Okay so its this model i asume from the transformers library: AutoModelForSeq2SeqLM;
The main thing would be if i overloaded the trainer class and created a custom trainer like this

 class CustomTrainer(Trainer): def __init__(self, *args, **kwargs): self.loss_fn = kwargs.pop("loss_fn") super().__init__(*args, **kwargs) self.tokenizer = kwargs.pop("tokenizer") def compute_loss(self, model, inputs, return_outputs=False): past_target = inputs.pop("past_target") # get the original past values future_target = inputs.pop("future_target") # get the original future values outputs = model(**inputs) logits = outputs.get("logits") # Decode logits to actual predictions y_pred = self.decode_predictions(logits) #TODO # Calculate the loss using the actual values loss = self.loss_fn(y_pred, future_target, past_target) return (loss, outputs) if return_outputs else loss

the models weights will be finetuned with whats returned by the overloaded function compute_loss right?? Or does the model do that internally in which case it would be more of a work. And what is the current loss fnc used?

CoCoNuTeK · 2024-06-15T17:24:23Z

CoCoNuTeK
Jun 15, 2024
Author

The output from the model has shape [batch_size, pred_len, 4096] the logits tensor; so to get the predicted token, i can just commit to the highest logit and get dimension [batch_size, pred_len] but i am still stuck with token values + the tokenizer i used here

 def to_hf_format(self, df: pd.DataFrame) -> dict: # Extract past and future targets past_target = torch.tensor(df['y'].values[:self.context_length]).unsqueeze(0) future_target = torch.tensor(df['y'].values[self.context_length:self.context_length + self.prediction_length]).unsqueeze(0) # Transform using the tokenizer input_ids, attention_mask, scale = self.tokenizer.context_input_transform(past_target) labels, labels_mask = self.tokenizer.label_input_transform(future_target, scale) labels[labels_mask == 0] = -100

to create the tokens from float values; does it have backwards operation?? And i dont see that working as the tokens encode a range of values not just one value.

2 replies

williamrodz Feb 15, 2025

@abdulfatir : Please correct me if I am wrong, but it seems that the question about which is the loss function used by default in the fine tuning process was left unanswered in this thread?

Could you help clarify and elaborate? Is it cross-entropy loss as the paper states?

abdulfatir Feb 17, 2025
Maintainer

@williamrodz Yes, the cross entropy loss is used for the original Chronos models (the ones described in the paper) for fine-tuning. In the case of Chronos-Bolt models, the multi quantile loss is used instead. However, note that the fine-tuning scripts in this repo only support fine-tuning the original Chronos models. If you're interested in fine-tuning Chronos-Bolt, please use AutoGluon for that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What is the loss funciton used by the model? #122

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

What is the loss funciton used by the model? #122

Uh oh!

CoCoNuTeK Jun 15, 2024

Replies: 2 comments · 3 replies

Uh oh!

abdulfatir Jun 15, 2024 Maintainer

Uh oh!

CoCoNuTeK Jun 15, 2024 Author

Uh oh!

CoCoNuTeK Jun 15, 2024 Author

Uh oh!

williamrodz Feb 15, 2025

Uh oh!

abdulfatir Feb 17, 2025 Maintainer

CoCoNuTeK
Jun 15, 2024

Replies: 2 comments 3 replies

abdulfatir
Jun 15, 2024
Maintainer

CoCoNuTeK Jun 15, 2024
Author

CoCoNuTeK
Jun 15, 2024
Author

abdulfatir Feb 17, 2025
Maintainer