SiheWan
diff --git a/‎ranking/ranking_tutorial.ipynb‎
Lines changed: 7 additions & 7 deletions b/‎ranking/ranking_tutorial.ipynb‎
Lines changed: 7 additions & 7 deletions
@@ -22,7 +22,7 @@
  },
  "outputs": [],
  "source": [
- "from catboost import CatBoost, Pool, MetricVisualizer\n",
+ "from catboost import CatBoostRanker, Pool, MetricVisualizer\n",
  "from copy import deepcopy\n",
  "import numpy as np\n",
  "import os\n",
@@ -365,7 +365,7 @@
  " if additional_params is not None:\n",
  " parameters.update(additional_params)\n",
  " \n",
- " model = CatBoost(parameters)\n",
+ " model = CatBoostRanker(**parameters)\n",
  " model.fit(train_pool, eval_set=test_pool, plot=True)\n",
  " \n",
  " return model"
@@ -395,9 +395,9 @@
  "source": [
  "### Group weights parameter\n",
  "Suppose we know that some queries are more important than others for us.<br/>\n",
- "The word \"importance\" used here in terms of accuracy or quality of CatBoost prediction for given queries.<br/>\n",
+ "The word \"importance\" used here in terms of accuracy or quality of CatBoostRanker prediction for given queries.<br/>\n",
  "You can pass this additional information for learner using a ``group_weights`` parameter.<br/>\n",
- "Under the hood, CatBoost uses this weights in loss function simply multiplying it on a group summand.<br/>\n",
+ "Under the hood, CatBoostRanker uses this weights in loss function simply multiplying it on a group summand.<br/>\n",
  "So the bigger weight $\\rightarrow$ the more attention for query.<br/>\n",
  "Let's show an example of training procedure with random query weights."
  ]
@@ -450,7 +450,7 @@
  "### A special case: top-1 prediction\n",
  "\n",
  "Someday you may face with a problem $-$ you will need to predict the top one most relevant object for a given query.<br/>\n",
- "For this purpose CatBoost has a mode called __QuerySoftMax__.\n",
+ "For this purpose CatBoostRanker has a mode called __QuerySoftMax__.\n",
  "\n",
  "Suppose our dataset contain a binary target: 1 $-$ mean best document for a query, 0 $-$ others.<br/>\n",
  "We will maximize the probability of being the best document for given query.<br/>\n",
@@ -572,7 +572,7 @@
  "\n",
  "$$ - \\sum_{i,j \\in Pairs} \\log \\left( \\frac{1}{1 + \\exp{-(f(d_i) - f(d_j))}} \\right) $$\n",
  "\n",
- "Methods based on pair comparisons called __pairwise__ in CatBoost this objective called __PairLogit__.\n",
+ "Methods based on pair comparisons called __pairwise__ in CatBoostRanker this objective called __PairLogit__.\n",
  "\n",
  "There's no need to change the dataset CatBoost generate the pairs for us. The number of generating pairs managed via parameter max_size."
  ]
@@ -615,7 +615,7 @@
  " for doc_id, line in enumerate(f):\n",
  " line = line.split(',')[:2]\n",
  " \n",
- " label, query_id = tuple(map(float, line))\n",
+ " label, query_id = float(line[0]), int(line[1])\n",
  " if query_id not in groups:\n",
  " groups[query_id] = []\n",
  " groups[query_id].append((doc_id, label))\n",