If you need to group dataset by continents and sum population and count countries (stored in index), you dont need to group by the index, you just need one grouping (by continent), but you need to do two aggregations - sum and count. And if your index (countries) contains only unique values, then country counting is same as counting any column in the dataframe.
You can do it in two steps, when you do sum / count seperately and then you merge results. Or you can do it in one pass with .agg() function.
Spoiled examples:
Example dataframe:
Output:
In [1]: import pandas as pd In [2]: data = {"continent":["Europe", "Europe", "North America"], "pop":[12313, 2341, 43312]} In [3]: df = pd.DataFrame(data, index=["Germany", "France", "Canada"]) In [4]: df Out[4]: continent pop Germany Europe 12313 France Europe 2341 Canada North America 43312
Using .agg() :
With .agg() you can use dictionary and define what functions do you want to apply to given columns. Only one column is used here:
Output:
In [5]: df.groupby("continent").agg({"pop":{"country_count":"count", "pop_sum":"sum"}}) Out[5]: pop country_count pop_sum continent Europe 2 14654 North America 1 43312
Resulting dataframe has column multindex based on dictionary defining aggregation.
"Simpler" approach with separate steps:
Output:
In [6]: country_count = df.groupby("continent").count() In [7]: country_count Out[7]: pop continent Europe 2 North America 1 In [8]: pop_sum = df.groupby("continent").sum() In [9]: pop_sum Out[9]: pop continent Europe 14654 North America 43312 In [10]: country_count.columns=["country_count"] In [11]: country_count.join(pop_sum) Out[11]: country_count pop continent Europe 2 14654 North America 1 43312
You need to rename column to avoid "clash" (or to specify suffix to use as parameter for .join())
If your index is not unique, probably simplest solution is to add index as another column (country) to dataframe and instead count() use nunique() on countries.
And while .agg() is not so well known function,
10 Minutes to pandas contains more than enough informations to deduce separate summing/counting followed by merge.