May-30-2024, 08:06 PM
I have data that I have sorted, below is a sample of the data:
Output: missing_values count 0 51 3 1 12 12 13 1 15 1 16 1 21 1 35 2 36 3 40 1I have the following code:# Get the vount of each missing value missing_value_count = missing_values.iloc[:, 0:1].value_counts().to_frame() missing_value_count.sort_index(inplace=True) missing_value_count.to_csv('question.csv') missing_value_count.agg(lambda s: pd.Series([*s.nlargest().index, *s.nsmallest().index], ['missing_values']), axis='columns')When I run the code I get the following error:Output:missing_value_count.agg(lambda s: pd.Series([*s.nlargest().index, *s.nsmallest().index], ['missing_values']), axis='columns') Traceback (most recent call last): Cell In[29], line 1 missing_value_count.agg(lambda s: pd.Series([*s.nlargest().index, *s.nsmallest().index], File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\frame.py:9196 in aggregate result = op.agg() File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\apply.py:699 in agg result = self.obj.apply(self.orig_f, axis, args=self.args, **self.kwargs) File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\frame.py:9423 in apply return op.apply().__finalize__(self, method="apply") File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\apply.py:678 in apply return self.apply_standard() File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\apply.py:798 in apply_standard results, res_index = self.apply_series_generator() File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\apply.py:814 in apply_series_generator results[i] = self.f(v) Cell In[29], line 1 in <lambda> missing_value_count.agg(lambda s: pd.Series([*s.nlargest().index, *s.nsmallest().index], File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\series.py:500 in __init__ com.require_length_match(data, index) File D:\Users\Mahmoud\anaconda3\Lib\site-packages\pandas\core\common.py:576 in require_length_match raise ValueError( ValueError: Length of values (2) does not match length of index (1)I want to return the lowest value in missing_values with the highest values in count. So in the above data the result will beOutput: missing_values count 0 51How can I modify this part of the code to get the result I want?missing_value_count.agg(lambda s: pd.Series([*s.nlargest().index, *s.nsmallest().index], ['missing_values']), axis='columns')
