I have flat csv data loaded into Data frame and trying to build nested json. I was able to build a nested json with orders as list but not able to add few columns like (geo, attributes) as dict within dict (json) and need some pointers on it.
Sample Data -
clientid requestid geo currency date orderid amount quantity attribute1 attribute2
X1 req1 AMER USD 16427166560 1111 600 1 Product A SKU-a
X1 req1 AMER USD 16427166500 1112 1200 1 Product B SKU-b
X1 req1 AMER CAD 16427167500 1113 1500 1 Product A SKU-a
X2 req2 EMEA EUR 16427163500 1114 1000 1 Product A SKU-a
Json O/P expected format -
Python Data Frame
Sample Data -
clientid requestid geo currency date orderid amount quantity attribute1 attribute2
X1 req1 AMER USD 16427166560 1111 600 1 Product A SKU-a
X1 req1 AMER USD 16427166500 1112 1200 1 Product B SKU-b
X1 req1 AMER CAD 16427167500 1113 1500 1 Product A SKU-a
X2 req2 EMEA EUR 16427163500 1114 1000 1 Product A SKU-a
Json O/P expected format -
Output:{ "request": { "clientid": "X1", "requestid": "req1", "businessunit": { "geo": "AMER" } "currency": "USD", "orders": [{ "date": 16427166560, "order_id": "1111", "amount": "600", "attributes": { "attribute1": "Product A", "attribute2": "SKU-a" } }, { "date": 16427166500, "order_id": "1112", "amount": "1200", "attributes": { "attribute1": "Product B", "attribute2": "SKU-b" } } ] } }Steps to reproduce Python Data Frame
df = pd.DataFrame([['X1', 'req1', 'AMER', 'USD', 16427166560, '1111', 600, '1', 'Product A', 'SKU-a'], ['X1', 'req1', 'AMER', 'USD', 16427166500, '1112', 1200, '1', 'Product B', 'SKU-b'], ['X1', 'req1', 'AMER', 'CAD', 16427167500, '1113', 1500, '1', 'Product A', 'SKU-a'], ['X2', 'req2', 'EMEA', 'EUR', 16427163500, '1114', 1000, '1', 'Product A', 'SKU-a'] ], columns = ['clientid', 'requestid', 'geo', 'currency', 'date', 'order_id', 'amount', 'quantity', 'attribute1', 'attribute2'])Initial draft which builds nested json -
j = (df.groupby(['clientid', 'requestid', 'geo', 'currency']) .apply(lambda x: x[['date', 'order_id', 'amount']].to_dict('records')) .reset_index() .rename(columns={0: 'orders'}) .to_json(orient='records'))Good to have - If we can get separate json file as O/P for each geo/currency in single shot I would appreciate pointers on it. 