Thursday, January 23, 2025

Python groupby

 There are 2 ways to normal results:

  • use agg function

grp_df = df.groupby(['BookLevel4','Symbol','Direction','OrderId'], as_index=False).agg({'TxTime':['min','max','count'], 'FillQuantity':['sum']})

grp_df.columns = [''.join(col).strip() for col in grp_df.columns.values]


  • use apply function 
    • In case you use more than 1 column
    • Wont have to normalize multi-level columns


grp_df = df.groupby(['BookLevel4','Symbol','Direction','OrderId'], as_index=False).apply(

        lambda s: pd.Series({

            "TxTimemin": s["TxTime"].min(),

            "TxTimemax": s["TxTime"].max(),

            "TxTimecount": s["TxTime"].count(),

            "FillQuantitysum": s["FillQuantity"].sum(),

            "FillVWAP": (s['FillPrice'] * s['FillQuantity']).sum() / s['FillQuantity'].sum()

        })

    )

No comments:

Post a Comment