Working with pivot tables
Pivot tables in pandas
Notes from datacamp lesson…I have found taking notes in a markdown file and creating as a post at the same time as the lesson has really helped with learning and recall.
Subsetting and calculations with pivot tables
how to call pivot tables:
df.pivot_table("column name containing numerical values to aggregate", index="columns to group by and display in rows", columns="columns to display as columns")
Pivot tables are dataframes with sorted indexes…
therefore can use loc and iloc on saved sorted dataframes…
e.g.
df_sorted = df.pivot_table("value", index=, column=)
df_sorted.loc["rows", "columns"]
#slicing is particularly good for subsetting pivot tables
df_sorted.loc["row : row"]
methods for summary statistics…
e.g. .mean()
have an axis function df_sorted.mean(axis=) default value is “index” which calculates across rows (if axis isn’t specified it will default to index).
Time to play with pivot tables apparently,
Eddie(WTR)