Excelython — Part 7: Pivot Tables in Python

Yiannis Servos
5 min readMay 26, 2021

When you are working with datasets that have a lot of information in detail level, sometimes you need to see or to show the bigger picture. This is important for you or your “clients” in order to be easy for everyone to extract data that is easy to be understood and digested.

Project Step 4

In our project we have in File_3 the column with name = “Store Code Need”, that holds the information of the replenishment need, per Store Code and SKU.

The requested information is the aggregated need per SKU and the Store Code is indifferent.

Excel Step 1

In File_3, we navigate to Insert Tab, above the ribbon and we click Pivot Table.

Pivot Table dialog box appears, with selected array and predefined options of where to place the Pivot Table that we will create.

Confirm that all of your columns first row have names and are not empty, confirm that the suggested range has all of your data and choose whether you want to add the Pivot Table in a new worksheet or in the existing one, by pointing the starting cell position.

Press OK.

Excel Step 2

After pressing the OK button you will see the Pivot Table design bar on the right of the screen.

Now we have to design our Pivot Table, in a way that we will be able to get the summarized Store Code Need per SKU.

We will pick the “SKU” field and we will drop it in the ROWS box area.

Then we will take in the same way, the “Current Inventory”, “Min Stock” and “Store Code Need” and place them in VALUES box area.

After this, the box area will look like this

And the Pivot Table that we just created it will look like this.

What do we have now?

  1. Summarized Current Inventory in Stores.
  2. Summarized Min Stock that we have defined in our systems.
  3. Summarized Store Code Need.

All the above in SKU level.

Excel Step 3

The final Step now is to use again VLOOKUP in order to get these three new columns in File_1 which we use it as main panel from the begining of our Project.

We will go back to File_1 and we will create three new columns:

  1. name = “Stores Current Inventory”
  2. name = “Stores Need”

In each of them we will write a VLOOKUP formula in order to get the relevant information from File_3.

“Stores Current Inventory”=VLOOKUP(F2;Pivot_Stores!A:D;2;0)

“Stores Need”=VLOOKUP(F2;Pivot_Stores!A:D;4;0)

The File_1 up after the final step will look like this.

Python Step 1

In Python, we will create a new dataframe and we will call it “File_3_pivot”.

If you want to read more things about pivot_tables in Python and how you can use different parameters, you can visit the below link.

link: Pandas Pivot Tables explained

  1. The structure of the code is similar to excel. You define the name of the new dataframe = File_3_pivot
  2. pd.pivot_table( = the function of pandas that will help us create the Pivot Table.
  3. File_3 = The data source that we will summarize.
  4. index = As ROWS in excel. Place the columns that you need to summarize the data.
  5. values = As VALUES in excel. Place the columns that you will get summarized.
  6. aggfunc = Define what type of calculation you need for the values columns(sum, min, max, mean,count)
  7. fill_value = How you want to fill in the empty cells.

So the code will look like this:

File_3_pivot = pd.pivot_table(File_3,index=["SKU"],
values = ["Current Inventory","Store Code Need"],
aggfunc="sum",
fill_value=0)

If you open the dataframe we just created it will look like this.

You can notice that the “SKU” column is as index. I prefer to reset index in the dataframes i create with pd.pivot_table, in order to be more easy in other stages to merge them with other dataframes.

You can reset the index by just writing.

File_3_pivot = File_3_pivot.reset_index()

Now the File_3_pivot looks like this.

Python Step 2

What we have to do now is to get the “Current Inventory” and “Store Code Need” columns in File_1 dataframe.

As we learned the previous part, we will do it with pd.merge.

The code will be the following.

File_1 = pd.merge(File_1, File_3_pivot, how="left", on = "SKU")

File_1 will look now like this.

There are other options also, which could help us do the above steps, but are not something to learn it now, so I am not writing them in order to stay focused.

The next part we will lower a little bit the pace, as we need to do some extra preprocessing in order to format the File_1 dataframe as we need for the project, plus we need to digest everything we learned so far.

Regards

--

--

Yiannis Servos

Experienced in Retail Category Management and enthusiast about technology and programming.