Pandas for SQL experts

Mohan Dorairaj
Mar 20, 2018

As a SQL expert, selecting, filtering, aggregating and even advanced OLAP operations come naturally. However it is hard to wrap our minds around the Pandas way of accomplishing the very same tasks.

This is learning by analogy exercise. I have picked 3 SQL examples and converted them into Pandas.

Toy Data:

Here is the toy data I used for this exercise:

and here is the code to create this toy data:

import pandas as pdMarks_data=[ [101,80,99,100,'A'],[102,87,76,79,'B'],[103,80,80,81,'B'],[104,65,60,70,'C']]Marks=pd.DataFrame(data=Marks_data, columns=['StudentID','Mark1','Mark2','Mark3','FinalGrade'])Student_data=[[101,1],[102,1],[103,2],[104,1]]Student=pd.DataFrame(data=Student_data, columns=['StudentID','Class'])

3 Examples:

We pick 3 SQL examples and replicate them in Pandas

(the dot operator in pandas works like pipe operator in unix. The output of previous operation (on the left) is passed to the next operation (on the right). Eg: Marks data frame is the input to “.assign” operation, which adds new columns. The resulting data frame with new columns is the input to “.loc” operation which filters rows & columns and so on..

Note: Pandas has multiple ways of achieving the same operations. I have picked the above methods to closely mimic SQL way for ease of understanding.

For a complete reference of Pandas Transformations for SQL Experts, please see here: https://github.com/mohanganeesh/Pandas4SQLexperts

--

--