Pandas for SQL Lovers – SELECT * FROM table

pandaSQL5Earlier in the series, I covered how to load data from a csv file into a pandas DataFrame. In this post I’ll focus on the building block for all other DataFrame queries: SELECT * FROM DataFrame

Also in this series:

So you have your data loaded into a Pandas DataFrame and now you want to run a query against the DataFrame.

There are different methods we can use depending on what we want to do. Let’s start with loc and iloc

  • loc allows you to specify which columns you want by providing the column name
  • iloc allows you to specify which columns you want be providing the column position

Imagine you have a DataFrame called airports with the following values:

index airport_code city state capacity
0 SEA Seattle WA 45000000
1 BOS Boston MA 36000000
2 HOU Houston TX 55000000
3 YOW Ottawa ON 25000000
4 YYZ Toronto ON 65000000

The first query you need is the same one you learn when you first learn SQL

SELECT * FROM airports

All the commands below will return the same result

Just want all the rows and columns? If you are using a Jupyter notebook just type the DataFrame name

airports

To display the values as output from any Python code

print(airports)

loc, and iloc accept parameters to specify which rows and columns to return. A value of : indicates all rows, or all columns.

airports.loc[:,:]
airports.iloc[:,:]

If you do not specify a value for the columns parameter, loc and iloc return all columns.

airports.loc[:]
airports.iloc[:]

If you know how many rows are in the DataFrame, you could specify you want rows 0 through to the last row in the DataFrame.

airports.loc[0:4,:]
airports.loc[0:4]
airports.iloc[0:4,:]
airports.iloc[0:4]

If you don’t know the number of rows in the DataFrame or the number of rows might change, use len to return the number of rows and make your code more robust.

airports.loc[0:len(airports),:]
airports.loc[0:len(airports)]
airports.iloc[0:len(airports),:]
airports.iloc[0:len(airports)

 

Every command above returns the exact same output:

SELECTALLROWSALLCOLUMNS

As is so often the case when writing code, there are lots of different ways to accomplish the same task. Getting familiar with these different techniques will give you a good base to build on as me to more advanced queries.

Looking for similar topics and technical posts? See the full list

6 responses to this post.

  1. […] Select all rows and columns (SELECT * FROM table) […]

    Reply

  2. […] Select all rows and columns (SELECT * FROM table) […]

    Reply

  3. […] Select all rows and columns (SELECT * FROM table) […]

    Reply

  4. […] Select all rows and columns (SELECT * FROM table) […]

    Reply

  5. […] Select all rows and columns (SELECT * FROM table) […]

    Reply

  6. […] Select all rows and columns (SELECT * FROM table) […]

    Reply

Leave a Reply to Pandas for SQL Lovers INSERT / Populating a DataFrame | HockeyGeekGirl Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: