- 1. This is a Jupyter Notebook!
- 2. Put any code in code cells
- 3. Jupyter Notebooks ♡ SQL (part i)
- 4. Jupyter Notebooks ♡ SQL (part ii)
- 5. Jupyter Notebooks ♡ SQL (part iii)
- 6. Jupyter Notebooks ♡ SQLAlchemy
- 7. Jupyter Notebooks ♡ plots
- 8. Goodbye for now!
1. This is a Jupyter Notebook!
A Jupyter Notebook is a document that contains text cells (what you're reading right now) and code cells. What is special with a notebook is that it's interactive: You can change or add code cells, and then run a cell by first selecting it and then clicking the run cell button above ( ▶| Run ) or hitting Ctrl + Enter
.
The result will be displayed directly in the notebook. You could use a notebook as a simple calculator. For example, it's estimated that on average 256 children were born every minute in 2016. The code cell below calculates how many children were born on average on a day.
256 * 60 * 24 # Children × minutes × hours
2. Put any code in code cells
But a code cell can contain much more than a simple one-liner! This is a notebook running Python and you can put any Python code in a code cell (but notebooks can run other languages too, like R). Below is a code cell where we define a whole new function (greet
). To show the output of greet
we run it last in the code cell as the last value is always printed out.
def greet(first_name, last_name):
greeting = 'My name is ' + last_name + ', ' + first_name + ' ' + last_name + '!'
return greeting
# Replace with your first and last name.
# That is, unless your name is already Jane Bond.
greet('Anurag', 'Peddi')
3. Jupyter Notebooks ♡ SQL (part i)
We've seen that notebooks can display basic objects such as numbers and strings. But notebooks also support and display the outputs of SQL commands! Using an open source Jupyter extension called ipython-sql, we can connect to a database and issue SQL commands within our notebook. For example, we can connect to a PostgreSQL database that has a table that contains country data, then inspect the first three rows of the table by putting %%sql
ahead of the SQL commands (more on the meaning of %%
later).
%%sql postgresql:///countries
SELECT * FROM countries LIMIT 3;
%sql SELECT * FROM countries where name = 'Belgium';
result = %sql SELECT * FROM countries;
# To pandas DataFrame
df = result.DataFrame()
df.info()
6. Jupyter Notebooks ♡ SQLAlchemy
If SQLAlchemy is your thing, you can do that in this notebook, too! Jupyter Notebooks love everything, apparently...
What's SQLAlchemy, you ask? SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. Next, we'll run the last query we just ran except after connecting to and querying the database using SQLAlchemy.
from sqlalchemy import create_engine
engine = create_engine("postgresql:///countries");
# Query database
result = engine.execute("SELECT * FROM countries;")
# Display column names
result.keys()
7. Jupyter Notebooks ♡ plots
Tables are nice but — as the saying goes — "a plot can show a thousand data points." Notebooks handle plots as well, but it requires some more magic. Here magic does not refer to any arcane rituals but to so-called "magic commands" that affect how the Jupyter Notebook works. Magic commands start with either %
or %%
(just like we saw with %sql
and %%sql
) and the command we need to nicely display plots inline is %matplotlib inline
. With this magic in place, all plots created in code cells will automatically be displayed inline.
Using the previously created pandas DataFrame that we named df
, let's plot the number of countries in each continent as a bar chart using the plot()
method of pandas DataFrames.
Now, for the difference between %%sql
and %sql
: ordinary assignment works for single-line %sql
queries while %%sql
is for multi-line queries. See the Assignment ipython-sql documentation section for more info.
%matplotlib inline
# Plotting number of countries in each continent
df.continent.value_counts().plot(kind='bar')
8. Goodbye for now!
Tables and plots are the most common outputs when doing data analysis, but Jupyter Notebooks can render many more types of outputs such as sound, animation, video, etc. Yes, almost anything that can be shown in a modern web browser. This also makes it possible to include interactive widgets directly in the notebook! Everything in this collection of Jupyter Widgets can be displayed in this notebook.
But that's enough for now! This was just a short introduction to Jupyter Notebooks, an open source technology that is increasingly used for data science and analysis. We hope you enjoyed it! :)
I_am_ready = True
# P.S. Feel free to try out any other stuff in this notebook.
# It's all yours!