Using Jupyter
Based on documentation from Project Jupyter.
What is a Jupyter Notebook?
A Jupyter Notebook is an incredibly powerful tool for interactively developing and presenting Physics projects.
A Notebook integrates code and its output into a single document that combines visualizations, narrative text, mathematical equations, and other rich media. In other words: it's a single document where you can run code, display the output, and also add explanations, formulas, charts, and make your work more transparent, understandable, repeatable, and shareable.
If your goal is to work with data, using a Notebook will speed up your workflow and make it easier to communicate and share your results.
Best of all, as part of the open source Project Jupyter, Jupyter Notebooks are completely free.
Your First Notebook
Running Jupyter
After you start Jupyter, a new tab in your web browser that should look something like the following.
This is the Notebook Dashboard, specifically designed for managing your Jupyter Notebooks. Think of it as the launchpad for exploring, editing and creating your notebooks.
Be aware that the dashboard will give you access only to the files and sub-folders contained within Jupyter’s start-up directory.
The dashboard’s interface is mostly self-explanatory — Browse to the folder in which you would like to create your first notebook, click the “New” drop-down button in the top-right and select “Python 3” under "Notebook":
If you switch back to the dashboard, you will see the new file Untitled.ipynb and you should see some green text that tells you your notebook is running.
What is an ipynb File?
Each *.ipynb file is a text file that describes the contents of your notebook in a format called JSON. Each cell and its contents, including image attachments that have been converted into strings of text, is listed therein along with some metadata.
The Notebook Interface
Now that you have an open notebook in front of you, its interface will hopefully not look entirely foreign. Jupyter is essentially just an advanced word processor.
Check out the menus to get a feel for it, especially take a few moments to scroll down the list of commands in the command palette, which is the small button with the keyboard icon. (Note of caution, some advanced keyboard shortcuts not guaranteed to work due to O/S and browser keybind will override.)
There are two fairly prominent terms that you should notice, which are probably new to you: cells and kernels are both key to understanding Jupyter and what makes it more than just a word processor. Fortunately, these concepts are not difficult to understand.
- A cell is a container for text to be displayed in the notebook or code to be executed by the notebook’s kernel.
- A kernel is a “computational engine” that executes the code contained in a notebook document.
Cells
Cells form the body of a notebook. In the screenshot of a new notebook in the section above, that box with the green outline is an empty cell. There are two main cell types:
- A code cell contains code to be executed in the kernel. When the code is run, the notebook displays the output below the code cell that generated it.
- A Markdown cell contains text formatted using Markdown, a text formatting language, and displays its output in-place when the Markdown cell is run.
The first cell in a new notebook is a code cell.
Test it with a classic hello world example: Type print('Hello World!') into the cell and click the run button in the toolbar above or press Ctrl + Enter. You may also use Shift + Enter, which also create a new cell in the next line after execution.
The result should look like this:
When we run the cell, its output is displayed below and the label to its left will have changed from In [ ] to In [1].
The output of a code cell also forms part of the document. You can tell the difference between code and Markdown cells because code cells have that label on the left and Markdown cells do not.
The “In” part of the label is simply short for “Input,” while the label number indicates when the cell was executed on the kernel — in this case the cell was executed first. If you run the cell again and the label will change to In [2] because now the cell was the second to be run on the kernel.
Jupyter signifies when the cell is currently running by changing its label to In [*].
In general, the output of a cell comes from any code specifically printed during the cell's execution, as well as the value of the last line in the cell, whether it a lone variable, a function call, or something else. For example:
Keyboard Shortcuts
When your cells have been executed, their border turns blue, whereas it was green while you were editing. In a Jupyter Notebook, the cell is highlighted with a border whose color denotes its current mode:
- Green outline — cell is in "edit mode"
- Blue outline — cell is in "command mode"
What can we do to a cell is in command mode? So far, we have seen how to run a cell with Ctrl + Enter, but there are plenty of other commands we can use. The best way to use them is with keyboard shortcuts.
Keyboard shortcuts are a quick way to facilitate cell-based workflow. Many of these are actions you can carry out on the active cell when it’s in command mode.
Below, you’ll find a list of some of Jupyter’s keyboard shortcuts.
- Toggle between edit and command mode with Esc and Enter, respectively.
- In command mode:
- Scroll up and down your cells with your Up↑ and Down↓ keys.
- Press A or B to insert a new cell above or below the active cell.
- M will transform the active cell to a Markdown cell.
- Y will set the active cell to a code cell.
- D + D (D twice) will delete the active cell.
- Z will undo cell deletion.
- Hold Shift and press Up or Down to select multiple cells at once. With multiple cells selected, Shift + M will merge your selection.
- Ctrl + Shift + -, in edit mode, will split the active cell at the cursor.
- Click and Shift + Click in the margin to the left of your cells to select them.
Markdown
Markdown is a lightweight markup language for formatting plain text. Its syntax has a one-to-one correspondence with HTML tags, so some prior knowledge here will be helpful but is not a prerequisite.
Let’s cover the basics with a quick example:
Here's how that Markdown would look once you run the cell to render it:
Kernels
Behind every Notebook runs a kernel. When you run a code cell, that code is executed within the kernel. Any output is returned back to the cell to be displayed. The kernel’s state persists over time and between cells — it pertains to the document as a whole and not individual cells.
For example, if you import libraries or declare variables in one cell, they will be available in another.
First, we’ll import a Python package and define a function:
import numpy as np
def square(x):
return x * x
Once we’ve executed the cell above, we can reference np and square in any other cell.
x = np.random.randint(1, 10)
y = square(x)
print('%d squared is %d' % (x, y))
This will work regardless of the order of the cells in your notebook. As long as the cell has been run, any variables you declared or libraries you imported will be available in other cells.
print('Is %d squared %d?' % (x, y))
But what happens if we change the value of y?
y = 10
print('Is %d squared is %d?' % (x, y))
We will get an output like:
Is 4 squared 10?
This is because once we've run the y = 10 code cell, y is no longer equal to the square of x in the kernel.
Most of the time when you create a notebook, the flow will be top-to-bottom. But it is common to go back to make changes. When you do need to make changes to an earlier cell, the order of execution we can see on the label, such as In [4], and can diagnose problems by seeing what order the cells have run.
If you ever wish to reset things, there are several useful options from the Kernel menu:
- Restart: restarts the kernel, thus clearing all the variables etc that were defined.
- Restart & Clear Output: same as above but will also wipe the output displayed below your code cells.
- Restart & Run All: same as above but will also run all your cells in order from first to last.
If your kernel is ever stuck on a computation and you wish to stop it, you can choose the "Interrupt" option.