FeaturedIT topics

What is Jupyter Notebook? Data analysis made easier

At some point, we all need to show our work. Most programming work is shared either as raw source code or as a compiled executable. The source code provides complete information, but in a way that’s more “tell” than “show.” The executable shows us what the software does, but even when shipped with the source code it can be difficult to grasp exactly how it works.

Imagine being able to view the code and execute it in the same UI, so that you could make changes to the code and view the results of those changes instantly, in real time? That’s just what Jupyter Notebook offers. 

Jupyter Notebook was created to make it easier to show one’s programming work, and to let others join in. Jupyter Notebook allows you to combine code, comments, multimedia, and visualizations in an interactive document — called a notebook, naturally — that can be shared, re-used, and re-worked. 

And because Jupyter Notebook runs via a web browser, the notebook itself could be hosted on your local machine or on a remote server. 

IDG

A Jupyter Notebook with text, code, and a visualization. Altering the code and re-running it live in the browser will change the output, and the new results can be saved to a new notebook.

Jupyter Notebook benefits

Originally developed for data science applications written in Python, R, and Julia, Jupyter Notebook is useful in all kinds of ways for all kinds of projects:

  • Data visualizations. Most people have their first exposure to Jupyter Notebook by way of a data visualization, a shared notebook that includes a rendering of some data set as a graphic. Jupyter Notebook lets you author visualizations, but also share them and allow interactive changes to the shared code and data set.
  • Code sharing. Cloud services like GitHub and Pastebin provide ways to share code, but they’re largely non-interactive. With a Jupyter Notebook, you can view code, execute it, and display the results directly in your web browser.
  • Live interactions with code. Jupyter Notebook code isn’t static; it can be edited and re-run incrementally in real time, with feedback provided directly in the browser. Notebooks can also embed user controls (e.g., sliders or text input fields) that can be used as input sources for code.
  • Documenting code samples. If you have a piece of code and you want to explain line-by-line how it works, with live feedback all along the way, you could embed it in a Jupyter Notebook. Best of all, the code will remain fully functional—you can add interactivity along with the explanation, showing and telling at the same time.

Jupyter Notebook components

Jupyter Notebooks can include several kinds of ingredients, each organized into discrete blocks:

  • Text and HTML. Plain text, or text annotated in the Markdown syntax to generate HTML, can be inserted into the document at any point. CSS styling can also be included inline or added to the template used to generate the notebook.
  • Code and output. The code in Jupyter Notebook notebooks is typically Python code, although you may add support in your Jupyter environment for other languages such as R or Julia. The results of executed code appear immediately after the code blocks, and the code blocks can be executed and re-executed in any order you like, as often as you like. 
  • Visualizations. Graphics and charts can be generated from code, by way of modules like Matplotlib, Plotly, or Bokeh. Like output, these visualizations appear inline next to the code that generates them. However, code can also be configured to write them out to external files if needed.
  • Multimedia. Because Jupyter Notebook is built on web technology, it can display all the types of multimedia supported in a web page. You can include them in a notebook as HTML elements, or you can generate them programmatically by way of the IPython.display module.
  • Data. Data can be provided in a separate file alongside the .ipynb file that constitutes a Jupyter Notebook notebook, or it can be imported programmatically—for instance, by including code in the notebook to download the data from a public Internet repository or to access it via a database connection.

Jupyter Notebook use cases

The most common use cases for Jupyter Notebook are data science, mathematics, and other research projects that involve visualizations of data or formulas. Apart from those, though, there are plenty of other use cases:

  • Sharing a visualization, with or without interactivity. People often share the results of a data visualization as a static image, but that’s useful only up to a point. By sharing a Jupyter notebook, you allow your target audience to dive in and play around. They can gain a thorough understanding of the data, interactively.
  • Documenting a process with code. Many programmers who blog about their programming experiences write up their posts in a Jupyter notebook. Others can download their notebook and recreate the exercise.
  • Live documentation for a library or module. Most documentation for Python modules is static; a Jupyter notebook can be used as an interactive sandbox for learning how a module works. Any Python module that runs well in a notebook interface (essentially, anything that writes to stdout as part of its behavior) is a good candidate for this.
  • Sharing code and data generally. All you need to do to share a Jupyter notebook and its associated data files is pack it up into an archive.

Jupyter Notebook limitations

As powerful and useful as Jupyter Notebook can be, it does have some limitations that need to be taken into account.

  • Notebooks aren’t self-contained. This is the single biggest drawback of using Jupyter Notebook: Notebooks require the Jupyter runtime, along with any libraries you plan on using. A few strategies exist for creating self-contained Jupyter Notebooks, but none of them is officially supported. You’re best off distributing notebooks to people who already have infrastructure in place to run them, or don’t mind the setup (by way of Anaconda, for instance).
  • Session state cannot be saved easily. The state of any code running in a Jupyter notebook cannot be preserved and restored with Jupyter Notebook’s default toolset. Every time you load the notebook, you will need to re-run the code in it to restore its state.
  • No interactive debugging or other IDE features. Jupyter Notebook is not a full-blown development environment for Python. Many of the features you would expect to find in an IDE—e.g., interactive debugging, code completion, and module management—aren’t available there.

Related Articles

Back to top button