Jupyter Notebook

Jupyter Notebook

R or Jupyter Notebook?

Although I am a fan of R, RStudio and R Notebook, some researchers are familiar with Jupyter Notebook. Specially, those who work in industry may love to use Jupyter Notebook. Jupyter Notebook is a web application in which you can create and share documents that contain live codes, equations, text and also graphical visualizations. Therefore, we can use Jupyter Notebook to perform data analysis in real time. The word “Jupyter” stands for the three programming languages Julia, Python, and R, since these programming languages were the first target languages. However, the Jupyter notebook technology currently supports many other languages.

I have installed and tried Jupyter Notebook. Everything that we can do with Jupyter Notebook, we can also do with R Notebook. i.e. Using R Notebook, we can also create and share documents as in Jupyter Notebook. Further, we can work with Python, C++ , SQL and Stan when using R Notebook. However, we have many more advantages when using R Notebook. For example, R Notebook provides a platform to create presentation slides, websites, dashboards and even to write and publish books. In Jupyter Notebook, we can edit and run our notebooks via a web browser. In R Notebook, we can prepare the notebook by running it in RStudio, and can be viewed as a word document, pdf or in a web browser as an html document. Both applications can be executed on a PC without Internet access.

Certainly, R and Python are the two most popular programming languages used by data analysts and data scientists. A very good comparison of the two languages are given in below:

  1. Choosing R or Python for Data Analysis?
  2. Should I learn Python or R or both?

Working with Jupyter Notebook

Jupyter Notebook has two main components; the kernels and a dashboard. The default kernel is for Python code, but other kernels also available for other programming languages. The dashboard shows the notebooks which can be reopened, and further we can use dashboard to manage the kernels. More details you can learn from Jupyter Notebook Tutorial on the DataCamp.

Before installing Jupyter Notebook App to the computer we can try it in a web browser. Go to https://try.jupyter.org, and select an example given there. We can get a temporary Jupyterserver which is running on mybinder.org. Then we can decide whether we really want to install it.

Installing Jupyter Notebook

The Jupyter notebook technology group strongly recommend to install Python and Jupyter using the Anaconda Distribution. The advantage of Anaconda is that it includes both Python, the Jupyter Notebook, and we have access to over 720 packages which can be easily installed with Anaconda’s conda, a package, dependency, and environment manager.

Anaconda is a package manager, an environment manager, and Python distribution that contains a collection of many open source packages. Some of the packages needed for data science projects (e.g. numpy, scikit-learn, scipy, pandas etc) are preinstalled with Anaconda. Additional packages can be installed by using Anaconda’s package manager, conda.

  1. Download Anaconda’s latest Python version.
  2. Then, install it following the instructions on the download page.
  3. To run the notebook, run the following command at the Terminal (Mac/Linux) or Command Prompt Windows).
    jupyter notebook
    This will start the notebook server, and show some information about the notebook server in the terminal,including the URL of the web application (by default, http://localhost:8888):
  4. Now, you can see the notebook open in your default web browser.
  5. When the notebook opens in your browser, you will see the Notebook Dashboard having a list of the notebooks, files, and subdirectories in the directory where the notebook server was started.
  6. To get help from the notebook server use jupyter notebook --help

Installing the Jupyter Notebook will also install the IPython kernel which allows you to working on notebooks using the Python programming language. However, we can run notebooks in other languages such as R or Julia, as well. Then we have to install additional kernels. For more information, see the full list of available kernels.

Installing a kernel to run R

  1. In Anaconda Navigator, go to the Environments side tab, then click the Create button at the bottom left. Then, the Create new environment dialog box appears as below:
    anafirst

  2. Give a name for R environment like R essentials, and check the “R” box for the R computer language to be installed.
  3. Now, click the Create button again to create it, and then highlight it to activate.
  4. Both R language and R essentials are available in the channel named MRO. In the anaconda console, type the command as below conda install -c r r-essentials to install all of the most popular R packages with all of their dependencies. anasecond

  5. After installing R, we can see the Anaconda navigator as below: anathird

  6. Later, if we want to update all installed packages and their dependencies then type the following command in the console. conda update -c r r-essentials We can also update packages using the Anaconda navigator.
  7. To update a single package in R-Essentials (if a new version of the package is available in the R channel) use conda update r-XXXX
  8. To do a conda search for any R package if you know the name, use conda search -f EXACTNAME
  9. Now, open Jupyter Notebook, and create a new notebook by selecting R as below:
    anafourth