Connecting to DB2 from Jupyter Notebook

When I first saw Jupyter Notebook at IBM Insight (at World of Watson) in 2016, I knew it was something I had to try. I have been excited about the possibility of making greater use of it. I’ve finally found the time to work with it a bit more and wanted to share the basics on how to install and use Jupyter Notebook with DB2.

What is Jupyter Notebook?

Jupyter Notebook is a Python based tool that is often used in data science. It is open source and web-based. The real power with Jupyter Notebook is that it allows you to combine cells of formatted text with cells of code that can be executed right inline. The data can then be easily displayed in powerful ways including graphs and charts with very little effort.

For the DB2 DBA, this can offer several advantages. For one thing, being familiar with a data science tool is never a bad thing for a data professional. For another thing, this format offers a powerful way to explain things in DB2 in a way that others can easily try things out or visualize the data. Just a few things I can see using Jupyter Notebook for:

  • Team documentation
  • Common troubleshooting procedures or cookbook
  • Database health check
  • Reorg/runstats script with details teaching what is being done and why
  • Documentation of repetitive database task – report generation, etc
  • Presentation or blog entry with SQL built in for an awesome takeaway

Installing Jupyter Notebook on Windows

This is not a one-step install, but it is also not impossible. I am listing the steps for Windows here because it is sometimes my first sandbox for trying new things and is a very accessible place (because it is the host OS for my laptop) for trying new things. I expect to add a blog entry with details for Linux in the future.

The steps I took using Windows 10:

  1. Download Anaconda
  2. Install Anaconda, accepting the defaults
  3. If you don’t have it already, install a DB2 client or server on the same machine
  4. Download and install visual C++ tools
  5. Open a DB2 command window (as administrator) and launch Jupyter Notebook:
    JupyterNotebook1
  6. A browser window will open. Select New, and Python 3:
    JupyterNotebook2

Congratulations, you have created your first Jupyter Notebook. Proceed to the next section to learn how to configure and use it with DB2.

Configuring Jupyter Notebook to Run with the SQL Magic

Enter the following in a cell, and then execute it (by clicking play at the top, or by hitting shift+enter). You will need to change the path to your SQLLIB directory for the DB2 installation, if it is different.

import sys,os,os.path
os.environ['IBM_DB_HOME']='C:\Program Files\IBM\SQLLIB'
!pip install ipython-sql
!pip install ibm_db 
!pip install ibm_db_sa

At this point, you need to restart the Jupyter Notebook Kernel – the following steps will fail if you do not. This can be accomplished by clicking the little refresh like button at the top:JupyterNotebook3

Once the Jupyter Notebook Kernel has been restarted, enter the following in a cell and execute it.

import ibm_db
import ibm_db_sa
import sqlalchemy
%load_ext sql

You now have the basics you need to connect to any local or cataloged DB2 database.

Jupyter Notebooks are often shared using github. Check out my basic connection Jupyter Notebook. It has all the details above plus a connection statement, and can be used as a great starting point.

Summary

This blog entry doesn’t even touch on the power of Jupyter Notebook and why you should consider using it. That will come. Play with it and let me know what you think and what ideas you have.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *