Introduction to NoSQL Databases

Caitlin C. Johnson
Last Updated: 1 February 2020

In this tutorial, you will learn how to build a data pipeline using Python to an Apache Cassandra database on a Docker container. The primary purpose of this project is to gain a better understanding of NoSQL databases and become more knowledgeable of situations where it may be more appropriate to use a NoSQL database instead of a relational database. By the end of this tutorial, it is my goal for you to have a basic understanding of how you can (1) set up a Docker container with Apache Cassandra installed on it and (2) utilize Python to establish a data pipeline to Apache Cassandra.

Target Audience

If you have beginner experience with Python and you’re looking to jump into the data engineering world, or if you’re currently a data engineer that’s looking to expand your skillset, then this tutorial will be particularly useful for you.

Tech Stack

Language: Python

Database: Apache Cassandra

Tool(s): Docker, TablePlus


Completing the tutorial will allow you to put the following on your resume:

Developed a data pipeline using Python to insert a large data set into an Apache Cassandra NoSQL database on a Docker container and incorporated best practices for the extraction, transformation, and loading of the data.


You should have a basic understanding of programming in Python.



Start Learning Now
Notify me when this tutorial is released!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.