Streamline dbt Model Development with Notebook-Style Workspace | by Khuyen Tran | Jun, 2023

Streamline dbt Model Development with Notebook-Style Workspace | by Khuyen Tran | Jun, 2023

[ad_1]

A free alternative to dbt cloud is Mage, an open-source data pipeline tool for data transformation and integration tasks.

Mage seamlessly complements dbt with a range of benefits, including:

  1. Integrated Web-based IDE: Mage provides a convenient web-based IDE where you can develop and explore data models effortlessly within a single interface.
  2. Language Flexibility: With Mage, you can combine the strengths of different tools and languages alongside dbt for enhanced data processing capabilities.
  3. Visualizing dbt Model Output: Mage provides a built-in visualization capability, allowing users to effortlessly visualize the output generated by dbt models with just a few clicks.
  4. Data Extraction and Loading: In addition to data transformation, Mage offers functionalities for data extraction and loading, enabling a more comprehensive end-to-end data pipeline solution.
  5. Pipeline Scheduling and Retry Mechanism: Mage allows you to schedule your data pipelines and automatically retry failed components, ensuring the smooth and reliable execution of your data integration processes.

Let’s dive deeper into each of these features.

Feel free to explore and experiment with the source code by cloning this GitHub repository:

Install Mage

You can install Mage using Docker, pip, or conda. This article will use Docker to install Mage and initialize the project.

docker run -it -p 6789:6789 -v $(pwd):/home/src mageai/mageai /app/run_app.sh mage start [project_name]

For example, let’s name our project “dbt_mage,” so the command becomes:

docker run -it -p 6789:6789 -v $(pwd):/home/src mageai/mageai /app/run_app.sh mage start dbt_mage

Find other ways to install Mage here.

Create a pipeline

Open http://localhost:6789/ in your browser to view the Mage UI.

Click on “New” and select “Standard (batch)” to create a new batch pipeline. Rename it as “dbt_pipeline.”

Image by Author

Install dependencies

Since we will use BigQuery as the data warehouse for dbt, we need to install dbt-bigquery by adding it to the “requirements.txt” file and clicking on “Install packages.”

Image by Author

Create a dbt project

To create a dbt project, navigate to the right panel and click on the terminal button.

Image by Author

Move to the “dbt” folder under your project and execute the command dbt init:

cd dbt_mage/dbt 
dbt init demo -s

This command adds the “demo” folder to the dbt directory.

Image by Author

Right-click on the “demo” folder and create a new file named “profiles.yml.” Specify your BigQuery credentials in this file.

[ad_2]
Source link

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *