Part 3: How to Analyze a Database File with GPT-3.5

In part 3 of this blog series, we'll explore the proper usage of data analysis with ChatGPT-3.5 and how you can analyze and visualize data from a SQLite database to help you make the most of your data.  

Suppose we have a database file for a gold broker company that sells gold and we want to analyze the total monthly sales for each product. What do we do and what is the logical question to understand that gold.db?
 

Tools You Will Need

The first step

Open the gold.db file in the database browser (SQLite DB Browser) and copy the database schema information:

  • CREATE TABLE Customer ( CustomerID INTEGER PRIMARY KEY AUTOINCREMENT, FirstName TEXT, LastName TEXT, Email TEXT, Phone TEXT, Address TEXT, RegistrationDate DATE );

  • CREATE TABLE OrderDetails ( OrderDetailID INTEGER PRIMARY KEY AUTOINCREMENT, OrderID INTEGER, ProductID INTEGER, Quantity INTEGER, PriceAtTimeOfPurchase REAL, FOREIGN KEY (OrderID) REFERENCES Orders(OrderID);

  • Updated reference FOREIGN KEY (ProductID) REFERENCES Product(ProductID) ); CREATE TABLE Orders ( OrderID INTEGER PRIMARY KEY AUTOINCREMENT, CustomerID INTEGER, OrderDate DATE, Status TEXT, FOREIGN KEY (CustomerID);

  • REFERENCES Customer(CustomerID) ); CREATE TABLE Product ( ProductID INTEGER PRIMARY KEY AUTOINCREMENT, ProductName TEXT, MetalType TEXT, Form TEXT, Weight REAL, Purity REAL, StockQuantity INTEGER, PricePerUnit REAL );

  • CREATE TABLE Supplier ( SupplierID INTEGER PRIMARY KEY AUTOINCREMENT, SupplierName TEXT, Contact TEXT, Email TEXT, Phone TEXT ); CREATE TABLE Supply ( SupplyID INTEGER PRIMARY KEY AUTOINCREMENT, SupplierID INTEGER, ProductID INTEGER, SupplyDate DATE, Quantity INTEGER, PricePerUnit REAL, FOREIGN KEY (SupplierID) REFERENCES Supplier(SupplierID), FOREIGN KEY (ProductID) REFERENCES Product(ProductID);
     

The second step

Write the following question in ChatGPT!

You are a data engineer and have a sqlite3 database with the following schema. You should write a python script that you can run in a Jupyter Notebook that should draw a diagram. The data in the database comes from sales of a gold broker company. Build a chart that shows total sales of products by month as well as group sales by product types.

Database schema:

CREATE TABLE Customer ( CustomerID INTEGER PRIMARY KEY AUTOINCREMENT, FirstName TEXT, LastName TEXT, Email TEXT, Phone TEXT, Address TEXT, RegistrationDate DATE ); CREATE TABLE OrderDetails ( OrderDetailID INTEGER PRIMARY KEY AUTOINCREMENT, OrderID INTEGER, ProductID INTEGER, Quantity INTEGER, PriceAtTimeOfPurchase REAL, FOREIGN KEY (OrderID) REFERENCES Orders(OrderID), -- Updated reference FOREIGN KEY (ProductID) REFERENCES Product(ProductID) ); CREATE TABLE Orders ( OrderID INTEGER PRIMARY KEY AUTOINCREMENT, CustomerID INTEGER, OrderDate DATE, Status TEXT, FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID) ); CREATE TABLE Product ( ProductID INTEGER PRIMARY KEY AUTOINCREMENT, ProductName TEXT, MetalType TEXT, Form TEXT, Weight REAL, Purity REAL, StockQuantity INTEGER, PricePerUnit REAL ); CREATE TABLE Supplier ( SupplierID INTEGER PRIMARY KEY AUTOINCREMENT, SupplierName TEXT, Contact TEXT, Email TEXT, Phone TEXT ); CREATE TABLE Supply ( SupplyID INTEGER PRIMARY KEY AUTOINCREMENT, SupplierID INTEGER, ProductID INTEGER, SupplyDate DATE, Quantity INTEGER, PricePerUnit REAL, FOREIGN KEY (SupplierID) REFERENCES Supplier(SupplierID), FOREIGN KEY (ProductID) REFERENCES Product(ProductID).

The answer will be as follows in ChatGPT (code)

import sqlite3
import pandas as pd
import matplotlib.pyplot as plt

# Step 1: Connect to the SQLite database
db_connection = sqlite3.connect('your_database.db')  # Replace 'your_database.db' with the actual database file path

# Step 2: Query the database to retrieve the required data
query = """
    SELECT
        strftime('%Y-%m', o.OrderDate) AS Month,
        p.MetalType AS ProductType,
        SUM(od.Quantity * od.PriceAtTimeOfPurchase) AS TotalSales
    FROM Orders o
    JOIN OrderDetails od ON o.OrderID = od.OrderID
    JOIN Product p ON od.ProductID = p.ProductID
    GROUP BY Month, ProductType
    ORDER BY Month
"""

# Execute the query and fetch the data into a Pandas DataFrame
df = pd.read_sql_query(query, db_connection)

# Step 3: Pivot the data for plotting
pivot_df = df.pivot(index='Month', columns='ProductType', values='TotalSales')
pivot_df.fillna(0, inplace=True)

# Step 4: Create the line chart using Matplotlib
plt.figure(figsize=(12, 6))
for product_type in pivot_df.columns:
    plt.plot(pivot_df.index, pivot_df[product_type], marker='o', linestyle='-', label=product_type)
    

plt.xlabel('sales Month and year')
plt.ylabel('Total Sales')
plt.title('Total Product Sales by Month')
plt.legend(loc='upper left')
plt.grid(True)

# Rotate x-axis labels for better readability
plt.xticks(rotation=90)

plt.show()

# Close the database connection
db_connection.close()

The third step

Copy the code as it is, but you have to change the path to the gold.db file inside the brackets.

('your_database.db')
 

The fourth step

Open the Jupiter Notebook, paste the code and run it … .

 

The result will be as it follows:

Conclusion

ChatGPT can be a game-changer in the field of data analysis, offering a valuable resource for generating insights, explanations, and visualizations. By leveraging this advanced AI model, you can streamline your data analysis process, enhance your decision-making, and unlock the potential of your data.

Remember that while ChatGPT can assist with data analysis, it's most effective when used in conjunction with human expertise and a comprehensive understanding of your data. With the right approach, ChatGPT can be a powerful ally in your quest for data-driven insights.

Blog 11/27/23

Part 4: Save Time and Analyze the Database File

ChatGPT-4 enables you to analyze database contents with just two simple steps (copy and paste), facilitating well-informed decision-making.

Kompetenz 5/14/24

Graph Technology

We help you harness the power of graphs to transform your business. Our expertise spans from graph database modelling and graph data science to generative AI.

Blog 1/29/24

Database Analysis Report

This report comprehensively analyzes the auto parts sales database. The primary focus is understanding sales trends, identifying high-performing products, Analyzing the most profitable products for the upcoming quarter, and evaluating inventory management efficiency.

Headerbild zu Datenbanken mit Open Source
Technologie 11/12/20

Databases with Open Source

Every dynamic application needs some form of database to store its data logically and sorted. However, there is no one-size-fits-all solution, but it should always be looked at the use case to make the appropriate choice.

Blog 11/14/23

Part 2: Data Analysis with powerful Python

Analyzing and visualizing data from a SQLite database in Python can be a powerful way to gain insights and present your findings. In Part 2 of this blog series, we will walk you through the steps to retrieve data from a SQLite database file named gold.db and display it in the form of a chart using Python. We'll use some essential tools and libraries for this task.

Blog 2/21/22

The Power of Event Sourcing

This is how we used Event Sourcing to maintain flexibility, handle changes, and ensure efficient error resolution in application development.

Blog 3/12/21

Introduction to Web Programming in F# with Giraffe – Part 3

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Blog 7/16/21

Building A Shell Application for Micro Frontends | Part 4

We already have a design system, several micro frontends consuming this design system, and now we need a shell application that imports micro frontends and displays them.

Blog 9/15/22

Introduction to Functional Programming in F# – Part 3

Dive into F# data structures and pattern matching. Simplify code and enhance functionality with these powerful features.

Blog 10/1/22

Introduction to Functional Programming in F# – Part 4

Unlock F# collections and pipelines. Manage data efficiently and streamline your functional programming workflow with these powerful tools.

Blog

Crisis management & building a sustainable future with AI

Non-profit organizations develop AI models to tackle global challenges - and draw lessons for businesses worldwide

Headerbild zu IBM DB2
Technologie

IBM Db2

The IBM Db2database has been established on the market for many years as the leading data warehouse database in addition to its classic use in operations.

Blog 12/22/22

Introduction to Functional Programming in F# – Part 6

Learn error handling in F# with option types. Improve code reliability using F#'s powerful error-handling techniques.

Blog 9/17/21

How to gather data from Miro

Learn how to gather data from Miro boards with this step-by-step guide. Streamline your data collection for deeper insights.

Headerbild zur automatischen Handschrifterkennung bei Versicherern
Branche

Automatic handwriting recognition for insurers

The recognition of handwriting works "out of the box". In addition, we support our customers in further document classification and extraction of specialized data so that it can be processed automatically.

Header zu Fullstack Development
Service

Fullstack Development

The trend in Software Development is towards Full-Stack Development. Full-stack developers are programmers who work in both frontend and backend development and thus have competencies in the areas of databases, servers, systems and clients.

Referenz

The digital customer file with IBM Content Manager

The prefabricated house specialist SchwörerHaus KG has relied on IBM technology for many years to set up a digital customer file.

Service 4/13/23

Training courses on how to deploy target software

Do you want to maintain the software for your idea and innovation management yourself? We will be happy to provide you with the necessary know-how in our training courses.

Blog 6/16/23

CSS :has() & Responsive Design

In my journey to tackle a responsive layout problem, I stumbled upon the remarkable benefits of the :has() pseudo-class. Initially, I attempted various other methods to resolve the issue, but ultimately, embracing the power of :has() proved to be the optimal solution. This blog explores my experience and highlights the advantages of utilizing the :has() pseudo-class in achieving flexible layouts.

Referenz 8/8/22

Interdisciplinary collaboration at C&A with Atlassian

To homogenize the toolchain TIMETOACT replaced two independent ticketing systems for C&A Services GmbH &Co. with the Atlassian product range. With this step into the enterprise cloud, the fashion retailer is putting is putting an exclamation mark on cross-departmental and cross-location digital collaboration.

Bleiben Sie mit dem TIMETOACT GROUP Newsletter auf dem Laufenden!