Visualize & Interpret Values from Supermarket Datasets

Data Science

I’m currently building a data science project. What is data science?. Data science combines mathematics and statistics, custom programming, advanced analytics, artificial intelligence (AI), and machine learning with specific subject matter expertise to uncover actionable insights hidden in organizational data. These insights can be used to guide decision making and strategic planning.

In this project I will briefly explain what is needed. Starting from the tools used and what programming language is used.

1. Jupyter Notebook

Get Up and Running With the Jupiter Notebook

Jupyter Notebook doesn’t come with Python, so if you want to try it, you’ll need to install Jupyter.

There are many distributions of the Python language. The most popular is CPython, which is a reference version of Python which you can get from their website. It is also assumed that you are using Python 3.

Jupyter can also be used to export projects to pdf, html and python scripts. What I use this time is through the host from gist.

Installation

If so, then you can use a handy tool that comes with Python called pip to install Jupyter Notebook like this:

$ pip install jupyter
Starting the Jupyter Notebook Server

Now that you have Jupyter installed, let’s learn how to use it. To get started, all you need to do is open up your terminal application and go to a folder of your choice. I recommend using something like your Documents folder to start out with and create a subfolder there called Notebooks or something else that is easy to remember.

Then just go to that location in your terminal and run the following command:

$ jupyter notebook

2. Visual Studio Code

Visual Studio Code is a free, lightweight but powerful source code editor that runs on your desktop and on the web and is available for Windows, macOS, Linux, and Raspberry Pi OS.

It comes with built-in support for JavaScriptTypeScript, and Node.js and has a rich ecosystem of extensions for other programming languages (such as C++, C#, Java, Python, PHP, and Go), runtimes (such as .NET and Unity), environments (such as Docker and Kubernetes), and clouds (such as Amazon Web Services, Microsoft Azure, and Google Cloud Platform).

3. Python

Python is the programming language of choice for data scientists. Although it wasn’t the first primary programming language, its popularity has grown throughout the years.

  1. In 2016, it overtook R on Kaggle, the premier platform for data science competitions.
  2. In 2017, it overtook R on KDNuggets’s annual poll of data scientists’ most-used tools.
  3. In 2018, 66% of data scientists reported using Python daily, making it the number one language for analytics professionals.
  4. In 2021, it overtook Java on the TIOBE index and is now the most popular programming language.

My project this time uses python libraries, such as matplotlib and pandas.

4. Dataset

The dataset I’m using is from kaggle. Namely community forums where the majority are data scientists and data analysts. There are tons of datasets, tutorials, and more. The download link for my data set is attached below.

Get Started with the Project

Conclusion

That’s a project I’m working on. If something is wrong, please comment below so that there are no misunderstandings and I will fix it as soon as possible. Thank you have a nice day

Leave a Reply

Your email address will not be published. Required fields are marked *