Arunita Sarkar

MS in Information Systems - Big Data Analytics | Data Analyst | Ex-Tata Consultancy Services LTD

Atlanta, Georgia, USA

About Me

👤 Data Analysis, Data Engineering, Data Modeling, Statistical Analysis, Data Visualization
💻 SQL | MS Excel | Python | R | Tableau | Power BI
🤍 Dog, Food, Books, Travel, Nail Art
🌐 India
🗩 Bengali, Hindi, English, Spanish

Skill Level

SQL
90%

Python
85%

R
50%

Tableau
70%

Microsoft Power BI
40%

MS Excel
75%

Azure
25%

Agile
99%

Projects

TV SHOW DATA CLEANING AND SEGMENTATION


Read More Details

Tools : Python - Pandas, TextBlob, Scikit-learn, Segmentation - K-Means

Description: A dataset consisting of both movies and series is thoroughly cleaned to make the data more readable and then segmentation using K-means is performed along with evaluation metrics.

Key Features:

  • Import the necessary packages and import the file.
  • Use shape() function to get the number of rows and columns of a dataset.
  • Show the column names of the dataframe.
  • Drop the duplicate records using drop_duplicates() function.
  • Format each of the columns which felt necessary to clean. Formatting included removing unneccessary characters, handling NULL values, stripping white spaces or other characters and converting the datatypes of the column.
  • Unwanted columns are dropped using drop().
  • Sort the dataset by Rating and Gross amount in descending order.
  • Segmentation performed using K-means with silhoette score, Calinski-Harabasz score and Davies-Bouldin score.

INSTACART MARKET BASKET ANALYSIS


Read More Details

Tools : SQL, SSMS, Tableau Public (Desktop Version)

Description: Inspired by the Instacart Market Basket Analysis challenge from DataLemur 🐒 (Ace the SQL & Data Interview). This analysis dives into Instacart's performance across Q2 and Q3, examining sales at both product and department levels.

Key Features:

  • Quarterly order counts.
  • Performance metrics for departments exceeding specific minimum growth percentage.
  • Dynamic lists of top and bottom products based on user input.
  • Visual breakdown of order count by weekday in both the quarters.
  • Visual breakdown of order count by hour of the day in both the quarters.
  • Navigate to Instacart's official site using the dashboard logo to order.

DATA SCIENCE SALARY TREND


Read More Details

Tools : Python (Pandas, Matplotlib), Jupyter Notebook, Tableau Public (Desktop Version)

Description: The purpose of is to understand how the average salary of job titles in Data Science department varies with respect to different factors. With the help of bar chart and line chart, I have showcased the changes in the salary. I focused primarily on the following columns for my EDA - Job Title, Employment Type, Experience Level, Expertise Level, Company Location, Salary in USD, Company Size and Year.

Key Features:

  • Dynamically choose the role you want to check.
  • Number of opportunities and the average salary throughout.
  • Percentage of the employment type.
  • Salary Range for the top 10 countries.
  • Yearly count of job openings for the selected role.
  • Average salary wrt experience level with the visual representation of position.

Contact