Skip to content

SnehaTanwar006/Python_EDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Python_EDA

  • Exploratory Data Analysis

EXPLORATORY DATA ANALYSIS IN PYTHON

What is Exploratory Data Analysis ?

Exploratory Data Analysis (EDA) is the process of examining and summarizing a dataset to understand its characteristics, identify patterns, and make informed decisions. It involves calculating summary statistics, visualizing data through plots and charts, identifying missing or inconsistent data, and exploring relationships between variables. EDA provides insights that guide further analysis and decision-making.

EDA helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions.

How to perform Exploratory Data Analysis ?

Performing EDA involves the following concise steps:

1.Understand the Data: Get familiar with the dataset's structure, variables, and potential missing values or inconsistencies.

2.Clean the Data: Handle missing values, erroneous data and duplicates appropriately.

3.Data Transformation: Rename confusing columns for better readability and drop Irrelevant Columns.

4.Calculate Summary Statistics: Compute basic statistics like mean, median, and standard deviation for numeric variables, and frequency counts for categorical variables. 5.Visualize the Data: Create plots such as histograms, box plots, and scatter plots to visualize the data distribution, outliers, and relationships between variables.

6.Analyze Relationships: Identify correlations between numeric variables and visualize them using correlation matrices or scatter plots.

7.Identify Outliers and Anomalies: Spot unusual observations that deviate significantly from the norm.

8.Handle Categorical Variables: Analyze categorical variables using bar plots or pie charts to understand category distributions.

9.Iterate and Explore: Continuously explore the data, generate hypotheses, and delve deeper into specific aspects for further analysis.

What data are we exploring today ?

We have a data-set of cars which contains more of 10,000 rows and more than 10 columns which contains features of the car such as Engine Fuel Type, Engine HP, Transmission Type, highway MPG, city MPG and many more.

About

Exploratory Data Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published