Definition of Statistics:

The science of collecting, describing & interpreting data.

Types of Statistics :

01. Descriptive Statistics: Methods of organizing, summarizing, and presenting data in an informative way.

     Topics under Descriptive Statistics:
     – Measure of central tendency (Mean, Median, Mode)
     – Measure of dispersion(Variance , Standard deviation)
     – Different types of distribution of data(Histograms, Probability density function, Probability mass function)

02. Inferential Statistics: Methods used to reach a conclusion about the population on the basis of a sample.

     Topics under Inferential Statistics:
     – Hypothesis testing(p-value, Significance value etc…)
       – x-test
       – t-test
       – Chi-square test
       – ANOVA(f-test)
  Example:-
  Data: Ages of students of one class: {18,17,20,21,19,18,17,22}
  Descriptive statistics – What is the average age of the class?
  Inferential statistics – Are the average age of the whole college & this particular class the same?

Applications of Statistical Concepts:

  • Finance – Correlation, regression, time series analysis
  • Marketing – Hypothesis testing, chi-square test, non-parametric testing
  • Academic Research – Hypothesis testing, chi-square test, non-parametric testing
  • Operating management – Hypothesis testing, ANOVA, time series analysis
  • Retailing – Sales data, distribution analysis, instore promotion, new product development

Basic Terms of Statistics:

Population: A collection or group of individuals objects or events whose properties are to be analyzed. Represented by N.
Example: All voters of Bangladesh
Sample: A subset of the population. Represented by n.
Example: 1000 voters from different voting areas to assume who will win the election
Variable: A characteristic about each individual element of a population or sample. It can take on many variables.
Example :
“Age” is a variable such as 19,14,23,17 etc.
“Types of flower” is a variable such as Rose, Lilly, sunflower, etc.
Data: The observed values of the variable. Data may be singular or plural.
Example: male,34,30-09-22, etc
Singular data: Value of variable associated with one element of a population or sample. This may be a number, symbol, or a word
Example: Ages: 23,21,30,34,23
Plural data: Set of values collected for the variable from each of the elements belonging to the sample
Example: Ages : [23,21,30,34,23]
Parameter: A parameter is a number describing a whole population.
Example: population mean
Statistic: A statistic is a number describing a sample.
Example: sample mean

Cochran’s Formula :

The Cochran’s formula allows to calculate an ideal sample size given a desired level of precision, desired confidence level, and the estimated proportion of the attribute present in the population.

The Cochran formula is: n0 = Z2pq / e2

Where:

  • e is the desired level of precision (i.e. the margin of error),
  • p is the (estimated) proportion of the population that has the attribute in question,
  • q is 1 – p.
  • z-value is found in a Z table.

Example:

Suppose we are doing a study on the inhabitants of a large town, and want to find out how many households serve breakfast in the mornings.

We don’t have much information on the subject to begin with, so we’re going to assume that half of the families serve breakfast: this gives us maximum variability.

So p = 0.5 Now let’s say we want 95% confidence, and at least 5 percent—plus or minus—precision. A 95 % confidence level gives us Z values of 1.96, per the normal tables, so we get

=((1.96)2 (0.5) (0.5)) / (0.05)2 = 385

So a random sample of 385 households in our target population should be enough to give us the confidence levels we need.

Types of variable:

  1. Qualitative or Attribute or Categorical variable – Non-numeric characteristic.

Example: Gender, eye color, hair color, Country name, Types of flowers, etc…

  1. Quantitative variable – Numeric characteristic.

Example: height, weight, No of Children, etc….

Quantitative variables are classified as below:

    1. Discrete variable: contains values which are whole numbers.

Example: No of Children – 2,3,5 etc,

    1. Continuous variable: Contain values which are whole numbers & decimal numbers also.

Example: Height – 175.25 cm,180.5 cm etc…

Variables.JPG

Data Types in Statistics:

While doing Exploratory Data Analysis in data science project, we should have a good understanding of different data types since certain statistical measurements are only for specific data types. It is also known as a measurement scale.

Also, we need to know which visualization method fits the particular data type.

Data types are divided into below two categories:

01. Quantitative Data:

  • Expressed as a number & measured by numerical variables.
  • Represented by line graphs, bar graphs, scatter plots, etc.
  • Examples: Exam score-74,76,98 etc Weight – 85.2 kg, 56 kg, etc

Quantitative data are 2 types as below:

    • Discrete data: Only whole or integer numbers. Can not divided into smaller parts.

Example: No of students 25

    • Continuous data: Whole numbers & decimal numbers. Can take any between 2 whole numbers.

Example: Weight of person 67.4 kg

Continuous data is divided into 2 types as below:

      • Interval data: No meaningful zero, negative value possible

Example: Temperature(°C or F, but not Kelvin), Dates, time gap, etc.

      • Ratio data: Do have absolute zero, no negative value possible

Example: Age, Height, Weight, length, Temperature(in Kelvin, but not °C or F), etc.

02. Qualitative Data:

  • Can’t be expressed in numbers.
  • Consists of words, pictures, symbols
  • Also known as categorical data as sorted by category not by number
  • Represented by Pie chart
  • Examples: Colors – Blue, green, red, etc. Country – USA, UK, Italy etc.

Qualitative data are two types as below:

    • Nominal Data: Labeling/name variables, no particular order

Example: Gender-Men, Women, etc, Eye color- Brown, Black, blue, etc.

    • Ordinal Data: Categorical data but in some order. Based on the relative position we can assign numbers but cannot do math with those numbers

Example: Rank in competition: First, Second, third, etc, Rating of product: 1,2,3,4,5,6,7,8, etc.

Data.JPG


Register

Login here

Forgot your password?

ads

ads

I am an enthusiastic advocate for the transformative power of data in the fashion realm. Armed with a strong background in data science, I am committed to revolutionizing the industry by unlocking valuable insights, optimizing processes, and fostering a data-centric culture that propels fashion businesses into a successful and forward-thinking future. - Masud Rana, Certified Data Scientist, IABAC

© Data4Fashion 2023-2024

Developed by: Behostweb.com

Please accept cookies
Accept All Cookies