Statistics can refer to numerical facts or its field of study: a group of methods used to collect, analyse, present, and interpret data and to make decisions. Decisions made by using statistical methods are called educated guesses. Like other fields of study, statistics has two aspects: theoretical and applied. Firstly, theoretical statistics deals with the development, derivation, and proof of theorems, formulas, rules and laws. Secondly, applied statistics involves that applications of those theorems, formulas, rules and laws to solve real world problems.
Lets take an example of the test scores of students enrolled in a class to show the statistical terminology: a whole set of numbers that represent the scores of students is called a data set, while the name of each student is called an element, while the score of each student is an observation. “A data set is a collection of observations on one or more variables.” (Mann, 2012, p.9)
Applied statistics, the type of statistics we will be covering here, can also be divided into two areas: descriptive statistics and inferential statistics. A data set is usually very large, making it hard to draw conclusions from. Therefore, it would be easier to use descriptive statistics which include using: summary tables, graphs, averages and diagrams. “Descriptive statistics consists of methods for organizing, displaying, and describing data by using tables, graphs, and summary measures.” (Mann, 2012, p.3)
The collection of all elements (individuals, items or objects) of interest is called a population (or target population). The selection of a few elements from this population is called a sample. “A major portion of statistics deals with making decisions, inferences, predictions, and forecasts about populations based on results obtained from samples.” (Mann, 2012, p.3) For instance, say we wanted to find the average starting salary of university graduates, we may take 1000 recent university graduates and find their starting salaries, and make a decision based on the results, this would be referred to as inferential statistics. Otherwise known as inductive reasoning or inductive statistics, inferential statistics consists of methods that use sample results to help make decisions or predictions about a population. Probability acts as a link between descriptive and inferential statistics.
The collection of information from a sample/population is called a survey. A survey that includes elements of the target population is called a census. A census is rarely undertaken as it can be costly and time consuming, having to include every member of the population would be so large. “Usually, to conduct a survey, we select a sample and collect the required information from the elements included in that sample.” (Mann, 2012, p.6) Decisions can then be made from this sample, such a survey conducted on a sample is called a sample survey. Whereas, inferences from a representative sample can be more reliable: a sample that represents the characteristics of the population as closely as possible.
A sample can be random or non-random. “A sample drawn in such a way that each element of the population has a chance of being selected is called a random sample.” (Mann, 2012, p.6) “If all samples of the same size selected from a population have the same chance of being selected, we call it simple random sampling.” (Mann, 2012, p.6) An example of a random sample is the lottery draw. Rather arranging names alphabetically and selecting the top 5/50 then this would be a non-random sample, as those further down the list have no way of being selected.
“In sampling with replacement, each time we select an element from the population, we put it back in the population before we select the next element.” (Mann, 2012, pp.6-8) As a result we may select the same item more than once in the same population. “Sampling without replacement occurs when the selected element is not replaced in the population.” (Mann, 2012, p.8) Most samples are taken without replacement.
“An element or member of a sample or population is a specific subject or object (for example, a person, firm, item, state, or country) about which the information is collected.” (Mann, 2012, p.8) “A variable is a characteristic under study that assumes different values for different elements.” (Mann, 2012, p.9) In contrast, we have a constant that is fixed. A variable sometimes may be the same, or may be denoted by x, y or z. “The value of a variable for an element is called an observation or measurement.” (Mann, 2012, p.9)
A variable can be classified as either quantitative or qualitative. A quantitative variable can be the price of something, and a qualitative variable could be the colour of something. Such quantitative data can be either discrete variables or continuous variables. “A variable whose values are countable is called a discrete variable.” (Mann, 2012, p.11) For example, the average number of cars sold at a car dealership, could not be 3.5. “A variable that can assume any numerical value over a certain interval or intervals is called a continuous variable.” (Mann, 2012, p.11) For example, the time taken to complete an examination can be between 20-30 minutes.
“Variables that cannot be measured numerically but can be divided into different categories are called qualitative or categorical variables.” (Mann, 2012, p.12) These variables do not assume any numerical value but can be classified into two or more nonnumerical categories. For instance, a student can fall into such categories as: undergraduate, graduate, postgraduate, ect.
Mann, P. 2012. Introductory Statistics. Seventh Edition. Wiley India, Delhi: India.