You are on page 1of 6

Q1) Solve any Five

a) What is the difference between inferential and descriptive


statistics?
 Descriptive statistics describe a sample. That’s pretty
straightforward. You simply take a group that you’re
interested in, record data about the group members, and
then use summary statistics and graphs to present the group
properties. With descriptive statistics, there is no uncertainty
because you are describing only the people or items that you
actually measure. You’re not trying to infer properties about a
larger population

Inferential statistics takes data from a sample and makes


inferences about the larger population from which the sample was
drawn. Because the goal of inferential statistics is to draw
conclusions from a sample and generalize them to a population,
we need to have confidence that our sample accurately reflects the
population. This requirement affects our process. 

1. Define the population we are studying.


2. Draw a representative sample from that population.
3. Use analyses that incorporate the sampling error.

b) What is the main difference between an Array and a matrix?


 Arrays have only one dimension, a row of elements while
a matrix has more dimensions, mostly 2 but can have more
than that, and its elements are either column or
row arrays which are here called vectors for a matrix with two
dimensions or matrices with dimension n-1 for a matrix of n
dimensions.

c) How can you load and use csv file in R?

 Go to drive , open the folder , click the file , copy the


location , go to r studio type the code read.csv( ) copy the
location in bracket.type the blacklash(//) and mention the file
name and give the extension .csv

Example I want a call a file matrix in D drive in piyush folder


Read.csv(“D://piyush//matrix.csv”)
If u want to analysis in it you have to take a variable we take a variable
matrix
Matrix<- Read.csv(“D://piyush//matrix.csv”)

e) What is t-tests() in R?
 One of the most common tests in statistics is the t-test, used
to determine whether the means of two groups are equal to
each other

f) How do you list the preloaded datasets in R?


 Command library loads the package MASS (for Modern
Applied Statistics with S) into memory. Command data()
will list all the datasets in loaded packages. The command
data(phones) will load the data set phones into memory.

Q.2 Answer any two from the following


a) What are the different data structures in R? Briefly explain about them?

 Data structures are used to store data in an organized


fashion in order to make data manipulation and other data
operations more.
There are five types of Data Structures in R Programming
which are mentioned below:

 Vector
 List
 Matrix
 Data Frame
 Factor

Vector is one of the basic data structures in R programming. It is


homogenous in nature, which means that it only contains elements of
the same data type. Data types can be numeric, integer, character,
complex or logical.
The vector in R programming is created using the c() function. Coercion
takes place in a vector from lower to top, if the elements passed are of
different data types from Logical to Integer to Double to Character.
The typeof() function is used to check the data type of the vector, and
class() function is used to check the class of a vector.

List in R Programming

A list in R programming is a non-homogenous data structure, which


implies that it can contain elements of different data types. It accepts
numbers, characters, lists, and even matrices and functions inside it. It is
created using the list() function.

Matrix in R Programming

The matrix in R programming is a 2-dimensional data structure that is


homogenous in nature, which means that it only accepts elements of the
same data type. Coercion takes place if elements of different data types
are passed. It is created using the matrix()function.
The basic syntax to create a matrix is given below:
matrix(data, nrow, ncol, )
Data Frame

A data frame in R programming is a 2-dimensional array-like structure


that also resembles a table, in which each column contains values of
one variable and each row contains one set of values from each column.
A data frame has the following characteristics:

 The column names of a data frame should not be empty.


 Row names should be unique.
 Data stored in a data frame can be numeric, factor or character
type.
 Each column should contain the same number of data items.

Factor

Factors in R programming are used in data analysis for statistical


modeling. They are used to categorize unique values in columns, like
“Male, “Female”, “TRUE”, “FALSE”etc., and store them as levels. They
can store both strings and integers. They are useful in columns that have
a limited number of unique values.
Factors can be created using the factor() function and they take vectors
as inputs.
2) Write about all summary commands in R?
max()- show maximum value
min() show minimum value
mean() show the average
median()show the median (middle value)
nrow() show the total rows
ncol()show the total column
head() top 5 entries
tail()bottom 5 entries

q.3 What are the six steps of hypothesis testing?

Answer-1. HYPOTHESES

State in order:

Research Hypothesis

Null Hypothesis

Alternate Hypothesis
Recall the difference between a general research hypthesis which will
not be overturned by a single investigation and a simple null and
alternate hypothesis.

 2. ASSUMPTIONS

include:

1. measurement level of data,

2. distributions underlying the data,

3. knowledge or lack of about population characteristics

4. sample size and method,

5. sample characteristics necessary for applying the test statistic,

6. level of significance for testing

3. TEST STATISTIC (or Confidence Interval Structure)

1. structure to be used to test significance levels or set of confidence


intervals (be sure to include the equations & notation)

2. special conditions to be met by statistic

4. REJECTION REGION (or Probability Statement)

Expected measure of the test statistic as generated from tables or critical


valve for a confidence interval.

5. CALCULATIONS
Actual test statistic measure or confidence interval generated including
specification of all additional equations used plus notation. May also
include sample calculations.

6. CONCLUSIONS

Statement of results or the acceptance, or rejection of the null


hypothesis & future direction of research.

Should include summary of results in tabular, graphical, or mapped form,


plus a discussion of where this research has led you.

You might also like