You are on page 1of 2

# ®

www.statcrunch.com
StatCrunch is a Web-based statistical software package for analyzing data. This study card is a brief introduction to StatCrunch,
covering the procedures that most students will encounter in an introductory statistics course. Visit www.statcrunch.com to access
StatCrunch. See the online version of this study card at www.statcrunch.com/studycard for high-quality screen shots. Follow the
help links at the statcrunch.com site for more extensive help documentation.

To Begin
1. Go to www.statcrunch.com.
2. Click Open StatCrunch.
3. Click Data, Stat, or Graph, followed by any submenus, as
instructed in the heading for each section.

Graph > Bar Plot

Graph > Scatter Plot

1. Choose the With data option to use data consisting of
individual outcomes in the data table.

a. Select the column(s) to be displayed.

b. Enter an optional Where statement to specify the data
rows to be included.

c. Select an optional Group by column to do a side-by-side
bar plot.
2. Choose the With summary option to use summary
information consisting of categories and counts.

a. Select the column containing the categories.

b. Select the column containing the counts.
3. Click Compute! to construct the bar plot(s).

1. Select the X column and Y column for the plot.
2. Enter an optional Where statement to specify the data
rows to be included.
3. Color-code points with an optional Group by column.
4. Click Compute! to produce the plot.

Example: Scatter plot of Price vs. Sqft, color-coded by location

Interacting with Graphics
Example: A bar plot showing the number of homes in each location with three
or more baths

Data > Load > From ﬁle

Graph > Histogram

1. Choose On my computer to load a data file from the local
system or On the Web to load a data file from the Web.
2. Specify the location of a text file (.txt, .csv, etc...) or Microsoft
Excel® file.
3. If the first line in your file does not contain column names,
uncheck the Use first line as column names option.
4. For text files (not Excel files), specify the delimiter for the data
values. For example, the delimiter for a .csv file is a comma.

1. Select the column(s) to be displayed.
2. Enter an optional Where statement to specify the data
rows to be included.
3. Click Compute! to construct the histogram(s).

of 30 four-bedroom homes listed for sale in the Bryan-College Station, Texas, area.
For each home, the data set contains the list price in thousands of dollars (Price),
square footage (Sqft), number of bathrooms (Baths) and location (Bryan, TX or College
Station, TX). It is currently being shared on the StatCrunch site at www.statcrunch.com/
app/?dataid=1046844

1. Click and drag the mouse to draw a rectangle around
graph objects to highlight them.
2. The corresponding rows will be highlighted in the data
table and in all other graphics.
3. Toggle highlighting on and off by clicking on the row
number in the data table.
4. To clear all highlighted rows, click the Clear button in
the lower left corner of the data table.
5. To highlight rows based on categories or numeric
ranges, use the
Data > Row Selection > Interactive Tools menu
option.

Example: When the boxplot outlier is highlighted, the corresponding row in the data
table is automatically highlighted as well as the corresponding portion of the histogram

Stat > Summary Stats > Columns
Example: A histogram of the Sqft column

Data > Compute Expression

Graph > Boxplot

1. Enter a mathematical or Boolean expression in the Expression
input box.
2. Alternatively, click the Build button to build the expression.
3. Click Compute! and the results of the expression will be added
as a new column to the StatCrunch data table.

Note: A Boolean (true/false) expression can be used as a Where statement in many
StatCrunch procedures to exclude outliers or to focus an analysis on a subset of the
data. Some of the following examples illustrate this feature.
Example: Computing the mathematical expression, 1000*(Price/Sqft)

1. Select the column(s) to be displayed. By default, a
boxplot for each column will be included on a single
graph.
2. Enter an optional Where statement to specify the data
rows to be included.
3. Select an optional Group by column to compare
boxplots across groups on a single graph.
4. By default, a boxplot of the five-number summary
is produced. Check Use fences to identify outliers to
obtain a modified boxplot.
5. Click Compute! to construct the boxplot(s).

1. Select the column(s) for which summary statistics are
to be computed.
2. Enter an optional Where statement to specify the data
rows to be included.
3. Compare statistics across groups using an optional
Group by column.
4. Click Compute! to view the summary statistics.

Example: Comparing prices of homes listed in Bryan to those listed in College
Station with the potential outlier removed

Stat > Tables > Frequency
1. Select the column(s) for which a frequency table is to be
computed.
2. Enter an optional Where statement to specify the data rows to be
included.
3. Click Compute! to view the frequency table(s).

Example: Boxplots comparing price across locations

1

2

Example: A frequency table for the number of bathrooms

3

Select the Hypothesis test or Confidence interval option. or > for the alternative. If the two samples are in separate columns.. a test of independence between the two factors is shown above. select the Selected columns option and then select the columns containing the samples. 3. Stat > T Stats > One Sample 1. 3. If the samples are in separate columns. Enter a name for the column variable. b. Click Compute! to view the results. Check the Tukey HSD option and specify a confidence level to perform a post hoc means analysis. For a hypothesis test. this step is typically not required. a. 4. For a confidence interval. Select the columns that contain the summary counts. Stat > Proportion Stats > Two Sample Stat > T Stats > Paired 1. This option is available only for continuous distributions. b. or > for the alternative. a. Alternative: Choose the With Summary option to enter the number of successes and number of observations for both samples. Select the columns containing the first and second samples. Compare results across groups by selecting an optional Group by column. Select the Hypothesis test or Confidence interval option. A 95% confidence interval for the proportion of all North Carolina births that are not premature is shown. enter a value between 0 and 1 for Level (0. 2. Enter an optional to specify the data rows to be included. b. with the normal distribution. b. enter the difference in proportions for the null hypothesis and choose ≠. specify the direction of the desired probability. Alternative: Choose the With Summary option to enter the number of successes and number of observations. enter a value between 0 and 1 for Level (0. A. C. b. 2. 1. enter the null proportion and choose ≠. Choose the With Data option to cross-tabulate two columns of raw data from the data table. choose Standard-Wald or Agresti-Coull.95 will produce 95% confidence intervals for all pairwise mean differences. Of those children who were praised for their intelligence. Click Compute! to view the results. or > for the alternative. Sample standard deviation. In the first line below the plot in the calculator window. c. Select the columns containing the first and second samples. Select an optional Group by column to compute separate tables across groups. children solved problems and were praised for either their intelligence or effort.Stat > Proportion Stats > One Sample Stat > T Stats > Two Sample Stat > ANOVA > One Way 1.. Example: A 95% confidence interval for the average difference between the graduation rate of full and part-time students across all colleges based on a random sample of four colleges Example: Testing to see if the average home listed in College Station is larger than 2000 square feet 4 Stat > Tables > Contingency 1. 3. Normal. Select one of the following options: a. Select the hypothesis test or confidence interval option.95 provides a 95% confidence interval). If the samples are in a single column. Select the columns containing the first and second samples.. Select the Hypothesis test or Confidence interval option. enter a value between 0 and 1 for Level (0. 4. b. a. 4. Enter an optional Where statement to specify the data rows to be included. Choose the With Data option to use sample data from the StatCrunch data table. Example: Hypothesis test comparing the prices of homes listed in Bryan to those in College Station with potential outlier included Example: ANOVA for the time. enter the difference in means for the null hypothesis and choose ≠. C. Using this summary information. c.g. and Sample size for both samples. Example: In a study. 2. 4. a. a. <. For a confidence interval.g. For a confidence interval. each child wrote a report stating. b. Select one of the following options: a. a. Choose the With Data option to use data from the data table.25). Alternative: Choose the With Summary option to enter the Sample mean. If the two samples are in separate columns. Sample standard deviation and Sample size.). Afterwards. For Method. Enter optional Where statements to specify the data rows to be included in both samples. Select the column to be tabulated across the rows. 2. Enter an optional Where statement to specify the data rows to be included. a. a two-sided hypothesis test is performed to compare the proportion of children lying for these two conditions. Choose the With Summary option to use a two-way cross classification already entered in the data table. b. <. 3. Click Compute! to view the results. enter the probability to the right of the equal sign and leave the other field empty (e. only 4 of 30 lied.95 provides a 95% confidence interval). <. 2. b. in minutes. 2. Enter optional Where statements to specify the data rows to be included in both samples. Select the Hypothesis test or Confidence interval option. enter a value to the right of the direction selector and leave the remaining field empty (e. enter a value between 0 and 1 for Level (0. Specify the sample outcomes that denote a Success for both samples. Stat > Regression > Simple Linear 1. 1. with the binomial distribution. Select the name of the desired distribution from the menu listing (e.95 provides a 95% confidence interval). 3. Example: For each of 998 North Carolina births. B. Select the column that contains the row labels. 2. For a hypothesis test. As examples. Stat > Calculators a. Enter an optional Where statement to specify the data rows to be included. For a confidence interval. To compute a probability. b. To determine the point that will provide a specified probability.95 provides a 95% confidence interval). Then. Uncheck the Pool variances option if desired. Note that the back problems variable is represented by the second and third columns in the data table. or > for the alternative. specify n and p. Choose the With Data option to use sample data from the StatCrunch data table. Click Compute! to view the results. A. Select the X variable (independent variable) and Y variable (dependent variable) for the regression. Click Compute! to view the results. Use an optional Where statement to specify the data included. specify the mean and standard deviation.g. 3. In the second line below the plot. 3. 1. 4. 2. 11 of 29 lied on their report. <. Example: Using summary counts cross-tabulating gender and back problems. a caller stays on hold before hanging up under three different treatments Example: Regression of Price on Sqft for College Station Enter a value here and press Compute Finding a binomial probability 5 Enter a value here and press Compute Finding a standard normal probability 6 Change direction to > _ Finding a t(8) quantile Enter a value here and press Compute . Select the column containing the sample data values. P(X < 3) = ____). b. For a hypothesis test. a. P(X < ____) = 0. b. Select the column containing the sample values. For a hypothesis test. this step is typically not required. 3. Choose the With Data option to use sample data from the StatCrunch data table. For a hypothesis test. select the Values in a single column option. among other things. Specify the outcome that denotes a Success. For a confidence interval. enter the null mean and choose ≠. Of those who were praised for their effort. D. Click Compute! to view the results. a. enter the difference in means for the null hypothesis and choose ≠. Binomial. specify the distribution parameters. 2.95 provides a 95% confidence interval). Then specify the column containing the samples (Responses in) and the column containing the population labels (Factors in). or > for the alternative. Click Compute! to view the results. Alternative: Choose the With Summary option to enter the Sample mean. this data set indicates whether the birth was premature. Click Compute! to view the results. how many problems they got right. enter a value between 0 and 1 for Level (0. Select the column to be tabulated across the columns. B. Enter an optional Where statement to specify the data rows to be included. etc. 1. <. Click Compute! to fill in the empty fields and to update the graph of the distribution. The default value of 0.