• Purpose of Statistics Package Exercises : The Probability & Statistics course focuses on the processes you use to convert data into useful information. This involves

    1. Collecting data,

    2. Summarizing data, and

    3. Interpreting data.

  • In addition to being able to apply these processes, you can learn how to use statistical software packages to help manage, summarize, and interpret data. The statistics package exercises included throughout the course provide you the opportunity to explore a dataset and answer questions based on the output using R, Statcrunch, TI Calculator, Minitab, or Excel. In each exercise, you can choose to view instructions for completing the activity in R, Statcrunch, TI Calculator, Minitab, or Excel, depending on which statistics package you choose to use.

  • The statistics package exercises are an extension of activities already embedded in the course and require you to use a statistics package to generate output and answer a different set of questions.

  • To Download R

    1. To download R, a free software environment for statistical computing and graphics, go to: https://www.r-project.org/ This link opens in a new tab and follow the instructions provided.

  • Using R

    1. Throughout the statistics package exercises, you will be given commands to execute in R. You can use the following steps to avoid having to type all of these commands in by hand:

    2. Highlight the command with your mouse.

    3. On the browser menu, click "Edit," then "Copy."

    4. Click on the R command window, then at the top of the R window, click "Edit," then "Paste."

    5. You may have to press to execute the command.

  • R Version

    1. The R instructions are current through version 3.2.5 released on April 14, 2016. Instructions in these statistics package exercises may not work with newer releases of R.

    2. For help with installing R for MAC OS X or Windows click here

  • The purpose of this activity is to give guided practice at solving problems involving binomial random variables and to teach how the same probabilities can be found using a statistical software package.

  • A multiple choice test has 10 questions, each with 5 possible answers, only one of which is correct. A student who did not study is absolutely clueless, and therefore uses an independent random guess to answer each of the 10 questions.

  • Let X be the number of questions the student gets right.

  • Applying the binomial formula is a good way for "first-timers" to understand the mechanics of binomial probabilities. Once you have mastered the technique, however, it may still be tedious to perform the necessary calculations.

  • For example, if I would ask you: What is the probability that the student will get at most 4 questions right? Or in other words, if we wanted to find P(X ≤ 4), we would need to add P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4).

  • For each of these 5 probabilities, we would need to use the formula, and then add the probabilities. This is very tedious. When calculations involve large n values, calculations become tedious as well. Luckily, any statistical software will do binomial calculations for us.

  • R Instructions To find probabilities of the type P(X = k) or P(X ≤ k) in R, we'll use the following function:

      dbinom(k, n, p) = P(X = k)
      pbinom(k, n, p) = P(X ≤ k)

  • where:

    1. k is the number of successes in trials

    2. n is the number of independent trials

    3. p is the probability of success in each trial

  • As practice, follow these steps to find P(X = 4) for our example (where n = 10 and p = .2), and verify that you get the same answer as you did in the last question, where you did it "by hand."

  • Now use R to find the probability that the student gets no more than 4 questions right: P(X ≤ 4).

  • R Instructions : Use R to find the probability that the student gets more than 2 questions right, P(X > 2).

  • Guidance: R will calculate for us only probabilities of the type P(X = k) or P(X ≤ k). To find P(X > 2) think about the complement of this event.

  1. Explanation :
    n = 10, p = .2

  1. Explanation :
    To find this probability, we use =binom.dist(x, n, p, cumulative). In this case, x = 4, n = 10, p = 0.2, and cumulative = TRUE, so we enter =binom.dist(4, 10, 0.2, TRUE)returns a probability of 0.967.

  1. Explanation :
    To find P(X > 2) we notice that the complement of the event "X > 2" is X ≤ 2, and therefore, by the Complement Rule, P(X > 2) = 1 - P(X ≤ 2). R gives us: Thus, there is approximately a 32% chance that using the "guessing strategy" the student will get more than 2 questions right on the quiz.