R Practice 1
2019/01/07
Contents
For this homework, you may type up or handwrite the answers to each question. When you get to the R questions at the end, please attach a copy of your .R file (with comments!) of commands for each relevant question. For future homeworks, I will demonstrate how you can complete everything using R Markdown, but you will not be obligated to do so.
Getting Set Up
Before we begin, start a new file with → → . As you work through this sheet in the console in R
, also add (copy/paste) your commands that work into this new file. At the end, save it, and run to execute all of your commands at once.
Creating Objects
1.
Create a vector called “me” with two objects, your first name, and your last name. Then call the vector to inspect it. Confirm it is a character class vector.
2.
Create a vector called “x” with all the even integers from 2 to 10.
3.
Find the mean of x with mean()
4.
Now take the following pdf of random variable Y:
y | p(y) |
---|---|
2 | 0.50 |
4 | 0.25 |
6 | 0.25 |
Calculate the standard deviation “manually” using our table method. You can look at the source code of Lecture 4 for my example.
- Create two vectors, one called y.i and one called p.i, with the data above.
- Merge them into a data frame called rv with
data.frame(y_i,p_i)
. Callrv
to inspect it.
- Merge them into a data frame called rv with
- Find the expected value of Y by taking the sum of each value of
y.i
multiplied byp.i
with thesum()
command.
- Find the expected value of Y by taking the sum of each value of
- Create a new column in
rv
calleddeviations
, where you subtract the mean from eachy.i
value. Callrv
again to make sure it’s now there.
- Create a new column in
- Create another column in
rv
calleddevsq
, where you square the deviations from part d. Callrv
again to make sure it’s now there.
- Create another column in
- Now add another column in
rv
calledweighteddevsq
, where you multiply the squared deviations in part e. by the associated probabilityp.i
. Callrv
again to make sure it’s now there.
- Now add another column in
- Finally, take the sum of
weighteddevsq
to get variance. Square root this to get standard deviation.
- Finally, take the sum of
5.
The mean height of adults is 65 inches, with a standard deviation of 4 inches. Use the normal distribution to find the probabilities of the following scenarios:
- Find the probability of someone being at least 60 inches tall using
pnorm()
.
- Find the probability of someone being at least 60 inches tall using
- Find the probability of someone being at most 60 inches tall.
- Find the probability of someone being between 61 and 69 inches tall. Why is this number familiar?
- Find the probability of someone being between 57 and 73 inches tall. Why is this number familiar?
Playing with a Data Set
For the following questions, use the diamonds
dataset, included as part of ggplot2
.
6.
Install ggplot2
7.
Load ggplot2
with the library()
command
8.
Get the structure of the diamonds data.frame. What are the different variables and what kind of data does each contain?
9.
Get summary statistics for carat
, depth
, table
, and price
10.
color
, cut
, and clarity
are categorical variables (factors). Use the table()
command to generate frequency tables for each.
11.
Now rerun the summary()
command on the entire data frame
12.
Plot a histogram of price.
13.
Plot a boxplot of price by diamond color.
Execute your R Script
Save the R Script you created at the beginning and (hopefully) have been pasting all of your valid commands to. This creates a .R file wherever you choose to save it to. Now looking at the file in the upper left pane of R Studio look for the button in the upper right corner that says Run. Sit back and watch R
redo everything you’ve carefully worked on, all at once.