STAT 39000: Project 2 — Fall 2020

Motivation: The ability to quickly reproduce an analysis is important. It is often necessary that other individuals will need to be able to understand and reproduce an analysis. This concept is so important there are classes solely on reproducible research! In fact, there are papers that investigate and highlight the lack of reproducibility in various fields. If you are interested in reading about this topic, a good place to start is the paper titled "Why Most Published Research Findings Are False" by John Ioannidis (2005).

Context: Making your work reproducible is extremely important. We will focus on the computational part of reproducibility. We will learn RMarkdown to document your analyses so others can easily understand and reproduce the computations that led to your conclusions. Pay close attention as future project templates will be RMarkdown templates.

Scope: Understand Markdown, RMarkdown, and how to use it to make your data analysis reproducible.

Learning objectives
  • Use Markdown syntax within an Rmarkdown document to achieve various text transformations.

  • Use RMarkdown code chunks to display and/or run snippets of code.

Questions

Question 1

Make the following text (including the asterisks) bold: This needs to be very bold. Make the following text (including the underscores) italicized: This needs to be very italicized.

Surround your answer in 4 backticks. This will allow you to display the markdown without having the markdown "take effect". For example:

markdown ` Some marked up text. `

Be sure to check out the Rmarkdown Cheatsheet and our section on Rmarkdown in the book.

Rmarkdown is essentially Markdown + the ability to run and display code chunks. In this question, we are actually using Markdown within Rmarkdown!

Items to submit
  • 2 lines of markdown text, surrounded by 4 backticks. Note that when compiled, this text will be unmodified, regular text.

Question 2

Create an unordered list of your top 3 favorite academic interests (some examples could include: machine learning, operating systems, forensic accounting, etc.). Create another ordered list that ranks your academic interests in order of most interested to least interested.

You can learn what ordered and unordered lists are [here](rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf).

Similar to (1), in this question we are dealing with Markdown. If we were to copy and paste the solution to this problem in a Markdown editor, it would be the same result as when we Knit it here.

Items to submit
  • Create the lists, this time don’t surround your code in backticks. Note that when compiled, this text will appear as nice, formatted lists.

Question 3

Browse www.linkedin.com/ and read some profiles. Pay special attention to accounts with an "About" section. Write your own personal "About" section using Markdown. Include the following:

  • A header for this section (your choice of size) that says "About".

  • The text of your personal "About" section that you would feel comfortable uploading to linkedin, including at least 1 link.

Items to submit
  • Create the described profile, don’t surround your code in backticks.

Question 4

LaTeX is a powerful editing tool where you can create beautifully formatted equations and formulas. Replicate the equation found here as closely as possible.

Lookup "latex mid" and "latex frac".

Items to submit
  • Replicate the equation using LaTeX under the Question 4 header in your template.

Question 5

Your co-worker wrote a report, and has asked you to beautify it. Knowing Rmarkdown, you agreed. Make improvements to this section. At a minimum:

  • Make the title pronounced.

  • Make all links appear as a word or words, rather than the long-form URL.

  • Organize all code into code chunks where code and output are displayed. If the output is really long, just display the code.

  • Make the calls to the library function be evaluated but not displayed.

  • Make sure all warnings and errors that may eventually occur, do not appear in the final document.

Feel free to make any other changes that make the report more visually pleasing.

`markdown `r ''`{r my-load-packages} library(ggplot2)

`r ''````{r declare-variable-390, eval=FALSE}
my_variable <- c(1,2,3)

All About the Iris Dataset

This paper goes into detail about the iris dataset that is built into r. You can find a list of built-in datasets by visiting stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html or by running the following code:

data()

The iris dataset has 5 columns. You can get the names of the columns by running the following code:

names(iris)

Alternatively, you could just run the following code:

iris

The second option provides more detail about the dataset.

According to stat.ethz.ch/R-manual/R-devel/library/datasets/html/iris.html there is another dataset built-in to r called iris3. This dataset is 3 dimensional instead of 2 dimensional.

An iris is a really pretty flower. You can see a picture of one here:

In summary. I really like irises, and there is a dataset in r called iris. ``

Items to submit
  • Make improvements to this section, and place it all under the Question 5 header in your template.

Question 6

Create a plot using a built-in dataset like iris, mtcars, or Titanic, and display the plot using a code chunk. Make sure the code used to generate the plot is hidden. Include a descriptive caption for the image. Make sure to use an RMarkdown chunk option to create the caption.

Items to submit
  • Code chunk under that creates and displays a plot using a built-in dataset like iris, mtcars, or Titanic.

Question 7

Insert the following code chunk under the Question 7 header in your template. Try knitting the document. Two things will go wrong. What is the first problem? What is the second problem?

````markdown

plot(my_variable)

``

Take a close look at the name we give our code chunk.

Take a look at the code chunk where my_variable is declared.

Items to submit
  • The modified version of the inserted code that fixes both problems.

  • A sentence explaining what the first problem was.

  • A sentence explaining what the second problem was.

For Project 2, please submit your .Rmd file and the resulting .pdf file. (For this project, you do not need to submit a .R file.)

OPTIONAL QUESTION

RMarkdown is also an excellent tool to create a slide deck. Use the information here or here to convert your solutions into a slide deck rather than the regular PDF. You may experiment with slidy, ioslides or beamer, however, make your final set of solutions use beamer as the output is a PDF. Make any needed modifications to make the solutions knit into a well-organized slide deck (For example, include slide breaks and make sure the contents are shown completely.). Modify (2) so the bullets are incrementally presented as the slides progress.

You do not need to submit the original PDF for this project, just the beamer slide version of the PDF.

Items to submit
  • The modified version of the solutions in beamer slide form.