3  Overview of R and R Studio

What is R ?

  • R, is a powerful language and environment for statistical computing and graphics.
  • R is an open-source programming language, widely used among statisticians, data analysts, and researchers for data manipulation, calculation, and graphical display.
  • R is not just a programming language, but also an environment for interactive statistical analysis.
  • It was developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently maintained by the R Development Core Team.
  • It is a GNU project and is freely available under the GNU General Public License.
  • Packages: The R community is known for its active contributions in terms of packages. There are thousands of packages available in the Comprehensive R Archive Network (CRAN), covering various functions and applications.
  • Platform Independent: R is available for various platforms such as Windows, MacOS, and Unix-like systems.

3.1 Installation and Setup

3.1.1 Install R

Download and install R from the Comprehensive R Archive Network CRAN and choose the relevant OS (Windows,mac,linux).

3.1.2 Install RStudio

RStudio is a recommended integrated development environment (IDE) for R. Download and install RStudio form POSIT and choose the relevant OS (Windows,mac,linux).

3.2 Overview of RStudio Panels

  • RStudio is a widely-used Integrated Development Environment (IDE) for R programming.
  • RStudio’s design enhances the efficiency and user-friendliness of coding, testing, and data analysis in R.
  • Its panels and features provide a comprehensive environment that caters to the needs of both novice and experienced R programmers.
  • It features a user-friendly interface and is divided into several panels, each designed for specific tasks. Here’s a detailed overview of these panels.
RStudio Panel Layout

Source Panel (Top-Left by Default)

Source Panel

Function

This panel is where you write and edit your R scripts and R Markdown documents.

Features
  • Syntax highlighting for R code.
  • Code completion and hinting.
  • Ability to run code directly from the script.

Console Panel (Bottom-Left by Default)

Console Panel

Function

This is where R code is executed interactively.

Features
  • Direct execution of R commands.
  • Displays results of script execution.
  • Keeps a history of your commands.

Environment/History Panel (Top-Right by Default)

Environment/History Panel

Environment Tab
  • Shows the current working dataset and variables in memory.
  • Allows for inspection and management of data structures and variables.
History Tab
  • Records all commands run in the Console.
  • Enables re-running and insertion of previous commands into scripts.

Output/ Files/ Plots/ Packages/ Help/ Viewer Panel (Bottom-Right by Default)

Output/ Files/ Plots/ Packages/ Help/ Viewer Panel

Files Tab
  • Manages project files and directories.
  • Sets the working directory.
Plots Tab
  • Displays graphs and charts.
  • Allows for the export of plots.
Packages Tab
  • Lists and manages R packages.
  • Provides access to package documentation.
Help Tab
  • Offers R documentation and help files.
  • Useful for learning about R functions and packages.
Viewer Tab
  • Displays local web content such as HTML files from R Markdown or Shiny apps.

Additional Features

  • Toolbar: Quick access to common tasks like saving, loading, and running scripts.
  • Customization: Ability to rearrange the layout of tabs and panes.
  • Version Control: Integrated support for Git and SVN.

3.3 R Syntax and R Script

3.3.1 R Syntax

R is a powerful programming language used extensively for statistical computing and graphics. It provides a wide array of techniques for data analysis, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and more. Its syntax allows users to easily manipulate data, perform calculations, and create graphical displays. Here’s a breakdown of some fundamental aspects of R syntax and an example to illustrate how it works.

Basic Syntax Components

  • Variables: In R, you can create variables without declaring their data type. You simply assign values directly with the assignment operator <- or =.

  • Comments: Comments start with the # symbol. Everything to the right of the # in a line is ignored by the interpreter.

  • Vectors: One of the basic data types in R is the vector, which you create using the c() function. Vectors are sequences of elements of the same type.

  • Functions: Functions are defined using the function keyword. They can take inputs (arguments), perform actions, and return a result.

  • Conditional Statements: R supports the usual if-else conditional constructs.

  • Loops: For iterating over sequences, R provides for, while, and repeat loops.

  • Packages: R’s functionality is extended through packages, which are collections of functions, data, and compiled code. You can install packages using the install.packages() function and load them with library().

3.3.2 R Script

  • Rscript is a tool for executing R scripts directly from the command line, making it easier to integrate R into automated processes or workflows.
  • It’s part of the R software environment, which is widely used for statistical computing and graphics. Rscript enables you to run R code saved in script files (typically with the .R extension) without opening an interactive R session.
  • This is particularly useful for batch processing, automated analyses, or running scripts on servers where a graphical user interface is not available.

Creating an R Script in RStudio

Creating and using R scripts in RStudio is a fundamental skill for anyone working with data in R. RStudio, being a powerful IDE for R, streamlines the process of writing, running, and managing R scripts. Here’s a concise guide based on insights from various sources:

  1. Start a New Script: To begin, navigate to File -> New File -> R Script. This opens a new script tab in the top-left pane where you can write your code.

  2. Writing Code: You can type your R code directly into this script pane. Common tasks include importing data, data manipulation, statistical analysis, and plotting. For instance, to create and print a variable, simply type something like result <- 3 followed by print(result) to see the output in the Console pane.

  3. Running Code: To execute your code, you can click the Run button at the top of the script pane, or use keyboard shortcuts (e.g., Ctrl+Enter on Windows). The output will appear in the Console pane at the bottom.

Basic R Scripts Examples

Below are a few examples of basic R scripts that demonstrate common tasks in R.

Example 1: Hello World

A simple script that prints “Hello, World!” to the console.

Example 2: Basic Arithmetic

This script performs basic arithmetic operations and prints the results.

Example 3: Creating and Plotting a Vector

This example demonstrates how to create two numerical vectors, perform an operation on them, and then plot the result.

In this example: - Two vectors, vector1 and vector2, are created using the c() function. - These vectors are then added together, resulting in the result vector. The addition is done element-wise: the first element of vector1 is added to the first element of vector2, and so on. - Finally, the plot() function is used to visualize result. The type = "o" argument specifies that both the points and the lines connecting them should be plotted, and col = "red" changes the color of the plot to red.

These examples introduce the basics of writing and running R scripts with Rscript. As you become more familiar with R’s syntax and features, you can write more complex scripts to automate a wide range of data analysis and statistical tasks.