Showing posts with label R. Show all posts
Showing posts with label R. Show all posts

Monday 1 October 2018

Introduction_Command line_Data editor_Rstudio in R languages

Command Line versus Scripts



Execution of commands in R is not menu driven. (Not like Clicking over buttons to get outcome)

we need to type the commands.

Single line and multi line commands are possible to write.

When writing multi-line programs, it is useful to use a text editor rather than execute everything directly at the command line.

Option 1:
  • One way use R's own built-in editor.
  • It is accessible from the RGui menu bar.
  • Click File and then click on New script
 
At this point R will open a window entitled Untitled-R Editor.

We may type and edit in this.

It we want to execute a line or a group of lines, just highlight them and press Ctrl+R.


Option 2:

Use R studio software


Suppose we want to use following three functions:

Type them.
library (MASS)
attach (bacteria)
fix (bacteria)

Suppose we want to run only function: library (MASS)

Highlight it and click on Run.



Data Editor
  • There is a data editor within R that can be accessed from the menu bar by selecting Edit/Data editor.
  • Provide the name of the matrix or data frame that we want to edit and a Data Editor window appears.
  • Alternatively we can do this from the command line using the fix function. 
Example:
  library (MASS)
  attach (bacteria)
  fix (bacteria)


We can do it in R Studio as follows :




Cleaning up the Windows
  • We assign names to variables when analyzing any data.
  • It is good practice to remove the variable names give to any data frame at the end each session in R.
  • This way, variable with same names but different properties will not get in each others way in subsequent work.
  • rm ( ) command removes variable name
  • For example,
           rm (x, y, z) removes the variable x, y and z.
  • detach ( ) command detaches objects from the Search Path.
  • It removes it from the search ( ) path of available R objects.
  • Usually this is either a data.frame which has been attached or a package which was attached by library.
  • To get rid of everything, including data frames, type rm (list=1s( )  ).

Saturday 29 September 2018

Array, Strings and Functions in R Languages

Array 
  • Used to store ordered list of values of same type.
We will see how to:
  • Create Array
  • Access Array
  • Modify Array





Functions :-
  •  Function is a set of statements combined together perform a specific Task.
  • Syntax:
                 functionName <- function(Arguments_optional)
                   {
                       //Statemnts.
                      }
  • We will see how to:
                 - Create a Function
                 - Call a Function




String :-  
  •  Values written inside single or double quotes are called strings. E.g. "Hello" , 'hello'
  • Quotes can't be mixed, if a string has double quote in beginning ending quotes should be same as well.
  • Example of  Valid strings and invalid strings.
  • "Hey", 'Hey', "Teacher's", 'Name" is' are valid strings.
  • 'Hey", "Hello" there", 'hey"there', are invalid strings.
  • We will see how to:
             - Create and manipulate strings using predefined functions.

Tuesday 25 September 2018

Importing Data Files from Other Software in R Language

Importing Data Files

Spreadsheet (Excel) file data

The xlsx package has the function read.xlsx ( ) for reading Excel files.

This will read the first sheet of an Excel spreadsheet.

To read Excel files, we first need to install the package

install.packages ("xlsx")
library (xlsx)
data  <-  read.xlsx ("datafile.xlsx", sheet Index or sheet Name)

 To load other sheets with read.xlsx( ), we specify a number for sheetIndex or a name for sheetName:

data  <- read.xlsx("datafile.xlsx", sheetIndex=2)

data  <-  read.xlsx("datafile.xlsx",sheetName="marks")

For reeading older Excel files in .xls format, use gdata package and function read.xls ( )

This will read the first sheet of an Excel spreadsheet.
To read Excel files, we first need to install the package

install.packages ("gdata")
library (gdata)
data  <- read.xls ("datafile.xls", sheet Index or sheet Name) )  

SPSS data file

For reading SPSs data files, use foreign package and function read.spss ( )

To read SPSs files, we first need to install the package

install.packages ("foreign")
library (foreign)
data  <-  read.spss ("datafile.sav") 


Other data files

The foreign package also includes functions to load other formats, including:
  • read.octave ("<Path to file>") : Octave and MATLAB
  • read.systat ("<Path to file>") : SYSTAT
  • read.xport ("<Path to file>") " SAS XPORT
  • read.data ("<Path to file>") : Stata
  
Contents of working directory

The list.files function shows the contents of your working directory:
> list.files ( )   // List of all available files in the working directory.

> setwd  ("C:/RCourse/")
> list.files ( )

 Redirecting Output to a File

Issue:
We want to redirect the output from R into a file instead of your console.

Solution:
Redirect the output of the cat function by using its file argument:

> ans  <-  6 + 8
> cat ("The answer of 6 + 8 is", ans, "\n", file="filename")

The Output will be saved in the working directory with given filename. 

Use the sink function to redirect all the output from both print and cat.

Call sink with a filename argument to begin redirecting console output to that file.

When we are done, sink with no argument to close the file and resume output to the console.

> sink ("filename")  #Beign writing output to file 
.....other session work .....
> sink

The print and cat functions normally write the output to console.

The cat function redirects the output to a file if we supply a file argument.

The print function cannot redirect its output.

The sink function can force all output to a file. 


Redirecting Output to a File: Three steps

1. 
> sink ("output.txt")  # Redirect output to file

2.
 > source ("script.R")  # Run the script, capture its output

3.
> sink ( )   # Resume writing output to console

Other options like append=TRUE/FALSE,  split=TRUE/FALSE are available. 

Example :-

 

Monday 24 September 2018

Data Reshaping in R Language

Data reshaping means changing how data is represented in rows and column.

 Most of Data Processing in R is done on Data Frames.

 R has many functions that deal with Reshaping of Data, by splitting, meaning or interchanging the Rows and columns.Some 

Data reshaping functions are:
            

             - cbind( )
             - rbind( )
             - merge( ), etc.


Another Data Frames :-



Monday 17 September 2018

Data management: Sequence in R Language

Sequences

A sequence is a set of related numbers, events, movements, or items that follow each other in a particular order.

The regular sequences can be generate in R.

Syntax :-
seq ( )

seq (from = 1, to = 1, by = ( ( to from) / (length.out - 1) ), length.out = NULL, along.with = NULL, ...)

Help for seq

> help ("seq")

The default increment is +1 or -1

> seq (from = 2, to = 4)
     [1]  2  3  4

> seq (from = 4, to = 2)
    [1]  4  3  2

>seq (from = -4, to = 4)
   [1]  -4  -3  -2  -1  0  1  2  3  4

 
 Sequence with constant increment:


  Generate a sequence from 10 to 20 with an increment of 2 units

> seq (from = 10, to = 20 , by = 2)
   [1]  10  12  14  16  18  20


Generate a sequence from 20 to 10 with a decrement of 2 units

> seq (from = 20, to = 10, by = -2)
  [1]  20  18  16  14  12  10



Downstream sequence with constant increment:

Generate a sequence from 3 to -2 with a decrement of 0.5 units

> seq (from = 3, to = -2, by = -0.5)
  [1]  3.0  2.5  2.0  1.5  1.0  0.5  0.0  -0.5  -1.0  -1.5  -2



Sequence with a predefined length with default increment +1

> seq (to = 10, length = 10)
  [1]  1  2 3 4  5  6  7  8  9  10



Sequence with predefined length with constant fractional increment.

> seq (from = 10, length = 10, by = 0.1)
  [1]  10.0  10.1  10.2  10.3  10.4  10.5  10.6  10.7  10.8  10.9




Sequence with a predefined length with constant decrement

> seq (from = 10, length = 10, by = -2)
  [1]  10  8  6  4  2  0  -2  -4  -6  -8



Sequences with a predefined length with constant fractional decrement

> seq (from = 10, length = 5, by = -.2)
  [1]  10.0  9.8  9.6  9.4  9.2



Sequence with a predefined variable and constant increment

> x <-2
> seq (1, x, x/10)
  [1]  1.0  1.2  1.4  1.6  1.8  2.0

> x<-50
> seq (0, x, x/10)
  [1]  0  5  10  15  20  25  30  35  40  45  50




Saturday 15 September 2018

Basic calculations: Truth table and conditional executions

Example of standard logical operations

Truth table


 
> x = TRUE
> y = FALSE

> x & y       # x AND y
[1]   FALSE

> x | y        # x OR y 
 [1]  TRUE

> !x          # negation of x
 [1] FALSE 


Example

> x <- 5
> Logical1  <- (x > 2)
> is.logical (Logical1)
 [1] TRUE

> Logical2 <- (x < 10)
> is.logical (Logical2)
 [1] TRUE

Example

> x <- 5
> Logical3 <-  (2*x > 11)
> is.logical (Logical3)
[1] TRUE

> Logical4  <-  (3*x <20)
> is.logical (Logical4)
 [1] TRUE


Control structures in R :

control statements,
loops,
function
Conditional execution

1. Conditional execution

Syntax

if (condition) {executes commands if condition is TRUE}
if (condition) {executes commands if condition is TRUE}

else {executes commands if condition is FALSE}

please note: 
  • The condition in this control statement may not be vector valued and if so, only the first element of the vector is used.
  • The condition may be complex expression where the logical operators "and" (&&) and "or" (| |) can be used.
Example

> x <- 5
> if ( x == 3)  { x <- x-1} else { x <- 2*x}
Interpretation:
  • If x = 3, then execute  x = x - 1.
  • If x ≠ 3, then execute x = 2*x.
In this case, x = 5 so x ≠ 3. Thus x = 2*5.

> x
 [1]  10

Now choose x = 3 and repeat this example


Friday 14 September 2018

Basic calculations: Logical operators in R Languages

Logical Operators and Comparisons

The following table shows the operations and functions for logical comparisons (True or False).




Examples :

> x = 1 : 6       # Generates x=1,2,3,4,5,6
> (x > 2) & (x < 5)  # Checks whether the values are greater than 2 and less than 5

[1]  FALSE  FALSE  TRUE  TRUE  FALSE  FALSE

> x [(x > 2) & (x < 5)]  # Finds which values are greater than 2 and smaller than 5.
[1]   3  4



Logical Operators and Comparisons 



  • The shoter form performs element-wise comparisons in almost the same way as arithmetic operators.
  • The longer from evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined.
Example of   " The longer form evaluates left to right examining only the first element of each vector"
> x = 1 : 6           # Generates x = 1,2,3,4,5,6

> (x > 2)  && (x < 5) 
 [1]  FALSE

is equivalent to:

> (x[1] > 2) & (x[1] < 5)
  [1]  FALSE


Note that x[1] is only the first element in x.

> x[ (x > 2) && (x < 5) ] 
   integer(0)                # Finds which values are greater than 2 and smaller than 5
This statement is equivalent to 

> x [ (x[1] > 2) & (x[1] < 5) ]
integer (0)


Tuesday 11 September 2018

Basic calculations: Missing data and logical operators in R Language

Missing data

R represents missing observations through the data value NA
We can detect missing values using is.na

> x  <-  NA             # assign NA to variable x
> is.na (x)               # is it missing ?
   [1]    TRUE

Now try a vector to know if any value is missing?

> x <-  c(11, NA, 13)
> is.na (x)
  [1] FALSE TRUE FALSE














Example : How to work with missing data

> x  <-  c(11, NA, 13)  # vector
> mean (x)     11 + NA + 13/2
  [1]   NA
> mean (x, na.rm = TRUE )  # NAs can be removed 
 [1]  12
                    11 + 13/2 = 12
The null object, called NULL, is returned by some functions and expressions.

Note that NA and NULL are not the same.

NA is a placeholder for something that exists but is missing.

NULL stands for something that never existed at all.





Logical Operators and Comparisons

The following table shows the operations and functions for logical comparisons (True or False)

TRUE and FALSE are reserved words denoting logical constants.


Logical Operators and Comparisons



  • The shorter form performs element-wise comparisons in almost the same way as arithmetic operators.
  • The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined.
  • The longer form is appropriate for programming control-flow and typically preferred in if clauses (conditional).
TRUE and FALSE are reserved words denoting logical constants


Example

 > x  <- 5
Is x less than 10 or x is greater than 5 ?
 > (x < 10) | | (x > 5)   # | | means OR
 [1]  TRUE

Is x greater than 10 or x is greater than 5 ?
> (x > 10) | |  (x > 5)
[1] FALSE


Monday 10 September 2018

Statisticsl Functions - Correlation and Example in R Language

Descriptive Statistics :

First hand tools which gives first hand information.
  • Central tendency of data
  • Variation in data
  • Structure and shape of data tendency
  • Relationship study (correlation coefficient, rank correlation, correlation ratio, regression etc.)
Bivariate Data

Quantitative measures provide quantitative measure of relationship.

Graphical plots provide first hand visual information about the nature and degree of relationship between two variables.

Relationship can be linear or nonlinear.



x, y : Two data vectors

Data    x = (x1,x2,....,xn)                       y = (y1,y2,...,yn)

cov (x,y) :    covariance between x and y
var (x)Variance of x


Correlation coefficient

Measures the degree of linear relationship between the two variables.
cor (x,y) : correlation between x and y




Example :-

Covariance:

Example :-

Correlation coefficient:
Exact positive linear dependence

> cor ( c(1,2,3,4) , c(1,2.3,4)  )
 [1]  1



Data on Daily water Demand




Popular Posts

Categories

AI (27) Android (24) AngularJS (1) Assembly Language (2) aws (17) Azure (7) BI (10) book (4) Books (117) C (77) C# (12) C++ (82) Course (62) Coursera (180) Cybersecurity (22) data management (11) Data Science (95) Data Strucures (6) Deep Learning (9) Django (6) Downloads (3) edx (2) Engineering (14) Excel (13) Factorial (1) Finance (6) flutter (1) FPL (17) Google (19) Hadoop (3) HTML&CSS (46) IBM (25) IoT (1) IS (25) Java (92) Leet Code (4) Machine Learning (44) Meta (18) MICHIGAN (5) microsoft (4) Pandas (3) PHP (20) Projects (29) Python (753) Python Coding Challenge (228) Questions (2) R (70) React (6) Scripting (1) security (3) Selenium Webdriver (2) Software (17) SQL (40) UX Research (1) web application (8)

Followers

Person climbing a staircase. Learn Data Science from Scratch: online program with 21 courses