Principal Component Analysis (PCA) from Scratch

How to perform PCA step by step using R and basic linear algebra functions and operations.  What is PCA? PCA is an exploratory data analysis based in dimensions reduction. The general idea is to reduce the dataset to have fewer dimensions and at the same time preserve as much information as possible. PCA allows us to make visual representations in two dimensions and check for groups or differences in the data related to different states, treatments, etc....

October 14, 2021

Heatmap Maker Shiny App

A brief tutorial about how to use Heatmap Maker, a web application in R made with Shiny.  What is Heatmap Maker? Heatmap Maker is a Shiny App that allows you to make and save cluster heatmaps as the following: Heatmap Maker has as a core the function heatmap.2() from gplots package. Install and use If you are an R and RStudio user, you have two options to use the app....

September 23, 2021

Nested Designs and their Analysis

How to perform the analysis for data from a nested design using R.  If you want to download the code on this post, you can click on the next link: Nested Designs and their Analysis. 1 Problem You have designed an experiment to compare the effect of three kinds of drugs on the expression of a gene. You performed the next experiment using mice, note that your design follows a hierarchy....

August 18, 2021

Linear Regression from Scratch

A tutorial explaining how to perform simple linear regression from scratch using linear algebra, calculus, and of course R.  Regression analysis has the objective to model, in a mathematical way, the behavior of a respond variable as a function of one or more independent variables (factors). As a practical example, think about a process of alcohol production using the microscopic fungus Saccharomyces cerevisiae (yeast). You know that alcohol yield is a linear function of the concentration of sucrose for some interval of concentrations, you are also interested on making nice predictions of alcohol yield for a given sucrose concentration....

July 29, 2021

Basic Linear Algebra with R

Basic linear algebra operations with R.  1 Introduction Linear algebra is the study of vectors and linear functions. On this post I’m going to show you how to perform some basic linear algebra operations with R. 2 Vector operations 2.1 Define vectors You can define vectors simply as follows: # Define two vectors x <- c(30, 20, 40, 10) y <- c(20, 15, 18, 40) 2.2 Sum You can add vectors with the same length:...

March 28, 2021

Exploratory Data Analysis

Exploratory data analysis using R and tidyverse. Always there is the possibility of failure  1 Introduction Statistical analysis like the t test or the analysis of variance are not designed to detect experimental errors. To overcome this problem doing an exploratory data analysis (graphing your data) can be a good approach. 2 Quantile-Quantile Plots A quantil is a measurement of position indicating where a specified proportion of the distribution of data lies....

March 5, 2021

Statistical Inference with R

Statistical inference concepts explained using R.  Perfection is always impossible; always it’s an approximation 1 Introduction Formally, statistical inference can be defined as the process through which inferences about a population are made based on certain statistics calculated from a sample of data drawn from that population. What does this mean? While you consult scientific literature, maybe you have read statements like “this results showed significant differences (P < 0....

February 26, 2021

Principal Component Analysis to Many Responses

How to perform a principal component analysis on metabolomic data using R. This includes visual representations like a scree plot and a scatter plot for PC1 and PC2.  1 Problem You have obtained the relative quantities for 43 metabolites in Arabidopsis thaliana under salt stress conditions at different times. Saline stress is a important factor limiting plant growth and you want to study how the level of primary metabolites changes under these conditions....

February 18, 2021

More than Two Treatments Comparison (One-Way ANOVA)

A brief R tutorial about how to perform a data analysis to compare four treatments data. This is achieved by fitting a analysis of variance model, summarizing and making some nice graphs for the data.  1 Problem You want to investigate the effect of light on plant growth. For this you designed an experiment that includes four different light sources. For each source you placed six plants (you replicate each treatment six times) in chambers with a specific light bulb....

February 15, 2021

Simple Linear Regression Applied to Enzyme Kinetics

A R tutorial about how to perform a simple linear regression analysis applied to an enzyme kinetics experiment. It also explore an alternative approach by nonlinear regression.  1 Problem You have performed a enzyme kinetics experiment. You are supposing a Michaelis-Menten kinetics and your aim is to determine the constants Vmax, Km and some others. The first ten observations in your data would look like this: S (M) v (M/min) 8....

February 8, 2021