Difference between revisions of "Using Bioconductor To Analyse Beadarray Data"
From Bridges Lab Protocols
Davebridges (Talk | contribs) m |
Davebridges (Talk | contribs) m |
||
Line 24: | Line 24: | ||
</pre> | </pre> | ||
*You may need to alter either the ProbeID or ControlID to fit the illuminaprobe column from the sampleprobe or controlprobe datasets. | *You may need to alter either the ProbeID or ControlID to fit the illuminaprobe column from the sampleprobe or controlprobe datasets. | ||
+ | *This fits the data into the BSData dataframe. Phenotype data can be accessed by pData(BSData) and expression data can be accessed by exprs(BSData). | ||
==Data Normalisation== | ==Data Normalisation== | ||
Line 35: | Line 36: | ||
*Save these boxplots as postscript files. | *Save these boxplots as postscript files. | ||
− | + | ==Clustering Analysis== | |
− | *This | + | *This analysis will generate a euclidean distance matrix then a cluster analysis of that matrix and will show the distribution between replicates. Ideally similar treatments will cluster together. |
+ | <pre> | ||
+ | d = dist(t(exprs(BSData.quantile))) | ||
+ | plot(hclust(d) | ||
+ | </pre> |
Revision as of 17:30, 21 August 2009
Software Requirements
- R, get from [CRAN]
- Bioconductor, get from [Bioconductor]
- Bioconductor packages. Install as needed:
- beadarray
- limma
source("http://www.bioconductor.org/biocLite.R") biocLite("PACKAGE")
Loading Data
- At a minimum you need the Probe Profile data (normally a txt file).
- For all R procedures first change directory to your working directory then next create a new script, and save all executed lines in that script file.
- Load the beadarray library, indictate dataFile (required), sampleSheet (normally a xls or csv file) and control set (Control Probe, normally a txt file)
data = "FinalReport_SampleProbe.txt" controls = "ControlProbe.txt" samplesheet = "Proj_54_12Aug09_WGGEX_SS_name.csv" BSData = readBeadSummaryData(dataFile = data, qcFile= controls, sampleSheet=samplesheet)
- You may need to alter either the ProbeID or ControlID to fit the illuminaprobe column from the sampleprobe or controlprobe datasets.
- This fits the data into the BSData dataframe. Phenotype data can be accessed by pData(BSData) and expression data can be accessed by exprs(BSData).
Data Normalisation
- Microarray data is typically quantile normalised and log2 transformed:
BSData.quantile = normaliseIllumina(BSData, method="quantile", transform="log2")
- To examine the effects of normalisation on the dataset use boxplots:
boxplot(as.data.frame(log2(exprs(BSData))),las=2,outline=FALSE, ylab="Intensity (Log2 Scale)") boxplot(as.data.frame(exprs(BSData.quantile)),las=2,outline=FALSE, ylab="Intensity (Log2 Scale)")
- Save these boxplots as postscript files.
Clustering Analysis
- This analysis will generate a euclidean distance matrix then a cluster analysis of that matrix and will show the distribution between replicates. Ideally similar treatments will cluster together.
d = dist(t(exprs(BSData.quantile))) plot(hclust(d)