Figure out appropriate stats, look at papers and ask dr. Pelky
Meeting with Dr. Lamendella the last week before finals
Thursday, dec 6th meeting lamendella lab night. Paired presentation
Ideas for crowd sourcing
Turn in form and cc. Dr. L
Send abstract to Dr. L
Results from heatmapping and nmds
Nmds-not seeing clustering, either due to not being robust enough or just need a different type of visualization
Procrustes analysis through qiime
Look at literature for other ideas besides nmds
Analyze like crohns, profiles of 20 most abundant
PC-ORD Non-metric Multidimensional Scaling and Correlations
- Imported Main Matrix “MM” sheet from “PCORD_only_Base.xlsx”
- Imported Environmental “EM” sheet from “PCORD_only_Base.xlsx”
- Ran analysis: Ordination|NMS|autopilot/Slow & thorough
- Took about 20 min. No option to “Graph Ordination”.
- Graphed NMS Scree Plot. Not exactly sure what this represents.
- Ran another analysis: Ordination|NMS|autopilot/Medium
- Graphed NMS Scree Plot again. No visible changes.
- Graphed 2D ordination: Graph|2D. Does not appear to be same as graph in demo with Dr. Lamendella which had an issue with outliers. (There is a way to delete outliers in PC-ORD. Look at the manual)
- Ran Pearson & Kendall Correlations with Ordination Axes: (in Graph window) Statistics|Correlations with Second Matrix. Nothing seems to be strongly correlated.
Heatmaps using R
Install specific packages to plot enhanced heatmaps.
Save .xls as .csv. Read .csv into R.
dietdata = read.csv("c:/moni_files/Dropbox/Lamendella/analysis/OTUforR.csv",
header=TRUE,row.names=1, colClasses="character",comment="", sep=",")
Created a heatmap using first 50 patients reads and first 50 OTUS
d3 = dietdata[1:50,1:50]
d3matrix = data.matrix(d3)
d3_heatmap = heatmap.2(d3matrix,Rowv=NA,
Colv=NA, scale="column", trace="none", col=redgreen(75), xlab="patient", ylab="OTU")
Created a heatmap with a dendrogram and color key.
heatmap.2(d3matrix,Rowv=NA,Colv=NA, scale="column", trace="none", col=redgreen(75), xlab="patient", ylab="OTU")
Things to figure out
- How to graph the output from a “Slow & Thorough” Ordination Analysis
- What exactly is the dendrogram representing? Edit. Apply to OTU.
- Group patients by visit.
- Change patient codes.
- Add bar to represent the different visits similar to the bar for clustering the patients in (http://genomebiology.com/2004/5/10/r80/figure/F2)
- Change settings so that it can show all or most of the the OTUs
Later down the road…
- Indicator Species Analysis
- Network Analysis & Cytoscape
- Procrustes Analysis
1. What is the problem?
Obesity and related health issues rates continue to rise and understanding the contributions of the gut microbiota area is a relatively new field that may offer novel management of the disease.
2. 1-2 sentences. What your research proposes to do/to study the problem.
Previously, fecal samples were collected from 39 patients on resistant and non-resistant starch diets and the 16sRNA genes were sequenced through hiseq high throughput sequencing.
3. How/methods (2 sentences)
These sequences will be used to identify changes of microbial diversity in response to diet through non-parametric analysis,
4. 2 sentences on preliminary results
Because the short length of the diet, it is expected that only specific individuals that are high-responders will have a shift in microbial diversity.
5. Conclusion- what does this data mean and why is it important?
This study can provide future novel biomarkers for obesity and insight on potential gut microbes for pre and probiotic therapies.
Stats wish list
vector analysis pearson correlation
2. correlation between nmds and bacterial profiles
3. ISA-indicator species analysis
which species are characteristic of baseline and which correlate with glucose
4. Network analysis
robust method, not just pretty picture
pc ord makes data, cytoscape makes graphic
5. Procrustes analysis
Things we figured out
- metadata ID names are some combination of RS number, visit etc. separated by dots.
- metadata after baseline is on computer at Lamendella’s house. Should receive by end of the day.
Things we need to do for report
- put together spreadsheet of metadata of change-insulin, glucose, body fat, age, gendery, ethnicity
- anovas from chori (Dr. Lamendella)
- summary stats of sequencing-sequences per sample vs. total QC’ed quality
Tests that we can perform in the future-
- non metric multidimensional scaling-PC-ORD
- pearson correlation-correlate human metadata and bacteria profile
- ANOVA-which metadata characters are significantly dif between groups
- ISA-indicator species analysis-which species correlate with which diet and group