Hi all,
Sorry, probably I am quite new with R and I am experiencing problems setting the correct structure for my input data. I am trying to generate a list of data frames but I am having problems in the further analyses, and I think maybe my data is not correctly organized.
Could you please provide any example of how to generate this data from the beginning? I have seen some of your examples (e.g. the nutrimouse data) but, since it is already established, I do not understand how to get to its final structure.
Thank you very much in advance.
hi @agallego,
I am not sure which method you are planning to use, but say you want to start first with PCA or PLS:
library(mixOmics)
data(nutrimouse)
?nutrimouse
#There are 40 mice samples (liver cells) from which we monitored lipids and the gene expression levels. There are 2 data sets: gene expression and lipids of size 40 x 120 (genes) and 40 x 21 (lipids), as well as a meta data file that indicate other information, such as diet and genotype of the mice.
head(nutrimouse$gene) # this is the gene expression data frame, you can store this as the object ‘X’
head(nutrimouse$lipid)# this is the lipid data frame, you can store this as the object ‘Y’
nutrimouse$genotype # this is how your meta data should look like, you can store this as the object ‘meda.data’
# store the data to get started with the analysis:
X = nutrimouse$gene
Y = nutrimouse$lipid
meta.data = nutrimouse$genotype
dim(X) # check the dimensions
dim(Y)
res.pca = pca(X)
# in your case, you would read the data first, e.g
X = read.csv('mygenes.csv', header = TRUE)
Y = read.csv('mylipids.csv', header = TRUE)
meta.data = read.csv('mygenotype.csv', header = TRUE)
dim(X) # check the dimensions
dim(Y)
res.pca = pca(X)
If you have someone in your lab / institute that could guide you through these first steps that would be easier for you as it is difficult to explain here.
Kim-Anh