New to multi_Omics, struggling from days

Hi everyone, I hope everyone is doing okay.
My “Ovary.TCGA” list structure contains 1) mRNA 2) miRNA 3) Protein 4) coVar (as factor)

  1. mRNA 10 x 49818
  2. miRNA 10 x 1122
  3. protien 10 x 464
    coVar contains aged & young people, the age I fetched from clinical data.
    Below is a few of my last code line:

#Creation of a list of 4
Ovary.TCGA ← list(mRNA = mRNA, miRNA = miRNA, protein = protein, coVar = coVar)
#mixOmics DIABLO Begins…
x1 ← Ovary.TCGA$mRNA
x2 ← Ovary.TCGA$miRNA
x3 ← Ovary.TCGA$protein
x ← list(mRNA = x1, miRNA = x2, protein = x3)
y ← Ovary.TCGA$coVar
result.diablo.tcga ← block.plsda(x,y)

Below is the error I am facing:

Error in Check.entry.single(X[[q]], ncomp[q], q = q) :
Unique indentifier is needed for the columns of X

packageVersion(‘mixOmics’)
[1] ‘6.24.0’

All of my row names match, which are 10 samples from TCGA, as submitter ids all are like this example “TCGA-09-2051”. Please anyone help me.

@MaxBladen @kimanh.lecao Please anyone?

hi @Saeedjaanz,
I can’t be sure but I think the columns of x1 (or x2 or x3) are not unique.
Also consider first filtering to up 5,000 genes in the mRNA data set (choose the mRNA with the highest variance across all samples).
10 samples is not many for such analysis, I assume this is a trial?

Kim-Anh

Actually, there was duplication in the mRNA columns. mRNA dataset had 60k+ columns as in genes. After filtration & normalization it became down to round about 27k. I didnt apply a log transformation. But when I removed duplicated column, the unique columns were only round about 4k. And I proceeded further with N Integration.

1 Like