Can MINT be used for categorical survey data?

Conceptual question.

I’ve had a plan to use MINT to integrate patient survey responses from different countries, where things like demographics, family history, disease status, comorbidities, disease triggers and symptoms and the likes are recorded in a longitudinal manner. The registry data is collected by sister organizations, which means the questions (features) are the same but the patients vary. I thought this would be great dataset to apply MINT to.
However, going through it now from the course notes, I am unsure as MINT is a sparse multi-group PLS model. From my reading of the material and previous question, the way I’ve dealt with categorical variables as an input is creating a dummy variables with each possible answer as its own column (features) and then turning this into a dummy matrix as input to analysis. From my understanding, this means sparse analysis options are out as those methods are likely going to remove parts of the same variable as they are interpreted as separate variables (columns) in the dummy matrix setup.

Does this mean MINT isn’t something I can apply to survey data integration? Or have I misunderstood something along the way? I hope the explanation makes sense too.


hi @lizak,

We have not tested this scenario as of yet. When you transform your variables into dummy in the X data set, what may happen is that you may end up selecting, say: X_status_category1 but not X_status_category2. It is worth trying but I think there is a bit of work regarding the interpretation.