Within the FrailSafe project, researchers from the University of Patras have developed a frailty detection model that takes texts of older people into account. This happens in the framework of developing new metrics for frailty considering not only physical, but also cognitive, behavioural, social and psychological domains. Even text analysis by itself shows very promising results. A frailty text analysis model is a complex statistical model derived by elaborate Artificial Intelligence (AI) - Machine Learning (ML) methods that give computers the ability to learn automatically from text. The model aims to detect a person’s frailty status (i.e. non-frail, pre-frail, frail) based on written text and help doctors in their diagnosis and treatment of frailty.
An important component of this frailty model comes from the analysis and processing of the participants’ typed text in order to classify their behaviour according to the levels of frailty. The construction of this statistical model is made possible by a series of standard analysis steps that are generally performed in machine learning.
First, a number of features are automatically extracted from text. Features are individually measurable properties or characteristics of the data being analyzed. This step is based on various natural language processing filters, ranging from simple word spellers to complex text entropy measuring components. One feature can provide useful information against input (older person’s written text) from a low-level perspective, for example the existence or absence of a single word, to a high-level, for example readability measurement (how easy it is to read the text according to well-known standards). The feature extraction process in FrailSafe led to the creation of 160 distinct features.
In a following step, a feature selection process is performed, where the best subset of the available features is chosen, since not all of them are equally informative and it would be neither useful nor convenient to use the full feature space. In FrailSafe, the 160 initial features were ranked according to a well-known scientific procedure (Pearson’s correlation coefficient). Briefly, these features include various readability measures, existence of keywords, sentiment value (sad to happy) and others, which contain information concerning statistical properties of the older person’s text. Finally, the best performing statistical model is selected to fit the data. Since the complexity of FrailSafe data is high, no single model is satisfying enough to be used. For this reason, a technique known as Majority Voting was used combining the best performing models in an ensemble classifier.
The test results revealed that our frailty text analysis model is capable of predicting the three frailty conditions (non-frail, pre-frail, frail) by an average accuracy of 64%. This means that we can use an older person’s written text for our text analysis model to predict the person’s frailty status, and have a 64% chance our prediction of frailty status to be true, which is way better than a random guess with a probability of 33.33% (since we can have three different frailty conditions). To further increase the model prediction accuracy, we investigated the use of a simplified model that reduces the possible predictions to only non-frail and frail. The accuracy achieved in this case was at 84%, nearly a 20% increase. These results are quite encouraging given that these models are based only on text. Higher levels of prediction accuracy are achieved when combining data from other domains including physical, cognitive, behavioural, social and psychological.