Trends in Healthcare IT: Big Data and Deep Learning [Commentary]


What opportunities and what risks have big data and artificial intelligence to offer for medicine, and particularly for radiology? This commentary by Rainer Kasan is based on a lecture and was translated into English.

Rainer Kasan, shareholder of Telepaxx Medical Archiving GmbH, explains how big data and deep learning can be used in the healthcare industry.

What is Big Data?

Big Data is about storing and processing large amounts of medical data and searching for useful patterns for disease detection. Today data is essentially characterized by the so-called “four Vs”: Volume, Velocity, Variety and Veracity. The amount of data doubles approximately every two years as a result of progressive digitalization. By networking electronic communication, we have to record and analyse the incoming information ever faster or even in “real time”. Data occurs in a wide variety of forms, e.g. as medical images, text documents such as findings or medication plans, HD videos, ECGs, audio files, etc. With big data analyses, data quality must be ensured.

Data Protection and Causality

Data may not be collected secretly, used commercially or passed on to third parties without the users’ knowledge. Therefore, anonymous information must be used for big data analysis. However, pseudonymous data is often used, which may allow conclusions to be drawn about the person by combining many related data records. Data analysts are aware that two events are not necessarily cause and effect just because they often occur together. Therefore, more data does not mean more knowledge. For this reason, the trend in medicine is again to evaluate data through smaller samples and other methods in order to understand the target group. Big Data must therefore first become small data and eventually smart data.

How Does Deep Learning Work?

Due to the immense increase in unstructured data, such as images, text, videos or speech, automatic pattern recognition is necessary. Deep Learning as a sub-area of machine learning helps here. This can be done with two methods: supervised and unsupervised Learning. In supervised learning, the available data already contains a “marker” and therefore already knows the “correct answers”. Problems can arise with the correct categorization of this data, such as brain haemorrhage “yes or no”. In unsupervised learning you search for hidden structures in unmarked data, which you group according to common features (clustering). In image processing, structures of brain hemorrhages can be detected and these patterns can then be applied. In genome research, unsupervised learning with compressed data is used to reduce the number of attributes by forming clusters between genetic traits.

Big Data in Radiology

To bring deep learning to clinics, small autonomous training systems are currently being set up using preconfigured hardware and software that learn to answer specific diagnostic questions. The result of these training procedures is a data protection compliant classifier that does not allow any conclusions to be drawn about the data resulting from it. If these deep learning classifiers are used in everyday clinical practice to diagnose diseases, they must be certified as medical devices. However, the medical results from the deep learning process would have to be completely comprehensible, which is currently not the case. The radiologist of the future will play the role of an “information broker”. Artificial intelligence will relieve him of routine tasks and, thanks to automated second opinions, will contribute to quality assurance, but will also generate incidental findings. In developing and emerging countries, few physicians could examine a significantly larger number of patients thanks to artificial intelligence.


However, we are still a long way from big data and artificial intelligence finding rare diseases and proposing tailor-made therapy solutions. Until then, we still have some problems to resolve: Do not confuse correlations and causalities, do not put computer performance and processing speed above evidence-based approaches, protect personal data, and discuss ethical and moral aspects. But big data already makes data resources clear and enables professional interpretations. Comparative data can be queried and used for medical analysis, prophylaxis, diagnosis, therapy or aftercare. Big data and artificial intelligence are decisive factors on the way to personalized and qualitatively better medicine. In genome research, with its huge data volume, big data analyses have long been indispensable. They offer new opportunities for many areas of our society. Be it in tracking down voter behaviour, understanding consumer behaviour, autonomous driving – and of course also in medicine.