Informatik/Technik
Dauerhafte URI für die Sektionhttps://epub.uni-luebeck.de/handle/zhb_hl/4
Listen
Auflistung Informatik/Technik nach Schlagwort "CNNs"
Gerade angezeigt 1 - 1 von 1
- Treffer pro Seite
- Sortieroptionen
Item Novel Machine Learning Methods for Video Understanding and Medical Analysis(2025-06-26) Hu, YaxinArtificial intelligence has developed rapidly over the past decade and has penetrated into nearly every aspect of life. New applications in areas such as human-computer interaction, virtual reality, autonomous driving and intelligent medical systems have emerged in large numbers. Video is a kind of high-dimensional data, which has one more dimension than images, requiring more computing resources. As more and more high-quality large-scale video datasets are released, video understanding has become a cutting-edge research direction in the computer vision community. Action recognition is one of the most important tasks in video understanding. There are many successful network architectures for video action recognition. In our work, we focus on proposing new designs and architectures for video understanding and investigating their applications in medicine. We introduce a novel RGBt sampling strategy to fuse temporal information into single frames without increasing the computational load and explore different color sampling strategies to further improve network performance. We find that frames with temporal information obtained by fusing the green channels from different frames achieve the best results. We use tubes of different sizes to embed richer temporal information into tokens without increasing the computational load. We also introduce a novel bio-inspired neuron model, the MinBlock, to make the network more information selective. Furthermore, we propose a spatiotemporal architecture that slices videos in space-time and thus enables 2D-CNNs to directly extract temporal information. All the above methods are evaluated on at least two benchmark datasets and all perform better than the baselines. We also focus on applying our networks in medicine. We use our slicing 2D-CNN architecture for glaucoma and visual impairments analysis. And we find that visual impairments may affect walking patterns of humans thus making the video analysis relevant for diagnosis. We also design a machine learning model to diagnose psychosis and show that it is possible to predict whether clinical high-risk patients would actually develop a psychosis.