In the past decade, deep learning (DL) has taken the world by storm. It has produced significant results in a wide variety of applications ranging from self driving cars to natural language processing (NLP). Modern deep learning is built from a number of different algorithms including artificial neural networks (ANN), optimisation algorithms, back-propagation (BP), and varying levels of supervision. Recent advances in GPU hardware, improved availability of large, high quality datasets, and the development of modern training algorithms have all played a pivotal role in the emergence of modern deep learning. These advances have made it easier to train and deploy deeper neural networks that exhibit great generalisation and state-of-the-art, (SOTA), results.
Scene understanding is a critical topic in computer vision. In recent years, semantic segmentation and monocular depth estimation have emerged as two key methods for achieving this goal. The combination of these two tasks enables a sys
Accurate detection of pedestrian lanes is a crucial criterion for vision-impaired people to navigate freely and safely. The current deep learning methods have achieved reasonable accuracy at this task. However, they lack practicality for real-time pedestrian lane detection due to non-optimal accuracy, speed, and model size trade-off. Hence, an optimized deep neural network (DNN) for pedestrian lane detection is required. Designing a DNN from scratch is a laborious task that requires significant experience and time. This paper proposes a novel neural architecture search (NAS) algorithm, named MSD-NAS, to automate this laborious task. The proposed method designs an optimized deep network with multi-scale input branches, allowing the derived network to utilize local and global contexts for predictions. The search is also performed in a large and generic space that includes many existing hand-designed network architectures as candidates. To further boost performance, we propose a Short-ter
Pedestrian lane detection is a crucial task in assistive navigation for vision-impaired people. It can provide information on walkable regions, help blind people stay on the pedestrian lane, and assist with obstacle detection. An accurate and real-time lane detection algorithm can improve travel safety and efficiency for the visually impaired. Despite its importance, pedestrian lane detection in unstructured scenes for assistive navigation has not attracted sufficient attention in the research community. This paper aims to provide a comprehensive review and an experimental evaluation of methods that can be applied for pedestrian lane detection, thereby laying a foundation for future research in this area. Our study covers traditional and deep learning methods for pedestrian lane detection, general road detection, and general semantic segmentation. We also perform an experimental evaluation of the representative methods on a large benchmark dataset that is specifically created for pedes
Depth estimation is an essential component in computer vision systems for achieving 3D scene understanding. Efficient and accurate depth map estimation has numerous applications including self-driving vehicles and virtual reality tools. This paper presents a new deep network, called D-Net, for depth estimation from a single RGB image. The proposed network can be trained end-to-end, and its structure can be customised to meet different requirements in model size, speed, and prediction accuracy. Our approach gathers strong global and local contextual features at multiple resolutions, and then transfers these to high resolutions for clearer depth maps. For the encoder backbone, D-Net can utilise many state-of-the-art models including EfficientNet, HRNet and Swin Transformer to obtain dense depth maps. The proposed D-net is designed to have minimal parameters and reduced computational complexity. Extensive evaluations on the NYUv2 and KITTI benchmark datasets show that our model is highly
A new robotic cane featuring inertial sensors has been developed with the potential to help navigate people with visual impairments around indoor environments.