In this university project, a Prius vehicle was programmed for autonomous driving in a simulated environment using the Robot Operating System (ROS). The vehicle was equipped with a front-facing camera and a 360-degree LIDAR sensor for detecting obstacles and pedestrians. The project involved developing three ROS packages: "opencv_person_detector" for detecting individuals in camera images, "pcl_obstacle_detector" for identifying obstacles in LIDAR point clouds, and "control_barrel_world" for vehicle control based on sensor inputs. The goal was to autonomously navigate a path marked by cones, avoiding obstacles and stopping for pedestrians. The project is showcased in a simulation video.
This project presents an implementation and comparison of two motion planning algorithms, RRT and RRT*, for autonomously navigating a robot in a parking lot scenario. Using a kinematic bicycle model and a PD controller, the algorithms were developed to guide the robot around static obstacles. The findings indicate that while RRT* generates shorter paths, it requires more computational time compared to RRT. The study concludes with suggestions for future enhancements using advanced variants of these algorithms.
This project introduces an adaptive AI phone application developed for KLM, aimed at revolutionizing the training of sales representatives. Utilizing advanced artificial intelligence within a systems-engineered framework, the application facilitates realistic and effective training scenarios. The project commenced with the conceptualization and enactment of an abstract usage scenario, further broken down into detailed deployment and operational phases. Key features include VOIP integration, speech-to-text and text-to-speech conversions, and AI-driven response generation, all meticulously delineated in a functional hierarchy tree and activity diagrams. The iterative development process prioritized real-time interaction, low latency, and high-quality voice output, culminating in an integrated system combining Vosk, ElevenLabs, ChatGPT 4, and Twilio’s Voice API. This innovative system is designed to enhance the training experience, preparing sales representatives to confidently navigate the complexities of unpredictable real-life customer interactions.
This study evaluates the effect of time delay on the performance of a teleoperated robot arm in simulated deep sea welding tasks. Given the hazardous nature of deep sea welding, the research explores a safer alternative using teleoperation. The experiment involved 10 participants controlling a robot arm to follow a set trajectory under varying time delays. Performance was measured using mean absolute error and task completion time. Results show a significant increase in both metrics with increased time delays, indicating a decline in operational efficiency. The study highlights the challenges of latency in teleoperated systems and suggests avenues for future research, including the impact of training and trajectory learning on performance.
This project presents an automated solution for retail store restocking using PDDL (Planning Domain Definition Language) and ROSPlan, led by Henk Jekel. It aims to address labor shortages in the retail sector due to an aging population. The system enables a robot to perform restocking tasks in a simulated store environment, determining the appropriate placement of items based on predefined store rules. The solution uses a Python-based ontology for product classification and a PDDL knowledge base for initial environment setup. Despite facing technical challenges in simulation, the project demonstrates the feasibility of using knowledge representation and reasoning for efficient and adaptable automated restocking in retail settings.
In this project, an innovative obstacle avoidance system for drones was developed, focusing on maximizing distance in a 10-minute flying competition. Two approaches, optical flow and color filtering, were investigated for their effectiveness in obstacle detection. Optical flow involved calculating distances to obstacles during circular flight, while color filtering recognized specific object colors. Despite the efficiency of optical flow, color filtering was chosen for the contest, leading to a successful 67-meter flight. The project underscores the potential of advanced navigation systems in enhancing drone safety and efficiency in competitive and practical scenarios.
April 28, 2023
DARTS FOR NEURAL ARCHITECTURE SEARCH
In this project, I developed software to utilize differentiable architecture search (DARTS) for
determining the best building block for a cell. Specifically, I compared three types of blocks:
Fused-MBConv, MBConv, and Depthwise Separable Convolution. The motivation behind the project was
to investigate the developmental process of the Fused-MBConv block, which is a superior
architecture building block used in the state-of-the-art image recognizer, EfficientNetV2,
developed by the Google brain team. I conducted a differentiable architecture search to evaluate
the performance of these three blocks on the Fashion-MNIST dataset. My research aimed to prove
that the DARTS algorithm would choose the best block among the three types that were evaluated.
However, the findings indicated that for the reduce cell, the algorithm found a mixture of
blocks, and for the normal cell, it only used the weakest block, which is the Depthwise
Separable Convolution, for unknown reasons.
July 1, 2021
The use of deep learning for person detection and gender classification
using RGB images to support the visually impaired.
This paper presents the deep learning approach to help the visually impaired in an object detection
task: recognising the gender of people in their proximate surrounding. By use of images coming from a RPI
WWCAM2 monocular camera, the person is first detected , i.e. localized in the image, and then classified to
one of the two genders. In order to perform gender detection in real-time, the use of transfer learning together
with a single-stage object detection algorithm was investigated. Based on the number of processed frames per
second (FPS) and the mean average precision (mAP), it was concluded that fine-tuning a pre-trained YOLOv4
algorithm on customized versions of the Pascal VOC 2007 dataset and the CelebA dataset is best suited for this
task.
Abstract: In the field of computer vision, pre-trained models have gained renewed attention,
including ImageNet supervised pre-training. Recent studies have highlighted the enduring
significance of the Lottery Tickets Hypothesis (LTH) in the context of classification,
detection, and segmentation tasks. Inspired by this, we set out to explore the potential of LTH
in the pre-training paradigm of depth estimation. Our aim is to investigate whether we can
significantly reduce the complexity of pre-trained models without compromising their downstream
transferability in the depth estimation task. We fine-tune the sparse pre-trained networks
obtained through iterative magnitude pruning and demonstrate universal transferability to the
depth estimation task, maintaining performance comparable to that of fine tuning on the full
pre-trained model. Our findings are still inconclusive.
The Albert project developed a versatile robotic system for supermarkets, designed to handle both online and in-store customer orders. Featuring an advanced ChatGPT-powered voice interaction system, Albert efficiently processes and responds to customer requests. Its autonomous capabilities include identifying, picking, and placing products, managed by the FlexBE state machine. Equipped with sensors like lidar and stereo cameras, it navigates safely around obstacles and customers. While testing has shown promising results in both simulations and real-world scenarios, further refinement is necessary to address remaining challenges before widespread implementation in supermarkets.
Future Projects