Now showing 1 - 5 of 10
- ItemA Review of Forest Fire Combating Efforts, Challenges and Future Directions in Peninsular Malaysia, Sabah, and Sarawak(2022-09-01) Yee Jian Chew; Shih Yin Ooi; Ying Han Pang; Kok-Seng WongAbstract: The land surface of Malaysia mostly constitutes forest cover. For decades, forest fires have been one of the nation’s most concerning environmental issues. With the advent of machine learning, many studies have been conducted to resolve forest fire issues. However, the findings and results have been very case-specific. Most experiments have focused on particular regions with independent methodology settings, which has hindered the ability of others to reproduce works. Another major challenge is lack of benchmark datasets in this domain, which has made benchmark comparisons almost impossible to conduct. To our best knowledge, no comprehensive review and analysis have been performed to streamline the research direction for forest fires in Malaysia. Hence, this paper was aimed to review all works aimed to combat forest fire issues in Malaysia from 1989 to 2021. With the proliferation of publicly accessible satellite data in recent years, a new direction of utilising big data platforms has been postulated. The merit of this approach is that the methodology and experiments can be reproduced. Thus, it is strongly believed that the findings and analysis shown in this paper will be useful as a baseline to propagate research in this domain.
- ItemBenchmarking full version of GureKDDCup, UNSW-NB15, and CIDDS-001 NIDS datasets using rolling-origin resampling(2021) Yee Jian Chew; Nicholas Lee; Shih Yin Ooi; Kok-Seng Wong; Ying Han PangNetwork intrusion detection system (NIDS) is a system that analyses network traffic to flag malicious traffic or suspicious activities. Several recent NIDS datasets have been published, however, the lack of baseline experimental results on the full version of datasets had made it difficult for researchers to perform benchmarking. As the train-test distribution of the datasets has yet to be predefined by the creators, this further obstructs the researchers to compare the performance unbiasedly across each of the machine classifiers. Moreover, cross-validation resampling schemes have also been addressed in the literatures to be inappropriate in the domain of NIDS. Thus, rolling origin – a standard resampling technique that is also known as a common cross-validation scheme in the forecasting domain is employed to allocate the training and testing distributions. In this paper, rigorous experiments are conducted on the full version of the three recent NIDS datasets: GureKDDCup, UNSW-NB15, and CIDDS-001. While the datasets chosen might not be the latest available datasets, we have selected them as they include the essential IP addresses fields which are usually missing or removed due to some sort of privacy concerns. To deliver the baseline empirical results, 10 well-known classifiers from Weka are utilized.
- ItemImproving multi-label text classification using weighted information gain and co-trained Multinomial Naïve Bayes classifier(2022) Kok-Seng Wong; Wandeep Kaur; Vimala BalakrishnanOver recent years, the emergence of electronic text processing systems has generated a vast amount of structured and unstructured data, thus creating a challenging situation for users to rummage through irrelevant information. Therefore, studies are continually looking to improve the classification process to produce more accurate results that would benefit users. This paper looks into the weighted information gain method that re-assigns wrongly classified features with new weights to provide better classification. The method focuses on the weights of the frequency bins, assuming every time a certain word frequency bin is iterated, it provides information on the target word feature. Therefore, the more iteration and re-assigning of weight occur within the bin, the more important the bin becomes, eventually providing better classification. The proposed algorithm was trained and tested using a corpus extracted from dedicated Facebook pages related to diabetes. The weighted information gain feature selection technique is then fed into a co-trained Multinomial Naïve Bayes classification algorithm that captures the labels' dependencies. The algorithm incorporates class value dependencies since the dataset used multi-label data before converting string vectors that allow the sparse distribution between features to be minimized, thus producing more accurate results. The results of this study show an improvement in classification to 61%.
- ItemEmerging Privacy and Trust Issues for Autonomous Vehicle Systems(2022) Kok-Seng Wong; Hung Nguyen; Giang Vu; Lan TranIn the awakening of cutting-edge technology, companies such as Apple, Waymo, and Tesla are racing to launch the industry’s first fully autonomous car. Besides the technical challenges such as safety and infrastructure, privacy and data protection have attracted the autonomous vehicle industry and researchers’ attention. In particular, it is hard for autonomous vehicle manufacturers to impose substantive privacy and security protections when different vendors and suppliers are involved in vehicle production. Although we know how much data autonomous vehicles will generate per day, there is a lack of knowledge of how the collected data will be used (e.g., real-time broadcasting and offline analytic). The privacy risks associated with data collection raise individual concerns in autonomous vehicle systems. For instance, when location information is combined with personal information, a person’s details such as wealth status, profession, sexual association, and religion can be deduced. The misuse of present and historical travel patterns also puts someone susceptible to physical harm or stalking. Driven by mutual benefits or regulations, specific data must be shared in real-time or published for analysis or research purposes. This paper discusses the emerging privacy and trust issues that are essential to motivate the acceptance of autonomous vehicles operating on public roads.
- ItemA Privacy-Preserving Framework for Surveillance Systems(2020) Kok-Seng Wong; Tu Nguyen; Anuar Maratkhan; M. Fatih, DemirciThe ability to visually track people present in the scene is essential for any surveillance system. However, the widespread deployment and increased advancement of video surveillance systems have raised awareness of privacy to the public, i.e., human identity in the videos. The existing indoor surveillance systems allow people to be watched remotely and recorded continuously but do not prevent any party from viewing activities and collecting personal visual information of people in the videos. Because of this problem, we propose a privacy-preserving framework to provide each user (e.g., parents) with a personalized video where the user sees only selected target subjects (e.g., child, teacher, and intruder) while other faces are dynamically masked. The primary services in our framework consist of a video streaming service and a personalized service. The video streaming service is responsible for detecting, segmenting, recognizing, and masking face images of the human subjects in the video. Notably, it classifies human subjects into insider and outsider classes and then applies the de-identification (i.e., masking) to those in the insider class, including the target subjects. Subsequently, the personalized service receives the visual information (i.e., masked and unmasked faces) from the streaming service and processes it at the user’s mobile device. The output is then a personalized video for each user. For security reasons, we require surveillance videos stored in the cloud in an encrypted form. To ensure an individual remains anonymous in a group, we propose a dynamic masking approach to mask the human subjects in the video. Our framework can deliver both reliable visual privacy protection and video utility. For instance, users can have confidence that their target subjects are anonymized in other views. To utilize the personalized video, users can use analytics software installed on their mobile devices to analyze the activities of their target subjects.