Streaming Data Regression
- Publication Type:
- Thesis
- Issue Date:
- 2020
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Regression analysis is one of the most important tasks to address in the area of machine learning, and it is a form of predictive modeling technique that investigates the relationship between dependent and independent variables. However, most regression algorithms, whether against linear regression or nonlinear regression analysis, were designed based on a batch dataset. Nowadays, technological advancements make it possible to access fast and potentially infinite data known as streaming data. In streaming data, the data is displayed in the form of sequences and can only be read once in a predetermined order, so batched regression algorithms cannot be used to process it. The streaming algorithm is a new type of technique in machine learning. In streaming algorithms, data are processed sequentially as well and can be examined in only a few passes (typically just one). Hence, the streaming algorithm represents a dynamic technique of supervised learning and unsupervised learning.
However, as a novel learning technique, the streaming algorithm is still immature and imperfect for the regression problem. Firstly, most of the existing streaming regression algorithms only can address precise data. Secondly, more studies on streaming data show that data distribution has concept drift problem. Finally, in many real-world applications, the regression problem of streaming data becomes more complicated. Two or more outputs instead of single output need to be predicted. To solve these problems, we proposed some new streaming regression algorithms. More specifically, in order to solve streaming data regression under a noisy environment, we propose a novel online regression algorithm, called online robust support vector regression (ORSVR). Furthermore, we also propose an online topology learning algorithm to filter noise data in the data preprocessing stage, called Gaussian membership-based self-organizing incremental neural network (Gm-SOINN). In addition, to solve the streaming data regression problem under evolving environments, we propose continuous support vector regression (C-SVR) for nonstationary streaming data. The problem of evolving streaming data regression has been a topic of consistent research in the fuzzy systems community. Hence, a novel evolving-fuzzy-neuro system, called the topology learning-based fuzzy random neural network (TLFRNN), is proposed. Finally, in order to solve the multiple-output regression problem of streaming data, we present an online multi-output regression system, called MORStreaming, for streaming data.
Please use this identifier to cite or link to this item: