Authors: Rohit Trivedi; Sandipan Patra; Shafi Khadem
The data-driven (DD) is a systematic approach to improve the data and model by deriving/adding features to address the problem identified during the iterative loop of forecasting model development. This article proposes a DD framework for forecasting short-term PV generation and load demand. A framework of three stages with a unique contribution in each stage, such as generalizing data preprocessing steps (stage-1), multivariate feature generation and selection (stage-2), and model hyperparameter tuning (stage-3) for further improvement in forecasting is proposed. It focuses on data as well as forecasting models. The whole process is analyzed using the time-series measured data collected from a real-life demonstration project in Ireland. Data preprocessing is generalized for both generation and demand forecasting under the same framework. The relevant features are selected with the help of the proposed random forest sequential forward feature selection algorithm. Hyperparameters are tuned through tree-structured Parzen estimator algorithm for further improvement. In addition, the performance of the classical autoregressive integrated moving average model is compared with the machine learning-based gate recurrent unit, long short term memory, recurrent neural network, and convolutional neural network models. Results show that the data-driven forecasting model framework systematically improves the model performance. The seasonal variation has also a high impact on the model performances.