Promoting Wind Energy by Robust Wind Speed Forecasting Using Machine Learning Algorithms Optimization

: Accurate, efficient, and stable wind prediction systems for wind turbines are critical to ensuring the operational safety and optimum design of power systems. This study deliberated hyperparameter fine-tuning of ten Machine Learning (ML) models to obtain the best short-term wind speed forecasting model by evaluating the Root-Mean-Square Error (RMSE), Mean Absolute Error (MAE), Correlation, and runtime. The Random Forest (RF) and gradient-boosted tree (GBT) had the best overall performance; however, RF has a much longer training time than GBT. This paper's findings can assist researchers and practitioners in developing the most effective data-driven methods for wind speed and power-generated forecasting.


Introduction
Significant concerns about the depletion of oil and gas supplies and global warming have made it imperative to seek energy from wind 1) and other renewable sources 2,3) .In addition to location 4,5) and wind turbine selection 6,7) wind speed forecasting has a substantial impact on energy development and optimizing production.However, the intermittency, fluctuation, and nonlinearity of wind series make it challenging to estimate wind speed accurately 8) .ML algorithms are an option for forecasting wind speed data.ML is an area of computer science concerned with creating methods and algorithms for data-driven learning and prediction.This subset of artificial intelligence has enabled engineers to develop and implement solutions for complex challenges in various fields [9][10][11] .
ML algorithms have been used in numerous research to predict wind at various time scales, including the brief term (a few seconds to 30 minutes ahead), the short-range (30 minutes to 6 hours ahead), the medium-term (6 hours to 1 day ahead), and the long-term (1 day to 1 week ahead) 12) .Moreover, many forecasting studies have been conducted in various fields using ML algorithms.One of the most popular ML techniques is ANN, which has been applied in several scientific domains, i.e., an accurate estimation and continuous forecasting of pollution levels 13) , the waste prediction problem 14) , optimizing fabrication process 15) , logistic regression and forecast risk tolerance behavior 16) , an inquiry on the process factors of the biomass gasifier 17) , prediction total and average monthly night traffic on Serbian state roads 18) , and mechanical property prediction from structure information 19) .
In terms of wind speed and power forecasting, numerous concepts have been adopted.Fuzzy logic 20) , neural networks 21) , and statistical models 22) can be applied to ML. Deep neural networks were combined with transfer learning to smooth out the rapid transients in the wind power prediction behavior 23) .At the same time, regression models engaged neural networks and methods such as wavelet transform, particle swarm optimization 20) , and the kNN 24) .Moreover, wind speed predictions have also been constructed using SVM and its variant, LSSVM 25) .The effectiveness of a SVR model based on heteroscedastic Gaussian noise in predicting wind production has been established 26) .Several models, such as the SVR, RF, Extreme Gradient Boost, kNN, and least absolute shrinkage selector operator techniques, are compared to predict wind power values, which show decent accuracy according to several places 27) .
Furthermore, many researchers utilize ensemble and hybrid techniques to enhance the accuracy of their models.A hybrid system for monthly wind speed forecasting was introduced, combining linear and nonlinear models through a nonlinear combination approach 8) .This method employs a data-driven intelligent model to determine the most effective combination, thereby improving the precision of wind speed predictions.Similarly, Nascimento proposed a hybrid model for precise wind speed prediction, incorporating time series and artificial intelligence while considering external factors like pressure, temperature, and precipitation to enhance accuracy 28) .In another study, six variations of unorganized machine models, including ELM and ESN with or without regularization coefficients, were examined 29) .The results indicated that the unorganized machines and ELM-based ensembles consistently outperformed linear models in all simulations, demonstrating superior overall performance.
It is essential to set the hyperparameters of the algorithms correctly to build effective prediction models for ML algorithms 30) .Because of their effects on model performance and the fact that the best set of values is unknown, hyperparameter setting has emerged as an important and challenging problem in using ML algorithms 31) .An essential issue in constructing effective prediction models for ML algorithms is appropriately setting the hyperparameters of the algorithm's prerequisite.Therefore, choosing and fine-tuning the ML model is crucial, similar to the physical method, where the model selection significantly impacts the forecast's accuracy 32) .However, there currently are no reliable recommendations for choosing a model, predictor, or set of hyperparameters.
In some research studies, intelligent systems leverage swarm intelligence algorithms to find the most suitable ML learning hyperparameters.A combined forecasting system utilizes four ANNs with optimal weighting coefficients determined by the MSSO 33) .The strategy boosts model performance by concurrently optimizing accuracy and stability.Similarly, adopting a hybrid approach that merges swarm intelligence optimization with artificial intelligence for predicting wind speeds has been done 34) .This method employs wavelet packet decomposition to compress and remove noise from the data, resulting in a faster and more precise model.Lastly, a versatile forecasting framework that combines empirical wavelet transform and neural network-based quantile regression to enhance the robustness and generalization of wind speed predictions has also been developed 35) .
A brief review of numerous approaches for forecasting wind speed and power is provided in Table 1.Most current wind speed forecasting studies employ restricted versions of statistical approaches.The most utilized strategies are SVR and ANN algorithms.As for evaluation, there is no universal agreement on the technique, as the authors utilized different error measures.RMSE and its normalized versions stand out from the rest of the list due to their extensive use.Regarding temporal horizons, most research has been undertaken for the next day and only a few for shorter periods.
The lead time of the applicable forecast is rarely provided 36) .Hence, only a few attempts were made to assess wind speed at brief intervals.It has been demonstrated that short-term tactics increase resource coordination in real-time or almost real-time energy dispatching systems 37) .On the other hand, our approach describes the implementation of short-term (15-minute) wind speed using multiple ML models.This work intends to close a gap in the literature by evaluating 10 wellknown ML models for short-term wind speed.This study's findings can guide selection models and hyperparameters for wind speed forecasting in various applications and studies.
The remainder of the article is structured as follows.The second section provides a summary of pertinent linked works.The third section presents data and model development.The materials and procedures are detailed accordingly.Also covered are exploratory data analysis, cleansing, and tweaking hyperparameter selection.The following section describes and contrasts the optimal models created through various ML approaches.Finally, the fifth section closes the report with conclusions and future research directions.

Materials and methods
The research methods in this paper followed the stages as shown Fig. 1.In general, the research was carried out in four stages, namely: 1) data collection, 2) data preparation, 3) model development using RapidMiner, and 4) analysis of experimental results.

Dataset input
This study used data from a meteorology mast in Baron Technopark, Daerah Istimewa Yogyakarta, Indonesia.It has a potential wind speed for power generation 38,39) .The measurement was conducted from June 2017 to March 2019 40) .The weather station measured and recorded wind speed, wind direction, solar radiation, ambient temperature, ambient humidity, and atmospheric pressure every fifteen minutes at an altitude of 30 meters.

Data preparation and science tool
Data preparation is intended to clean, transform, and filter data to prepare it for modeling.Nevertheless, a complete series must consider the impacts of seasonality and trends.Consequently, it is necessary to approximate the missing values.Therefore, the data are imported into Windographer to calculate the monthly average value.Then, the missing and outliner values were replaced with them, considering the seasonal patterns for the period in which they occur.Furthermore, the data for the experiment were grouped into two classes: yearly and quarterly.In addition, a month's data was extracted to test the fitness model on the dataset source.Qualified data series can then be processed using RapidMiner, a well-established end-to-end Java-based analytic tool for text mining, data mining, predictive analytics, and business analytics 56) , to achieve an alternate rapid way for prediction 57) .This solution has been implemented in multiple contexts and is today's most popular standalone open-source solution.It is a market leader in its industry 58) .Moreover, according to a performance study, RapidMiner is better than its rival software for data mining activities with less RAM consumption 59) .

Models development
The following part will discuss how RapidMiner was used in this paper, as shown in the enclosed dashed line Fig. 1.The first step in exploiting RapidMiner is developing model configurations, as shown in Fig. 2, by dragging them from the selected menu.The following steps are attaching data and experimenting.

Data pre-processing
The read data step in RapidMiner imports spreadsheet data by following the import configuration wizard instruction.Targeted data was chosen and formatted into "real" type, and the wind speed role was changed into a label.The next step is windowing to transform the data series into smaller, manageable subsets for processing and analysis.
The windowing operates by moving "windows" containing a specified window plus step size, window sliding head, through the series to produce the anticipated label, i.e., the attribute value found in horizon values after the window end 60) .The term "horizon" refers to the number of steps ahead that must be forecasted to train a model to predict future value based on this learning.The step size affects the proportion of the input set fed into the ML process.For this experiment, the window size was 5, the step size was 1, and the horizon was 1.
Moreover, the dataset was separated into training, testing, and validating groups to derive the ML performance.Data separation was performed using the data splitting menu.

Model training and testing
Following completion of the data engineering and preprocessing stages, the dataset was subjected to crossvalidation, as shown in the enclosed dotted line in Fig. 1.This study employed a 10-fold cross-validation technique using the data splitting 80% of training and 20% of testing to assess the accuracy of the model.This technique aims to test the datasets against the model in ten iterations, thereby obtaining the average accuracy percentage.This approach allows for evaluating the model's performance across specific segments of the dataset and the complete dataset.The ML modules used for training in this experiment incorporate the cross-validation operator, which furnishes a collection of performance metrics based on the stated performance configurations for each ML technique.

Tested ML techniques
Ten model MLs were used in Rapidminer to forecast wind speed, namely, LM, GLM, DL, RF, GBT, DT, ANN, kNN, SVM, and SVR.The model development and procedure using RapidMiner studio, along with their embodied operators, is shown in Fig. 2. Conversely, Table shows the hyperparameter tuning used by each algorithm.Determining hyperparameter values followed the babysitting method 61) .

Metrics for model performance evaluation
This study evaluates the performance of the deployed prediction model utilizing a set of assessment metrics.In comparing the efficacy of the model for projecting energy use and generation, the following indicators and accompanying formulas are employed: (Eq. 1) RMSE to assess the precision of model prediction performance, (Eq.2) MAE to characterize the mean of the absolute differences between projected and actual values, and (Eq.3) correlation to depict how well trends in the anticipated values track trends in the actual values.

𝑅𝑀𝑆𝐸
∑   (Eq. 1) Where  = the actual value,  = forecasted value,  and  are the means of actual and forecasted, and n = number of data samples The fourth indicator is runtime, which describes the model's computational requirements.It is calculated as the total of the training and prediction times.The procedure used a standard laptop with 8 GB of RAM and an Intel® Core™ i7-3537U CPU.various sampling types in the Split Data operator, namely automatic, linear, shuffled, and stratified sampling.That approach will ensure that our model's performances are robust, reliable, and capable of generalizing to different data scenarios.

Model performance evaluation
In ML, it is essential to evaluate a model's performance on data it has not seen before.That is done to ensure that the model is more than memorizing the training data and will generalize well to new data.Splitting data into two sets of training and testing is commonly practiced.This study revealed the effect of splitting the dataset by considering two splitting scenarios.First, in S1, the data were split into 80% training, 20% testing, and 0% validation.Second, in S2, the data were split into 70% training, 10% testing, and 20% validation.
In addition, by using different data quantities to fit the model, one can better understand how the forecasting model performs under different conditions.We tested the model with one month of data, seasonal data spanning three months, and annual data, ensuring its suitability for both short-term and long-term forecasting scenarios.

Results and discussion
The hyperparameters for each algorithm have a significant impact on the model performance.Therefore, it is crucial to carefully tune the hyperparameters for each algorithm to achieve the best results.Here, the experimental results are presented and explained.

Dataset description and features
The collected data characteristics are exposed Table 3 and are depicted as a correlation in Fig. 3 and Fig. 4. Upon examination of the monthly boxplots depicted in Fig. 3, it is evident that there exists a discernible annual seasonal pattern.June presented a higher mean wind speed than other months, especially February and March.The highest mean value was 10.09 m/s in June, and the lowest was 3.81 m/s in February.In addition, Fig. 4 shows the scatter plot of the data, a correlation between wind speed and other weather parameters.A negative correlation between wind speed and ambient humidity is exposed, while the correlation between wind speed and other parameters is positive.

Hyperparameter tuning
This study uses 10 ML algorithms, including regression, classification, and ensembles.The hyperparameter value listed in Table 2 was tuned by following the experimental step shown in Fig. 1 using a yearly dataset on splitting of 80% training and 20% testing.Each value was run for every algorithm and is tabulated in Table 4, showing the performance of the different ML algorithms on the best corresponding hyperparameter value.The table compares the algorithms based on four metrics: RMSE, MAE, correlation, and runtime.RMSE and MAE measure the magnitude and absolute error between the predicted and actual values, correlation measures the strength of the relationship between the predicted and actual values, and runtime measures the time each algorithm takes to run.The RMSE, MAE, and runtime vary with algorithms on their best hyperparameter.However, the correlation values are all very high, indicating that all algorithms can learn the relationship between the input and output variables.The best-performing algorithm in terms of RMSE is GBT and RF, with a value of 0.90, while GLM exhibits the worst performance with an RMSE of 1.02.It means that GBT and RF have the lowest average error between their predictions and the actual values, and GLM has the highest error.Compared to ANN, GBT, and RF, learning methods based on DT outperformed them in improving wind speed forecasts.It is common because RF systematically avoids correlation and enhances the model's performance by selecting a random subset of features.Therefore, it will result in different subsets for the various trees, which in turn causes less correlation between the DT and a reduction in variance in the final prediction 62) .Previous wind power forecasting studies have shown that the GBT model performed best 63,64) .
Moreover, the best-performing algorithm for MAE is SVR, with a value of 0.54, while SVM exhibits the worst performance with an MAE of 0.67.It means that SVR has the lowest absolute error between its predictions and the actual values, and SVM has the highest error.However, SVR exhibits the most extended runtime of 1040 s, leading to not considering it in practice.
The three quickest algorithms for the provided dataset are DT, LR, and GLM, with corresponding runtimes of 2 s, 4 s, and 5 s, respectively.Although responsive, they exhibit a lower correlation than RF, GBT, and DL.Hence, the selection of an algorithm is contingent on the user's The table shows that the RF and GBT algorithms are the best choices for regression tasks, as they offer a good balance between accuracy and efficiency.GBT and RF have an RMSE of 0.90 and a MAE of 0.55.They also have a high correlation of 0.97, indicating that they can accurately predict the target variable.However, GBT is superior to RF at 35 s vs. 118 s runtime.GBT is a robust ensemble learning algorithm well-suited for various tasks, including classification, regression, and ranking 65) .
The best ML algorithm selection is about a trade-off between accuracy and training time; it can sometimes be universally optimal since it always depends on the advantages of lower error rates and the expense of the computational resource.However, it is typical in automated ML to apply a time budget for model tuning and training during benchmarking 66) .It will not be worthwhile if we forecast 10 minutes but require 12 minutes of runtime.In practical forecasting applications, minimal errors are often seen as far more crucial than training time.Nevertheless, if numerous models have nearly comparable performance, it is advisable to choose the faster model.Moreover, it is essential to note that the best-performing algorithm for a given task will depend on the specific characteristics of the dataset.For example, if the dataset is small or noisy, a simpler algorithm such as LR may perform better than a more complex algorithm such as GBT 50) .
The best ML algorithm selection is about a trade-off between accuracy and training time, and it can sometimes be universally optimal since it always depends on the advantages of lower error rates and the expense of the computational resource.However, it is typical in automated ML to apply a time budget for model tuning and training during benchmarking 67) .It will not be worthwhile if we forecast 10 minutes but require 12 minutes of runtime.In practical forecasting applications, minimal errors are often seen as being far more crucial than training time; nevertheless, if numerous models have nearly comparable performance, it is advisable to choose the faster model.It is important to note that the bestperforming algorithm for a given task will depend on the specific characteristics of the dataset.For example, if the dataset is small or noisy, a simpler algorithm such as LR may perform better than a more complex algorithm such as GBT 68) .

Effect of sampling selection
The splitting operator facilitates the sampling methods to be customized for the experiment.While using the auto setting in the previous result, as shown in Table 4, the question arises as to whether the sampling method influences model performance.Therefore, they were subjected to four sampling selections: automatic, linear, shuffled, and stratified.Table 5 shows the sampling-type selection test results for splitting data.Linear sampling produces the worst performance for all models compared to other sampling types.However, the data splitting method with the lowest RMSE and MAE values for most algorithms is shuffled and automatic.Compared to stratified and linear splitting, shuffled splitting keeps the proportions of the different classes or groups in the data.Furthermore, the utilization of automatic selection sampling is deemed sufficient.

Effect of data splitting and data quantity
The effect of splitting the dataset was revealed by considering two splitting scenarios, S1 and S2, and the data quantity effect was studied using yearly, seasonal, and August data.The experiment results are presented in Table 6.
For yearly data, RF, DL, kNN, and DT exhibit more goodness fitting than GBT, but DT is less fit than RF, DL, and kNN.GBTs are an ensemble learning algorithm combining multiple weak learners to produce a strong learner.GBTs are known for their excellent performance on various tasks, but they can be more prone to overfitting than other ML algorithms, such as RF, DL, kNN, and DT.When the data is split into three sets, the training set is smaller than when the data is split into two sets.It can lead to overfitting for GBTs.
In addition, the effect of quantity data can be studied by comparing performance results from yearly, seasonal, and August data on the S2 scenario.Generally, the RMSE worsens for all algorithms when the amount of data gets smaller.Furthermore, the MAE changes pattern follows the same trend as RMSE, except for GBT, which shows the opposite trend.
Moreover, Table 6 shows that all of the ML algorithms have a high correlation with each other, indicating that they are all making similar predictions on the dataset.However, there are some exciting trends in the table.First, the fewer data sets available, the worse the correlation value.The training sets are smaller for the more granular splitting scenarios.As a result, the algorithms are more likely to overfit the training data, which can lead to decreased correlation with the other algorithms.Second, the RF algorithm has the highest correlation with all other algorithms, suggesting that RF is the most stable and consistent algorithm.Third, the DT algorithms have the lowest correlation with all other algorithms.It suggests that DT is more sensitive to changes in the dataset and may be more prone to overfitting.

Wind speed prediction
RapidMiner evaluates performance using 20% of the uncertain testing dataset.Fig. 6 depicts the discrepancy between the relevant model's forecast and actual wind speed using S1.Overall, the ML system accurately predicted the wind speed, and there is a significant correlation between the anticipated and actual values.However, the graph shows that DT performance is low compared to others.That is because the deviation in tree structures variance emerges as the negative side of the method 69) .DT generally tends to have low bias and high variance 70) .Nevertheless, DT has a much shorter training time than the RF.As the number of trees in an RF increases, so does the time required to train each.It is frequently essential when working on an ML project with a tight deadline.

Conclusion
Ten well-known ML models (LM, GLM, DL, RF, GBT, DT, ANN, kNN, SVM, and SVR) were generated for short-term wind speed forecasting utilizing a data set from June 2017 to March 2019 from a meteorology mast in Baron Technopark, Daerah Istimewa Yogyakarta, Indonesia.The RF and GBT models performed best on the test dataset, with the GBT model doing considerably better.-364- In addition, both models had a better RMSE error rate.However, RF training requires more time and memory than GBT.Hence, GBT is recommended for applications in the real world.Hyperparameter adjustment is required to maximize the full potential of ML models.The more robust models (e.g., GBT, LR) are correct even without training, but others (e.g., ANN, SVM, SVR) have significant errors with their default parameters and require training.
The application-ready ML models from a high-level program library described in this article could assist researchers and practitioners in offering relatively accurate solutions for operational wind speed predictions to attain maximum power.Nonetheless, there is room for improvement in fine-tuning various combinations of hyperparameters to estimate wind speed better precisely.Also, it would be intriguing to apply the models to additional datasets from different regions.It would enable us to determine the extent to which site location influences model performance.In addition, it is challanging to investigate the integration of potentially utilizing the physical process model data as features for the machine learning model or adopting dynamic weighting depending on forecast uncertainty.However, there is still an extent of improvement in fine-tuning other combinations of hyperparameters to predict wind speed more accurately.Furthermore, applying the models to other datasets from different locations would be interesting.It would allow us to study how much a particular site location affects model performance.To improve the accuracy of wind forecasting, more research into the hybridization of physical and ML models is another option.

Supplement Table.
Summary of the metrics ML model.

Fig. 3 Fig. 4
Fig. 3 Monthly box plot of the wind speed value

Fig. 5
Fig. 5 Wind speed prediction charts (the predictions vs. the actual values) of ML models

Table 1 .
A summary comparison between recent wind speed and power forecasting

Table 1 .
(continuing) A summary comparison between recent wind speed and power forecasting

Table 5 .
Comparison of sampling type selection test

Table 4 .
The best performance of the ML algorithm

Table 6 .
The performance of the ML algorithms for different splitting scenario