Resumen
Machine learning applications have demonstrated the potential to generate precise models in a wide variety of fields, including marine applications. Still, the main issue with ML-based methods is the need for large amounts of data, which may be impractical to come by. To assure the quality of the models and their robustness to different inputs, synthetic data may be generated using other ML-based methods, such as Triplet Encoded Variable Autoencoder (TVAE), copulas, or a Conditional Tabular Generative Adversarial Network (CTGAN). With this approach, a dataset can be trained using ML methods such as Multilayer Perceptron (MLP) or Extreme Gradient Boosting (XGB) to improve the general performance. The methods are applied to the dataset containing mass flow, temperature, and pressure measurements in seven points of a marine steam turbine as inputs, along with the exergy efficiency (??
?
) and destruction (????
E
x
) of the whole turbine (WT), low-pressure cylinder (LPC) and high-pressure cylinder (HPC) as outputs. The achieved results show that models trained on synthetic data achieve slightly worse results than the models trained on original data in previous research, but allow for the use of as little as two-thirds of the dataset to achieve these results. Using ??2
R
2
as the main evaluation metric, the best results achieved are 0.99 for ??????
?
W
T
using 100 data points and MLP, 0.93 for ????????
?
L
P
C
using 100 data points and an MLP-based model, 0.91 for ????????
?
H
P
C
with the same method, and 0.97 for ????????
E
x
W
T
, 0.96 for ??????????
E
x
L
P
C
, and 0.98 for ??????????
E
x
H
P
C
using a the XGB trained model with 100 data points.