The increasing demand for electricity in daily life highlights the need for Smart Cities (SC) to use energy efficiently. Both technical and Non-Technical Losses (NTL), particularly those resulting from electricity theft, present powerful obstacles; NTL alone can reach billions of dollars. Although Machine Learning (ML) based approaches for NTL detection have been embraced by numerous utilities, there is still a lack of thorough analysis of these methods. Limited research exists on NTL identification evaluation criteria and unbalanced data management in the context of SC. This research compares ML algorithms and data balancing methods to optimize electricity consumption detection. The given research applied the 15 ML techniques of Logistic regression, Bernoulli naive Bayes, Gaussian naive Bayes, K-Nearest Neighbour, perceptron, passive-aggressive classifier, quadratic discriminant analysis, SGD classifier, ridge classifier, linear discriminant analysis, decision tree, nearest centroid classifier, multi-nomial naive Bayes, complement naive Bayes and dummy classifier. While SMOTE, AdaSyn, NRAS, and CCR are considered for data balancing. AUC, F1-score, and seven relevant performance metrics were used for comparison. We have also implemented SHapely Additive exPlanations (SHAP) for feature importance and model interpretation. Results show varying classifier performance with different balancing methods, emphasizing data preprocessing's role in NTL detection for smart grid security.
- smart power grids
- learning (artificial intelligence)
- smart meters