2020 TheTextClassificationofTheftCri

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Theft Crime Prediction, TF-IDF Feature Generation, Supervised Text Classification Algorithm.

Notes

Cited By

Quotes

Abstract

Classifying theft crime data of a city from 2009 to 2019 based on text classification technology. Firstly, manually classifying and defining theft crimes based on legal view and criminal practice view, then selecting 2621 data at random from the whole data. Extracting features from pre-processed sample data by TF-IDF model, then training and testing text classification model by XGBoost algorithm, and comparing the test results of KNN algorithm, Naïve Bayes algorithm, SVM algorithm and GBDT algorithm. The results show that the XGBoost algorithm are better than KNN, Naïve Bayes, SVM and GBDT. Adjusting slightly various categories to improve the accuracy of classification, and the accuracy of each algorithm is improved by 2-5 percentage points and the accuracy of XGBoost is highest. So, the results show that, 1. XGBoost algorithm is best to use as classifying the whole data. 2. The influence of data quality on classification accuracy is obvious and can improve the accuracy of algorithms rapidly. The classified theft crime data of 2009-2019 through XGBoost algorithm can be used as based data for the prediction of various types of crimes.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2020 TheTextClassificationofTheftCriZhang QiThe Text Classification of Theft Crime based on TF-IDF and XGBoost Model2020