Improved feature weight algorithm and its application to text classification (Q1793568)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Improved feature weight algorithm and its application to text classification |
scientific article; zbMATH DE number 6953568
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | Improved feature weight algorithm and its application to text classification |
scientific article; zbMATH DE number 6953568 |
Statements
Improved feature weight algorithm and its application to text classification (English)
0 references
12 October 2018
0 references
Summary: Text preprocessing is one of the key problems in pattern recognition and plays an important role in the process of text classification. Text preprocessing has two pivotal steps: feature selection and feature weighting. The preprocessing results can directly affect the classifiers' accuracy and performance. Therefore, choosing the appropriate algorithm for feature selection and feature weighting to preprocess the document can greatly improve the performance of classifiers. According to the Gini Index theory, this paper proposes an Improved Gini Index algorithm. This algorithm constructs a new feature selection and feature weighting function. The experimental results show that this algorithm can improve the classifiers' performance effectively. At the same time, this algorithm is applied to a sensitive information identification system and has achieved a good result. The algorithm's precision and recall are higher than those of traditional ones. It can identify sensitive information on the Internet effectively.
0 references
0.718527615070343
0 references
0.7048733234405518
0 references
0.7022448778152466
0 references