Improved feature weight algorithm and its application to text classification (Q1793568)

From MaRDI portal





scientific article; zbMATH DE number 6953568
Language Label Description Also known as
default for all languages
No label defined
    English
    Improved feature weight algorithm and its application to text classification
    scientific article; zbMATH DE number 6953568

      Statements

      Improved feature weight algorithm and its application to text classification (English)
      0 references
      0 references
      0 references
      0 references
      0 references
      12 October 2018
      0 references
      Summary: Text preprocessing is one of the key problems in pattern recognition and plays an important role in the process of text classification. Text preprocessing has two pivotal steps: feature selection and feature weighting. The preprocessing results can directly affect the classifiers' accuracy and performance. Therefore, choosing the appropriate algorithm for feature selection and feature weighting to preprocess the document can greatly improve the performance of classifiers. According to the Gini Index theory, this paper proposes an Improved Gini Index algorithm. This algorithm constructs a new feature selection and feature weighting function. The experimental results show that this algorithm can improve the classifiers' performance effectively. At the same time, this algorithm is applied to a sensitive information identification system and has achieved a good result. The algorithm's precision and recall are higher than those of traditional ones. It can identify sensitive information on the Internet effectively.
      0 references

      Identifiers