refinr (Q110667)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: refinr |
Cluster and Merge Similar Values Within a Character Vector
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | refinr |
Cluster and Merge Similar Values Within a Character Vector |
Statements
12 November 2023
0 references
These functions take a character vector as input, identify and cluster similar values, and then merge clusters together so their values become identical. The functions are an implementation of the key collision and ngram fingerprint algorithms from the open source tool Open Refine <https://openrefine.org/>. More info on key collision and ngram fingerprint can be found here <https://openrefine.org/docs/technical-reference/clustering-in-depth>.
0 references