Stem a set of terms using one of the algorithms provided by the Snowball stemming library.

stem_snowball(x, algorithm = "en")

x | character vector of terms to stem. |
---|---|

algorithm | stemming algorithm; see ‘Details’ for the valid choices. |

Apply a Snowball stemming algorithm to a vector of input terms, `x`

,
returning the result in a character vector of the same length with the
same names.

The `algorithm`

argument specifies the stemming algorithm. Valid choices
include the following:
`"ar"`

(`"arabic"`

),
`"da"`

(`"danish"`

),
`"de"`

(`"german"`

),
`"en"`

(`"english"`

),
`"es"`

(`"spanish"`

),
`"fi"`

(`"finnish"`

),
`"fr"`

(`"french"`

),
`"hu"`

(`"hungarian"`

),
`"it"`

(`"italian"`

),
`"nl"`

(`"dutch"`

),
`"no"`

(`"norwegian"`

),
`"pt"`

(`"portuguese"`

),
`"ro"`

(`"romanian"`

),
`"ru"`

(`"russian"`

),
`"sv"`

(`"swedish"`

),
`"ta"`

(`"tamil"`

),
`"tr"`

(`"turkish"`

),
and `"porter"`

.
Setting `algorithm = NULL`

gives a stemmer that returns its input
unchanged.

The function only stems single-word terms of kind "letter"; it leaves other inputs (multi-word terms, and terms of kind "number", "punct", and "symbol") unchanged.

The Snowball stemming library
provides the underlying implementation. The `wordStem`

function from
the SnowballC package provides a similar interface, but that function
applies the algorithm to all input terms, regardless of the kind of the term.

A character vector the same length and names as the input, `x`

, with
entries containing the corresponding stems.

# apply english stemming algorithm; don't stem non-letter terms stem_snowball(c("win", "winning", "winner", "#winning"))#> [1] "win" "win" "winner" "#winning"# compare with SnowballC, which stems all kinds, not just letter# NOT RUN { SnowballC::wordStem(c("win", "winning", "winner", "#winning"), "en") # }