Lists of common function words (‘stop’ words).
stopwords_da stopwords_de stopwords_en stopwords_es stopwords_fi stopwords_fr stopwords_hu stopwords_it stopwords_nl stopwords_no stopwords_pt stopwords_ru stopwords_sv
The stopwords_
objects are character vectors of case-folded
‘stop’ words. These are common function words that often get discarded
before performing other text analysis tasks.
There are lists available for the following languages:
Danish (stopwords_da
), Dutch (stopwords_nl
),
English (stopwords_en
), Finnish (stopwords_fi
),
French (stopwords_fr
, German (stopwords_de
)
Hungarian (stopwords_hu
), Italian (stopwords_it
),
Norwegian (stopwords_no
), Portuguese (stopwords_pt
),
Russian (stopwords_ru
), Spanish (stopwords_es
),
and Swedish (stopwords_sv
).
These built-in word lists are reasonable defaults, but they may require further tailoring to suit your particular task. The original lists were compiled by the Snowball stemming project. Following the Quanteda text analysis software, we have tailored the original lists by adding the word "will" to the English list.
A character vector of unique stop words.