| Title: | 'Stanza' - A 'R' NLP Package for Many Human Languages |
|---|---|
| Description: | An interface to the 'Python' package 'stanza' <https://stanfordnlp.github.io/stanza/index.html>. 'stanza' is a 'Python' 'NLP' library for many human languages. It contains support for running various accurate natural language processing tools on 60+ languages. |
| Authors: | Kurt Hornik [aut], Florian Schwendinger [aut, cre], Julian Amon [aut] |
| Maintainer: | Florian Schwendinger <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0-3 |
| Built: | 2026-05-29 11:17:49 UTC |
| Source: | https://github.com/cran/stanza |
Conda Install Stanza
conda_install_stanza( envname = "stanza", packages = c("python", "stanza"), forge = FALSE, channel = c("stanfordnlp"), conda = "auto", ... )conda_install_stanza( envname = "stanza", packages = c("python", "stanza"), forge = FALSE, channel = c("stanfordnlp"), conda = "auto", ... )
envname |
a character string giving the name or path of the conda environment to be used or created for the installation. |
packages |
a character vector giving the packages to be installed. |
forge |
a logical giving if conda forge should be used for the installation. |
channel |
a character vector giving the conda channels to be used. |
conda |
a character string giving the path to the conda executable. |
... |
additional arguments passed to |
NULL
## Not run: conda_install_stanza() ## End(Not run)## Not run: conda_install_stanza() ## End(Not run)
Entities
entities(x, ...)entities(x, ...)
x |
an object inheriting from |
... |
optional additional arguments, currently not used. |
a data.frame with the entities.
Checks if Stanza is initialized.
is_stanza_initialized()is_stanza_initialized()
TRUE if Stanza is initialized, otherwise FALSE
is_stanza_initialized()is_stanza_initialized()
Multi-Word Token
multi_word_token(x, ...)multi_word_token(x, ...)
x |
an object of |
... |
optional additional arguments, currently not used. |
a data.frame with the multi-word tokens.
Download pretrained NLP models. For more information about the parameters see https://stanfordnlp.github.io/stanza/download_models.html.
stanza_download( language = "en", model_dir = stanza_options("model_dir"), package = "default", processors = list(), logging_level = "INFO", resources_url = stanza_options("resources_url"), resources_version = stanza_options("resources_version"), model_url = stanza_options("model_url") )stanza_download( language = "en", model_dir = stanza_options("model_dir"), package = "default", processors = list(), logging_level = "INFO", resources_url = stanza_options("resources_url"), resources_version = stanza_options("resources_version"), model_url = stanza_options("model_url") )
language |
a character string giving the language (default is |
model_dir |
path to the directory for storing the for |
package |
a character string giving the package to be used (default is |
processors |
a character string or named list giving the processors to download models for.
If a string is provided it should provide the names of the desired processers as comma seperated
string, e.g., |
logging_level |
a character string giving the logging level (default is |
resources_url |
a character string giving the url to the |
resources_version |
a character string giving the version of the resources.
The default value is obtained from Python during the initiatlization and can be obtained
and changed by using |
model_url |
a character string giving the model url.
The default value is obtained from Python during the initiatlization and can be obtained
and changed by using |
NULL
if (stanza_options("testing_level") >= 3L) { stanza_initialize() stanza_download("en") }if (stanza_options("testing_level") >= 3L) { stanza_initialize() stanza_download("en") }
Function to obtain the download method code or list all allowed download methods.
stanza_download_method_code(method = NULL)stanza_download_method_code(method = NULL)
method |
a character string giving the name of the download method.
The case oft he download method name is ignored.
If |
an integer giving the download method code.
if (is_stanza_initialized()) { stanza_download_method_code() stanza_download_method_code("none") stanza_download_method_code("reuse_resources") stanza_download_method_code("download_resources") }if (is_stanza_initialized()) { stanza_download_method_code() stanza_download_method_code("none") stanza_download_method_code("reuse_resources") stanza_download_method_code("download_resources") }
Initialize the Python binding to stanza.
stanza_initialize( python = NULL, virtualenv = NULL, condaenv = NULL, model_dir = NULL, resources_url = NULL, model_url = NULL )stanza_initialize( python = NULL, virtualenv = NULL, condaenv = NULL, model_dir = NULL, resources_url = NULL, model_url = NULL )
python |
a character string giving the path to the |
virtualenv |
a character string giving the name of the virtual environment,
or the path to the virtual environment, to be used.
The variable |
condaenv |
a character string giving the name of the |
model_dir |
a character sting giving the path to the directory storing the |
resources_url |
a character string giving the url to the |
model_url |
a character string giving the model url. |
NULL
if (stanza_options("testing_level") >= 3L) { stanza_initialize() }if (stanza_options("testing_level") >= 3L) { stanza_initialize() }
Allow the user to set and examine options like
stanza_options(option, value, update_python_defaults = FALSE)stanza_options(option, value, update_python_defaults = FALSE)
option |
any options can be defined, using 'key, value' pairs. If 'value' is missing the current set value is returned for the given 'option'. If both are missing. all set options are returned. |
value |
the corresponding value to set for the given option. |
update_python_defaults |
a logical (default is |
NULL if both arguments option and value are provided.
The currently set value if the argument value is missing.
All set options if the argument option is missing.
stanza_options("conda_environment", "stanza")stanza_options("conda_environment", "stanza")
NLP Pipeline
stanza_pipeline( language = "en", model_dir = stanza_options("model_dir"), package = "default", processors = list(), logging_level = "INFO", use_gpu = FALSE, download_method = "reuse_resources", ... )stanza_pipeline( language = "en", model_dir = stanza_options("model_dir"), package = "default", processors = list(), logging_level = "INFO", use_gpu = FALSE, download_method = "reuse_resources", ... )
language |
a character string giving the language (default is |
model_dir |
path to the directory for storing the for |
package |
(default is |
processors |
FIXME: we should define if we want to use comma seperated string or a character vector. |
logging_level |
a character string giving the logging level (default is |
use_gpu |
a logical giving if |
download_method |
an integer or character string giving the download method code.
If a character string is provided, it is passed to |
... |
additional named arguments passed to the stanza pipeline. |
a function that can be used to process text.
## Not run: p <- stanza_pipeline() doc <- p('R is a programming language for statistical computing.') ## End(Not run)## Not run: p <- stanza_pipeline() doc <- p('R is a programming language for statistical computing.') ## End(Not run)
Obtain the version of the stanza Python package.
stanza_version()stanza_version()
a character string giving the version of the stanza Python package.
stanza_version()stanza_version()
Tokens
tokens(x, ...)tokens(x, ...)
x |
an object inheriting from |
... |
optional additional arguments, currently not used. |
a data.frame with the tokens.
Install Stanza via Virtual Environment
virtualenv_install_stanza( envname = "stanza", packages = "stanza", python = NULL, ... )virtualenv_install_stanza( envname = "stanza", packages = "stanza", python = NULL, ... )
envname |
a character string giving the name or path of the virtual environment to be used or created for the installation. |
packages |
a character vector giving the packages to be installed. |
python |
a string giving the name or path of the python version to be used
(e.g., |
... |
additional arguments passed to |
NULL
## Not run: virtualenv_install_stanza() ## End(Not run)## Not run: virtualenv_install_stanza() ## End(Not run)