Returns the cosine similarity of each word in a data frame and a given vector representation.
Arguments
- word_df
A data.frame containing a column with words or multiword expressions.
- word_field
A character string indicating the name of the column in word_df that contains the words.
- rep
A given word-vector representation, stored as numerical vector, matrix, or sparse matrix object.
length
, resp.ncol
ofrep
must be equal to the dimensions of the usedfastText
model.- model
A fastText model, loaded by
fastrtext::load_model()
.
Examples
model <- fastrtext::load_model(system.file("extdata",
"tw_demo_model_sml.bin",
package = "dictvectoR"))
pop_rep <- tw_annot %>%
dplyr::filter(pop == 1) %>%
clean_text(text_field = "full_text") %>%
get_corpus_representation(model = model)
words_df <- data.frame(words = c("coronadeutschland", "skandal"))
words_df$popsimil <- simil_words2rep(words_df,
word_field = "words",
rep = pop_rep,
model)