Package: mclm 0.2.7.9000

Mariana Montes

mclm: Mastering Corpus Linguistics Methods

Read, inspect and process corpus files for quantitative corpus linguistics. Obtain concordances via regular expressions, tokenize texts, and compute frequencies and association measures. Useful for collocation analysis, keywords analysis and variationist studies (comparison of linguistic variants and of linguistic varieties).

Authors:Dirk Speelman [aut], Mariana Montes [aut, cre]

mclm_0.2.7.9000.tar.gz
mclm_0.2.7.9000.zip(r-4.5)mclm_0.2.7.9000.zip(r-4.4)mclm_0.2.7.9000.zip(r-4.3)
mclm_0.2.7.9000.tgz(r-4.4-x86_64)mclm_0.2.7.9000.tgz(r-4.4-arm64)mclm_0.2.7.9000.tgz(r-4.3-x86_64)mclm_0.2.7.9000.tgz(r-4.3-arm64)
mclm_0.2.7.9000.tar.gz(r-4.5-noble)mclm_0.2.7.9000.tar.gz(r-4.4-noble)
mclm_0.2.7.9000.tgz(r-4.4-emscripten)mclm_0.2.7.9000.tgz(r-4.3-emscripten)
mclm.pdf |mclm.html
mclm/json (API)
NEWS

# Install 'mclm' in R:
install.packages('mclm', repos = c('https://masterclm.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/masterclm/mclm/issues

Uses libs:
  • c++– GNU Standard C++ Library v3

On CRAN:

corpuslinguistics

100 exports 1 stars 0.74 score 37 dependencies 35 scripts 205 downloads

Last updated 2 years agofrom:773eecbe1f. Checks:OK: 9. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 13 2024
R-4.5-win-x86_64OKSep 13 2024
R-4.5-linux-x86_64OKSep 13 2024
R-4.4-win-x86_64OKSep 13 2024
R-4.4-mac-x86_64OKSep 13 2024
R-4.4-mac-aarch64OKSep 13 2024
R-4.3-win-x86_64OKSep 13 2024
R-4.3-mac-x86_64OKSep 13 2024
R-4.3-mac-aarch64OKSep 13 2024

Exports:as_characteras_concas_data_frameas_fnamesas_freqlistas_numericas_reas_tokensas_typesas.reassoc_abcdassoc_scorescat_rechisq1_to_pcleanup_spacescol_pcoordconcdetailsdrop_booldrop_empty_rcdrop_extensiondrop_fnamesdrop_pathdrop_posdrop_redrop_tagsdrop_typesexplorefind_xpathfnames_mergefnames_merge_allfreqlistfreqlist_difffreqlist_mergefreqlist_merge_allget_fnamesimport_conckeep_boolkeep_fnameskeep_poskeep_rekeep_typesmclm_xml_textmerge_concn_fnamesn_tokensn_typesorig_ranksorig_ranks<-p_to_chisq1perl_flavorperl_flavor<-print_kwicranksrere_has_matchesre_replace_allre_replace_firstre_retrieve_allre_retrieve_firstre_retrieve_lastre_whichread_assocread_concread_fnamesread_freqlistread_tokensread_txtread_typesrow_pcoordscan_rescan_re2scan_txtscan_txt2short_namesslmasurf_cooctext_cooctokenizetokens_mergetokens_merge_alltot_n_tokenstot_n_tokens<-trunc_attype_freqtype_freqstype_namestypestypes_mergetypes_merge_allwrite_assocwrite_concwrite_fnameswrite_freqlistwrite_tokenswrite_txtwrite_typesxlim4caylim4cazero_plus

Dependencies:BHbitbit64caclicliprcpp11crayondplyrfansigenericsgluehmslifecyclemagrittrNLPpillarpkgconfigprettyunitsprogressR6Rcppreadrrlangslamstringistringrtibbletidyselecttmtzdbutf8vctrsvroomwithrxml2yaml

Readme and manuals

Help Manual

Help pageTopics
Coerce object to characteras_character as_character.default as_character.re as_character.tokens
Coerce data frame to a concordance objectas_conc
Coerce object to a data frameas.data.frame.assoc_scores as.data.frame.conc as.data.frame.details.slma as.data.frame.fnames as.data.frame.freqlist as.data.frame.slma as.data.frame.tokens as.data.frame.types as_data_frame as_data_frame.default
Coerce object to 'fnames'as_fnames
Coerce table to a frequency listas_freqlist
Coerce object to a numeric vectoras_numeric as_numeric.default
Coerce object to class 'tokens'as_tokens
Coerce object to a vector of typesas_types
Association scores used in collocation analysis and keyword analysisassoc_abcd assoc_scores
Subset an object by different criteriabrackets [.fnames [.freqlist [.tokens [.types [<-.fnames [<-.tokens [<-.types
Helpers for plotting 'ca' objectsca_help col_pcoord row_pcoord xlim4ca ylim4ca
Print a regular expression to the consolecat_re
Proportion of chi-squared distribution with one degree of freedom that sits to the right of xchisq1_to_p
Clean up the use of whitespace in a character vectorcleanup_spaces
Build a concordance for the matches of a regexconc
Build collocation frequencies.create_cooc surf_cooc text_cooc
Details on a specific itemdetails details.slma
Drop empty rows and columns from a matrixdrop_empty_rc
Drop XML tags from character stringdrop_tags
Interactively navigate through an objectexplore explore.assoc_scores explore.conc explore.fnames explore.freqlist explore.tokens explore.types
Run XPath queryfind_xpath
Retrieve the names of files in a given pathfnames get_fnames
Build the frequency list of a corpusfreqlist
Subtract frequency listsfreqlist_diff
Import a concordanceimport_conc
Subset an object based on logical criteriadrop_bool drop_bool.fnames drop_bool.freqlist drop_bool.tokens drop_bool.types keep_bool keep_bool.fnames keep_bool.freqlist keep_bool.tokens keep_bool.types
Filter collection of filenames by namedrop_fnames keep_fnames
Subset an object by indexdrop_pos drop_pos.fnames drop_pos.freqlist drop_pos.tokens drop_pos.types keep_pos keep_pos.fnames keep_pos.freqlist keep_pos.tokens keep_pos.types
Subset an object based on regular expressionsdrop_re drop_re.fnames drop_re.freqlist drop_re.tokens drop_re.types keep_re keep_re.fnames keep_re.freqlist keep_re.tokens keep_re.types
Subset an object based on a selection of typesdrop_types drop_types.fnames drop_types.freqlist drop_types.tokens drop_types.types keep_types keep_types.fnames keep_types.freqlist keep_types.tokens keep_types.types
Get text from xml nodemclm_xml_text
Merge concordancesmerge_conc
Merge filenames collectionsfnames_merge fnames_merge_all merge_fnames
Merge frequency listsfreqlist_merge freqlist_merge_all merge_freqlist
Merge 'tokens' objectsmerge_tokens tokens_merge tokens_merge_all
Merge 'types' objectsmerge_types types_merge types_merge_all
Count number of items in an 'fnames' objectn_fnames
Count tokensn_tokens n_tokens.freqlist n_tokens.tokens
Count typesn_types n_types.assoc_scores n_types.freqlist n_types.tokens n_types.types
Retrieve or set original ranksorig_ranks orig_ranks.freqlist orig_ranks<- orig_ranks<-.default orig_ranks<-.freqlist
P right quantile in chi-squared distribution with 1 degree of freedomp_to_chisq1
Retrieve or set the flavor of a regular expressionperl_flavor perl_flavor<-
Print a concordance in KWIC formatprint_kwic
Print an objectmclm_print print.assoc_scores print.conc print.fnames print.freqlist print.slma print.tokens print.types
Retrieve the current ranks for frequency counts.ranks ranks.freqlist
Build a regular expressionas.re as_re re
Convenience functions in support of regular expressionsre_convenience re_has_matches re_replace_all re_replace_first re_retrieve_all re_retrieve_first re_retrieve_last re_which
Read association scores from fileread_assoc
Read a concordance from a fileread_conc
Read a collection of filenames from a text fileread_fnames
Read a frequency list from a csv fileread_freqlist
Read a 'tokens' object from a text fileread_tokens
Read a text file into a character vectorread_txt
Read a vector of types from a text fileread_types
Scan a regular expression from consolescan_re scan_re2
Scan a character string from consolescan_txt scan_txt2
Shorten filenamesdrop_extension drop_path short_names
Stable lexical marker analysisslma
Sort an 'assoc_scores' objectsort.assoc_scores
Sort a frequency listsort.freqlist
Create or coerce an object into class 'tokens'tokenize tokens
Retrieve or set the total number of tokenstot_n_tokens tot_n_tokens.freqlist tot_n_tokens<- tot_n_tokens<-.freqlist
Truncate a sequence of character datatrunc_at trunc_at.tokens
Retrieve frequencies from 'freqlist' objecttype_freq type_freqs
Return the names of the types in an objecttype_names type_names.assoc_scores type_names.freqlist
Build a 'types' objecttypes
Write association scores to filewrite_assoc
Write a concordance to file.write_conc
Write a collection of filenames to a text filewrite_fnames
Write a frequency list to a csv filewrite_freqlist
Write a 'tokens' object to a text filewrite_tokens
Write a character vector to a text filewrite_txt
Write a vector of types to a text filewrite_types
Make all values strictly higher than zerozero_plus