| Title: | Fast Regex Matching with Rust and Parallel Processing |
|---|---|
| Description: | High-performance regex matching for R using Rust via extendr. Supports both standard regex (fast) and fancy-regex (backrefs/lookaheads). Includes smart defaults for parallel processing and multiple output formats. |
| Authors: | Brancen Gregory [aut, cre] |
| Maintainer: | Brancen Gregory <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.0.0.9000 |
| Built: | 2026-05-27 06:06:20 UTC |
| Source: | https://github.com/brancengregory/stringrs |
High-performance regex matching for R using Rust via extendr. Supports both standard regex (fast) and fancy-regex (backrefs/lookaheads). Includes smart defaults for parallel processing and multiple output formats.
Maintainer: Brancen Gregory [email protected]
Choose regex engine based on pattern complexity
choose_engine(patterns)choose_engine(patterns)
Detect multiple patterns using optimized flat array output
detect_multi(strings, patterns, output, engine, parallel, chunk_size)detect_multi(strings, patterns, output, engine, parallel, chunk_size)
Detect single pattern match (uses cached compilation)
detect_single(strings, pattern, engine, parallel)detect_single(strings, pattern, engine, parallel)
Fancy-regex single pattern with caching
r_string_detect_fancy_cached(strings, pattern, parallel)r_string_detect_fancy_cached(strings, pattern, parallel)
Multi-pattern fancy-regex with caching
r_string_detect_multi_fancy_optimized( strings, patterns, parallel_strategy, chunk_size )r_string_detect_multi_fancy_optimized( strings, patterns, parallel_strategy, chunk_size )
OPTIMIZED: Multi-pattern detection using flat boolean array Returns: flat array n_strings × n_patterns as R integer vector (0/1)
r_string_detect_multi_regex_optimized( strings, patterns, parallel_strategy, chunk_size )r_string_detect_multi_regex_optimized( strings, patterns, parallel_strategy, chunk_size )
OPTIMIZED: Single pattern detection
r_string_detect_regex_cached(strings, pattern, parallel)r_string_detect_regex_cached(strings, pattern, parallel)
Uses global regex cache, thread-local instances for zero contention, flat array output, and optimized parallel chunking.
string_detect( strings, patterns, output = c("wide", "long"), engine = c("auto", "regex", "fancy_regex"), parallel = c("auto", "sequential", "string_parallel", "pattern_parallel"), chunk_size = NULL )string_detect( strings, patterns, output = c("wide", "long"), engine = c("auto", "regex", "fancy_regex"), parallel = c("auto", "sequential", "string_parallel", "pattern_parallel"), chunk_size = NULL )
strings |
Character vector of strings to search |
patterns |
Character vector of regex patterns (single or multiple) |
output |
Format of output: "wide" (default) or "long" |
engine |
Regex engine: "auto" (default), "regex", or "fancy_regex" |
parallel |
Parallel strategy: "auto" (default), "sequential", "string_parallel", "pattern_parallel" |
chunk_size |
Number of strings per chunk for parallel processing. NULL for auto. |
For single pattern: logical vector. For multiple patterns: tibble (wide or long format).