Problem Statement
Explain basic and extended regular expressions in Linux. What are the differences and how do they affect grep, sed, and awk?
Explanation
Basic Regular Expressions (BRE) require escaping special characters like +, ?, |, {, } to use them as metacharacters. In BRE, grep 'a\+' matches one or more 'a'. Extended Regular Expressions (ERE) treat these as metacharacters by default. Use grep -E or egrep for ERE: grep -E 'a+' matches one or more 'a' without escaping.
Common metacharacters: . (any character), * (zero or more), + (one or more, ERE), ? (zero or one, ERE), ^ (line start), $ (line end), [] (character class), [^] (negated class), () (grouping, ERE or escaped in BRE), | (alternation, ERE or escaped in BRE). Anchors: \b for word boundary (some implementations), \< \> for word start/end.
Grep uses BRE by default, ERE with -E flag. Sed uses BRE by default, ERE with -E or -r flag (GNU sed). Awk uses ERE by default. Examples: grep 'cat\|dog' (BRE) vs grep -E 'cat|dog' (ERE) for alternation. Sed 's/\(word\)/\1s/' (BRE) vs sed -E 's/(word)/\1s/' (ERE) for capture groups.
Character classes: [:alnum:] (alphanumeric), [:alpha:] (alphabetic), [:digit:] (digits), [:space:] (whitespace), [:upper:]/[:lower:] (case). Quantifiers: {n} (exactly n), {n,} (n or more), {n,m} (n to m). Understanding BRE vs ERE prevents regex errors and enables writing portable scripts working across different Unix systems.
