Regular Expressions by Michael Paoli :se tabstop=4 //notes to self to accompany slides (+ optinal demos) //mostly just to keep myself in sync with slides and remind myself of //points I may wish to cover that I might not immediately think of based //upon just the slide //water :-) --1--------------------------------------------------------------------- Regular Expressions //title slide //Introduction of Michael Paoli //questions/timeline/presentation, e.g.: Okay, much to cover in comparatively little time - I'm going to go pretty fast. Hopefully we'll have some time for questions at the end, if you feel a burning question before that, please raise your hand, and time permitting I may pick up a few questions along the way. --2--------------------------------------------------------------------- What is a Regular Expression (RE)? --3--------------------------------------------------------------------- Flavors (covering ...) Shells is it an RE, or not? covering for completeness not covering more full RE stuff in some shells mostly just covering common wildcard/globbing among shells BRE & ERE - from POSIX (standardization, etc.) BRE common family found in many places ERE variant of BRE, found in many places + BSDisms some useful extensions found in many BSD derrived REs Perl, etc. - among the most powerful - and complex perlre manpage 1032 non-blank lines ed RE 28 lines on ed(1) manpage - 1979 commonalities, differences, exceptions cover key "families"/flavors, differences of note other differences/exceptions of note won't cover every difference/exception --4--------------------------------------------------------------------- Shells Bourne, etc. & POSIX-like character class - more details later \ - quote, escape, make special/meta - or not C-shells a{b,c,d}e - looks RE-like, but it's not; commonly confused (time permitting) examples --5--------------------------------------------------------------------- BRE - beginnings ... to grep, expr, sed, awk, vi, ... still mostly forward compatible from very old/ancient definition, many older bugs/limitations long since gone quantifiers, e.g. * refer to smallest indivisible preceeding thing (time permitting) examples (time permitting) example of derrivation of grep name --6--------------------------------------------------------------------- BREs (continued from ed(1)) Character class (time permitting) examples --7--------------------------------------------------------------------- BREs (continued from ed(1)) \ \(\), (), (?:) \N &, $& --8--------------------------------------------------------------------- BREs - modern, adds ... \{m,n\}, {m,n} --9--------------------------------------------------------------------- EREs (e.g. egrep, etc.) | - takes the maximal alternatives ? + -10--------------------------------------------------------------------- BSDisms \<, \> -11--------------------------------------------------------------------- Perl, etc. -12--------------------------------------------------------------------- Perl - covered and not covered here not covering double quote (", qq) interpolation m'RE' - \n still matches newline, etc. -13--------------------------------------------------------------------- Perl adds: not all covered here - just what perl adds and is relevant to RE match portion, and not " interpolation bits -14--------------------------------------------------------------------- /x, (?x) continued Make REs human readable! :-) -15--------------------------------------------------------------------- Perl (continued) \w ... \b ... \y -16--------------------------------------------------------------------- Perl (continued) \B ... \Q -17--------------------------------------------------------------------- Perl (continued) positive and negative look ahead/behind behind pattern usually limited to literal characters, character classes, and | (alternation) -18--------------------------------------------------------------------- Worthy of note fgrep [ef]?grep -v, -l - efficiency - also -q (but -q not as portable) -19--------------------------------------------------------------------- Worthy of note (continued) expr - just matching a string to an RE? -20--------------------------------------------------------------------- Worthy of note (continued) sed - surprisingly powerful for "just" a streaming editor pattern space hold space embedded newlines conditional and unconditional branching and labels awk - more of a programming language - sort of like interpreted C + REs -21--------------------------------------------------------------------- Worthy of note (continued) -e vi/nvi vs. vim vi/nvi - mostly BRE + BSDisms vim - perl-like + BSDisms + lots of extensions and exceptions could write a (small) book on vim annoyances -22--------------------------------------------------------------------- select Examples 5 letter palindromes - from an old article echo and command substitution just to get output more consolidated for slide Apache - examples taken from BALUG main production site on virtual hosting provider Apache extensions - ! at start to negate, adjacent AND, [NC], [OR] perl-like, with some limitations, exceptions, extensions -23--------------------------------------------------------------------- "Footnotes" -24--------------------------------------------------------------------- References RE metasyntax vs. tools/languages/libraries -25--------------------------------------------------------------------- Questions?