Awk regular expressions, regular operators detailed

  

use awk as a text processing tool, regular expressions are indispensable. To master the regular expressions used by this tool. In fact, we don't have to learn its regular expression alone. Regular expressions are like a programming language, and have their own grammar rules that already mean something. For different tools, most of them mean the same meaning. Regular expressions are used in many Linux text processing tools (awk, sed, grep, perl). In fact, there are only three types. For details, please refer to: linux shell regular expressions (BREs, EREs, PREs) difference comparison. As long as some tools are part of a certain type of regular expression. Then its grammar rules are basically the same. Through that article, we know that awk's regular expression belongs to: Extended Regular Expression (also called Extended RegEx EREs).


A, awk Extended Regular Expression (ERES) basic expression symbol introduction

character
function
+ specified if one or more characters or extensions The specific value of the regular expression (before + (plus sign)) is in this string, and the string matches. The command line: awk '/smith+ern/' testfile
will contain the character smit
followed by one or more h
characters with the characters ern
Any record of the ending string is printed to standard output. The output in this example is:
smithern, harry smithhern, anne
? Specify the character if zero or one character or extended regular expression (before ? (question)) in the string, then the character String matching. Command line: awk '/smith?/' testfile
Prints all records containing the character smit
followed by zero or one instance of h
characters to standard output. The output in this example is:
smith, alan smithern, harry smithhern, anne smitters, alexis
|             Specify if | Any one of the strings separated by (vertical lines) is in a string, and the string matches. Command line: awk '/allen |  Alan /' testfile
Prints all records containing the string allen
or alan
to standard output. The output in this example is:
smiley, allen smith, alan
( ) Combines strings in a regular expression. Command line: awk '/a(ll)?(nn)?e/' testfile
will have the string ae
or alle
or anne
or All records of allnne
are printed to standard output. The output in this example is:
smiley, allen smithhern, anne
{m} Specifies that if there are exactly m patterns whose specific values ​​are in the string, the string matches. Command line: awk '/l{2}/' testfile
Print to standard output
smiley, allen
{m,} Specify if at least m patterns have specific values ​​in a string match. Command line: awk '/t{2,}/' testfile
Print to standard output:
smitters, alexis
{m, n} Specify if m and n (include m and n) The specific values ​​of the patterns are in the string (where m <= n), then the string matches. Command line: awk '/er{1, 2}/' testfile
Print to standard output:
smithern, harry smithern, anne smitters, alexis
[String] Specify regular expressions and square brackets inside String Any characters specified by the variable match. The command line: awk '/sm[ah]/' testfile
will have sm
followed by any characters arranged in alphabetical order from a
to h
All records are printed to standard output. The output of this example is:
smawley, andy
[^ String] The [ ] (square brackets) and ^ (insert token) at the beginning of the specified string indicate that the regular expression and any characters in the square brackets are not match. Thus, the command line: awk '/sm[^ah]/' testfile
Print to standard output:
smiley, allen smith, alan smithern, harry smithhern, anne smitters, alexis
~,!~ A conditional statement that specifies a variable that matches a regular expression (tilde) or does not match (tilde, exclamation point). Command line: awk '$1 ~ /n/' testfile
Prints all records with the first field containing the characters n
to standard output. The output in this example is:
smithern, harry smithhern, anne
^ Specify the beginning of the field or record. Command line: awk '$2 ~ /^h/' testfile
Prints all records with the character h
as the first character of the second field to standard output. The output in this example is:
smithern, harry
$ Specify the end of the field or record. Command line: awk '$2 ~ /y$/' testfile
Prints all records with the character y
as the last character of the second field to standard output. The output in this example is:
smawley, andy smithern, harry
. (period) Represents any character except the terminal newline character at the end of the blank. Command line: awk '/a..e/' testfile
Prints all records with characters a
and e separated by two characters to standard output. The output in this example is:
smawley, andy smiley, allen smithhern, anne
* (asterisk) means zero or more of any character. Command line: awk '/a.*e/' testfile
Prints all records with characters a
and e separated by zero or more characters to standard output. The output in this example is:
smawley, andy smiley, allen smithhern, anne smitters, alexis
\\ (backslash) escape character. An escape character removes any special meaning of a character when it precedes any character that has a special meaning in the extended regular expression. For example, the command line: /a\\/\\//
will match the pattern a //because the backslash negates the slash as the usual meaning of the regular expression delimiter. To specify the backslash itself as a character, use a double backslash. For more information on backslashes and their use, see the following about escape sequences.


compared to PERs, mainly in conjunction with some type identifier indicates no: comprising: & rdquo; \\ d, \\ D, \\ s, \\ S, \\ t, \\ v, \ , \\f,\ ” Other functions are basically the same. Our common software: javascript
, .net, java supports regular expressions, basically: EPRs type.


Second, awk common call regular expression method

  • awk statement:

    awk ‘/REG/{ Action}’

    /REG/is a regular expression that can be used to send a conditional record of $0 to: action for processing.

  • awk regular operation statement (~, ~! Equivalent!~)

    [chengmo@centos5 ~]$ awk 'BEGIN{info="this is a test";if( info ~ /test/){print "ok" }}'ok

  • Awk uses regular expression functions

    gsub( Ere, Repl, [ In ] )

    sub( Ere, Repl , [ In ] )

    match( String, Ere )

    split( String, A, [Ere] )

    Detail function usage, you can refer to: linux awk built-in function Detailed introduction (example)


    By the above, I don't know if you have a clearer understanding of awk regular expressions. Any questions can communicate with me!

  • Copyright © Windows knowledge All Rights Reserved