Awk Detailed

  

AWK is an excellent text processing tool. It is not only one of the most powerful data processing engines available in Linux but also in any environment. The maximum functionality of this programming and data manipulation language (named after the first letters of its founders Alfred Aho, Peter Weinberger, and Brian Kernighan) depends on the knowledge of one person. AWK provides extremely powerful features: style loading, flow control, mathematical operators, process control statements, and even built-in variables and functions. It has almost all the fine features that a complete language should have. In fact, AWK does have its own language: AWK programming language, which has been officially defined by the three creators as "style scanning and processing language". It allows you to create short programs that read input files, sort data, process data, perform calculations on inputs, and generate reports, as well as countless other features.

You may be familiar with UNIX, but you may be strange to awk, which is not surprising. Indeed, compared to its excellent features, awk is far from reaching its popularity. What is awk? Unlike most other UNIX commands, it is impossible to know the function of awk from the name: it is neither an English word with independent meaning nor an abbreviation for several related words. In fact, awk is an abbreviation for three names: Aho, (Peter) Weinberg, and (Brain) Kernighan. It is these three people who created awk -- an excellent style scanning and processing tool.

At its simplest, AWK is a programming language tool for working with text. AWK is similar to the shell programming language in many ways, although AWK has its own syntax. Its design ideas are derived from the effective language designed by SNOBOL4, sed, Marc Rochkind, the language tools yacc and lex, and of course some excellent ideas from the C language. When AWK was originally created, its purpose was for text processing, and the basis of this language was to execute a series of instructions as long as there was pattern matching in the input data. The utility scans each line in the file for patterns that match what is given in the command line. If a match is found, proceed to the next programming step. If no match is found, continue processing the next line.

Although the operation can be complicated, the syntax of the command is always:

awk '{pattern + action}' {filenames}

where pattern represents AWK in the data The content that is looked up, and the action is a series of commands that are executed when a match is found. Braces ({}) do not need to appear all the time in the program, but they are used to group a series of instructions according to a particular pattern.

gawk is the GNU version of AWK.

First, what is the function of AWK?

Along with sed and grep, awk is a style scanning and processing tool. But its function is much stronger than sed and grep. Awk provides extremely powerful features: it can do almost everything grep and sed can do, and it can also perform style loading, flow control, mathematical operators, process control statements, and even built-in variables and functions. . It has almost all the fine features that a complete language should have. In fact, awk does have its own language: the awk programming language, which awk's three creators have officially defined as: style scanning and processing languages.

Second, why use awk?

Even so, you may still ask, why should I use awk?

The first reason to use awk is text-based Style scanning and processing is what we often do. Awk does something like a database, but unlike a database, it handles text files. These files have no special storage format. Normal people can edit and read them. Understand and handle them. Database files tend to have special storage formats, which makes them necessary to process them with a database handler. Since this kind of database-like processing is often encountered, we should find a simple and easy way to deal with them. UNIX has many tools for this, such as sed, grep, sort, and find, etc., awk is one of them. Very good one.

The second reason to use awk is that awk is a simple tool, of course it is relative to its powerful features. Indeed, UNIX has many excellent tools, such as the UNIX native development tool C language and its continuation of C++ is very good. But relative to them, awk is much more convenient and simpler to accomplish the same function. This is first because awk offers solutions for a variety of needs: from the awk command line for solving simple problems to the complex and sophisticated awk programming language, the advantage of this is that you don't have to use complicated methods to solve the problem. Simple question. For example, you can solve a simple problem with a command line, and C does not work. Even a simple program, C language must be written and compiled. Secondly, awk itself is interpreted and executed, which makes the awk program not have to go through the compilation process. At the same time, it also makes it fit well with the shell script program. Finally, awk itself is simpler than C language. Although awk absorbs many excellent components of C language, familiar with C language will be of great help to learn awk, but awk itself does not need to use C language — — a powerful But development tools that take a lot of time to learn to master their skills.

The third reason to use awk is that awk is an easy to get tool. Unlike C and C++, awk has only one file (/bin/awk), and almost every version of UNIX provides its own version of awk, so you don't have to worry about how to get awk. But the C language is not the case. Although the C language is a natural development tool for UNIX, this development tool is released separately. In other words, you must pay for your UNIX version of the C language development tool (except for those who use the D version. ), get and install it, then you can use it.

Based on the above reasons, coupled with the powerful features of awk, we have reason to say that if you want to deal with the work related to text style scanning, awk should be your first choice. Here is a general rule to follow: If you have difficulty with ordinary shell tools or shell scripts, try awk. If awk still can't solve the problem, then use C language. If C language still fails, move to C++.

Three, awk call method

As mentioned before, awk provides different solutions to meet a variety of needs, they are:

1, awk command line, You can use awk just like you would with normal UNIX commands. You can also use the awk programming language on the command line. Although awk supports multiple lines of input, it is a good record to enter a long command line and ensure that it is correct. It's a headache, so this method is generally only used to solve simple problems. Of course, you can also reference the awk command line or even the awk script in the shell script.

Copyright © Windows knowledge All Rights Reserved