stevepedwards.com/DebianAdmin linux mint IT admin tips info

Printing (LIVE!) ASCII Histograms from Numeric Data Output + Files

 

Live histogram in a terminal??!! So how did I get to this point after 3 days of hard slog? Read on..!

------------------------------------------------------------------------------------------------------------------------------------------------- 

2/9/16 OK, I have checked my suspicion that the number of hashes are out of proportion for live data after the first reference maximum is read (without the sort function in the first part of the original function) if smaller than later numbers. You can change the multiplier to 1 from 60, for exact proportion, but then your ranges are limited to about 1-143 max without spilling onto new lines for a full screen terminal width. Programmers out there will have solutions no doubt...

143max.png

Other limitations are data cannot be 0 or negative numbers, due to divide by zero errors.

------------------------------------------------------------------------------------------------------------------------------------------------- 

Boy! This topic had me going bonkers! I could not find simple examples on the web over 2 days for terminal ASCII histograms! Making sense of Perl nearly got me sectioned...and I still failed to get it's methods...Awk gave me hope, as I did not think it possible to create these histos in awk without perl as I never found an example until this seemingly complex nightmare, which is the command that the above statements and limitations above refer to, when altered for the live data output without the sort section:

history | awk '{h[$2]++}END{for(i in h){print h[i],i|"sort -rn|head -20"}}' |awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'

histhisto.png

Many people on the web want this too, as it's very useful, without installing large job specific Apps, so why such a lack of simple examples or good tutorials out there? I guess it's because awk and perl are complex to understand, and if you just want a simple diagram from some numbers, who is going to start learning these langs from scratch when other tools like Gnuplot are simpler to use? You can see the journey I took on the awk/perl notepad page - though I'm trying to learn awk anyway.

It turns out, that after fighting with various web examples, then simplifying them as best I could via trial and error mostly, trying to work out the arcane dark art that is perl, I had an answer from day 1 staring me in the face, with just one tweak required - "print $1" - for it to process a simple numeric data file and spit out bars:

perl -pe 's/(\d+)$/"="x$1/e; print "$1 "' random.txt

randomhisto.png

Easy eh? Yeah, once shown..!

Now there is fine tuning for the pedantic, as padding is required for single digits, accounting for decimals etc. but it is a start, and can be tidied via sort if required:

perl -pe 's/(\d+)$/"="x$1/e; print "$1 "' random.txt | sort -nr

randsorted.png

The main point is that it READS EACH LINE of the file without loops etc. - default general behaviour for awk as I've seen so far - but had me stumped with the main example I was working with below, which has a "multiplier" function to add the "=" bar characters in proportion to the data, for scaling the decimal to a better view size:

 perl -e 'while(1) {`uptime` =~ /average: ([\d.]+)/; printf("% 5s %s\n", $1, "#" x ($1 * 10)); sleep 3 }'

perlloadavg.png

You can see why I had problems trying to reduce this example to something simpler to take file input rather than a command, and keep the bar output, AND remove the while loop and sleep timer. It would not work, whatever I tried, as it only reads the first line from an input and stops, even if awk was trying to feed it something like:

awk '{print}' 1_5.txt
1
2
3
4
5

awk '{print}' 1_5.txt | perl -e '{/^/ =~ /([\d.]+)/; printf("% 5s %s\n", $1, "#" x ($1 * 10)) }'
1 ##########

One other major mistake I kept making was confusing "backticks" for my usual single quotes:

`command` is required - NOT 'command'

These keys are accessed via my Fn key with the "¬ marked" key (next to key1), the Spacebar...I've not used them before as most stuff is pasted from elsewhere, so if you are using single quotes, and your commands are not working how they should, you know why, as in the example:

{`uptime` =~ /average:

I did get cat 1_5.txt to work in the place of uptime, but perl still only reads one line for multi line examples:

perl -e '{`cat load.txt` =~ /average: ([\d.]+)/; printf("% 5s %s\n", $1, "#" x ($1 * 10))}'
1.15 ###########

No good for:

cat 6_1.txt
6
5
4
3
2

perl -e '{`cat 6_1.txt` =~ /([\d.]+)/; printf("% 5s %s\n", $1, "#" x ($1 * 10))}'
6 ############################################################

As only the first line (6) is read, not the rest. That was a frustrating dead end for ages.

With scripting; research, trial, error and persistence is often the only way to get to s solution, short of spending weeks learning a language from scratch in the hope you will cover exactly what you are looking for, but you still need some of the basics of any language to have an idea of behaviour, and character options for specific functions...not easy.

Now with these examples, uptime particularly, you MAY be able amend it sufficiently to apply to any/all of the system tools I have listed in the Performance Tools Post

31/8/16: Ok, 2 days later I have learned a lot by studying the history histogram function above and found some (not all!!) of how it works. Smart programmer did that line! :

------------------------------------------------------------------------------------------------------

The function is actually two awk commands that can be treated separately - good!

history | awk '{h[$2]++}END{for(i in h){print h[i],i|"sort -rn|head -20"}}'
419 awk
216 perl
122 cat
39 vi...

The first above is easy-ish - loops to find how many occurrences of each command used, sorts high to low numerically, then heads it to only 20. (The full history record varies on setup and linux dist. Mint default list is:

history | wc -l
1000

The history awk function reads $2 for the command names and puts them in h[$2] :

history | tail -3
2208 wc -w < TheRavenV1_3.txt
2209 history | head -3
2210 history | tail -3

history | awk '{h[$2]++}END{for(i in h){print h[i],i|"sort -rn|head -20"}}' |awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}' | head -2
awk 419 ############################################################
perl 216 ###############################

I think it puts the contents of $2 of history list into an incrementing array named h[$2], so contains the command name, later ID'd by $2, and h[$2], with the increment integer (i) up to max occurrence ID'd by h[i] and i in the for loop: This can be shown by printing the contents of each section:

Array command list contents (each occurrence counted by uniq here for clarity):

history | awk '{h[$2]++; print $2}' | uniq -c | head
5 cat
3 perl
1 uptime
1 vi
22 cat
1 sudo
2 perl
1 uptime
10 cat
10 uptime

Array Command Occurrence Counter:

history | awk '{h[$2]++; print h[$2]}'

72
1
73
33
420
421
422

For loop counter contents:

history | awk '{h[$2]++}END{for(i in h){print h[i],i }}'
1 rm
419 awk
1 sudo
2 cd
3 clear
39 vi...

This also shows I have used only 45 different commands in the whole 1000 size list:

history | awk '{h[$2]++}END{for(i in h){print h[i],i | "sort -rn"}}' | wc -l
45

Summary Part1 (I think??) - output of history command column $2 are stored in a numbered incrementing array, counted for occurrence of each command and labelled using a FOR loop, then numerically sorted by occurrence and headed.

Part 2

awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'

The next awk command can't be shown directly as it needs the prior awk output, but knowing the format of the first part that feeds it comprises only 2 columns, a different 2 column input can be fed in to learn it's behaviour. Using cat to give a number list and a name file from prior awk notepad page file examples, to give a text section as the history command would;

cat -n acronyms.txt | awk '{print $1,$2}' | awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'
BASIC 1 ############################################################
CICS 2 ########################################################################################################################
COBOL 3 ####################################################################################################################################################################################

Now it is obvious that the multiplier is 60*$1. If reduced to 1, it shows:

cat -n acronyms.txt | awk '{print $1,$2}' | awk '!max{max=$1;}{r="";i=s=1*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'
BASIC 1 #
CICS 2 ##
COBOL 3 ###
DBMS 4 ####
GIGO 5 #####
GIRL 6 ######
AWK 7 #######
PERL 8 ########
FORTRAN 9 ######### 

Summary Part 2 

An experienced programmer would have seen immediately that it is part 2 that is the generic useful part for histogram usage as this contains the multiplier to suit various number ranges, and reads a NOT MAX and MAX value and uses a dividend (somehow!) to scale the output according the maximum and minimum of the range (?? not sure about that or how).

Point is, Part 2 can be used in isolation for many basic number lists - experimenting now...

It works correctly on the random.txt file, scaling automatically AND padding single digits properly in a terminal, with the 60 multiplier! Don't know how it works yet...(why 60?) This scales all below by 2/3 max...very clever...

cat random.txt | awk '{print $1,$2}' | awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'
90 ############################################################
87 ##########################################################
74 ##################################################
40 ###########################
90 ############################################################
70 ###############################################
46 ###############################
10 #######
35 ########################
  8 ######
  1 #
  0

Generating some random numbers, I get the single digit large multiplier issue:

for x in {1..5}; do echo $((1 + RANDOM % 20)); done >> Random.txt

6
19
8
2
11

cat Random.txt | awk '{print $1,$2}' | awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'
2 ############################################################
8 ################################################################################################################################################################################################################################################
7 ##################################################################################################################################################################################################################
6 ####################################################################################################################################################################################
0
5 ######################################################################################################################################################
0
4 ########################################################################################################################
0
6 ####################################################################################################################################################################################

BUT, as soon as you have a double digit in there, it auto formats all lines to x3max total in proportion for all - which for most purposes should be fine if data list has at least a 10 in there!

rm Random.txt

echo $((1 + RANDOM % 20)) >> Random.txt

cat Random.txt | awk '{print $1,$2}' | awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'
20 ############################################################
5 ###############
12 ####################################
11 #################################
20 ############################################################

Also, just found out that the 60 is the max number of hashes to be shown for the biggest number in any range - vi told me...

60hashes.png

An interesting thing to KNOW re awk, I found due to that hash line above; It needs to complete the first line scan...if you want number of chars in a line e.g. by setting the FS (Field Separator) to null, as there are no white spaces between chars:

cat hashes.txt
############################################################
############################################################
The number of hashes won't be read correctly for the first line, only the second! This could be a prob in a program eh? A headscratcher debugging it!

awk 'FS = ""; {print NF}' hashes.txt
0
60

You get round this by using a BEGIN function before the main awk loop body starts to allow the first line to be read before the loop:

awk 'BEGIN {FS=""} {print NF }' hashes.txt
60
60

I suppose the main thing that drove all this was to see Performance Stats on the command line, so as a simple example from log/recorded data, amend the command to suit. This is some data from the live Gnuplot Posts where $3 is the CPU usr load :

cat dstatlive.txt
26-08 18:20:12| 13 3 82 2 0 00
26-08 18:20:13| 14 4 81 0 0 10
26-08 18:20:14| 12 4 81 3 0 00

Running this through the command gives the wrong column order, as the 26 is from the file's $1 date:

cat dstatlive.txt | awk '{print $1,$2}' | awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'
18:20:12| 26 ############################################################
18:20:13| 26 ############################################################
18:20:14| 26 ############################################################

Change $1 to $3:

cat dstatlive.txt | awk '{print $3,$2}' | awk '!max{max=$1;}{r="";i=s=60*$1/max;while(i-->0)r=r"#";printf "%15s %5d %s %s",$2,$1,r,"\n";}'
18:20:12| 13 ############################################################
18:20:13| 14 #################################################################
18:20:14| 12 ########################################################

Here's an answer to help - replacing cat and one awk, and with altered spacers between fields, so now we know what they do...

awk '{print $1,$2,$3}' dstatlive.txt | awk '!max{max=$3;}{r="";i=s=60*$3/max;while(i-->0)r=r"#";printf " %1s %1s %2d %s %s",$1,$2,$3,r,"\n";}'
26-08 18:20:12| 13 ############################################################
26-08 18:20:13| 14 #################################################################
26-08 18:20:14| 12 ########################################################

Try and work out how to include the full date time columns for the data yourself then apply it to any logs you have of the same/similar format, like sadf generates:

AMDA8 600 2016-08-31 20:15:01 UTC all %%system 3.63
AMDA8 600 2016-08-31 20:15:01 UTC all %%iowait 0.41
AMDA8 600 2016-08-31 20:15:01 UTC all %%steal 0.00
AMDA8 600 2016-08-31 20:15:01 UTC all %%idle 86.99

You have a useful tool there if you can adapt it. Here's an immediate practical example running dstat for 15 secs:

 dstat -tc | awk '{if(NR>3) print $1,$2,$3,$4 fflush()}' | tee dstat15secs.txt

awk '{print $1,$2,$3}' dstat15secs.txt | awk '!max{max=$3;}{r="";i=s=60*$3/max;while(i-->0)r=r"#";printf " %1s %1s %2d %s %s",$1,$2,$3,r,"\n";}'
31-08 23:04:34| 14 ############################################################
31-08 23:04:35| 11 ################################################
31-08 23:04:36| 14 ############################################################
31-08 23:04:37| 14 ############################################################
31-08 23:04:38| 13 ########################################################
31-08 23:04:39| 12 ####################################################
31-08 23:04:40| 14 ############################################################
31-08 23:04:41| 13 ########################################################
31-08 23:04:42| 13 ########################################################
31-08 23:04:43| 13 ########################################################
31-08 23:04:44| 15 #################################################################
31-08 23:04:45| 11 ################################################
31-08 23:04:46| 15 #################################################################
31-08 23:04:47| 12 ####################################################
31-08 23:04:48| 17 #########################################################################

AND FINALLY - for the last trick - Do It Live!! Bypass the files and pipe it directly into the function for a live per seconds, updating histo!! Remember from the Live Data Posts to add the extra $4 so the weird extra 0 that awk adds does not affect the data you want in $3.

dstat -tc | awk '{if(NR>3) print $1,$2,$3,$4 fflush()}' | awk '!max{max=$3;}{r="";i=s=60*$3/max;while(i-->0)r=r"#";printf " %1s %1s %2d %s %s",$1,$2,$3,r,"\n";}'
31-08 23:13:37| 15 ############################################################
31-08 23:13:38| 12 ################################################
31-08 23:13:39| 17 ####################################################################
31-08 23:13:40| 13 ####################################################
31-08 23:13:41| 13 ####################################################
31-08 23:13:42| 15 ############################################################
31-08 23:13:43| 13 ####################################################
31-08 23:13:44| 13 ####################################################
31-08 23:13:45| 14 ########################################################

 

 

A static example over 2.5 hrs of sadf %user load 10min sample logs since PC start:

sadf | grep user | awk '{print $3,$4,$8 }' | awk '!max{max=$3;}{r="";i=s=60*$3/max;while(i-->0)r=r"#";printf " %1s %1s %2d %s %s",$1,$2,$3,r,"\n";}'
2016-09-02 09:05:02 11 ############################################################
2016-09-02 09:15:01 7 #########################################
2016-09-02 09:25:01 8 ################################################
2016-09-02 09:35:01 6 ####################################
2016-09-02 09:45:01 10 ######################################################
2016-09-02 09:55:01 13 ##########################################################################
2016-09-02 10:05:01 13 #######################################################################
2016-09-02 10:15:01 16 #####################################################################################
2016-09-02 10:25:01 14 ################################################################################
2016-09-02 10:35:01 14 ############################################################################
2016-09-02 10:45:01 11 #############################################################
2016-09-02 10:55:01 11 ###############################################################
2016-09-02 11:05:01 15 ################################################################################
2016-09-02 11:15:01 17 ############################################################################################
2016-09-02 11:25:01 18 ###################################################################################################
2016-09-02 11:35:01 14 ##############################################################################

OK, I have checked my suspicion that the number of hashes are incorrect after the first reference maximum, which is why the original function sorts the numbers first, so the maximum is read first and becomes the reference. So you would have to put sort -k=[x.xx] in there before the awk function. Unfortunately, this puts dates out of order AND negates running correct relative values for accurate live data...shame....

This one for free memory changes may be useful over 40 secs, as it fills a full term; enough to see wild changes anyway..? This is "live" but buffered and shown once complete.

free -c 40 -s1 | grep Mem | awk '{print $4 }' | awk '!max{max=$1;}{r="";i=s=140*$1/max;while(i-->0)r=r"#";printf " %1s %1s %s",$1,r,"\n";}'

freememhisto.png

This next one below is a comparison of free mem from vmstat, seen against free above; They differ by a factor of ten in bytes readings seen below, but show near equivalence in proportion, so accurate enough. It's a lot easier to see changes in a diagram than reading numbers eh?!

sudo vmstat 1 | awk '{if(NR>4) print $4 fflush()}' | awk '!max{max=$1;}{r="";i=s=140*$1/max;while(i-->0)r=r"#";printf " %1s %1s %s",$1,r,"\n";}'

vmstatfree.png

If you wanted to get 4 systems monitored on 4 small terms, drop the 140 to 70:

sudo vmstat 1 | awk '{if(NR>4) print $4 fflush()}' | awk '!max{max=$1;}{r="";i=s=70*$1/max;while(i-->0)r=r"#";printf " %1s %1s %s",$1,r,"\n";}'

4terms.png

Comments are closed.

Post Navigation