How can I add a filter to my grep script to NOT include a string?

Question

I'm working on a script that will separate the Registrar information from a domains whois. So far it's working enough but there are a few things that I want to remove in order for it to be a bit cleaner. It works on the majority of domains. Here's my code:

#!/bin/bash
reg=$(whois "stackoverflow.com" | egrep -i 'Registrar|Sponsoring Registrar|Registrant|!internic')
printf "Below is my best attempt at finding the Registrar info:\n"
printf "$reg\n"

And here's what it outputs:

Below is my best attempt at finding the Registrar info:
with many different competing registrars. Go to http://www.internic.net
   Registrar: NAME.COM, INC.
   Sponsoring Registrar IANA ID: 625
registrar's sponsorship of the domain name registration in the registry is
date of the domain name registrant's agreement with the sponsoring
registrar.  Users may consult the sponsoring registrar's Whois database to
view the registrar's reported date of expiration for this registration.
Registrars.

I added some psudo-code in my grep to try and exclude the string "internnic", in order to snip off that first line. I'd also want to find a way to remove the secondary "registrar's sponsorship..." etc.

Is it possible to detect a string and not include that line? Thanks

True, but if I don't have it it misses a lot that I do want. Good idea though I'll look into it. — Egrodo, May 20 at 15:01

cas · Accepted Answer · 2016-05-21 00:23:15Z

Another option is to be more specific about what you are grepping for. For example:

whois stackoverflow.com | grep -E '^[[:space:]]*(Registr(ar|ant|y)|Sponsoring).*: '

This extracts only lines that begin with optional white space before either 'Registrar', 'Registrant', 'Registry', or 'Sponsoring', followed by any number (zero or more) of any character, followed by a colon and a space.

(BTW, this uses grep -E rather than the obsolete and deprecated egrep. They do the same thing.)

Output:

   Registrar: NAME.COM, INC.
   Sponsoring Registrar IANA ID: 625
Registry Domain ID: 108907621_DOMAIN_COM-VRSN 
Registrar WHOIS Server: whois.name.com 
Registrar URL: http://www.name.com 
Registrar Registration Expiration Date: 2016-12-26T19:18:07Z 
Registrar: Name.com, Inc. 
Registrar IANA ID: 625 
Registry Registrant ID:  
Registrant Name: Sysadmin Team 
Registrant Organization: Stack Exchange, Inc. 
Registrant Street: 110 William St , Floor 28 
Registrant City: New York 
Registrant State/Province: NY 
Registrant Postal Code: 10038 
Registrant Country: US 
Registrant Phone: +1.2122328280 
Registrant Email: [email protected] 
Registry Admin ID:  
Registry Tech ID:  
Registrar Abuse Contact Email: [email protected] 
Registrar Abuse Contact Phone: +1.1 7203101849

BTW, while testing any form of text processing (incl. regular expressions) on text from slow sources (like a database query or from a remote source like whois or a http server), it's useful to run the slow command once and redirect output to a file, then test against the file. When you have what you want, make sure it works the same with directly-piped (fresh) data.

e.g.

whois stackoverflow.com > so.txt

Other useful things to do with whois output:

extract Domain block at beginning of whos (field lines begin with 4 spaces and end with a colon):

grep -Ei '^[[:blank:]]+.*:[[:blank:]]' so.txt

Output:

   Domain Name: STACKOVERFLOW.COM
   Registrar: NAME.COM, INC.
   Sponsoring Registrar IANA ID: 625
   Whois Server: whois.name.com
   Referral URL: http://www.name.com
   Name Server: CF-DNS01.STACKOVERFLOW.COM
   Name Server: CF-DNS02.STACKOVERFLOW.COM
   Status: clientTransferProhibited https://icann.org/epp#clientTransferProhibited
   Updated Date: 26-nov-2015
   Creation Date: 26-dec-2003
   Expiration Date: 26-dec-2016

extract Registrant block, beginning with `Domain Name' field and ending with 'Registrar Abuse Contact Phone' field:

sed -n -e '/^Domain Name:/,/^Registrar Abuse Contact Phone:/p' so.txt
both of the above together:

sed -n -e '/^Domain Name:/,/^Registrar Abuse Contact Phone:/p /^[[:blank:]]+.*:[[:blank:]] /p'
Output from all of the above can easily be further processed with awk or any other text-processing tool that can be made to use a colon (:) character as field-separator.

Wow, this is extremely informative and helpful to what I'm trying to do. I am going to look into this further when I get a chance to work on my script. I really appreciate your explanations for everything, just reading the post I've learned a ton. — Egrodo, May 22 at 4:02

stdunbar · Answer 2 · 2016-05-20 14:48:05Z

up vote 2 down vote

Use the -v flag:

reg=`whois stackoverflow.com | egrep -i 'Registrar|Sponsoring Registrar|Registrant' | grep -v internic`

answered May 20 at 14:48

stdunbar

1211

This is what I was looking for, thanks! – Egrodo May 20 at 15:02

add a comment |

asked	12 days ago
viewed	48 times
active	11 days ago

current community

your communities

more stack exchange communities

How can I add a filter to my grep script to NOT include a string?

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged bash scripting grep string whois or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

How can I add a filter to my grep script to NOT include a string?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged bash scripting grep string whois or ask your own question.

Related

Hot Network Questions