193. Valid Phone Numbers
Description
Given a text file file.txt that contains list of phone numbers (one per line), write a one liner bash script to print all valid phone numbers.
You may assume that a valid phone number must appear in one of the following two formats: (xxx) xxx-xxxx or xxx-xxx-xxxx. (x means a digit)
You may also assume each line in the text file must not contain leading or trailing white spaces.
Example:
Assume that file.txt has the following content:
987-123-4567 123 456 7890 (123) 456-7890
Your script should output the following valid phone numbers:
987-123-4567 (123) 456-7890
My Solution
Source Code
1
2
# Read from the file file.txt and output all valid phone numbers to stdout.
grep -E "^(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$" file.txt
Analysis
So this is probably going to sound terrible but I didn't actually know there
were different implementations of POSIX regular expressions. I learned today
there are Basic Regular Expressions and Extended Regular Expressions. ERE are
what I've been using my entire programming-related life. Some of the
differences between BRE and ERE include {} and
() needing to be escaped in BRE and ERE having +,
?, and |. When you use grep, by default
it takes a BRE. I've used grep before but only with absolutely
basic regular expressions, like just matching a specific word. To use
grep with a ERE, you would use grep -E.
Anyway, the regular expression I came up is not that complicated. Basically I
started with the simplest test case, a number formatted
xxx-xxx-xxxx and worked my way up.
^[0-9]{3}-[0-9]{3}-[0-9]{4}$ matches xxx-xxx-xxxx
/(?[0-9]{3}/)?-[0-9]{3}-[0-9]{4} will also match
(xxx)-xxx-xxxx which isn't actually valid but it's a start.
Area codes that are wrapped in () must be followed by a space.
Likewise area codes not wrapped in () must be followed by hyphen.
To match this pattern, I used:
(\([0-9]{3}\) |[0-9]{3}-)
Putting it all together:
^(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$
And that's about it.