193. Valid Phone Numbers
Description
Given a text file file.txt
that contains list of phone numbers (one per line), write a one liner bash script to print all valid phone numbers.
You may assume that a valid phone number must appear in one of the following two formats: (xxx) xxx-xxxx or xxx-xxx-xxxx. (x means a digit)
You may also assume each line in the text file must not contain leading or trailing white spaces.
Example:
Assume that file.txt
has the following content:
987-123-4567 123 456 7890 (123) 456-7890
Your script should output the following valid phone numbers:
987-123-4567 (123) 456-7890
My Solution
Source Code
1
2
# Read from the file file.txt and output all valid phone numbers to stdout.
grep -E "^(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$" file.txt
Analysis
So this is probably going to sound terrible but I didn't actually know there
were different implementations of POSIX regular expressions. I learned today
there are Basic Regular Expressions and Extended Regular Expressions. ERE are
what I've been using my entire programming-related life. Some of the
differences between BRE and ERE include {}
and
()
needing to be escaped in BRE and ERE having +
,
?
, and |
. When you use grep
, by default
it takes a BRE. I've used grep
before but only with absolutely
basic regular expressions, like just matching a specific word. To use
grep
with a ERE, you would use grep -E
.
Anyway, the regular expression I came up is not that complicated. Basically I
started with the simplest test case, a number formatted
xxx-xxx-xxxx
and worked my way up.
^[0-9]{3}-[0-9]{3}-[0-9]{4}$
matches xxx-xxx-xxxx
/(?[0-9]{3}/)?-[0-9]{3}-[0-9]{4}
will also match
(xxx)-xxx-xxxx
which isn't actually valid but it's a start.
Area codes that are wrapped in ()
must be followed by a space.
Likewise area codes not wrapped in ()
must be followed by hyphen.
To match this pattern, I used:
(\([0-9]{3}\) |[0-9]{3}-)
Putting it all together:
^(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$
And that's about it.