Nothing To Lose

If you don’t have it, how can you lose it!
Subscribe

Catching email id’s from file(s) using grep and other utils

April 18, 2009 By: Dexter Category: BASH, Linux Commands, Regular Expressions, Shell Scripting, Tutorial

Here is a simple mechanism that you can use to collect all the email id(s) from a file(s) into a single file. To do this we will be using the following command cat , grep, sort and uniq.

This one liner should do the work

cat file | grep -io ‘\<[^-.][0-9A-Za-z\.\-\_]\+@[0-9A-Za-z.]\+\>‘ | sort | uniq

If you want all the id’s in some file then redirect the above command to a file.

cat file | grep -io ‘\<[^-.][0-9A-Za-z\.\-\_]\+@[0-9A-Za-z.]\+\>‘ | sort | uniq > mailid.txt

Now lets convert this into a shell script where we shall accept a directory name from the user. This directory will be the one containing the files having the email ids
You can download the script from here Script to retrieve Email ids form files in a directory
I have noticed the copy paste of the code below is not working because of formatting characters.

#!/bin/bash
clear
echo -n “Enter the name of a DIRECTORY from where you want to pick up email id’s: “;
read dirname;

# check if the entered name is a directory

if [ -d $dirname ];then

cd $dirname; #  if it exists change to the directory

else

echo “+============================+”

echo “| Check your directory name! |”

echo “+============================+”

exit 1;

fi

# Loop through all file  in the given directory

for files in *

do

if [ ! -d $files ];then

# process all files and store them in a temporary file in users home dir

echo “Processing file $files”;

cat $files | egrep -io ‘\<[^-.][0-9A-Za-z\.\-\_]+@[0-9A-Za-z.]+\>‘ >> ~/$$;

echo “Processed”;

fi

done

cd -  # get back to previous working dir, i am assuming it was home

# sort the emails ids in the file, remove duplicates and store in a final file.

sort ~/$$ | uniq >> emailids.$$

# remove the temporary file

rm ~/$$

# tell the user where the mail ids are stored

echo

echo “+=================================================+”

echo ” Your email ids are available in ~/emailids.$$ ”

echo “+=================================================+”

exit 0;

Well I should warn you, the regular expression will catch anything that looks like an email id, so you might end up having lots of things that looks like an email id.
[end]

Comments are closed.