Friday, July 10, 2009

Searching for Files in UNIX and Linux

Introduction
This article will demonstrate how to use the 'find' and 'locate' commands in UNIX and Linux to search for files.

Searching for files with 'find' and 'locate'
To search for files in UNIX and Linux you can use the 'find' command. The 'find' command will do a real-time search of the filesystem looking for files that you have specified. On Linux and some other systems you can use the 'locate' command to search for files based on the filename. Usually a scheduled job runs an 'updatedb' command which puts every file on the filesystem into a 'database' or 'index' file. Then when you try to search for a file using 'locate' it searches through its indexed copy of file names instead of scanning the entire filesystem. This means that the 'locate' command is very fast while the 'find' command is slower. However the index database that the locate command uses is usually run just once a day so its index can become outdated quickly. The main drawbacks from using the 'find' command is that it's slower since it's done in real-time and can be very disk intensive.

Doing a Basic Search

[troygeek@localhost troygeek]$ find . -name "*.log" -print 2>/dev/null
./logs/error.log
./logs/access.log
In the above example, you'll see I call the 'find' command and pass it a few parameters. First, the . (dot) means do the search in my current directory and the files below. The -name option means I want to do a name pattern search. The "*.log" means look for all the filenames that end in *.log. This piece is what you are searching for. You can change it to look for myfile.txt, *someword*, or any other search string you can think of. The -print option means to print out the path of the files that are found. The 2>/dev/null means take any of the errors you get and throw them away into /dev/null (think of /dev/null as a trash bin).

Searching for all the .log files on all of the filesystems
To search for all of the .log files in the entire computer, you need to change the . (dot) to a / (forward slash) to tell it to look in everything under the root directory. The result would look like this:
[troygeek@localhost troygeek]$ find / -name "*.log" -print 2>/dev/null
/var/log/mysqld.log
/var/log/boot.log
/var/log/prelink.log
/var/log/yum.log
/etc/logrotate.d/vsftpd.log
/usr/share/doc/tux-3.2.12/sample.log
...

Searching through files that contain a specific string
Here's how you can search through a specific directory and below for all the files that contain a certain string. In this example, I'll look in my current directory and below for all the files that contain the word 'bookmarks'. This will display the filename that contains the string and a portion of the file that contains the string you're searching for.
[troygeek@localhost troygeek]$ find . -print|xargs grep -i "bookmarks"
./www/index.html:Please update your bookmarks.

Searching for large files

To search for large files using the 'find' command, just use the -size option and tell it the size in bytes that you're looking for. For example, here's how to search for files that are larger than 10,000 bytes (about 10 KB):
[troygeek@localhost troygeek]$ find . -size +10000 -print
./troygeek_main_final.sql
./troygeek_main_final_noweblog.sql
./logs/access.log
./troygeekLocal2.sql
./mysql-connector-java-3.1.8a.tar.gz

Searching for files that have changed in the last day

To search for all the files that have changed in the last day (or any amount of days really) just use the -mtime option and pass it the amount of days like below:
[troygeek@localhost troygeek]$ find . -mtime -1 -print
./.bash_history
./logs/error.log
./logs/access.log
./webapps/TroyGeek
./webapps/TroyGeek/Theme/ts_header.jsp
./webapps/TroyGeek/Pictures
./webapps/TroyGeek/Pictures/images
./webapps/TroyGeek/Pictures/images/1002839-R2-017-7_jpg.jpg
./webapps/TroyGeek/Pictures/images/1002839-R2-033-15_jpg.jpg
./webapps/TroyGeek/Pictures/images/100_0417_JPG.jpg
./webapps/TroyGeek/Pictures/images/100_0434_JPG.jpg
./webapps/TroyGeek/Pictures/images/100_0444_JPG.jpg
./webapps/TroyGeek/Pictures/images/100_0509_JPG.jpg
./webapps/TroyGeek/Pictures/images/100_0510_JPG.jpg
./webapps/TroyGeek/Pictures/images/100_0511_JPG.jpg

Using 'locate' to search for files.

As mentioned above the 'locate' command is usually installed on Linux systems but may also be on other flavors of UNIX. Here's how to do a simple search for all the .log files.
[troygeek@localhost troygeek]$ locate *.log
/var/log/prelink.log
/var/log/yum.log
/etc/logrotate.d/vsftpd.log
/usr/share/doc/tux-3.2.12/sample.log
/usr/lib/rpm/rpm.log
/usr/local/mysql/src/mysql-4.1.12/innobase/config.log
/usr/local/mysql/src/mysql-4.1.12/config.log
What you don't notice from the text above is that those search results were returned instantly whereas the results from the 'find' command have to search every file one by one which can take a long time.

Doing a case-insensitive search with locate
You may want to ignore case sensitivity when you're doing a search. To do that just use the -i option when doing a search.
[troygeek@localhost troygeek]$ locate -i PASSWD
/etc/passwd
[troygeek@localhost troygeek]$ locate PASSWD
[troygeek@localhost troygeek]$
And that's the basics of searching for files in UNIX and Linux.

If you have any trouble, please post your comments on this web page.
share on: facebook

1 comments:

Anonymous said...

I would like to exchange links with your site www.blogger.com
Is this possible?