Note: Before starting, we would like to explain what we mean by “files” in the title of this post. By files, we mean media files (such as images, videos, mp3s, etc…) and documents (such as PDF files and Word/Excel/Powerpoint documents, etc…).
Another note: This post assumes that no file is linked to directly from outside sources. Any link to a file in your filesystem is assumed to be incoming from your website itself, and not from elsewhere.
A great way to keep your website clean is to remove unused media/document files periodically. Such files may be image files, music files, video files, PDFs, Word documents, etc… Of course, this cleanup job can be done manually, but it might take a very long time if done this way. A better way would be to develop a script that will do this task automatically. Here’s what such a script should do:
- It will first create an index of all the media/document files available on the website. For example, it will loop through the images directory (and all the sub-directories) and it will create a record of every single image (including its path). This information can be stored in a text file or a database table.
-
Once the index of all the media/document files is created, the script must loop through all the template files and create an index of the used media/document files (let’s call this index the filesystem index). When that task is done, the same script must loop through all content tables (such as the #__content table) and create another index of the used media/document files in the database (let’s call this index the database index).
-
Once all the indexes are done, the script must loop through the index of available files and check if each one of these files exists in the filesystem index or the database index. If a match is not found in either index, then the script will automatically delete it from the filesystem (or move it to a folder outside the website directory).
-
Once the above task is done, the script will create a report of all the deleted (or moved) files.
As you can see from the above, this is a relatively easy script, and shouldn’t take a long time to develop. If you need help developing it, then please contact us. We can create this script for you in record time and for a very affordable cost!