diskover

File system crawler and disk space analyzer using Elasticsearch and Kibana.


Project maintained by shirosaidev Hosted on GitHub Pages — Theme by mattgraham

diskover - File system crawler and disk space analyzer using Elasticsearch and Kibana

diskover

diskover is a file system crawler that indexes your files metadata in Elasticsearch and visualizes your disk usage in Kibana. It crawls and indexes your files on a local computer or remote server using nfs or cifs.

File metadata is bulk added and streamed into Elasticsearch, allowing you to search and visualize your files in Kibana without having to wait until the crawl is finished. diskover is written in Python and runs on Linux, OS X/macOS and Windows.

diskover aims to help manage your storage by identifying old and unused files and give better insights into file duplication and wasted space. It was originally designed for the vfx community to help deal with managing large amounts of data growth.

Screenshots

Kibana dashboards / saved searches and visualizations (included in diskover download) kibana-screenshot diskover-web (diskover’s web file manager and file system search engine) diskover-web Gource visualization support (see videos below) diskover-gource

diskover Gource videos

Installation Guide

Requirements

Windows Additional Requirements

Optional Installs

Download

$ git clone https://github.com/shirosaidev/diskover.git
$ cd diskover

You need to have at least Python 2.7. or Python 3.5. and have installed required Python dependencies using pip.

$ sudo pip install -r requirements.txt

Getting Started

Start diskover as root user with:

$ cd /path/you/want/to/crawl
$ sudo python /path/to/diskover.py

For Windows, run CygWin terminal as administrator and then run diskover.

Defaults for crawl with no flags is to only index files 5+ MB and 30+ days modified time. Use -h to see cli options.

A successfull crawl should look like this:

   ___       ___       ___       ___       ___       ___       ___       ___
  /\  \     /\  \     /\  \     /\__\     /\  \     /\__\     /\  \     /\  \
 /::\  \   _\:\  \   /::\  \   /:/ _/_   /::\  \   /:/ _/_   /::\  \   /::\  \
/:/\:\__\ /\/::\__\ /\:\:\__\ /::-"\__\ /:/\:\__\ |::L/\__\ /::\:\__\ /::\:\__\
\:\/:/  / \::/\/__/ \:\:\/__/ \;:;-",-" \:\/:/  / |::::/  / \:\:\/  / \;:::/  /
 \::/  /   \:\__\    \::/  /   |:|  |    \::/  /   L;;/__/   \:\/  /   |:\/__/
  \/__/     \/__/     \/__/     \|__|     \/__/    v1.0.12    \/__/     \|__|
                                      https://github.com/shirosaidev/diskover

2017-05-17 21:17:09,254 [INFO][diskover] Connecting to Elasticsearch
2017-05-17 21:17:09,260 [INFO][diskover] Checking for ES index: diskover-2017.04.22
2017-05-17 21:17:09,262 [WARNING][diskover] ES index exists, deleting
2017-05-17 21:17:09,340 [INFO][diskover] Creating ES index
Crawling: [100%] |########################################| 8570/8570
2017-05-17 21:17:16,972 [INFO][diskover] Finished crawling
2017-05-17 21:17:16,973 [INFO][diskover] Directories Crawled: 8570
2017-05-17 21:17:16,973 [INFO][diskover] Files Indexed: 322
2017-05-17 21:17:16,973 [INFO][diskover] Elapsed time: 7.72081303596

User Guide

Read the wiki for more documentation on how to use diskover.

Discussions/Questions

For discussions or questions about diskover, please ask on Google Group.

Bugs

For bugs about diskover, please use the issues page.

License

See the license file.