Clones MkIV

amp_god 27.07.06 20:59

Etsii infohashin perusteella koneesta samoja tiedostoja ja tulostaa kloonilistan lopuksi. Tukee kaikkia PHPn tukemia hashausmenetelmiä.

 Tekstiversio  Arvo: 0 (0 ääntä)  Äänestä: +  -
LiNUX:
[code]
<?php
### Configuration ##
$hashing = "md5";
$savepath = "/tmp/clones.list";
###/Configuration ##
set_time_limit(9000);
ini_set("memory_limit", "256M");
$combined = 0;
function calculate_space($dir) {
  global $combined;
  $sock = opendir($dir);
  print("Going thru folder $dir...\n");
  while($file = readdir($sock)) {
    if($file == ".") continue;
    if($file == "..") continue;
    $fullpath = "$dir/$file";
    if(is_dir($fullpath)) { calculate_space($fullpath); }
    if(is_file($fullpath)) { $combined += filesize($fullpath); }
  }
}

function go_through($dir) {
  $sock = opendir($dir);
  while($file = readdir($sock)) {
    if($file == ".") continue;
    if($file == "..") continue;
    $fullpath = "$dir/$file";
    if(is_dir($fullpath)) { go_through($fullpath); }
    if(is_file($fullpath)) { process($fullpath); }
  }
  closedir($sock);
}

$filehash = array();
$clonelist = array();
$processed = 0;
function process($path) {
  global $filehash, $clonelist, $processed, $combined, $hashing;
  $size = filesize($path);
  $processed += $size;
  $size = number_format($size);
  print("Now processing:\n  $path\n    Size: $size bytes\n");
  print("    Hash: ");
  $md5 = hash_target($path);
  if($md5 === FALSE) {
    $md5 = hash_file($hashing, $path);
  }
  print("$md5\n");
  print("  Checking for clones...");
  if(!empty($filehash["$md5"])) {
    print("*FOUND*\n    Matching file: {$filehash["$md5"]}\n");
    $clonelist[] = "{$filehash["$md5"]} <-MATCHES-> $path";
    #usleep(50000);
  } else {
    print(" done\n");
    $filehash[$md5] = $path;
  }
  print("\n");
  $perc = round($processed / $combined * 100, 2);
  $size1 = number_format($combined);
  $size2 = number_format($processed);
  print("$perc% done, ($size2 of $size1 bytes)\n\n");
}

function hash_target($target) {
  global $hashing;
  $size = filesize($target);
  if($size < 1024) return FALSE;
  $comp = 0;
  $buflen = round($size / 100);
  $hashlink = @hash_init($hashing);
  $sock = @fopen($target, "r");
  if(!$sock) return FALSE;
  if(!$hashlink) return FALSE;
  $buf = FALSE;
  $perc = "";
  while($size > $comp) {
    $len = strlen($perc);
    $counter = 0;
    while($counter < $len) {
      print(chr(8)); $counter++;
    }
    $count = $comp + $buflen;
    if($count < $size) {
      $buf = fread($sock, $buflen);
    } else {
      $left = $size - $comp;
      $buf = fread($sock, $left);
    }
    if($buf === FALSE) return FALSE;
    $len = strlen($buf);
    $comp += $len;
    hash_update($hashlink, $buf);
    $perc = round($comp / $size * 100) . "%";
    print($perc);
  }
  $len = strlen($perc);
  $counter = 0;
  while($counter < $len) {
    print(chr(8)); $counter++;
  }
  $hash = hash_final($hashlink);
  @fclose($sock);
  return $hash;
}

function save_clonelist() {
  global $savepath, $clonelist;
  $sock = fopen($savepath, "w");
  foreach($clonelist as $clone) {
    fwrite($sock, "$clone\n");
  }
  fclose($sock);
}

$count = 0;
foreach($argv as $folder) {
  if($count > 0 && is_dir($folder)) {
    calculate_space($folder);
  }
  $count++;
}
$count = 0;
foreach($argv as $folder) {
  if($count > 0 && is_dir($folder)) {
    go_through($folder);
  }
  $count++;
}
print("Saving the clonelist to $savepath...");
save_clonelist();
print("done\n");
?>
[/code]

Ohjelma näyttää prosentteina hashauksen edistymisen isommilla tiedostoilla :)
Scripti on vielä kehitysasteella, eli ominaisuuksia on vielä tulossa lisää.
Jos oletus MD5 hashaus ei kelpaa, katso http://fi.php.net/manual/en/function.hash-algos.php

Käyttäminen linuxin konsolissa:
php clones.php /mitka /kansiot /kaydaan /lapi