imagecatcher

weicco 15.10.04 10:31

Ruby - Hakee kuvat www-sivulta

 Tekstiversio  Arvo: 2 (4 ääntä)  Äänestä: +  -
# Copyright (c) 2004 Marko Parkkola

#Permission is hereby granted, free of charge, to any person
#obtaining a copy of this software and associated documentation files
#(the "Software"), to deal in the Software without restriction, including
#without limitation the rights to use, copy, modify, merge,
#publish, distribute, sublicense, and/or sell copies of the Software,
#and to permit persons to whom the Software is furnished to do
#so, subject to the following conditions:

#Redistributions of source code must retain the above copyright notice
#and this list of conditions
#The name of the contributor may not be used to endorse or promote products



# USAGE: 
# skripti.rb osoite.com
# skripti.rb osoite.com/index.html
# skripti.rb osoite.com/index.html 8080

require 'net/http'

host, fn = ARGV[0].split '/', 2
port = (ARGV[1] == nil ? 80 : ARGV[1].to_i)

fn = "index.html" unless fn != nil

puts "Fetching " + host + " " + fn + " : #{port}"

h = Net::HTTP.new host, port
resp, data = h.get 'http://' + host + '/' + fn, nil
exit unless resp.code == "200"

puts "Images --"
data.scan(/<img([^>])+src="((http)?(.+?))".*?>/) {|x|
url = "http://" + host unless x[1] =~ /^http/
    url += x[1]
    puts url + " fetching..."

    h = Net::HTTP.new host, port
    resp, data = h.get url, nil
    if (resp.code != "200")
        puts "Failed!"
    else
        fn = x[1].split /.*\//
        f = File.open fn[1], "w"
        f.write data
        f.close
    end
}

editoitu: 15:56 16.10.04
Akiro 15:56 16.10.04 
Vähän heikko tuo index.html-oletus, voi siellä olla index.php vai vaikka foo.bar index-tiedostona riippuen siitä miten webserveri on konffittu ja mitä käyttäjä on tunkenu sinne.

PS. Ruby on ihan mukava kieli :-)
Ztane 10:40 26.10.04 
ARGV[0].split '/', 2

Tää vaikuttaa vähä jo hölmöltä, miksei sitä koko alku-urlia voi antaa oikeessa muodossa