Hey! That’s mine!#section1
Most web professionals are all too aware of the problems caused by hotlinkers. Leechers. Bandwidth thieves. People who use images hosted on your web server on their own pages.
For some lucky people who don’t pay by the gigabyte for the amount of data they transfer, that’s not too big a deal. Who cares if some little-trafficked weblog uses your photograph of snow falling in New York?
For other sites, however, it’s a much bigger problem. If a 100K JPEG is hotlinked on a site that gets, say, 1,000 hits a day, that’s 100MB of data transferred from your site without a single person actually visiting your site. If you have only a few gigabytes of transfer available per month — or worse, pay money per gigabyte — this can add up. And if someone were to leech an entire gallery from your site …
The trouble is that the usual approaches for preventing hotlinking have a couple of side effects.
Quick fixes aren’t perfect#section3
The usual approach is to instruct the server to deny all requests for images where the HTTP referer header 1 is not either from your own site (or blank). Thus, only people actually browsing your web site — or those whose browsers are not passing referrer headers for whatever reason — will be able to see the image.
A second approach is to redirect off-site traffic to an alternate image — either a general “hotlinking denied” image, or (in the case of some mischievous webmasters) something more shocking.
The trouble with these techniques is that regular linking is also prevented. Since browsers also send referrer headers when someone clicks a link to one of your images, the only way for people to go directly to your pictures would be to copy and paste a link into a new browser window. Granted, some webmasters might like this — it ensures that people link to the pages that photos appear on — but others may want links to succeed. Plus, if you have a gallery page with lots of images, this method makes it difficult for someone to point directly to a particular piece of your fantastic artwork.
The solution I’m about to suggest solves this problem while giving credit to you when people link to your pictures.
Where do we go from here?#section4
With PHP and mod_rewrite, you can disallow embedding and allow linking while automatically creating gallery pages for those direct linkers. It’s the best of all worlds, and here’s how to do it.
You’ll need an Apache server capable of running PHP, with mod_rewrite enabled. If you don’t know what you have, ask your hosting company, or give it a try — if it fails, you’ll know you don’t have them.
First, create a new file called showpic.php
and put this code in it:
<?php header("Content-type: text/html"); header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); header("Cache-Control: no-store, no-cache, must-revalidate"); header("Cache-Control: post-check=0, pre-check=0", false); header("Pragma: no-cache"); $pic = strip_tags( $_GET['pic'] ); if ( ! $pic ) { die("No picture specified."); } ?> <html> <head> <title><?php echo($pic); ?></title> <meta http-equiv="Content-Type" c charset=iso-8859-1" > </head> <body>
![]()
Image from your web site.
</body> </html>
Needless to say, you should change the HTML to match your own web site.
Let’s take a look at the PHP in there. The first line is a header to make sure the Content-Type sent to the browser identifies the document as HTML. We’ll see why this is important in a moment. The second line checks that a variable $pic
has been passed to the script. If not, it skips to the end and exits quite abruptly. However, since this script should never be called without that variable (again, we’ll see why later), that’s not too much of an issue.
Assuming that this variable is there, the other lines of PHP strip any tags from it (to prevent cross-site scripting exploits), output the variable in the right place to create a valid tag, and add the file name to the page <title>.
So far, this is just a simple script. Go to www.yoursite.com/showpic.php?pic=yourimage.gif
and it will output a simple page showing yourname.gif
and a credit.
Now it gets interesting#section5
If you’re an .htaccess
neophyte, take a look at this introduction which will take you through the basics.
The next step is to add the following code to your .htaccess
file:
RewriteEngine OnRewriteCond %{REQUEST_FILENAME} .*jpg$|.*gif$|.*png$ [NC] RewriteCond %{HTTP_REFERER} !^$ RewriteCond %{HTTP_REFERER} !yoursite.com [NC] RewriteCond %{HTTP_REFERER} !friendlysite.com [NC] RewriteCond %{HTTP_REFERER} !google. [NC] RewriteCond %{HTTP_REFERER} !search?q=cache [NC]RewriteRule (.*) /showpic.php?pic=$1
Let’s go through this one line at a time. RewriteEngine On
gets mod_rewrite ready to do its stuff. First come the conditions:
RewriteCond %{REQUEST_FILENAME} .*jpg$|.*gif$|.*png$ [NC]
Okay. First condition: the file name must end in .jpg, .gif, or .png. This makes sure our hotlink prevention only triggers on images. You might want to change this to include .swf, .mp3, or other similar files.
RewriteCond %{HTTP_REFERER} !^$
Second condition: the referrer must not be blank. This means that people who aren’t passing referrer headers, for whatever reason, will still be able to see your images.
RewriteCond %{HTTP_REFERER} !yoursite.com [NC] RewriteCond %{HTTP_REFERER} !friendlysite.com [NC]
These next conditions allow linking from your own site, and any other friendly sites that you want to allow linking from. Change the sites to your own, of course. Apache isn’t psychic.
(Don’t know what the ! .*$
is all about? It’s a regular expression. If you keep the format the same, you don’t need to worry about it.)
RewriteCond %{HTTP_REFERER} !google. [NC] RewriteCond %{HTTP_REFERER} !search?q=cache [NC]
Okay. Finally, let’s let Google get through. These last conditions allow people using the Google cache and Google Image Search to see your pictures. (You might want to remove this if you don’t want people to find your pictures this way, but I don’t recommend it.)
All together now#section6
Now let’s hook the two together. On to the last line of the .htaccess
file, which is:
RewriteRule (.*) /showpic.php?pic=$1
This last rule silently redirects the request to /showpic.php?pic=[the requested file]. Thanks to the wonder of Apache, this will automatically include all necessary slashes and path information, and not be visible to the end user.
So what happens?#section7
Now, the only way a request will have got this far is if:
- It’s for an image file, and
- it’s not coming from a domain that you own or are friends with.
So firstly, and most importantly, if someone tries to hotlink one of your images, it’ll fail — the browser, instead of receiving an image file, will receive the result of showpic.php
, which is sent as text/html
. It’ll realise it can’t display it, and produce a broken image placeholder. Bandwidth saved.
On the other hand, if someone tries to link directly to your images, they’ll get silently redirected to an HTML page with your credit on it! No red X, no silly “denied” image — just a handy page that shows them the image they want to see, and gives you credit for your work.
See it in action#section8
First of all, let’s check that the script still allows images to load for people visiting your own web site. Yes, that looks fine. Now, let’s see if A List Apart can hotlink my images. Nope, guess not. And what happens if you just link straight to the image file? Well, there’s a nicely formatted page.
Taking it further#section9
If you’re using some kind of content management system like Gallery, there might be a way to tie a script like this into a database of pictures, and automatically generate ALT tags and more information about the picture.
Of course, I’ll leave that as an exercise for the reader.
1 For some reason, the HTTP specifications misspell “referrer” as “referer.”
Editor’s Note: The PHP code example in this article has been edited to address a small potential cross-site scripting vulnerability, to work with register_globals and short_tag off, and to work with caching. Thanks to everyone who helped make it better.
When I click the final link to the “nicely formatted image” it only showed the credit line the first time – on all subsequent visits it only showed the image as it normally would. This was in Firefox .9.
In Opera 7.52 with referrer logging turned on, the last page only shows the credit line (even after reload). With referrer logging turned on, it only shows the picture (as in _only_ the picture – not a html page).
Regardless, the article gets the idea across 🙂
I meant to say:
With referrer logging turned OFF, it only shows the picture (as in _only_ the picture – not a html page).
Testing in IE 6 and Opera 7.52, the “Nope, guess not” page does have a broken image at the end.
Svein, I didn’t know Opera could disable refer logging. Good to know. Thanks.
For got to mention, while Mark is correct in regards to the way Firefox is displaying the page, Mozilla 1.7 and the latest K-Meleon (which is essentially a modified Mozilla 1.5.) Display the last page as intended (or as I believe was intended).
I had been researching the finer points of this over the last day or so (casually) when your post appeared on my RssReader.
All of the posts worked will for me in Firefox .9, IE6 and the RssReader preview pane.
I will be implementing this for my clients in the next version of my CMS. Thanks!
The link at the end of the section titled “Where do we go from here?” should be http://www.yoursite.com/showpic.php?pic=yourimage.gif.
I’d also like to point out that since PHP 4.2.0 register globals is OFF by default and thus using $pic alone will not get you the results you expect; try using $_GET[“pic”] instead.
Hmm, after reading some of the above comments I trying refressing the final, html page and everything seems fine. Then I closed the tab (in Firefox) and clicked on the link in the article again. This time I got the image only (no HTML). I then closed that tab, cleared the browser’s cache and clicked on the link in the article again. Ahha, I see the HTML page again.
So apparently, when a user first clicks on the link, it works as intended, but on any subsequant visits, the browser first checks the cache for an image by that name (remember, with mod_rewrite we never change the url as we would with a redirect so the browser doesn’t know the difference) and if it finds it, the image is displayed from cache without the HTML. Only when no such image is found is a request made which then triggers this script.
Personaly, I realy don’t see this as a big problem, as the hotlink has already been interupted, but I wonder if this could be avoided by telling the browser not to cache the image (with additional headers).
I don´t see why the file extension of the PHP generated page has to be “.jpg” – it is neither a JPEG nor is the content “type image/jpeg”. If you present the user a HTML page, don´t confuse him by obfuscating the original file type.
Anyway, I like the idea behind your article though. Thanks!
Ed,
No the link as it appears in the article it correct. That is the point of the rewrite rule. When someone links directly to your image from their site, they are redirected to http://www.yoursite.com/showpic.php?pic=yourimage.gif. This redirect is transparent to the user and the browser as it is all handled on the server. I would suggest rereading the article.
However, you do have a point with the register globals thing. That could present a problem to some.
Waylman,
Thanks, but I’m familiar with the intricacies of mod_rewrite. The author writes:
……
So far, this is just a simple script. Go to http://www.yoursite.com/showpic.php?yourimage.gif and it will output a simple page showing yourname.gif and a credit.
……
At that point in the article the mod_rewrite code has not been introduced and the author is merely suggesting that going to the URL will show the picture and a credit. However, the script will die immediately since $pic is not defined in the query string. Like I said, the link should be http://www.yoursite.com/showpic.php?pic=yourimage.gif.
it’s an excellent article. But I have to point this out to the no-so Strong PHP people out there. This is a bad practice using $_GET[] global variable directly call a file on your server. Check out php.net site and read the article about “best practice”.
Basically, if you do this. You are tell me to hijack your files. Consider this. http://www.yoursite.php?pic=image.gif if you change to this. http://www.yoursite.php?pic=secret_file.php
Better way to approach this. _could_ be write the above mentioned file as a module. Without the html header. Instead generate a header as image/jpeg (or whatever you want) then call this script from another script hidden all the actual images from the whole world (I have been doing this for one year now. And the best part is I can turn it on and off anytime I want) Hope this help.
good point joel, and run you solution by me again a little slower please?
Taking it further…
You could also allow hot-linking but use PHP with an image editing module to add a watermark, your home page URL, copyright info and/or your name onto the image itself.
I have always been trying to find a great way to be able to protect my images but still enable being able to pull up an image without it having to be on a page on my site.
Thank you so much. Though, I must mention – It didn’t work until I removed the “/” in front of the image and in front of the file in my .htaccess. *shrugs*
I spent a while pondering this, as I’ve seen a few sites fail when using referrer based blocking, and came up with an alternative, but it’s a lot more work.
When an image is displayed, the SRC for the IMG tag is a PHP file –
– which gets xxxxxxxx.jpg, and reformats it for display. I was already doing this bit, so the images could be resized on the fly – call it with ‘&width=640’ for a 640px wide image, for example. The bit I added was that each page that displays images calculates a hash from the current time, to the nearest hour, and passes this on to the php script as ‘&auth=xxxxxxxxxx’. When img.php runs, it checks this against the hash for the current time.
If they match, it outputs the image as requested. If they fail, it limits the size and quality, and adds extra text with the site address across the top and bottom of the image.
Because the image is still displayed, it doesn’t break pages for people who do nick images (they’re not usually *meaning* to be evil ;). Because it adds the site address, it can still advertise my site, but because it limits the jpeg quality, and refuses to output in the largest sizes, it cuts down the bandwidth. Hopefully prevents excessive leeching, whilst still allowing people to post the odd image on their blog.
Oh, and once the image has been generated once at a specified size / quality, it caches it for use next time, so it doesn’t have to resize every time.
I forgot to mention – I’m using Firefox, and it works fine – though it does appear to have some sort of gliche where it will sometimes suddenly show the image and then upon refresh, it won’t. odd.
Joel, I don’t see a security problem here. He’s not returning the contents of the file. He’s just using the file name to construct an
element.
joel – the article’s focus was people linking directly to the files. If their linking then they know the file name already and this is meant to try to be a graceful way to add a layer in between the transaction.
I agree with your post, however in the scope of this article embedding something else as a $pic var isn’t going to do much since it is only constructing an img element and Apache is checking the mime type before sending it to the script.
The generated page contains the
element with a src of the current URI. This is a problem. It has already been retrieved, namely as the current page, with a type of text/html. You need to explicitly make sure the generated page sends headers to disable any cache of it.
Directly after ready this article two things came to mind:
Firstly this would force the people who want to display our images on their sites to directly copy our image to their servers which by all accounts could be breaking copy right law even if it was purely for reference purposes and they gave full credit and linkage to our site, these same actions would also result in duplicates of the image cropping up all over the web.
This goes against the idea of such technologies as bit torrent where by when the demand for a file drops one kind individual keeps a torrent open so as to ensure every use can gain quick and easy access, instead it is being suggested that expect other people to host our content.
Secondly there is the problem of people who want to make reference to and image that we host but are unable to ever host it of embed it, such as in a bulletin board that does not allow ht ml or images.
Hello. I’m the guy who wrote this article.
Caching: you’re right, that’s a small issue – if someone 1) visits an offsite page and their browser, or a webcache, caches the HTML file as the picture and 2) they then visit your own site and their cache fails to check for an updated version, then the picture will still fail.
Try adding the headers described in http://php.weblogs.com/stories/storyReader$550 to the script.
Security: PHP is treating $pic as a string, similar to a “What’s your name?” script that echoes it on the following page – it doesn’t even touch the file on the server. I don’t see how this is a security issue.
register_globals: if register_globals is off, er, yes, this won’t work. That’s true. In all honesty, I didn’t consider it; I’ve never yet encountered a commercial host who’s turned off register_globals, and I must be getting a little sloppy with my code. Apologies.
This php code has some problems…
It works only if
register_globals and short_tag are turned on.
[code]
— showpic.php.orig 2004-07-14 08:54:49.000000000 +0300
+++ showpic.php 2004-07-14 08:57:29.000000000 +0300
@@ -1,12 +1,12 @@
–
+
–
” alt=”Image”>
Image from
[/code]
Hope this helps.
Thank you Thomas for writing this article and presenting a novel technique to a common problem. I especially liked how you handled different linking situations.
why are you suggesting using php, i dont get it. traditionally this is acheived purely through .htaccess – thus
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www.)?mydomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www.)?otheralloweddomain.com(/)?.*$ [NC]
RewriteRule .(gif|jpg)$ http://www.mydomain.com/nasty.gif [R,L]
works for me.
Well, there is a tiny issue with this code IMHO. It echos the $pic variable without checking what’s actually in the variable. So it’s wide open for any kind of Cross Site Scripting (XSS) attack. I can output any HTML and JavaScript code I want on your website, e.g. by calling /showpic.php?pic=%22%3E%3Ch1%3E%3Cblink%3EXSS%3C%22
In this particular case it’s not a big security hole, but if the script would be part of a larger website (maybe even with session/cookie based authentification like “Gallery”) someone could hijack accounts by providing a carefully prepared link to other users. Check http://www.cgisecurity.com/articles/xss-faq.shtml for details.
At least call strip_tags($pic) or htmlentities($pic) before outputting the string. Just **never** trust user data.
Jim – did you even read the article before replying? That question’s answered in the first couple of sections!
Sascha – hmm, okay, I see what you mean – combined with the JavaScript cookie-stealing trick, that could be an issue. In that case, I agree with you that strip_tags() might need to be called.
many have already pointed out that the PHP could be better. ok, it’s not the main focus of this article, but yes: use $_GET, do some sanity checking and, if necessary, stripping and replacing characters to make user data safe. I’d also add not using short open tags and the = shorthand, as again this will not work on some server configurations. another point that may be worth mentioning is that some software firewalls, like norton internet security, routinely strip out http-referer information from any web traffic going through, so expect some users who have this type of software installed to experience slight difficulties.
I think this could be a pretty effective way to deal with hotlinking, with a little tweaking. For some reason, though, using Opera, I didn’t get the image on the third page. Referrer logging is enabled.
Patrick: if the referrer information’s stripped, then it allows the image by default – although if your referrer information *is* being stripped, then you’re probably going to have more problems than this, particularly if you’re trying to download files from some web sites that require on-site links.
ILJD: I’m using the latest version of Opera, and I see the image without a problem – it’s possible it’s a cache issue, which may be fixed by the updated version of the script I just uploaded.
“if the referrer information’s stripped, then it allows the image by default”
fair enough
“- although if your referrer information *is* being stripped, then you’re probably going to have more problems than this, particularly if you’re trying to download files from some web sites that require on-site links.”
that was my original point, in a roundabout way (although, as noted above, it doesn’t affect this solution, granted): don’t rely on http-referer, as there might be very legitimate situations in which it comes back empty
My concern about the accessibility this solution rose immediately when I saw alt=”Image” in the code. Now, I understand that the alt attribute (it’s not a tag) information is passed from the page and so would be very difficult to control without a database, but I would think there would be something better to put here than “Image.” Some validators will fail simply by finding the word “image” in an alt attribute.
I considered directing the user to the site’s main page, or using a short statement about why the attribute cannot be filled. I don’t have a perfect solution for this, perhaps someone with more experience may have a better idea. It would be good to see some discussion of this.
I see your kitty on the nope guess not link….(?)
cheers, bill
In IE6, I get “No picture specified” when I follow the third link (the direct one to belle.jpg). It works in Firefox.
Clever idea, though.
Chris: sorry about that, fixed now. Stupid script error on my site.
Michael: yep, you’re right that the accessibility isn’t perfect – but considering that, without the script, all the user would get is a picture with the same, zero accessible information.
I didn’t really want to get into the intricacies of adding custom ALT tags for every picture – and just echoing the filename seemed rather silly – so I left it at “image”. That’s free to be changed, of course.
bill n: What browser are you using? Do you have a proxy? Is anything stripping out referrer information?
Depending on browser and settings, there might be issues of seeing the image (or not seeing it) connected with caching as you view the three examples. But viewing the three examples is not what will happen in actual use.
The same thing can happen with .htaccess, or with old-school “counters.”
For instance, a site contains a “hit counter.” The site is mirrored by a translation service. On the translation service’s page, the hit counter returns a GIF stating that the counter could not be viewed. If the user were to visit the original site (not the mirrored translation), the counter might still fail to appear – at least until the browser was refreshed. But few people, other than the site’s developer, are likely to visit the original page immediately after viewing the translated mirror.
I did some reading through 508 in order to find out what related to this issue. I believe adding descriptive alt attribute information may classify as an undue burden, so following 1194.2(a)(2) would apply.
http://www.section508.gov/index.cfm?FuseAction=Content&ID=12
I would conclude that a short description of why no data is available would be the proper response. Additionally, a link to the main page of the site would be helpful.
If I am missing something here, please inform me.
If you want an alt attribute, name your image with a particular pattern and use a PHP function like preg_replace() to strip/replace the filename into something resembling plain language. For example, my-new-car.jpg can have the extension stripped and the hyphens turned to spaces, and boom, you have an alt attribute of “my new car”. It’s a bit of a “trick”, but it gets the job done if you don’t have an alternate method of data storage like a DB or text files.
Hi Guys
Got to say I am the kind of people lock up my house even when I am in there. Get it 😉
Yes, correct about the using = short hand doesn't create a security problem ... on some server. Some other setup or older version of PHP do not work with this. And you end up have to make a print or echo call. That's where the problem comes in. And I believe there are quite a few PHP newbie not aware of this. just-get-it-work-now-and-think-later anyone? Just want to make people aware, not matter how small a things you do (especially when you deal with server side issues). You just have to think of a few other issues may arise. Cheers and later
There was a discussion about using silly made-up URLs on Raymond Chen’s (excellent) blog recently:
http://blogs.msdn.com/oldnewthing/archive/2004/07/13/181733.aspx
Basically, if you make up a URL like, for example, yoursite.com, you’d better have control over it, or someone else will and you’ll be less than thrilled with the results.
(Incidentally, has anyone complained in the last ten seconds about the lack of a Comment Preview button on this forum? Crappy UI design: even Zeldman isn’t immune, and that’s scary.)
Hey, great article. I use a simlar hotlinking prevention on my website for private or restricted images.
Another nice feature that anyone who has a little experience with php, is to collect & report information everytime the hotlinking fails.
If you have a high traffic site you may want to store it into a database and have a webpage make daily reports of who and when and of what image.
Personally i get very little , so i prefer to have it send an email to my account at work warning me that a page is trying to access things they shouldnt be.(you could combine them both)
You could check your tracking analysis, but i prefer the instant feedback.
2cents.
I cleared my cache and I still see belle in the nope guess not window
Browser is IE 6.0.2800.1106CO
Alas I can’t tell you much about proxy etc – it’s my corporate LAN
Cheers, Bill
If you write articles in which you use URLs that you do not own, you can use a number of fake TLDs that have been set aside especially for this purpose: .test, .example, .invalid, .localhost.
This is especially useful when citing fake e-mail addresses, as the owner of the domain will get spam on the addresses you just made up.
Example of a valid example address: branko@yoursite.example.
Thomas, thank you for a nice and useful idea.
(Pity about the execution, but that is what these forums are here for.) 🙂
Branko: despite the panicked tone of a few of the comments, I don’t think the original execution is bad. The code on the main page has now been fixed (with a suitable note attached), which should fix caching issues, and the (fairly minor, it must be said) XSS vulnerability. It’ll also work with register_globals off now, too.
I really don’t think linking to “yoursite.com” is a problem in the middle of PHP example code. If you’re including it as a clickable example, fair enough, but as an obvious placeholder in code?
bill n: the problem is almost definitely with the corporate LAN, which is presumably screwing up the example somewhere!
I will use it at http://www.absolutengine.com – addint it to the system.
I keep running into log file instances of people leeching my images as signerature images on message boards. One of which recently used up a half-gig of bandwidth in less than one month, simply because the animated gif in question was 3 or 4 hundred K.
The solution: Write over the leeched file with an extremely insulting yet low file-size replacement:
http://thisisdrew.com/SCREAMER.GIF
Revenge is sometimes more fun than prevention.
It is important to remember that even if you can see the image in the ‘nope, guess not’ example, you had to have opened the image in a proper page first. If you see the image on the ‘nope, guess not’ page, you’re viewing it from a cached resource somewhere in between you and the host. This means, of course, that no bandwidth is being used from the host. If your primary goal is to conserve bandwidth, the solution is still valid.
Looks a good solution.
I did try using a rather cruder ASP referer check on one site, and had complaints that it didn’t work for people with ad blockers installed.
Has anyone tested this solution – I don’t know how ad blockers work so cannot guess whether it will trigger them.
(My eventual solution was to use a visible watermark with my URL and tell people they were welcome to take copies. KISS.)
For standards compliance, you should send the 403 status code in the response. Otherwise you prevent spiders and external apps from figuring out what the problem is.
I tried in IE 6.0 and Firefox .9, and the images loaded on each test page.
Thought you would like to know.
Justin
At “Nope, guess not” some browsers as Amaya or Off by One ignore it and display image
Regarding the use of fake domains in sample code, there is RFC 2606 (http://www.rfc-editor.org/rfc/rfc2606.txt) about making use of example.com that IANA has set up apart exactly for that.
To read a small article about it head to http://www.oreillynet.com/pub/wlg/4051
Great article by the way.
Sorry, but a clarification from my previous post. In the previous post when I say that this is a great article I’m refering to this ALA article 😉
I’d love to see an update to the code to include how to exclude certain directories and such. You know, a lot of sites use banners for linking back to your site and really don’t mind the leeched bandwidth in that case.
The technique has great promise for sending files (zips, mp3s, etc) through a file download manager too, in the case that they are hotlinked or directly accessed. For those of use that pay our hosting by advertising, this is essential.
Once I replaced the HTML in showpic.php and uploaded it and the changes to my htaccess file, my site failed to load its stylesheets.
What in the world?
I came across this article by the time the comments were 6 pages deep and the cross-scripting vulerability was addressed.
Anyone?
The thing tripping me up on adding a directory exclusion to the rewrite was Apache’s variable names that don’t necessarily mirror the CGI variables I’m used to. I got it working with this additional line, which allows hotlinking from a banners directory:
RewriteCond %{REQUEST_FILENAME} !images/banners/
Very nice method Thomas.
Just a quick note on this:
—–snip—–
RewriteCond %{HTTP_REFERER} !^$
Second condition: the referrer must not be blank. This means that people who aren’t passing referrer headers, for whatever reason, will still be able to see your images.
—–snip—–
The intentions (as mentioned in the article) are good, but the consequences might be less attractive as Internet user’s security awareness grows.
Bundled security software like Norton Internet Security (with Privacy Control) and McAfee Security Center (with Privacy Service) can be found on more and more systems every day. This isn’t a bad thing, really – Internet users protecting themselves is, in general, quite sweet.
However, once a user has this software installed, it usually stays with its initial settings. There’s a minimum risk that the user will fiddle around with the software and configure it to suit specific needs and requirements. If a user actually does fiddle around with it, he/she will most likely not disable a security feature, but rather enable more of them if possible.
The effect is that the hot linking protection, and anything else accepting blank referrers, has no effect.
I figured this could be worth mentioning as there’s a slight tone of “not passing referrers” = “something’s wrong” in the discussion.
cheers
/j.
Great article, it was refreshing to see a creative way of dealing with hotlinking. I don’t think there can be a perfect solution, the blank referrer is becoming more popular in bowsers – I disable it for my own browsing in opera, i believe it is disabled by default now in Opera, also – and security solutions, thus bypassing hotlink protection. If you take out the blank referrer line in .htaccess, it just causes a mess of other problems with legitimate links to files/images – mostly for those with disabled referrers, or those behind some firewalls, etc, but also for browsers with referrers enabled, especially when it comes to plugins, and some browsers not always passing the referrer correctly! That’s been my own experience, anyway, trying to find the best solution for my needs to prevent hotlinking. Nothing has ever been perfect.
Just my thoughts on the hotlinking problem in general ;~}
The no referer pass-through is fine and the fact that some user agents will bypass the Apache rewite has little bearing at all on the usefulness of the script. The purpose of the script is to discourage webmasters from linking to your images and save your bandwidth, which this technique does more or less. What is the use of hotlinking an image when 75-90%+ of users will see a broken image in it’s place? It’s much more effective to just steal the image (which is even less preventable).
I meant web users, not just webmasters, as hotlinking forum avatars is fairly common these days.
I just thought I’d mention that I’ve implemented this technique on my website, and it works perfectly! Thanks very much for this extraordinarily useful technique!
http://www.st-minutiae.com/graphics/view.php
I see Zeldman is still hanging onto this craptacular flat, non-threaded, discussion format. Oh. Joy. No threading topics on a single page. No preview of postings. No info blurb about permitted markup. Nice job, Zeldman.
This article is awful. Not everybody uses PHP, and as has been mentioned, the code used in the article is flawed.
Stick with cross-platform standards, such as…I dunno… .htaccess, maybe?
The very concept of bandwidth theft is funny as hell to me anyway. The Internet is an open network. If you put it in a publicly-accessible area of your server, you’re serving it up for public consumption. To call such public use of a publicly-accessible file on a publicly-accessible directory on a publicly-accessible server is asinine. There are existing technologies that work on all web servers, namely .htaccess, that can be used to prevent those files, such as images, from being served to any domain other than your own. But, failure to do so IS YOUR FAULT. Blaming the user is stupid and ignorant.
I had abandoned using the mod_rewrite method to block hotlinking because of the large number of referrers that are not just stripped, but replaced. By a simple dash, “-“, by “XXXX:++++++++++++”, or “Referrer blocked by Add Subtract”, to name but a few. None of those are an attempt at hotlinking, but the ReWrite Conditions published in the article will deny them images when visiting my site. And in the past, it appeared to be about 4% of visitors (could be more today, with the proliferation of privacy software). This can be avoided by adding one line, after the RewriteCond %{REQUEST_FILENAME} line:
RewriteCond %{HTTP_REFERER} ^[http|nttp].*$
This line looks only for referrer strings that begin with “http” or “nttp”, and the examples listed above will be passed images. I’ve been testing it today, and my access logs have yet to show a “403 – Denied” that wasn’t valid.
===snip===
Once I replaced the HTML in showpic.php and uploaded it and the changes to my htaccess file, my site failed to load its stylesheets.
What in the world?
—snip—
I’ve had similar problems with mod_rewrite. Usually the style sheets without a title load correctly, but the ones with a title do not.
Interesting. Thanks for seconding, Mickey, I thought I was off-topic. I wish it had worked, I’m always on the lookout for a better way, and I suspect my mod rewrite script is not as well coded as it could be.
Could be worse though. I could be Sparky and just be angry for anger’s sake.
Can’t we all just get along?
Thanks Thomas,
This will come in useful.
..and Sparky, don’t be a twat all your life
One important note that I don’t believe was mentioned but merely addressed as ‘if it doesn’t work then it isn’t supported’ is that .htaccess files only work on Unix servers. So all of us .asp people don’t have that privelage. And even on Unix servers, some are restricted so that you can’t use them.
Great tutorial by the way, .htaccess is very useful. Damn ASP and it’s ‘I’m too kewl to be on Unix’-ness.
Thanks for the great article Thomas. It’ll be very useful, and I’ll try the code after some tea 🙂
Sparky – just as kasper mentions; .htacess is not available on Windows.
saj
Sparky:
.htaccess is not a cross-platform standard.
The fact that “not everybody uses PHP” doesn’t make the article “awful.”
Bandwidth theft is a problem, and if you believe it’s your job as a web developer to prevent it, then you need access to tools or techniques that let you do so. This article offers one set of tools, and it results in something a bit more multi-dimensional than what .htaccess allows. If you use Apache and simply want to prevent people from stealing your bandwidth, then .htaccess may be enough. If you don’t use Apache, or if you want to respond to requests for images in a more complex manner than .htaccess allows, this article offers a possible solution.
The claim that the code is “flawed” is rather silly. All code is flawed. Basically, every time ALA runs an article showing how PHP or another server-side language can be used to achieve a desired result, advanced coders who read ALA will point out weaknesses or hidden dangers in the approach or suggest alternatives. That’s programming. There’s always another way to shake the bag of marbles, and there’s always a downside or tradeoff to any approach, especially when it’s a simple approach showing non-experts how to do something fairly straightforward.
Put another way, if ALA published articles on writing, for an audience of business professionals who occassionally need to write clear business letters and proposals, those articles wouldn’t necessarily please brilliant prose stylists like Martin Amis.
You’ve also complained about ALA’s formatting of the discussion forum, which is off-topic so won’t be addressed except to say that we originally ran a threaded forum, which many ALA readers disliked; we switched to the current format in response to reader requests, and most people seem to prefer it. As to your comment about “no info blurb about permitted markup,” I guess you didn’t notice the info blurb about permitted markup at the top of every page of the forum. It says:
—
HTML tags and entities display as source; they do not render. To create a live link, simply type the URL (including http://).
—
I’m not sure how much more information you’d need than that.
We don’t have a preview feature, yet; you’re right about that.
I’ve just tried doing this, but have not been too sucessfull. So far, I’ve created the necessary showpic.php, and the .htaccess files.
http://www.thekhans.me.uk/showpic.php
The problem arises because I’m not entirely sure where to place the .htaccess file. I’ve placed a copy in my images folder
http://www.thekhans.me.uk/assets/images/
and one at root level. The root level .htaccess also contains information on error redirects. The hotlinking code appears below those statements.
However, when I directly request an image (after clearing my cache and starting a new browser session), I can still see the image without the credit;
http://www.thekhans.me.uk/assets/images/bullet_orange.png
Also, an image requested via a URL will not show because .htaccess/PHP fails to send the right path information. I’ve had to change the PHP code in showpic.php from
Which makes sense slightly because I have a slightly odd place to store images.
Aaargh! Can anyone help with where the .htaccess needs to go, and actually try accessing an image from my site directly to see if the error message pops up?
Thanks!
Ah figured it out. I was missing the crucial Rewrite line!
saj
—quote—
Interesting. Thanks for seconding, Mickey, I thought I was off-topic. I wish it had worked, I’m always on the lookout for a better way, and I suspect my mod rewrite script is not as well coded as it could be.
—quote—
I solved my problem. I figured out that I was not useing absolute urls for my style sheets, images and favicons. This meant that when I used a virtual folder path the style sheet was not found. Example:
Real page: /index.php
Virtual path: /debug/
Linked material: style.css
So on the real page, /style.css was called, however on the virtual pathed page, the browser tryed to find /debug/style.css, which of course did not excisit. I solved my problem by linking to /style.css. I hope I clearly explained myself.
Not quite. .htaccess is, AFAIK, a feature of the Apache webserver, which runs on Unix, Linux, MS Windows and probably more.
How do ya figure?
Great artile though
Hi,
I’m a leecher. I’m aware of this practice, and therefore I fake the http referer per request. Not really of course, but it’s easy to fake this. If I was to create a leech script, this was the first trick I’d try to get around. `curl –referer=yourdomain.com yourdomain.com/yourscript.php?image=noleech` would do the trick.
I’m not sure if I understand the third example. If I go to:
http://www.thomasscott.net/images/belle.jpg
It shows the formatted page. But if I go to:
http://www.thomasscott.net/images/belle2.jpg
The page doesn’t get formatted as it should.
Why?
Unfortunately this does nothing about people running norton internet security, computer associates ez firewall, proxies and whatever that strips the referrer, it will still show the image to them because of the “blank referrer”
On Keenspace, the blank referrer is not allowed, which breaks peoples browsers using such broken software.
This trick, would only work on those that are sending the referrer. It solves the problem of hotlinking only if everyone is sending the referrer.
That should be keenspace.com not keenspaced
Well, to be honest, I don’t feel like reading through eight pages of posts, to find out if it has been mentioned, so ignore this post if it has.
The regex’s are slightly flawed. They only check if the extension is at the end of the get request. I could easily write mysite.com/image.png?.html or mysite.com/image.png#.html to get around the .htaccess’ files knowledge.
I had a regex to get around that, but I cannot recall it at this current time, but I am sure somebody can come up with one.
How about users with firewall? for example, Outpost (by default) block’s HTTP_REFERER..
Bless you. This fixes my situation nicely.
(I too am on FF 0.9.2, and I get the “end result” just fine.)
Hi, folks!
I enjoyed the article and am impressed at the amount of collaborative critique and revision has happenned since the first version!
I have revised and tested scenarios surrounding the ideas in this article (and subsequent comments) and have a version which may be of interest to someone.
Without further ado:
– – – A (Refined?) Version – – –
I have removed, from the .htaccess example in the article, allowances for google. These may be added in again easily. I simply did not need them in my case. I have also incorporated an allowance for referrer filtering software that obfuscates the referrer string (the second RewriteCond line, a variation of an earlier post except without the allowance for newsreaders). Naturally, change example.com in the fourth RewriteCond line to your own domain.
Here is the .htaccess snippet:
########## BEGIN .HTACCESS DIRECTIVES
# Set per-dir options
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
# Hotlink protection
# This stops robot indexing, offsite cacheing, and embedding of our images in other webpages by
# throwing a (403) Forbidden status (takes care of most robots) followed by a text/html MIME-type
# as the explanation/error document (takes care of requests expecting an image MIME-type, such as
# in the case of an embedding request from another website)
# The explanation produced is a normal HTML document which displays:
# 1. a copyright notice
# 2. the originally requested image (embedded in the response page)
# 3. a link to our homepage
# This approach still allows linking to images, as in the case of pointing to an image in a gallery
# Exceptions: (causes a pass-thru)
# Referrer is invalid or empty (takes care of direct client requests and some referrer filters)
# Requests from our own domain (so we can embed the images in our own pages)
RewriteCond %{REQUEST_URI} .(jp(e?)g|gif|png)$ [NC]
RewriteCond %{HTTP_REFERER} ^http(s?) [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http(s)?://(www.)?example.com [NC]
RewriteRule .+..{3,4}$ /showpic.php [T=application/x-httpd-php,L]
########## END .HTACCESS DIRECTIVES
Place this in your document root directory’s .htaccess file and your whole site is covered.
Now, for the interesting part. You may have noticed that we did not put a query string at the end of the RewriteRule. No, it is not a bug, it is *really* a feature! No, I’m not crazy (at least nobody has ben able to prove it, yet!)
Apache is wonderful in how much information is made available to you via environment variables — without you doing anything special! You need not rely on userspace GET variables that can be easily tampered with. Apache remembers the original location of the image requested in a special environment variable called REDIRECT_URI. The wonderful thing about this is that Apache has already stripped out any goofy XSS (Cross-Site Scripting) problems by validating the URI syntax internally. Wasn’t that a nice daemon? Good daemon.
So now that we know our original image request data is nice and safe in Apache environment variables, we move to the revised PHP code:
########## BEGIN PHP SCRIPT
Image file:
########## END PHP SCRIPT
First, we need not mess with GET variables anymore. Second, as long as you put showpic.php in the document root, this will cover your whole site, since the full image path is constructed at the top of the script. Third, robots will not be thrown, because we send a correct 403 Forbidden response.
I used a (sly?) trick with the 403 thing. The HTML following it is actually an explanatory error document, but the user will never know it unless they check the headers! So long as the document is large enough, this should be displayed in place of even MSIE’s error documents! This error document, being of the text/html MIME-type, will still break embedding of your images into other sites, regardless of whether their server ignores the 403.
The script also updates the copyright date to the current year automatically and refers to $_SERVER[‘SERVER_NAME’] to get your site name for the link to your homepage and the link text. In the interest of accessibility, the image ALT text provides a generic explaination as to why the image has no real description.
Obviously, you would want to adjust the CSS and copyright notice to your own needs and attributions.
Hopefully, this provides a robust and elegant solution that someone will find useful!
Cheers!
BTW: My site is not up yet, but it will be sometime in September. Come visit then, won’t you? 😉
Though this may be obvious to some, this site word-wrapped the code in my prior post, so be careful with the comments starting with ‘#’ and all of the lines in the .htaccess example!
Cheers!
I made a typo in the explanation of environment variables:
“Apache remembers the original location of the image requested in a special environment variable called REDIRECT_URI.”
“REDIRECT_URI” should be “REDIRECT_URL”.
Thankfully, the typo was not in the all-important code itself.
Carry on. 😉
How about guys with firewalls installed ???
The “Nope, guess not” page does have an image, didn´t work i guess..
While getting some help from Webmaster World regarding my .htaccess file, I inadvertantly got a bit of information regarding tightening up the hotlink prevention code. Some of you may be interested.
http://www.webmasterworld.com/forum92/2101.htm
Just loved it!
is it possible to redirect from somefile.jpg to somefile.jpg.html just using mod rewrite?
so if someone comes in from google images they are sent to the html file without writing a rule for each file.
Paul
Where it states that the images should fail to load I can still see it, same kitty cat but without any text on both “blocked” ones.
IE 6 is the browser and it appears the script doesn’t work very well.
I put an HTA access file on my site server and it works for some and not for others, more often than not it doesn’t work.
Got something to say?
We have turned off comments, but you can see what folks had to say before we did so.
More from ALA
Humility: An Essential Value
Personalization Pyramid: A Framework for Designing with User Data
Mobile-First CSS: Is It Time for a Rethink?
Designers, (Re)define Success First
Breaking Out of the Box