Fun With HTTP Headers

12/10/2006 - 21:09 por Adrian Di Ruggiero | Informe spam
Introduction
Like any good web developer, I have a tendency to poke around at people's
web sites to see if I can figure out how they're implemented. After poking
at enough sites, I started noticing that people were putting some weird and
interesting stuff in their HTTP headers. So, a couple of weeks ago, I
decided to actually go out and see what I could find by scrounging around
in HTTP headers in the wild. A header safari, if you will. These are the
results of my hunt.

Headers?
HTTP is the protocol used to transmit data on what we know as "the web". At
the beginning of every server response on the web, there's a bit of text
like:
HTTP/1.1 200 OK
Connection: close

The top line specifies the protocol version of HTTP and a response code
(200 in this case) used to indicate the outcome of a request. Following
that are a bunch of lines that should consist of a field name (like
"Connection"), followed by a colon, and then followed by a value (like
"close" or "keep-alive"). These lines of text are the HTTP response
headers. Immediately after the headers is a blank line, followed by the
content of the message, such as the text of a web page or the data of an
image file.

Technical Mumbo Jumbo
Want to examine the headers of a site for yourself? Try curl:
curl -i http://www.nextthing.org/

In the output of the above the first few lines are the headers, then there
are a couple of line breaks, and then the body. If you just want to see the
headers, and not the body, use the -I option instead of -i. Be forewarned,
however, that some servers return different headers in this case, as curl
will be requesting the data using a HEAD request rather than a GET request.

What I did to gather all of these headers was very similar. First, I
downloaded an RDF dump of the Open Directory Project's directory, and
pulled out every URL from that file. Then, I stuck all of the domain names
of these URL's in a big database. A simple multithreaded Python script was
used to download all of the index pages of these URL's using PycURL and
stick the headers and page contents in a database. When that was done, I
had a database with 2,686,155 page responses and 23,699,737 response
headers. The actual downloading of all of this took about a week.

This is, of course, not anywhere near a comprehensive survey of the web.
Netcraft received responses from 70,392,567 sites in its August 2005 web
survey, so I hit around 3.8% of them. Not bad, but I'm sure there's a lot
of interesting stuff I'm missing.

Obligatory Mention of Long Tail
First of all, yes, HTTP headers form something like a long tail:

In particular, hapax legomena (one-offs) make up over half of the headers
found. I expected this. Unfortunately for me, however, a lot of the really
interesting stuff is over on that long flat section of the long tail. Which
means I spent a lot of time poring over one-offs looking for interesting
stuff. Weee.
It's a good thing I'm easily amused.

Off with Her Headers

I found 891 instances of:
X-Pad-For-Netscrape-Bug: 0123456789
Which brought back memories of the days when Netscape was reviled by
developers the world 'round, and had not yet achieved its ultimate (albeit
posthumous) glory with Firefox. It's nice to know comments by frustrated
engineers have such a long half-life on the Internet. There are a few
variants on this header:
X-Pad: avoid browser bug
XX-Pad: Padding
aheader: WOULDN'T YOU LIKE TO KNOW!
X-BrowserAlignment: problem

Similarly, people are still blocking Microsoft's Dumb Tags:
X-MS-Smart-Tags: We have nothing to do with them.
X-Meta-MSSmartTagsPreventParsing: TRUE

Speaking of Microsoft, apparently the IIS team felt the need to advertise
the domain of the site the user was accessing in every page request:
Server: Microsoft-IIS/5.0
jvc.org: jvc.org

How completely and utterly unnecessary.
They're not the only ones, though. WebObjects powered sites spit out:
HTTP/1.1 200 Apple

Go team!
This cute header is courtesy of Caudium, a webserver written partially in
Pike:
X-Got-Fish: Yes
The webmaster of www.kfki.hu should be commended for being on the bleeding
edge, both using Caudium and including lots of Dublin Core metadata in the
headers. Although, 32 headers seems a bit much, which is why I'm not going
to show them all:
DC.Subject: physics
DC.Type: organizational homepage
SCHEMA.DCTERMS: http://purl.org/dc/terms/
X-Got-Fish: Yes

Contrary to popular belief, there are people out there using Smalltalk on
the web. Two of them. One Smalltalk software company running a web server
written in Smalltalk, and another:
Server: Swazoo 0.9 (Columbus)
X-WIKI-ENGINE: SmallWiki 1.0
CACHE-CONTROL: no-cache
X-WIKI-COPYRIGHT: Software Composition Group, University of Berne, 2003

running a Smalltalk user's group web site with a wiki written on Smalltalk
on a web server written in, you guessed it: Smalltalk. Cool.
And, of course, it wouldn't be the Internet without an appearance by a
BOFH:
X-BOFH: http://www.xxxxx.de/bofh/xxxxxx.html

The actual URL it points to has been obscured to protect the guilty, and a
local mirror provided in its stead.
Missed Cneonctions
This header:
Cneonction: close
and its variant:
nnCoection: close
were two of the headers which first spurred my interest in HTTP headers.
imdb.com, amazon.com, gamespy.com, and google.com have all at various times
used these or similar misspellings of connection, and I'm not by any means
the first to have noticed. My first thought was that this was just a typo.
After more consideration, however, I now believe this is something done by
a hackish hardware load balancer trying to "remove" the connection close
header when proxying for an internal server. That way, the connection can
be held open and images can be transmitted through the same TCP connection,
while the backend web server doesn't need to be modified at all. It just
closes the connection and moves on to the next request. Ex-coworker and
Mudd alumus jra has a similar analysis.
Another data point which would back this up is the Oracle9iAS Web Cache
rewriting:
Connection: close
as
yyyyyyyyyy: close
Connection: Keep-Alive

Headers with "X-Cnection: close" appear to be the result of a similar
trick.
One ISP/web host is kind enough to include their web address and phone
number in every request to any of their hosted servers:
Phone: (888) 817-8323
Web: www.wgn.net

This is just super-awesome. I once spent a good hour trying to find a
technical contact for a certain monstrous job site to tell them their
servers had been compromised and were displaying the following message to
visitors:
You are being sniffed by Carnivore.
Your nation is secure.
..OCR IS WATCHING YOU..
The message, funnily enough, was being relayed by modifying the HTTP
headers.
C is for Cookie
Cookies 2 were defined in RFC 2965, way back in October of 2000. As far as
I know, Opera is the only browser in widespread use to support them. It's
sad, really, as the original cookie spec that Netscape came up with is kind
of lame. Specifically, Netscape's spec defines the expiration as a date,
which is vulnerable to clock skew on the user's system making the cookie
expire early. The Cookies-2 spec, on the other hand, uses a max-age
attribute, specifying the lifetime of the cookie in seconds:
Set-Cookie2: Meyer_Sound_777h.126.233.177.1122451925660461; path=/;
max-age09600; domain=.meyersound.com; version=1

There are also Comment and CommentURL fields which explain what the cookie
is for, but I have yet to find a header which uses them. *sigh* On the
other hand, I did find 518 Set-Cookie2 headers, which, while miniscule
compared to the 764,976 SetCookie headers I received, is more than I
expected. It looks like software written by Sun is responsible for most of
these.
A bunch of servers spit out:
shmget() failed: No space left on device
Doh! Time to cycle some log files.
Pingback discovery headers like this show up a lot (2370 times):
X-Pingback: http://www.nextthing.org/wordpress/xmlrpc.php
I don't even know what to say to this, found at ebrain.ecnext.com:
HTTP/1.1 200 OK
Date: Sun, 24 Jul 2005 18:39:54 GMT
Server: Apache
<META HTTP-EQUIV=\"Set-Cookie\" CONTENT=\"cust_idY955231; path=/;
domain=ebrain.ecnext.com; expires=Mon, 01-Jan-2011 00: 00:00 GMT\">
<META HTTP-EQUIV=\"Set-Cookie\" CONTENT=\"cust_session=7/24/2005 14:
39:54; path=/; domain=ebrain.ecnext.com; expires=Sun 24-Jul-2005
19:9:54\">
Transfer-Encoding: chunked

At least they're not alone, as www.charlottesweb.hungerford.org will keep
them company:
Turn off Pictures Popup Toolbar in IE 6.0: <meta
http-equiv=\"imagetoolbar\" content=\"no\" />

And www.station.lu:
XHTML: <!DOCTYPE html PUBLIC\"-//W3C//DTD XHTML 1.0 Transitional//EN\"
\"http://www.w3.org/TR/xhtml1/DTD/xht...tional.dtd\">

The list goes on?
The Coral Content Distribution Network has been getting some buzz lately,
so I was interested to see some
X-Coral-Control: redirect-home
headers show up. This header is used to tell Coral that if Coral can't
handle the load of requests for cached copies of your page, it should
redirect these requests back to your site.
Why anyone would think to themselves, "Gee, if a massively scalable caching
service running on hundreds of geographically distributed computers can't
handle the load of people wanting to look at my site, I'll just have them
bounce people back at me", I don't know. Masochism perhaps?
Speaking of P2P technologies, I was interested to run across a KaZaA
server:
HTTP/1.0 404 Not Found
X-Kazaa-Username: anonymous_user
X-Kazaa-Network: KaZaA
X-Kazaa-IP: xx.xx.xx.xx:1348
X-Kazaa-SupernodeIP: xx.xx.xx.x:3699

It looked like it was running on someone's DVR. Anyone have any pointers as
to what software does that?
Along the same lines, haha:
X-Kaza-Username: hrosen
X-Kaza-Network: RIAA
X-Kaza-IP: 146.82.174.12:80
X-Kaza-SupernodeIP: 68.163.90.12:80
X-Disclaimer: All Your Base Are Belong To Us
X-Pizza-Phone: 961.1.351904

They're not even the only ones using "X-Disclaimer", a bunch of other sites
do too:
X-Disclaimer: The local sysadmins have *nothing* to do with the content of
this server.

It looks like Tux Games is trying to extend the venerable RFC 1097 to the
web:
X-Subliminal: You want to buy as many games as you can afford

Personally, I would've gone for: "X-Superliminal: Hey you, buy some
games!".
I'm sure these kind folks would be first adopters:
X-Cotton: The Fabric of Our Lives

This person wants to make their opinion known, so here it is:
Veto: Usage of server response for statistics and advertising is disagreed!

To which I say: Take off every 'zig'!! You know what you doing.
Robot Rock
I'd never really paid much attention to the Robots header:
ROBOTS: index,follow,cache

as it's mostly used to disable indexing of a page and is intended to be
used in a meta tag in the HTML itself, not in the HTTP headers.
However, it seems Google has added a new NOARCHIVE attribute, so let's see
who's using it in their headers rather than in the meta tags like Google
specifies.
It looks like the Singapore-based "Ministry of Pets" doesn't want to be
cached, as does the Civil Engineering department at S?o Paulo Polytechnic
University, the realtime-3d software company MultiGen Paradigm, Swiss
handicraft company Schweizer Heimatwerk, a Swiss kitesailing site, the
Ragin' Cajun Cafe in Hermosa Beach, CA, the London-based BouncingFish web
consultancy, and the French financial paper La Tribune. That's it.
BouncingFish even goes so far as to use an additional GOOGLEBOT header:
GOOGLEBOT: NOCACHE

How many of these sites are not being cached by Google? Zero. Which just
goes to show that one shouldn't just expect mix-and-matching of specs to
work.
Along the same vein, I don't think the first two headers below will work as
expected:
X-Meta-ROBOTS: ALL
X-Meta-Revisit-After: 1 days
Robots: INDEX, FOLLOW

Except, possibly, in spiders using Perl's HTML::HeadParser module. And, of
course, we've already seen that the third header probably won't work,
either.
While I'm on the subject of Google. all Blogspot sites spit out:
test: %{HOSTNAME}e

So Blogger folks, whatcha doin'?
It's Funny, Laugh
The fine folks at www.basement.com.au want to make it clear that:

Mickey-Mouse: Does_Not_Live_Here
Some people have a lot of fun with headers, as seen here:
Limerick: There was a young fellow named Fisher
Limerick: Who was fishing for fish in a fissure,
Limerick: When a cod, with a grin,
Limerick: Pulled the fisherman in
Limerick: Now they're fishing the fissure for Fisher.

This is the only ascii art I found:
<!--
*************************************************************************
* Welcome to schMOOze University *
* *
* ==> To connect to an existing player type: CONNECT NAME PASSWORD *
* ==> To connect as a guest type: CONNECT GUEST *
* *
*************************************************************************
* all text is copyrighted by the various authors *
* TIME FLIES LIKE AN ARROW FRUIT FLIES LIKE A BANANA *
* *** *
* * * *
* * * *
* * * *
* * * *
* * (__) * *
* * (OO) * *
* * ____________ / * *
* * /| / * *
* * / | | | | * *
* * * | |^^ | | * *
* * ^ ^ ^ ^ * *
************ ************

Nobody is connected.
<HTML>
<HEAD>
<TITLE>Welcome to schMOOze!</TITLE>
<meta http-equiv="Content-Type" content="text/html;
<meta http-equiv="refresh"
content="0;URL=http://schmooze.hunter.cuny.edu/">
</BODY></HTML>

and it had me puzzled, until I realized it's a telnet server, and the above
is a really clever hack to redirect browsers towards HTML-land.
This made me laugh:
X-Powered-By: Intravenous Caffeine Drips
X-kluged-by: Nick, Mic, Ash, Andy
X-munged-by: The powers that be
X-Sanity-Provided-By: Ashleigh

Apparently the site has an alter-ego, as well.
www.wrestlingdb.com has some interesting headers. A few requests gets:
X-Stone-Cold-Steve-Austin: And that's the bottom line, cause Stone Cold
says so.
X-Mick-Foley: Have a nice day!
X-Ric-Flair: To be the man, WHOOO!, you've got to beat the man.
X-Rock: If you smell what The Rock is cooking.
X-Booker-T: Can you dig it, SUCKAAAA?
X-Kurt-Angle: It's true, it's DAMN true.
X-Hurricane: Stand Back! There's a Hurricane Coming Through!
X-Kane: FREAKS RULE!

which is about as entertaining as watching a real wrestling match.
Totally Ellet
Just so everyone knows, Frostburg students are so totally leet, they don't
even need to spell it correctly:
Owned And Operted By FSU Computer Club: 31137

Speaking of which, apparently some guy named morris would like his visitors
to know that he 0wnzor$ them:
X-You-Are-Owned-By: morris

Not sure where that box you rooted and are browsing the web from is
located? Never fear, mobileparty.net will tell you:
X-Detected-Country: US

And, for those who were wondering, the Texarkana Police are the world's
finest, at least in the HTTP headers department:
TEXPOLICE: LAW_ENFORCEMENTS_FINEST

These nederlanders are representin' for the westside:
X-Side: : WESTSIDE-FOR-LIFE

Western Europe, that is. Jaaa.
Speaking of furriners, anyone care to translate:
X-Sarrazin-Says: Ciccio, lascia perdere, e' un blowfish a 448 bit.

Similarly:
X-beliebig: Dieser Header dient der allgemeinen Verwirrung =:)
X-Gleitschirmfliegen: macht Spaaaasss!

Going back to my discussion on standards, localizing headers that are used
to actually do stuff is a bad idea:
Ultima Modifica??o: Thu, 28 Jul 2005 15:12:07 GMT

ObRef
The Democrats called, they want you to know they found their sense of
humor:
X-Dubya: You teach a child to read and he or her will be able to pass a
literacy test.

Make sure to hit it a few times for optimum goodness:
X-Dubya: We're in for a long struggle, and I think Texans understand that.
And so do Americans.
X-Dubya: Africa is a nation that suffers from incredible disease.
X-Dubya: We're making the right decisions to bring the solution to an end.
X-Dubya: Families is where our nation finds hope, where wings take dream.

In the politics vein:
X-Powered-By: MonkeyMag 0.02.01, (c) Niel Bornstein and Kendall Clark
X-Shout-Out: No Power Without Accountability
X-Mos-Defology: Speech is my hammer//Bang the world into shape//Now let it
fall
X-American-Leftist-Salute: Doing Woody's Work!
X-Billy-Braggage: Sun, Sea, Socialism!

Yes! Someone just made my day. I love Al Bundy quotes:
X-Bundy: Here we have 3 of the seven dwarfs, puffy, crabby and horny.
X-Bundy: You know I never danced unless it was gonna get some sex for me.
X-Bundy: I blame it on TV myself.
X-Bundy: To know me is to love me.

I was disappointed in the lack of mention of mules, donkeys, or garden
gnomes, but at least llamas, mice, and loons are well represented:
X-Llamas-Version: 2.0

From: www.teevee.org
X-Favourite-Animal: Mouse

From: www.kingssing.de
X-Loons-Version: 1.5.1

From: www.eod.com
Speaking of strange characters, apparently the Wicked Witch of the West and
Spongebob Squarepants cohabitate at www.harbor-club.com:
X-Wicked-Witch: West
X-Spongebob: Squarepants!

Who knew?!
As if we needed further proof that the soft underbelly of the Internet is
full of cults, slowly corrupting the moral fabric of society, I present:
X-SAVIOUR: BOB_DOBBS

From the looks of things, Living Slack Master Bob Dobbs is giving Jesus a
run for his money among Oregonian carpenters and their web designers. They
join such luminaries as R. Crumb, Devo, and Bruce Cambell.
And if you thought that was an obscure meme, try this on for size:
X-Lerfjhax: Extra yummy

When I first saw an X-Han header, I thought for sure the contents would be
"Shot first!", but instead I found something more obscure:
X-Han: 'I look forward to a tournament of truly epic proportions.'

While we're on pop culture allusions:
X-Powered-By: Twine
X-Towelie: You wanna get high?

And it would be difficult to be more obscure than this:
X-Sven: look out for the fruits of life

Finally, old school Mac-diehards should appreciate:
X-Blam: Frog blast the vent core!

Connection: close
Back when I was interviewing for an internship at Tellme Networks, they had
a comment buried on their homepage that said:
<!-- (c) Copyright 2000 Tellme Networks. -->
<!--
If you're looking at our HTML source, you're exactly the person
who should send us your resume. We recently redesigned our site;
Tell us all about how you would make it better and better yet,
if you have an illustrious career of web-hacking, drop us an email
at jobs@tellme.com.

I thought this was just way awesome. However, if I was disappointed when
they removed that comment, I'm even more disappointed to report that I have
yet to find a single HTTP header offering me a cool job. What's wrong with
you people?! I'm supposed to be able to find anything on the Internet!
I was, at least, thanked for my efforts, and I found the answer to life,
the universe, and everything!
X-Thank-You: for bothering to look at my HTTP headers
X-Answer: 42

You're welcome! And thank you all, for making the Internet so interesting!
This entry was posted on Sunday, August 7th, 2005 at 11:26 PM and is filed
under General, Programming. You can follow any responses to this entry
through the RSS 2.0 feed. You can leave a response, or trackback from your
own site.


http://www.nextthing.org/archives/2...tp-headers
 

Leer las respuestas

#1 Demóstenes
13/10/2006 - 00:24 | Informe spam
Copia descarada de:
http://www.nextthing.org/archives/2...tp-headers

Demóstenes

"Adrian Di Ruggiero" escribió en el mensaje
news:
Introduction
Like any good web developer, I have a tendency to poke around at people's
web sites to see if I can figure out how they're implemented. After poking
at enough sites, I started noticing that people were putting some weird
and
interesting stuff in their HTTP headers. So, a couple of weeks ago, I
decided to actually go out and see what I could find by scrounging around
in HTTP headers in the wild. A header safari, if you will. These are the
results of my hunt.

Headers?
HTTP is the protocol used to transmit data on what we know as "the web".
At
the beginning of every server response on the web, there's a bit of text
like:
HTTP/1.1 200 OK
Connection: close

The top line specifies the protocol version of HTTP and a response code
(200 in this case) used to indicate the outcome of a request. Following
that are a bunch of lines that should consist of a field name (like
"Connection"), followed by a colon, and then followed by a value (like
"close" or "keep-alive"). These lines of text are the HTTP response
headers. Immediately after the headers is a blank line, followed by the
content of the message, such as the text of a web page or the data of an
image file.

Technical Mumbo Jumbo
Want to examine the headers of a site for yourself? Try curl:
curl -i http://www.nextthing.org/

In the output of the above the first few lines are the headers, then there
are a couple of line breaks, and then the body. If you just want to see
the
headers, and not the body, use the -I option instead of -i. Be forewarned,
however, that some servers return different headers in this case, as curl
will be requesting the data using a HEAD request rather than a GET
request.

What I did to gather all of these headers was very similar. First, I
downloaded an RDF dump of the Open Directory Project's directory, and
pulled out every URL from that file. Then, I stuck all of the domain names
of these URL's in a big database. A simple multithreaded Python script was
used to download all of the index pages of these URL's using PycURL and
stick the headers and page contents in a database. When that was done, I
had a database with 2,686,155 page responses and 23,699,737 response
headers. The actual downloading of all of this took about a week.

This is, of course, not anywhere near a comprehensive survey of the web.
Netcraft received responses from 70,392,567 sites in its August 2005 web
survey, so I hit around 3.8% of them. Not bad, but I'm sure there's a lot
of interesting stuff I'm missing.

Obligatory Mention of Long Tail
First of all, yes, HTTP headers form something like a long tail:

In particular, hapax legomena (one-offs) make up over half of the headers
found. I expected this. Unfortunately for me, however, a lot of the really
interesting stuff is over on that long flat section of the long tail.
Which
means I spent a lot of time poring over one-offs looking for interesting
stuff. Weee.
It's a good thing I'm easily amused.

Off with Her Headers

I found 891 instances of:
X-Pad-For-Netscrape-Bug: 0123456789
Which brought back memories of the days when Netscape was reviled by
developers the world 'round, and had not yet achieved its ultimate (albeit
posthumous) glory with Firefox. It's nice to know comments by frustrated
engineers have such a long half-life on the Internet. There are a few
variants on this header:
X-Pad: avoid browser bug
XX-Pad: Padding
aheader: WOULDN'T YOU LIKE TO KNOW!
X-BrowserAlignment: problem

Similarly, people are still blocking Microsoft's Dumb Tags:
X-MS-Smart-Tags: We have nothing to do with them.
X-Meta-MSSmartTagsPreventParsing: TRUE

Speaking of Microsoft, apparently the IIS team felt the need to advertise
the domain of the site the user was accessing in every page request:
Server: Microsoft-IIS/5.0
jvc.org: jvc.org

How completely and utterly unnecessary.
They're not the only ones, though. WebObjects powered sites spit out:
HTTP/1.1 200 Apple

Go team!
This cute header is courtesy of Caudium, a webserver written partially in
Pike:
X-Got-Fish: Yes
The webmaster of www.kfki.hu should be commended for being on the bleeding
edge, both using Caudium and including lots of Dublin Core metadata in the
headers. Although, 32 headers seems a bit much, which is why I'm not going
to show them all:
DC.Subject: physics
DC.Type: organizational homepage
SCHEMA.DCTERMS: http://purl.org/dc/terms/
X-Got-Fish: Yes

Contrary to popular belief, there are people out there using Smalltalk on
the web. Two of them. One Smalltalk software company running a web server
written in Smalltalk, and another:
Server: Swazoo 0.9 (Columbus)
X-WIKI-ENGINE: SmallWiki 1.0
CACHE-CONTROL: no-cache
X-WIKI-COPYRIGHT: Software Composition Group, University of Berne, 2003

running a Smalltalk user's group web site with a wiki written on Smalltalk
on a web server written in, you guessed it: Smalltalk. Cool.
And, of course, it wouldn't be the Internet without an appearance by a
BOFH:
X-BOFH: http://www.xxxxx.de/bofh/xxxxxx.html

The actual URL it points to has been obscured to protect the guilty, and a
local mirror provided in its stead.
Missed Cneonctions
This header:
Cneonction: close
and its variant:
nnCoection: close
were two of the headers which first spurred my interest in HTTP headers.
imdb.com, amazon.com, gamespy.com, and google.com have all at various
times
used these or similar misspellings of connection, and I'm not by any means
the first to have noticed. My first thought was that this was just a typo.
After more consideration, however, I now believe this is something done by
a hackish hardware load balancer trying to "remove" the connection close
header when proxying for an internal server. That way, the connection can
be held open and images can be transmitted through the same TCP
connection,
while the backend web server doesn't need to be modified at all. It just
closes the connection and moves on to the next request. Ex-coworker and
Mudd alumus jra has a similar analysis.
Another data point which would back this up is the Oracle9iAS Web Cache
rewriting:
Connection: close
as
yyyyyyyyyy: close
Connection: Keep-Alive

Headers with "X-Cnection: close" appear to be the result of a similar
trick.
One ISP/web host is kind enough to include their web address and phone
number in every request to any of their hosted servers:
Phone: (888) 817-8323
Web: www.wgn.net

This is just super-awesome. I once spent a good hour trying to find a
technical contact for a certain monstrous job site to tell them their
servers had been compromised and were displaying the following message to
visitors:
You are being sniffed by Carnivore.
Your nation is secure.
..OCR IS WATCHING YOU..
The message, funnily enough, was being relayed by modifying the HTTP
headers.
C is for Cookie
Cookies 2 were defined in RFC 2965, way back in October of 2000. As far as
I know, Opera is the only browser in widespread use to support them. It's
sad, really, as the original cookie spec that Netscape came up with is
kind
of lame. Specifically, Netscape's spec defines the expiration as a date,
which is vulnerable to clock skew on the user's system making the cookie
expire early. The Cookies-2 spec, on the other hand, uses a max-age
attribute, specifying the lifetime of the cookie in seconds:
Set-Cookie2: Meyer_Sound_777h.126.233.177.1122451925660461; path=/;
max-age09600; domain=.meyersound.com; version=1

There are also Comment and CommentURL fields which explain what the cookie
is for, but I have yet to find a header which uses them. *sigh* On the
other hand, I did find 518 Set-Cookie2 headers, which, while miniscule
compared to the 764,976 SetCookie headers I received, is more than I
expected. It looks like software written by Sun is responsible for most of
these.
A bunch of servers spit out:
shmget() failed: No space left on device
Doh! Time to cycle some log files.
Pingback discovery headers like this show up a lot (2370 times):
X-Pingback: http://www.nextthing.org/wordpress/xmlrpc.php
I don't even know what to say to this, found at ebrain.ecnext.com:
HTTP/1.1 200 OK
Date: Sun, 24 Jul 2005 18:39:54 GMT
Server: Apache
<META HTTP-EQUIV=\"Set-Cookie\" CONTENT=\"cust_idY955231; path=/;
domain=ebrain.ecnext.com; expires=Mon, 01-Jan-2011 00: 00:00 GMT\">
<META HTTP-EQUIV=\"Set-Cookie\" CONTENT=\"cust_session=7/24/2005 14:
39:54; path=/; domain=ebrain.ecnext.com; expires=Sun 24-Jul-2005
19:9:54\">
Transfer-Encoding: chunked

At least they're not alone, as www.charlottesweb.hungerford.org will keep
them company:
Turn off Pictures Popup Toolbar in IE 6.0: <meta
http-equiv=\"imagetoolbar\" content=\"no\" />

And www.station.lu:
XHTML: <!DOCTYPE html PUBLIC\"-//W3C//DTD XHTML 1.0 Transitional//EN\"
\"http://www.w3.org/TR/xhtml1/DTD/xht...tional.dtd\">

The list goes on?
The Coral Content Distribution Network has been getting some buzz lately,
so I was interested to see some
X-Coral-Control: redirect-home
headers show up. This header is used to tell Coral that if Coral can't
handle the load of requests for cached copies of your page, it should
redirect these requests back to your site.
Why anyone would think to themselves, "Gee, if a massively scalable
caching
service running on hundreds of geographically distributed computers can't
handle the load of people wanting to look at my site, I'll just have them
bounce people back at me", I don't know. Masochism perhaps?
Speaking of P2P technologies, I was interested to run across a KaZaA
server:
HTTP/1.0 404 Not Found
X-Kazaa-Username: anonymous_user
X-Kazaa-Network: KaZaA
X-Kazaa-IP: xx.xx.xx.xx:1348
X-Kazaa-SupernodeIP: xx.xx.xx.x:3699

It looked like it was running on someone's DVR. Anyone have any pointers
as
to what software does that?
Along the same lines, haha:
X-Kaza-Username: hrosen
X-Kaza-Network: RIAA
X-Kaza-IP: 146.82.174.12:80
X-Kaza-SupernodeIP: 68.163.90.12:80
X-Disclaimer: All Your Base Are Belong To Us
X-Pizza-Phone: 961.1.351904

They're not even the only ones using "X-Disclaimer", a bunch of other
sites
do too:
X-Disclaimer: The local sysadmins have *nothing* to do with the content of
this server.

It looks like Tux Games is trying to extend the venerable RFC 1097 to the
web:
X-Subliminal: You want to buy as many games as you can afford

Personally, I would've gone for: "X-Superliminal: Hey you, buy some
games!".
I'm sure these kind folks would be first adopters:
X-Cotton: The Fabric of Our Lives

This person wants to make their opinion known, so here it is:
Veto: Usage of server response for statistics and advertising is
disagreed!

To which I say: Take off every 'zig'!! You know what you doing.
Robot Rock
I'd never really paid much attention to the Robots header:
ROBOTS: index,follow,cache

as it's mostly used to disable indexing of a page and is intended to be
used in a meta tag in the HTML itself, not in the HTTP headers.
However, it seems Google has added a new NOARCHIVE attribute, so let's see
who's using it in their headers rather than in the meta tags like Google
specifies.
It looks like the Singapore-based "Ministry of Pets" doesn't want to be
cached, as does the Civil Engineering department at S?o Paulo Polytechnic
University, the realtime-3d software company MultiGen Paradigm, Swiss
handicraft company Schweizer Heimatwerk, a Swiss kitesailing site, the
Ragin' Cajun Cafe in Hermosa Beach, CA, the London-based BouncingFish web
consultancy, and the French financial paper La Tribune. That's it.
BouncingFish even goes so far as to use an additional GOOGLEBOT header:
GOOGLEBOT: NOCACHE

How many of these sites are not being cached by Google? Zero. Which just
goes to show that one shouldn't just expect mix-and-matching of specs to
work.
Along the same vein, I don't think the first two headers below will work
as
expected:
X-Meta-ROBOTS: ALL
X-Meta-Revisit-After: 1 days
Robots: INDEX, FOLLOW

Except, possibly, in spiders using Perl's HTML::HeadParser module. And, of
course, we've already seen that the third header probably won't work,
either.
While I'm on the subject of Google. all Blogspot sites spit out:
test: %{HOSTNAME}e

So Blogger folks, whatcha doin'?
It's Funny, Laugh
The fine folks at www.basement.com.au want to make it clear that:

Mickey-Mouse: Does_Not_Live_Here
Some people have a lot of fun with headers, as seen here:
Limerick: There was a young fellow named Fisher
Limerick: Who was fishing for fish in a fissure,
Limerick: When a cod, with a grin,
Limerick: Pulled the fisherman in
Limerick: Now they're fishing the fissure for Fisher.

This is the only ascii art I found:
<!--
*************************************************************************
* Welcome to schMOOze University *
* *
* ==> To connect to an existing player type: CONNECT NAME PASSWORD *
* ==> To connect as a guest type: CONNECT GUEST *
* *
*************************************************************************
* all text is copyrighted by the various authors *
* TIME FLIES LIKE AN ARROW FRUIT FLIES LIKE A BANANA *
* *** *
* * * *
* * * *
* * * *
* * * *
* * (__) * *
* * (OO) * *
* * ____________ / * *
* * /| / * *
* * / | | | | * *
* * * | |^^ | | * *
* * ^ ^ ^ ^ * *
************ ************

Nobody is connected.
<HTML>
<HEAD>
<TITLE>Welcome to schMOOze!</TITLE>
<meta http-equiv="Content-Type" content="text/html;
<meta http-equiv="refresh"
content="0;URL=http://schmooze.hunter.cuny.edu/">
</BODY></HTML>

and it had me puzzled, until I realized it's a telnet server, and the
above
is a really clever hack to redirect browsers towards HTML-land.
This made me laugh:
X-Powered-By: Intravenous Caffeine Drips
X-kluged-by: Nick, Mic, Ash, Andy
X-munged-by: The powers that be
X-Sanity-Provided-By: Ashleigh

Apparently the site has an alter-ego, as well.
www.wrestlingdb.com has some interesting headers. A few requests gets:
X-Stone-Cold-Steve-Austin: And that's the bottom line, cause Stone Cold
says so.
X-Mick-Foley: Have a nice day!
X-Ric-Flair: To be the man, WHOOO!, you've got to beat the man.
X-Rock: If you smell what The Rock is cooking.
X-Booker-T: Can you dig it, SUCKAAAA?
X-Kurt-Angle: It's true, it's DAMN true.
X-Hurricane: Stand Back! There's a Hurricane Coming Through!
X-Kane: FREAKS RULE!

which is about as entertaining as watching a real wrestling match.
Totally Ellet
Just so everyone knows, Frostburg students are so totally leet, they don't
even need to spell it correctly:
Owned And Operted By FSU Computer Club: 31137

Speaking of which, apparently some guy named morris would like his
visitors
to know that he 0wnzor$ them:
X-You-Are-Owned-By: morris

Not sure where that box you rooted and are browsing the web from is
located? Never fear, mobileparty.net will tell you:
X-Detected-Country: US

And, for those who were wondering, the Texarkana Police are the world's
finest, at least in the HTTP headers department:
TEXPOLICE: LAW_ENFORCEMENTS_FINEST

These nederlanders are representin' for the westside:
X-Side: : WESTSIDE-FOR-LIFE

Western Europe, that is. Jaaa.
Speaking of furriners, anyone care to translate:
X-Sarrazin-Says: Ciccio, lascia perdere, e' un blowfish a 448 bit.

Similarly:
X-beliebig: Dieser Header dient der allgemeinen Verwirrung =:)
X-Gleitschirmfliegen: macht Spaaaasss!

Going back to my discussion on standards, localizing headers that are used
to actually do stuff is a bad idea:
Ultima Modifica??o: Thu, 28 Jul 2005 15:12:07 GMT

ObRef
The Democrats called, they want you to know they found their sense of
humor:
X-Dubya: You teach a child to read and he or her will be able to pass a
literacy test.

Make sure to hit it a few times for optimum goodness:
X-Dubya: We're in for a long struggle, and I think Texans understand that.
And so do Americans.
X-Dubya: Africa is a nation that suffers from incredible disease.
X-Dubya: We're making the right decisions to bring the solution to an end.
X-Dubya: Families is where our nation finds hope, where wings take dream.

In the politics vein:
X-Powered-By: MonkeyMag 0.02.01, (c) Niel Bornstein and Kendall Clark
X-Shout-Out: No Power Without Accountability
X-Mos-Defology: Speech is my hammer//Bang the world into shape//Now let it
fall
X-American-Leftist-Salute: Doing Woody's Work!
X-Billy-Braggage: Sun, Sea, Socialism!

Yes! Someone just made my day. I love Al Bundy quotes:
X-Bundy: Here we have 3 of the seven dwarfs, puffy, crabby and horny.
X-Bundy: You know I never danced unless it was gonna get some sex for me.
X-Bundy: I blame it on TV myself.
X-Bundy: To know me is to love me.

I was disappointed in the lack of mention of mules, donkeys, or garden
gnomes, but at least llamas, mice, and loons are well represented:
X-Llamas-Version: 2.0

From: www.teevee.org
X-Favourite-Animal: Mouse

From: www.kingssing.de
X-Loons-Version: 1.5.1

From: www.eod.com
Speaking of strange characters, apparently the Wicked Witch of the West
and
Spongebob Squarepants cohabitate at www.harbor-club.com:
X-Wicked-Witch: West
X-Spongebob: Squarepants!

Who knew?!
As if we needed further proof that the soft underbelly of the Internet is
full of cults, slowly corrupting the moral fabric of society, I present:
X-SAVIOUR: BOB_DOBBS

From the looks of things, Living Slack Master Bob Dobbs is giving Jesus a
run for his money among Oregonian carpenters and their web designers. They
join such luminaries as R. Crumb, Devo, and Bruce Cambell.
And if you thought that was an obscure meme, try this on for size:
X-Lerfjhax: Extra yummy

When I first saw an X-Han header, I thought for sure the contents would be
"Shot first!", but instead I found something more obscure:
X-Han: 'I look forward to a tournament of truly epic proportions.'

While we're on pop culture allusions:
X-Powered-By: Twine
X-Towelie: You wanna get high?

And it would be difficult to be more obscure than this:
X-Sven: look out for the fruits of life

Finally, old school Mac-diehards should appreciate:
X-Blam: Frog blast the vent core!

Connection: close
Back when I was interviewing for an internship at Tellme Networks, they
had
a comment buried on their homepage that said:
<!-- (c) Copyright 2000 Tellme Networks. -->
<!--
If you're looking at our HTML source, you're exactly the person
who should send us your resume. We recently redesigned our site;
Tell us all about how you would make it better and better yet,
if you have an illustrious career of web-hacking, drop us an email
at

I thought this was just way awesome. However, if I was disappointed when
they removed that comment, I'm even more disappointed to report that I
have
yet to find a single HTTP header offering me a cool job. What's wrong with
you people?! I'm supposed to be able to find anything on the Internet!
I was, at least, thanked for my efforts, and I found the answer to life,
the universe, and everything!
X-Thank-You: for bothering to look at my HTTP headers
X-Answer: 42

You're welcome! And thank you all, for making the Internet so interesting!
This entry was posted on Sunday, August 7th, 2005 at 11:26 PM and is filed
under General, Programming. You can follow any responses to this entry
through the RSS 2.0 feed. You can leave a response, or trackback from your
own site.


http://www.nextthing.org/archives/2...tp-headers




Preguntas similares