summaryrefslogtreecommitdiff
path: root/docs/TheArtOfHttpScripting
diff options
context:
space:
mode:
authorDaniel Stenberg <daniel@haxx.se>2008-05-29 21:48:15 +0000
committerDaniel Stenberg <daniel@haxx.se>2008-05-29 21:48:15 +0000
commit47925f3dd78a1cb46c6245d16af906c929cd4a84 (patch)
treee10c38b01d63797638be3f727f0cb358cd2af11f /docs/TheArtOfHttpScripting
parent82c5950c7e8e354e7af3588a037c0c53f155e757 (diff)
downloadcurl-47925f3dd78a1cb46c6245d16af906c929cd4a84.tar.gz
Added a new "13. Web Login" chapter
Diffstat (limited to 'docs/TheArtOfHttpScripting')
-rw-r--r--docs/TheArtOfHttpScripting58
1 files changed, 48 insertions, 10 deletions
diff --git a/docs/TheArtOfHttpScripting b/docs/TheArtOfHttpScripting
index f3357d474..3d237b489 100644
--- a/docs/TheArtOfHttpScripting
+++ b/docs/TheArtOfHttpScripting
@@ -1,5 +1,5 @@
Online: http://curl.haxx.se/docs/httpscripting.html
-Date: December 9, 2004
+Date: May 28, 2008
The Art Of Scripting HTTP Requests Using Curl
=============================================
@@ -137,6 +137,10 @@ Date: December 9, 2004
you need to replace that space with %20 etc. Failing to comply with this
will most likely cause your data to be received wrongly and messed up.
+ Recent curl versions can in fact url-encode POST data for you, like this:
+
+ curl --data-urlencode "name=I am Daniel" www.example.com
+
4.3 File Upload POST
Back in late 1995 they defined an additional way to post data over HTTP. It
@@ -202,14 +206,14 @@ Date: December 9, 2004
curl -T uploadfile www.uploadhttp.com/receive.cgi
-6. Authentication
+6. HTTP Authentication
- Authentication is the ability to tell the server your username and password
- so that it can verify that you're allowed to do the request you're doing. The
- Basic authentication used in HTTP (which is the type curl uses by default) is
- *plain* *text* based, which means it sends username and password only
- slightly obfuscated, but still fully readable by anyone that sniffs on the
- network between you and the remote server.
+ HTTP Authentication is the ability to tell the server your username and
+ password so that it can verify that you're allowed to do the request you're
+ doing. The Basic authentication used in HTTP (which is the type curl uses by
+ default) is *plain* *text* based, which means it sends username and password
+ only slightly obfuscated, but still fully readable by anyone that sniffs on
+ the network between you and the remote server.
To tell curl to use a user and password for authentication:
@@ -237,6 +241,10 @@ Date: December 9, 2004
able to watch your passwords if you pass them as plain command line
options. There are ways to circumvent this.
+ It is worth noting that while this is how HTTP Authentication works, very
+ many web sites will not use this concept when they provide logins etc. See
+ the Web Login chapter further below for more details on that.
+
7. Referer
A HTTP request may include a 'referer' field (yes it is misspelled), which
@@ -407,7 +415,37 @@ Date: December 9, 2004
curl -H "Destination: http://moo.com/nowhere" http://url.com
-13. Debug
+13. Web Login
+
+ While not strictly just HTTP related, it still cause a lot of people problems
+ so here's the executive run-down of how the vast majority of all login forms
+ work and how to login to them using curl.
+
+ It can also be noted that to do this properly in an automated fashion, you
+ will most certainly need to script things and do multiple curl invokes etc.
+
+ First, servers mostly use cookies to track the logged-in status of the
+ client, so you will need to capture the cookies you receive in the
+ responses. Then, many sites also set a special cookie on the login page (to
+ make sure you got there through their login page) so you should make a habit
+ of first getting the login-form page to capture the cookies set there.
+
+ Some web-based login systems features various amounts of javascript, and
+ sometimes they use such code to set or modify cookie contents. Possibly they
+ do that to prevent programmed logins, like this manual describes how to...
+ Anyway, if reading the code isn't enough to let you repeat the behavior
+ manually, capturing the HTTP requests done by your browers and analyzing the
+ sent cookies is usually a working method to work out how to shortcut the
+ javascript need.
+
+ In the actual <form> tag for the login, lots of sites fill-in random/session
+ or otherwise secretly generated hidden tags and you may need to first capture
+ the HTML code for the login form and extract all the hidden fields to be able
+ to do a proper login POST. Remember that the contents need to be URL encoded
+ when sent in a normal POST.
+
+
+14. Debug
Many times when you run curl on a site, you'll notice that the site doesn't
seem to respond the same way to your curl requests as it does to your
@@ -437,7 +475,7 @@ Date: December 9, 2004
such as ethereal or tcpdump and check what headers that were sent and
received by the browser. (HTTPS makes this technique inefficient.)
-14. References
+15. References
RFC 2616 is a must to read if you want in-depth understanding of the HTTP
protocol.