Can anyone explain how the communication takes place between the browser and web server? I want to learn how
- GET, POST verbs (among others)
- cookies
- sessions
- query strings
work behind the scene.开发者_运维问答
Hyper Text Transfer Protocol (HTTP) is a protocol used for transferring web pages (like the one you're reading right now). A protocol is really nothing but a standard way of doing things. If you were to meet the President of the United States, or the king of a country, there would be specific procedures that you'd have to follow. You couldn't just walk up and say "hey dude". There would be a specific way to walk, to talk, a standard greeting, and a standard way to end the conversation. Protocols in the TCP/IP stack serve the same purpose.
The TCP/IP stack has four layers: Application, Transport, Internet, and Network. At each layer there are different protocols that are used to standardize the flow of information, and each one is a computer program (running on your computer) that's used to format the information into a packet as it's moving down the TCP/IP stack. A packet is a combination of the Application Layer data, the Transport Layer header (TCP or UDP), and the IP layer header (the Network Layer takes the packet and turns it into a frame).
The Application Layer
...consists of all applications that use the network to transfer data. It does not care about how the data gets between two points and it knows very little about the status of the network. Applications pass data to the next layer in the TCP/IP stack and then continue to perform other functions until a reply is received. The Application Layer uses host names (like www.dalantech.com) for addressing. Examples of application layer protocols: Hyper Text Transfer Protocol (HTTP -web browsing), Simple Mail Transfer Protocol (SMTP -electronic mail), Domain Name Services (DNS -resolving a host name to an IP address), to name just a few.
The main purpose of the Application Layer is to provide a common command language and syntax between applications that are running on different operating systems -kind of like an interpreter. The data that is sent by an application that uses the network is formatted to conform to one of several set standards. The receiving computer can understand the data that is being sent even if it is running a different operating system than the sender due to the standards that all network applications conform to.
The Transport Layer
...is responsible for assigning source and destination port numbers to applications. Port numbers are used by the Transport Layer for addressing and they range from 1 to 65,535. Port numbers from 0 to 1023 are called "well known ports". The numbers below 256 are reserved for public (standard) services that run at the Application Layer. Here are a few: 25 for SMTP, 53 for DNS (udp for domain resolution and tcp for zone transfers) , and 80 for HTTP. The port numbers from 256 to 1023 are assigned by the IANA to companies for the applications that they sell.
Port numbers from 1024 to 65,535 are used for client side applications -the web browser you are using to read this page, for example. Windows will only assign port numbers up to 5000 -more than enough port numbers for a Windows based PC. Each application has a unique port number assigned to it by the transport layer so that as data is received by the Transport Layer it knows which application to give the data to. An example is when you have more than one browser window running. Each window is a separate instance of the program that you use to surf the web, and each one has a different port number assigned to it so you can go to www.dalantech.com in one browser window and this site does not load into another browser window. Applications like FireFox that use tabbed windows simply have a unique port number assigned to each tab
The Internet Layer
...is the "glue" that holds networking together. It permits the sending, receiving, and routing of data.
The Network Layer
...consists of your Network Interface Card (NIC) and the cable connected to it. It is the physical medium that is used to transmit and receive data. The Network Layer uses Media Access Control (MAC) addresses, discussed earlier, for addressing. The MAC address is fixed at the time an interface was manufactured and cannot be changed. There are a few exceptions, like DSL routers that allow you to clone the MAC address of the NIC in your PC.
For more info:
- Protocols
- TCP/IP
Your browser first resolves the servername via DNS to an IP. Then it opens a TCP connection to the webserver and tries to communicate via HTTP. Usually that is on TCP-port 80 but you can specify a different one (http://server:portnumber
).
HTTP looks like this:
Once it is connected, it sends the request, which looks like:
GET /site HTTP/1.0
Header1: bla
Header2: blub
{emptyline}
E.g., a header might be Authorization
or Range
. See here for more.
Then the server responds like this:
200 OK
Header3: foo
Header4: bar
content following here...
E.g., a header might be Date
or Content-Type
. See here for more.
Look at Wikipedia for HTTP for some more information about this protocol.
The links for specifications of each aspect of the question is as follows:
GET, POST verbs (among others) - The HTTP Specification exhaustively discusses all aspects of HTTP communication (the protocol for communication between the web server and the browser). It explains the Request message and Response message protocols.
Cookies - are set by attaching a
Set-Cookie
HTTP Header to the HTTP response.QueryStrings - are the part of the URL in the HTTP request that follow the first occurrence of a "?" character. The linked specification is for section 3.4 of the URI specification.
Sessions - HTTP is a synchronous, stateless protocol. Sessions, or the illusion of state, can be created by (1) using cookies to store state data as plain text on the client's computer, (2) passing data-values in the URL and querystring of the request, (3) submitting POST requests with a collection of values that may indicate state and, (4) storing state information by a server-side persistence mechanism that is retrieved by a session-key (the session key is resolved from either the cookie, URL/Querystring or POST value collection.
An explanation of HTTP can go on for days, but I have attempted to provide a concise yet conceptually complete answer, and include the appropriate links for further reading of official specifications.
Your browser is sitting on top of TCP/IP, as the web is based on standards, usually port 80, what happens is when you enter an address, such as google.com, your computer where the browser is running on, creates packets of data, encapsulated at each layer accordingly to the OSI standards, (think of envelopes of different sizes, packed into each envelope of next size), OSI defines 7 layers, in one of the envelopes contains the source address and destination address(that is the website) encoded in binary.
As it reaches the 1st layer, in OSI terms, it gets transmitted across the media transmitter (such as cable, DSL).
If you are connected via ISP, the layered pack of envelopes gets transmitted to the ISP, the ISP's network system, peeks through the layered pack of envelopes by decoding in reverse order to find out the address, then the ISP checks their Domain Name System database to find out if they have a route to that address (cached in memory, if it does, it forwards it across the internet network - again layered pack of envelopes).
If it doesn't, the ISP interrogates the top level DNS server to say 'Hey, get me the route for the address as supplied by you, ie. the browser', the top level DNS server then passes the route to the ISP which is then stored in the ISP's server memory.
The layered pack of envelopes are transmitted and received by the website server after successful routing of the packets (think of routing as signposts for directions to get to the server), which in turn, unpacks the layered pack of envelopes, extracts the source address and says 'Aha, that is for me, right, I know the destination address (that is you, the browser), then the server packetizes the webpages into a packed layered envelopes and sends it back (usually in reverse route, but not always the case).
Your browser than receives the packetized envelopes and unpacks each of them. Then your computer descrambles the data and your browser renders the pages on the screen.
I hope this answer is sufficient enough for your understanding.
this is from Browser — Server Communication
It depends on the web server, but if you're wondering what it looks like from the client side, just install Live Headers and Firebug for firefox. With the net tab in firebug and live headers open, it should be clear exactly how the two interact.
For a more in-depth look at the actual data going back and forth, use wireshark.
There is a commercial product with an interesting logo which lets you see all kind of traffic between server and client named charles.
Another open source tools include: Live HttpHeaders, Wireshark or Firebug.
Communication between a browser and a webserver takes place at so many levels that is close to impossible to answer this question. HTTP plays a role, but HTTP is meaningless without TCP which is meaningless without IP which is meaningless without a physical network on which it sent. Then, there are POST vs GET requests which are similar but enough different to warrant a special dicussion. Sometimes an HTTP request needs to be authenticated, sometimes, it needs not. Mime types should be mentioned. Then, a browser sends a different request if there is a proxy. And then also encodings play a role. So, I guess, the most concise answer to this kind of question is: the browser asks the server for data and the server gives the requested data to the browser.
精彩评论