So I was thinking about languages the other day, and it struck me that any program written in a compiled language that interacts with the Internet is then translated into assembly that has to interact with the Internet. I've just begun learning a bit of x86 assembly to help me understand C++ a bit better, and I'm baffled by how something so low-level could do something like access the Internet.
I'm sure the full answer to thi开发者_StackOverflow中文版s question is much more than would fit in a SO answer, but could somebody give me maybe a basic summary?
User-space programs that "interact with the internet", in all modern systems, do so by issuing system calls to the underlying operating system, which supplies the API for a TCP/IP stack.
The system calls in question (such as socket
, listen
, accept
, and so forth) are typically documented at a C level, but in each particular OS implementation they will translate to machine code, of course. But whether values go in particular registers, or locations in memory pointed to by particular registers, etc, is pretty minor and totally system-specific.
If you're wondering how the machine code (probably also compiled from C) in the kernel and device drivers "interacts with the internet" (in response to system calls), it does so both by building and maintaining in-memory data structures to track the state of various things, and by interacting with the underlying hardware (e.g. via interrupts, I/O ports, memory mapped device areas, or whatever that particular architecture uses) -- just like it interacts with (say) a video display, or a disk device.
It depends. When you read about a web script written in C, it's actually a CGI program. CGI is a protocol, not a language. CGI specifies to put "GET", "POST", etc. into REQUEST_METHOD, "foo=bar?baz=42" into QUERY_STRING, post data into stdin, etc.. To access these, the CGI program uses system calls. The web server uses CGI to communicate with a web script. A program that communicates across the Internet by itself can use the system sockets API.
In summary, the operating system does all the communicating. The program just makes the right system calls.
If you wonder how the operating system communicates over the Internet, the answer is that the OS kernel uses a driver to interface with the network card over an IO port, memory-mapped IO, etc.. The OS and network card implement Internet Protocol standards for everything to work together.
What you need to do is to look up some of those PIC web-server projects. Some of them are web-servers written in assembly and running on 8-bit hardware. It will give you a clear idea of how something as low-level as assembly can be used to interact with the rest of the world through the Internet.
It basically involves
- Writing some low-level drivers (Layer 2) to interface with the networking hardware - this may be using ethernet or even modems (with SLIP).
- Write the next layers - IP and TCP - to process the TCP/IP packets. This will need some assembly magic as these processes are quite involved.
- Write the application layer - whether it be a web-server or client or whatever - that exploits the underlying layers.
Hope this clears up some doubt.
Is it reasonable to say that at some point regardless of the program, code gets transformed(for lack of the proper term) into some form of "assembly" language (I think there is more than one) which then has a "one to one" relationship to machine code? Not sure how .NET and ILASM/Java and its corresponding bytecode fit into this, but I thought all of it at some point turned into assembly and then machine code.
精彩评论