Can anyone point out the flaw in this code? I'm retrieving some HTML with TcpClient. NetworkStream.Read() never seems to finish when talking to an IIS server. If I go use the Fiddler proxy instead, it works fine, but when talking directly to the target server the .read() loop won't exit until the connection exceptions out with an error like "the remote server has closed the connection".
internal TcpClient Client { get; set; }
/// bunch of other code here...
try
{
NetworkStream ns = Client.GetStream();
StreamWriter sw = new StreamWriter(ns);
sw.Write(request);
sw.Flush();
byte[] buffer = new byte[1024];
int read=0;
try
{
while ((read = ns.Read(buffer, 0, buffer.Length)) > 0)
{
response.AppendFormat("{0}", Encoding.ASCII.GetString(buffer, 0, read));
}
}
catch //(SocketException se)
{
}
finally
{
Close();
}
Update
In the debugger, I can see the entire response coming through immediately and being appended to my StringBuilder (response). It just appears that the connection isn't being closed when the server is done sending the response, or my code isn't detecting it.
Conclusion As has been said here, it's best to take advantage of the offerings of the protocol (in the case of HTTP, the Content-Length header) to determine when a transaction is complete. However, I've found that not all pages have content-length set. So, I'm now using a hybrid solution:
For ALL transactions, set the request's
Connection
header to "close", so that the server is discouraged from keeping the socket open. This improves the chances that the server will close the connection when it is through responding to your request.If
Content开发者_StackOverflow中文版-Length
is set, use it to determine when a request is complete.Else, set the NetworkStream's RequestTimeout property to a large, but reasonable, value like 1 second. Then, loop on
NetworkStream.Read()
until either a) the timeout occurs, or b) you read fewer bytes than you asked for.
Thanks to everyone for their excellent and detailed responses.
Contrary to what the documentation for NetworkStream.Read implies, the stream obtained from a TcpClient
does not simply return 0 for the number of bytes read when there is no data available - it blocks.
If you look at the documentation for TcpClient
, you will see this line:
The TcpClient class provides simple methods for connecting, sending, and receiving stream data over a network in synchronous blocking mode.
Now my guess is that if your Read
call is blocking, it's because the server has decided not to send any data back. This is probably because the initial request is not getting through properly.
My first suggestion would be to eliminate the StreamWriter
as a possible cause (i.e. buffering/encoding nuances), and write directly to the stream using the NetworkStream.Write
method. If that works, make sure that you're using the correct parameters for the StreamWriter
.
My second suggestion would be not to depend on the result of a Read
call to break the loop. The NetworkStream
class has a DataAvailable
property that is designed for this. The correct way to write a receive loop is:
NetworkStream netStream = client.GetStream();
int read = 0;
byte[] buffer = new byte[1024];
StringBuilder response = new StringBuilder();
do
{
read = netStream.Read(buffer, 0, buffer.Length);
response.Append(Encoding.ASCII.GetString(buffer, 0, read));
}
while (netStream.DataAvailable);
Read the response until you reach a double CRLF. What you now have is the Response headers. Parse the headers to read the Content-Length header which will be the count of bytes left in the response.
Here is a regular expression that can catch the Content-Length header.
David's Updated Regex
Content-Length: (?<1>\d+)\r\n
Content-Length
Note
If the server does not properly set this header I would not use it.
Not sure if this is helpful or not but with HTTP 1.1 the underlying connection to the server might not be closed so maybe the stream doesn't get closed either? The idea being that you can reuse the connection to send a new request. I think you have to use the content-length. Alternatively use the WebClient or WebRequest classes instead.
I may be wrong, but it looks like your call to Write
is writing (under the hood) to the stream ns
(via StreamWriter
). Later, you're reading from the same stream (ns
). I don't quite understand why are you doing this?
Anyway, you may need to use Seek
on the stream, to move to the location where you want to start reading. I'd guess that it seeks to the end after writing. But as I said, I'm not really sure if this is a useful answer!
Two Suggestions...
- Have you tried using the DataAvailable property of NetworkStream? It should return true if there is data to be read from the stream.
while (ns.DataAvailable)
{
//Do stuff here
}
- Another option would be to change the ReadTimeOut to a low value so you don't end up blocking for a long time. It can be done like this:
ns.ReadTimeOut=100;
精彩评论