import Network.URI
import Network.HTTP
import Network.Browser
get :: URI -> IO String
get uri = do
let req = Request uri GET [] ""
resp <- browse $ do
setAllowRedirects True -- handle HTTP redirects
request req
return $ rspBody $ snd resp
main = do
case parseURI "http://cn.bing.com/search?q=hello" of
Nothing -> putStrLn "Invalid search"
Just uri -> do
body <- get uri
writeFile "output开发者_开发技巧.txt" body
Here is the diff between haskell output and curl output
It's probably not a good idea to use String
as the intermediate data type here, as it will cause character conversions both when reading the HTTP response, and when writing to the file. This can cause corruption if these conversions are nor consistent, as it would appear they are here.
Since you just want to copy the bytes directly, it's better to use a ByteString
. I've chosen to use a lazy ByteString
here, so that it does not have to be loaded into memory all at once, but can be streamed lazily into the file, just like with String
.
import Network.URI
import Network.HTTP
import Network.Browser
import qualified Data.ByteString.Lazy as L
get :: URI -> IO L.ByteString
get uri = do
let req = Request uri GET [] L.empty
resp <- browse $ do
setAllowRedirects True -- handle HTTP redirects
request req
return $ rspBody $ snd resp
main = do
case parseURI "http://cn.bing.com/search?q=hello" of
Nothing -> putStrLn "Invalid search"
Just uri -> do
body <- get uri
L.writeFile "output.txt" body
Fortunately, the functions in Network.Browser
are overloaded so that the change to lazy bytestrings only involves changing the request body to L.empty
, replacing writeFile
with L.writeFile
, as well as changing the type signature of the function.
精彩评论