0%

Android HttpClient和URLConnection两种下载HTML源码的方法

两种方法分别采用HttpClient和URLConnection,同时解决乱码问题。经真机测试,好像是HttpClient方式比较稳定,一般都能下载到,但是URLConnection在EDGE网络下经常下不到数据。 HttpClient方式:

    public String getHtml(String url) throws IOException, URISyntaxException{
        URI u=new URI(url);
        DefaultHttpClient httpclient = new DefaultHttpClient(); 
        HttpGet httpget = new HttpGet(u);   
        ResponseHandler responseHandler = new BasicResponseHandler();   
        String content = httpclient.execute(httpget, responseHandler);   
        content = new String(content.getBytes("ISO-8859-1"),"UTF-8");
                          //目标页面编码为UTF-8,没这个会乱码

        return content;		
    }

也可以用httpget.getParams().setParameter(HttpMethodParams.HTTP_CONTENT_CHARSET,"GB2312"); 在发送请求时设置编码 URLConnection方式:

    public String getHTML(String url)
    {
    try{
    URL newUrl=new URL(url);
    URLConnection connect=newUrl.openConnection();
    connect.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
    DataInputStream dis=new DataInputStream(connect.getInputStream());
    BufferedReader in = new BufferedReader(new InputStreamReader(dis,"UTF-8"));//目标页面编码为UTF-8
    String html="";
    String readLine=null;
    while((readLine=in.readLine())!=null)
    {
    html=html+readLine;
    }
    in.close();
    return html;
    }catch(MalformedURLException me){
    }
    catch(IOException ioe){
    }
    return null;
    }