2013年10月1日星期二

dom4j coding problem

 This post last edited by the leidazhi on 2013-09-30 17:05:35
HttpCore prepared using the http client ( running on windows7) received from another machine (Linux) sent messages when trying to return HttpResponse of HttpEntity as xml parsing error occurs , the error message : 1 -byte UTF-8 sequence byte 1 is invalid.

the input stream is converted to a string can be displayed properly .

Also running on a linux machine code is not the problem , can correctly parse and display an xml packet containing Chinese .

In the process of creating xml using SAXReader of setEncoding set encoding to UTF-8 or GBK or GB2312 also not correctly parse packets , error message : 3 -byte UTF-8 sequence of words section 3 is invalid.

I ask how to solve this problem ? Thank
------ Solution ------------------------------------- -------
you're using the latest version of httpClient . You do not put him into the content of what byte array stream
direct return new XMLContent ("UTF-8", response.getEntity (). getContent ());
------ For reference only --------- ------------------------------
landlord create xml key code stickers ah
----- - For reference only ---------------------------------------

public XMLContent (String encoding, InputStream inputStream) {
try {
SAXReader saxReader = getSaxReader ();
saxReader.setEncoding (encoding);
document = saxReader.read (inputStream);
inputStream.close ();
} catch (Exception e) {
throw new RuntimeException ("Failed to parse xml content:" + e.getMessage ());
}
}

public XMLContent (InputStream inputStream) {
try {
document = getSaxReader (). read (inputStream);
inputStream.close ();
} catch (Exception e) {
throw new RuntimeException ("Failed to parse xml content:" + e.getMessage ());
}
}

invocation:
HttpPost httpPost = new HttpPost (urlString);
HttpProtocolParams.setContentCharset (httpPost.getParams (), "UTF-8");
HttpProtocolParams.setHttpElementCharset (httpPost.getParams (), "UTF-8");
....
HttpResponse response = httpClient.execute (httpPost);
return XMLContent (new ByteArrayInputStream (EntityUtils.toByteArray (response.getEntity ())));


------ For reference only ---------------------------------- -----
do not use SAXReader.read (InputStream is) this method with overloaded SAXReader.read (Reader reader);
XMLContent (String encoding, InputStream inputStream) This method is under renovation

public XMLContent(String encoding, InputStream inputStream) {
        try {
            SAXReader saxReader = getSaxReader();
            document = saxReader.read(new InputStreamReader(inputStream,encoding));
            inputStream.close();
        } catch (Exception e) {
            throw new RuntimeException("Failed to parse xml content:" + e.getMessage());
        }
    }
调用的时候改成
return new XMLContent("UTF-8",new ByteArrayInputStream(EntityUtils.toByteArray(response.getEntity())));

------ For reference only ----------------------------------- ----


I tried this method , does not work, and I put the input flow into the byte array , print it out , linux and windows following result is the same, so the question should be what happened when parsing xml in saxReader .
In addition , access to SaxReader approach is this:
private SAXReader getSaxReader () {
SAXReader saxReader = new SAXReader ();
saxReader.setEncoding ("UTF-8");
saxReader.setIgnoreComments (true);
return saxReader;
}
------ For reference only ------------------------- --------------

problem has been solved , the code above all no problem, locate the beginning wrong place wrong, not here , is another convert xml other part , do not write it here . But still very grateful to you ha ^ _ ^ , if there are problems but also to ask you yo

没有评论:

发表评论