If passed byte [] bytes = text.getBytes (); byte array is converted
[-80, -94, -54, -57]
If you want to do through the char , which is traversed String of each character , and then
convert byte array. To and text.getBytes () to get an array of the same, how to do it ?
encoding is GBK encoding .
example:
String text = "阿是";
byte[] bytes = text.getBytes();//[-80, -94, -54, -57]
byte[] abytes = new byte[text.length() * 2];
for (int i = 0; i < text.length(); i++) {
char c = text.charAt(i);
//在这里通过c,该怎么做,才能将String转成byte[]
//System.out.println((byte)((c + 0xA0)) );
//System.out.println((byte)(0x96));
//System.out.println((byte)(0xA0 + (c >> 6)));
//System.out.println((byte)(0xa0 + (c & 0x3F)));
}
god please let me know.
------ Solution ---------------------------------------- ----
What is the problem here ?
then answer questions such as landlord .
If you want to know immediately , you can open a paste hang 40 points to me
There is no problem . . . . You are wrong , right ? . . . .
Well, I said is not accurate, so write is actually possible.
------ Solution ---------------------------------------- ----
Java character set conversion has the API , including GBK, research getBytes () to go .
------ Solution ---------------------------------------- ----
do not know why you have to use char to turn , if it is to use it, look at this way it can meet your requirements
package study.string.length;
import java.io.UnsupportedEncodingException;
import sun.io.CharToByteConverter;
import sun.io.MalformedInputException;
public class StrLenght {
public static void main(String[] args) throws UnsupportedEncodingException, MalformedInputException {
String str = "a中";
byte[] chars = str.getBytes();
for (int x = 0; x < chars.length; x++) {
System.out.println(chars[x]);
}
print(str);
}
public static void print(String str) throws UnsupportedEncodingException, MalformedInputException {
byte[] result = new byte[str.getBytes().length];
int p = 0;
for (int i = 0; i < str.length(); i++) {
char c = str.charAt(i);
byte l = (byte) c;
byte h = (byte) (c >> 8);
if (h == 0) {
result[p++] = l;
} else {
char[] cs = new char[1];
cs[0] = c;
CharToByteConverter converter = CharToByteConverter.getConverter("GBK");
byte[] br = converter.convertAll(cs);
result[p++] = br[0];
result[p++] = br[1];
}
}
for (int x = 0; x < result.length; x++) {
System.out.println(result[x]);
}
}
}
------ Solution ------------------------------------- -------
If I have to deal with a character a character , you can use CharBuffer.
But still the easiest to use String , generally will not go wrong.
------ For reference only -------------------------------------- -
char is double-byte , and the result will be different , oh
------ For reference only ----------------- ----------------------
for (int i = 0; i < text.length () ; i + +) {
char c = text.charAt (i);
If you have Chinese character processing , text.length () would be a problem .
------ For reference only -------------------------------------- -
What is the problem here ?
------ For reference only -------------------------------------- -
What is the problem here ?
then answer questions such as landlord .
If you want to know immediately , you can open a paste hang 40 points to me
------ For reference only ---------------------------------------
What is the problem here ?
then answer questions such as landlord .
If you want to know immediately , you can open a paste hang 40 points to me
There is no problem . . . . You are wrong , right ? . . . .
------ For reference only -------------------------------------- -
deal indeed Chinese , char can represent Chinese . If Chinese is converted into a byte [] b = new byte [2]; letters to convert a byte. What is the problem ? Please enlighten me .
------ For reference only -------------------------------------- -
deal indeed Chinese , char can represent Chinese . If Chinese is converted into a byte [] b = new byte [2]; letters to convert a byte. What is the problem ? Please enlighten me .
there is no problem,
But you can not new byte [2], Byte and Char is a difference .
------ For reference only -------------------------------------- -
I want to use bit operations, so that the performance better. If you use a nio package the way I would do .
If it is converted to UTF-8, the following operations are possible :
(byte)(0xE0 + (chr >> 12));
(byte)(0x80 + ((chr >> 6) & 0x3F));
(byte)(0x80 + (chr & 0x3F));
above code is a character chr .
If a character directly
(byte)chr;
can.
now confusion is GBK, bitwise how to write .
Please god who educated us.
------ For reference only -------------------------------------- -
deal indeed Chinese , char can represent Chinese . If Chinese is converted into a byte [] b = new byte [2]; letters to convert a byte. What is the problem ? Please enlighten me .
there is no problem,
But you can not new byte [2], Byte and Char is a difference .
There is a difference , so to pass bit computing , the char into byte [].
please let us know .
------ For reference only -------------------------------------- -
ah , that I have seen , and finally calls the native method . I only java ah.
------ For reference only -------------------------------------- -
this is not a problem .
Please refer Ming Road , thanks
------ For reference only ---------------------------------------
this use sun private classes and methods. Also a way .
------ For reference only -------------------------------------- -
this uses the sun 's private classes and methods. Also a way .
yes ah, source code , it is quite complicated, you can look at free look .
------ For reference only ------------------------------------ ---
I made reference to the JDK charset.jar in , sun.nio.cs.ext.GBK18030.java source, as well as in rt.jar sun.nio.cs.UTF_8.java source. utf -bit arithmetic coding can be directly converted into bytes because java internal use unicode encoding , utf-8 encoding is the law , and can be directly mapped into unicode. And gbk and unicode without a certain relationship . Bitwise mostly unable to turn . In GBK18030.java is completely manual way by means of the mapping .
And I will GBK18030.java the code extracted , made tools , performance, not as a direct use nio transfer efficiency.
This is nio way :
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.util.Arrays;
public class GBKCharUtils {
public static final Charset charset = Charset.forName("GBK");
public static byte[] getBytes(char c) {
CharBuffer charBuffer = CharBuffer.allocate(1);
charBuffer.put(c);
charBuffer.flip();
ByteBuffer byteBuffer = charset.encode(charBuffer);
return byteBuffer.array();
}
public static byte[] getBytes(char[] chars) {
CharBuffer charBuffer = CharBuffer.wrap(chars);
ByteBuffer byteBuffer = charset.encode(charBuffer);
return byteBuffer.array();
}
public static void main(String[] args) {
CharBuffer charBuffer = CharBuffer.allocate(3);
charBuffer.put('c');
charBuffer.put('2');
charBuffer.put('a');
System.out.println(Arrays.toString(getBytes('雷')));
System.out.println(Arrays.toString(getBytes(new char[]{'雷'})));
}
}
So go directly nio it.
Here is GBK18030.java extracted to approach.
public void encode(CharBuffer src, ByteBuffer dst) {
//int hiByte = 0, loByte = 0;
while (src.hasRemaining()) {
char c = src.get();
if (c >= 0x0000 && c <= 0x007F) {
dst.put((byte)c);
} else if (c <= 0xA4C6 || c >= 0xE000) {
int outByteVal = getGB18030(encoderIndex1, encoderIndex2, c);
//hiByte = (outByteVal & 0xFF00) >> 8;
//loByte = outByteVal & 0xFF;
dst.put((byte)((outByteVal & 0xFF00) >> 8));
dst.put((byte)(outByteVal & 0xFF));
}
}
}
which getGB18030 is to do gbk encoding and unicode mapping .
specific source , jdk does not seem to , you can download the source code to see openjdk .
没有评论:
发表评论