------ Solution ---------------------------------------- ----
voice recognition is a specialized technique , in fact, the judgment of the sound wave . If you want to use the Java implementation , then you must first sound input and output of Java have a basic understanding of the operation . Here is a simple recording and output procedures. Recording, each read data is stored in the buffer into the byte array , and then output it . And you, is to identify , then, is to extract information from the array . So what is the structure of this array like it . On this procedure, using 16-bit dual-channel recording, then this array , each two put together , is a sample value, this value is the loudness of the sound . The 2 bytes as a unit , then, as is the two-channel , that is, a set of two units , one for the left channel and a right channel , alternate entry .
import java.io.*;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.SourceDataLine;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.FloatControl;
import javax.sound.sampled.TargetDataLine;
public class RecordAndPlay {
volatile int divider;
public RecordAndPlay(){
Play();
}
public static void main(String[] args) {
new RecordAndPlay();
}
//播放音频文件
public void Play() {
try {
AudioFormat audioFormat =
// new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100F,
// 8, 1, 1, 44100F, false);
new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,44100F, 16, 2, 4,
44100F, true);
DataLine.Info info = new DataLine.Info(TargetDataLine.class,
audioFormat);
TargetDataLine targetDataLine = (TargetDataLine) AudioSystem.getLine(info);
targetDataLine.open(audioFormat);
SourceDataLine sourceDataLine;
info = new DataLine.Info(SourceDataLine.class, audioFormat);
sourceDataLine = (SourceDataLine) AudioSystem.getLine(info);
sourceDataLine.open(audioFormat);
targetDataLine.start();
sourceDataLine.start();
FloatControl fc=(FloatControl)sourceDataLine.getControl(FloatControl.Type.MASTER_GAIN);
double value=0.2;
float dB = (float)(Math.log(value==0.0?0.0001:value)/Math.log(10.0)*20.0);
fc.setValue(dB);
int nByte = 0;
final int bufSize=1024;
byte[] buffer = new byte[bufSize];
while (nByte != -1) {
//System.in.read();
nByte = targetDataLine.read(buffer, 0, bufSize);
sourceDataLine.write(buffer, 0, nByte);
}
sourceDataLine.stop();
} catch (Exception e) {
e.printStackTrace();
}
}
}
------ Solution ------------------------------------- -------
it to you for the last time that the recording process , I wrote a program recording and waveform display . The following functions:
1. Click "Start" to start recording and display , click on the " pause" to pause the recording.
2. recording pause mode, you can move the scroll bar to view the waveform before , press the " Playback" and display the waveform can be played back , playback starts at the leftmost currently displayed waveform corresponding to the sound ( but here encounter a problem, that last bit sound playback does not come out , it is rather strange , did not find a reason ) .
procedures are as follows :
import java.awt.*;
import java.awt.event.*;
import java.awt.geom.Line2D;
import javax.swing.*;
import javax.swing.event.*;
import java.util.*;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.SourceDataLine;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.FloatControl;
import javax.sound.sampled.TargetDataLine;
/**
* 2011-6-23 0:53:40
* @author Administrator
*/
public class ShowWave {
JFrame frame;
JPanel pan;
final int panHeight=400;
JScrollBar timeLocationScrollBar;
int point[];
int number;
byte bufferAll[];
int bufferAllIndex;
int vRate,hRate;
JButton startButton,pauseButton;
JButton replayButton,stopReplayButton;
boolean continueRecorde;
boolean continueReplay;
JPanel centerPane,buttonPane;
JSlider hSlider;
boolean jsbActive;
public ShowWave(){
initData();
frame=new JFrame("录音并显示波形");
pan=new JPanel(){
public void paint(Graphics g){
g.setColor(Color.WHITE);
g.fillRect(0, 0, this.getWidth(), this.getHeight());
g.setColor(Color.red);
int x[]=new int[number];
for(int i=0;i<number;i++){
x[i]=i;
point[i]=panHeight-point[i];
}
g.drawPolyline(x, point, number);
// Graphics2D g2d=(Graphics2D)g;
// for(int i=0;i<number-1;i++){
// g2d.draw(new Line2D.Double(x[i], point[i], x[i+1], point[i+1]));
// }
g.setColor(Color.blue);
}
};
pan.setPreferredSize(new Dimension(600,panHeight));
timeLocationScrollBar=new JScrollBar();
timeLocationScrollBar.setOrientation(JScrollBar.HORIZONTAL);
timeLocationScrollBar.setMaximum(0);
timeLocationScrollBar.setMinimum(0);
timeLocationScrollBar.setValue(0);
timeLocationScrollBar.addAdjustmentListener(new AdjustmentListener() {
public void adjustmentValueChanged(AdjustmentEvent e) {
if(jsbActive==false){
return;
}
synchronized(bufferAll){
int beginIndex=timeLocationScrollBar.getValue();
beginIndex=beginIndex*2*hRate;
if(beginIndex==0){
number=bufferAllIndex/hRate/2;
if(number>600){
number=600;
}
}
else{
number=600;
}
for(int i=0;i<number;i++,beginIndex+=2*hRate){
int hBit=bufferAll[beginIndex];
int lBit=bufferAll[beginIndex+1];
point[i]=hBit<<8|lBit;
point[i]/=vRate;
point[i]+=panHeight/2;
}
pan.repaint();
}
}
});
centerPane=new JPanel();
centerPane.setLayout(new BorderLayout());
centerPane.add(pan);
centerPane.add(timeLocationScrollBar,BorderLayout.SOUTH);
frame.getContentPane().add(centerPane);
startButton=new JButton("开始");
startButton.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent e) {
continueRecorde=true;
jsbActive=false;
}
});
pauseButton=new JButton("暂停");
pauseButton.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent e) {
continueRecorde=false;
jsbActive=true;
}
});
hSlider=new JSlider();
hSlider.setOrientation(JSlider.HORIZONTAL);
hSlider.setMaximum(100);
hSlider.setMinimum(1);
hSlider.setValue(hRate);
hSlider.addChangeListener(new ChangeListener() {
public void stateChanged(ChangeEvent e) {
hRate=hSlider.getValue();
int length=bufferAllIndex/hRate/2;
length-=600;
int value=(int)((double)timeLocationScrollBar.getValue()/timeLocationScrollBar.getMaximum()*length);
jsbActive=false;
timeLocationScrollBar.setMaximum(length);
jsbActive=true;
timeLocationScrollBar.setValue(value);
}
});
replayButton=new JButton("回放");
replayButton.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent e) {
if(continueRecorde||continueReplay){
return;
}
new Thread(){
public void run(){
try{
continueReplay=true;
AudioFormat audioFormat =new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
44100F, 16, 1, 2,44100F, true);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
SourceDataLine sourceDataLine;
sourceDataLine = (SourceDataLine) AudioSystem.getLine(info);
sourceDataLine.open(audioFormat);
sourceDataLine.start();
FloatControl fc=(FloatControl)sourceDataLine.getControl(FloatControl.Type.MASTER_GAIN);
double value=2;
float dB = (float)(Math.log(value==0.0?0.0001:value)/Math.log(10.0)*20.0);
fc.setValue(dB);
int beginIndex=timeLocationScrollBar.getValue();
beginIndex=beginIndex*2*hRate;
int bufSize=1024;
byte buffer[]=new byte[bufSize];
while (beginIndex < bufferAllIndex && continueReplay) {
synchronized (bufferAll) {
int nByte = bufferAllIndex - beginIndex > bufSize ? bufSize : bufferAllIndex - beginIndex;
System.arraycopy(bufferAll, beginIndex, buffer, 0, nByte);
sourceDataLine.write(buffer, 0, nByte);
// System.out.println(beginIndex+" "+bufferAllIndex);
beginIndex += nByte;
if(beginIndex/2/hRate<=timeLocationScrollBar.getMaximum()){
timeLocationScrollBar.setValue(beginIndex/2/hRate);
}
}
}
sourceDataLine.flush();
sourceDataLine.stop();
sourceDataLine.close();
continueReplay=false;
}catch(Exception ee){ee.printStackTrace();}
}
}.start();
}
});
stopReplayButton=new JButton("停止回放");
stopReplayButton.addActionListener(new ActionListener(){
public void actionPerformed(ActionEvent e) {
continueReplay=false;
}
});
Box box=Box.createHorizontalBox();
box.add(Box.createHorizontalGlue());
box.add(startButton);
box.add(Box.createHorizontalStrut(10));
box.add(pauseButton);
box.add(Box.createHorizontalStrut(10));
box.add(hSlider);
box.add(Box.createHorizontalStrut(10));
box.add(replayButton);
box.add(Box.createHorizontalStrut(10));
box.add(stopReplayButton);
box.add(Box.createHorizontalGlue());
box.setBorder(BorderFactory.createEmptyBorder(10, 10, 10, 10));
frame.getContentPane().add(box,BorderLayout.SOUTH);
frame.pack();
frame.setLocationRelativeTo(null);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setVisible(true);
play();
}
------ For reference only ----------------------------------- ----
this is probably not that clear Xieqing of twelve . . .
------ For reference only -------------------------------------- -
wait for the landlord to achieve !
------ For reference only -------------------------------------- -
good idea
continued efforts
------ For reference only -------------------------------- -------
strong, waiting for the code
------ For reference only ------------------------ ---------------
help you a top one ! I hope you will soon be finding out ! When talk about your ideas, paste the code ah ! Let us learn , learn ah !
------ For reference only -------------------------------------- -
learn. .
------ For reference only -------------------------------------- -
me too
------ For reference only ------------------------------- --------
Java in this respect an open source project shortly before someone else to do this ...... I use C + + to do a similar function , but it 's simpler
- ----- For reference only ---------------------------------------
top up +
Concern
------ For reference only ---------------------------------- -----
we all have top yourselves under way without professional help so many prawns are where to go
------ For reference only ---------------------------------- -----
java to do this . . More trouble
------ For reference only ------------------------------------ ---
landlord can understand speech recognition software under Linux
------ For reference only ------------------------------ ---------
goodies
learned
------ For reference only -------------------------------- -------
first learn because the original did not come into contact with do not know how to start it . You will use sphinx2-0.1 This I see something online that specialize in JAVA inside voice but under pressure do not know how to use
------ For reference only ---------------------------------- -----
13th floors can give thanks heroes speak more I looked a little confused huh. . .
------ For reference only -------------------------------------- -
good, worthy of attention. . .
------ For reference only -------------------------------------- -
top one , first learn ! !
------ For reference only -------------------------------------- -
up, learn
------ For reference only ------------------------------ ---------
Hey, in fact I know it is not much. Since the original stir the procedures used in a bare metal control sound output, the sampling rate, the number of sample bits , for a number of single and double channel which understood, and the sampling of the data structure and the meaning of some understanding, of course , these are your identification on the basis that if you do not know the meaning of reading , of course, is not to start with it . In fact, for the incoming data collected , as I posted that programs, two bytes form a unit value is the loudness of the sound at a time , then a series of consecutive these values constitute the loudness changing sound sequence. Of course, for such a digital sampled values are discrete. For example, with a sampling rate of 44100 , which 1s sampled 44,100 times per 1/44100s sampled value. If you use a timeline depicting these points and twenty-two connected , that seeing is a fitting waveform. You want to analyze , is this waveform.
------ For reference only -------------------------------------- -
can you give me some useful information Yeah
------ For reference only ------------------- --------------------
code is too long, only to separate stickers
public void initData(){
point=new int[600];
Arrays.fill(point, 0);
number=600;
bufferAll=new byte[130*1024*1024];
bufferAllIndex=0;
vRate=120;
hRate=20;//1470
continueRecorde=false;
continueReplay=false;
jsbActive=true;
}
public void play() {
try {
AudioFormat audioFormat =
// new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100F,
// 8, 1, 1, 44100F, false);
new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,44100F, 16, 1, 2,
44100F, true);
DataLine.Info info = new DataLine.Info(TargetDataLine.class,
audioFormat);
TargetDataLine targetDataLine = (TargetDataLine) AudioSystem.getLine(info);
targetDataLine.open(audioFormat);
SourceDataLine sourceDataLine;
info = new DataLine.Info(SourceDataLine.class, audioFormat);
sourceDataLine = (SourceDataLine) AudioSystem.getLine(info);
sourceDataLine.open(audioFormat);
targetDataLine.start();
sourceDataLine.start();
FloatControl fc=(FloatControl)sourceDataLine.getControl(FloatControl.Type.MASTER_GAIN);
double value=2;
float dB = (float)(Math.log(value==0.0?0.0001:value)/Math.log(10.0)*20.0);
fc.setValue(dB);
int nByte = 0;
final int bufSize=256;
byte[] buffer = new byte[bufSize];
new Thread(){
public void run(){
while(true){
if(!continueRecorde){
try{
Thread.sleep(50);
}catch(InterruptedException ie){}
continue;
}
synchronized(bufferAll){
if(600*hRate*2<bufferAllIndex){
int beginIndex=bufferAllIndex-600*hRate*2;
for(int i=0;i<600;i++,beginIndex+=2*hRate){
int hBit=bufferAll[beginIndex];
int lBit=bufferAll[beginIndex+1];
point[i]=hBit<<8|lBit;
point[i]/=vRate;
point[i]+=panHeight/2;
}
number=600;
pan.repaint();
}
else{
int beginIndex=0;
number=bufferAllIndex/hRate/2;
for(int i=0;i<number;i++,beginIndex+=2*hRate){
int hBit=bufferAll[beginIndex];
int lBit=bufferAll[beginIndex+1];
point[i]=hBit<<8|lBit;
point[i]/=vRate;
point[i]+=panHeight/2;
}
pan.repaint();
}
int length=bufferAllIndex/hRate/2;
if(length>600){
timeLocationScrollBar.setMaximum(length-600);
timeLocationScrollBar.setValue(length-600);
}
}
try{
Thread.sleep(10);
}catch(InterruptedException ie){}
}
}
}.start();
while (nByte != -1) {
if(!continueRecorde){
try{
Thread.sleep(50);
}catch(InterruptedException ie){}
continue;
}
synchronized(bufferAll){
nByte = targetDataLine.read(buffer, 0, bufSize);
System.arraycopy(buffer, 0, bufferAll, bufferAllIndex, nByte);
bufferAllIndex+=nByte;
sourceDataLine.write(buffer, 0, nByte);
}
// try{
// Thread.sleep(10);
// }catch(InterruptedException ie){}
}
sourceDataLine.stop();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]){
new ShowWave();
}
}
really unreasonable , three hundred lines , also called long
------ For reference only ------------------------ ---------------
upstairs strong ! ! !
------ For reference only -------------------------------------- -
passing moment, said Kai-fu Lee has done voice recognition
------ For reference only --------------------- ------------------
really powerful ah , learning
------ For reference only ------------ ---------------------------
this is too cow B totally do not understand
------ For reference only ---------------------------------------
embedded C + + is still relatively good
------ For reference only ------------------------------------- -
Oh thank you very much . . . . I want to end posted
------ For reference only ------------------------------ ---------
looked, write nice , after recording the output ask where ah ? Convenient for you , please add QQ313158469, ask for advice about ?
Thank you !
------ For reference only -------------------------------------- -
I am also concerned about, , , extracting speech features ( sample ) ----- > Stock -------> ( voice command ) ----- > Libraries contrast. ( Specific recognition )
Who idea to extract voice features .
------ For reference only ---------------------------------- -----
remove noise ---- byte ( that is not what we are saying ) abs (80) or how much ,
---- - For reference only ---------------------------------------
Thank you very much ; 22 F
------ For reference only --------------------------------- ------
33 F, what progress ah, share with ah.
------ For reference only -------------------------------------- -
good thing to learn ah
------ For reference only -------------------------- -------------
code must annotate it
------ For reference only ----------------- ----------------------
Bangding ah, I was doing the voice score java program , a friend will add this to qq: 508651685
------ For reference only ---------------------------------------
sphinx4 under trial , can be transferred through the demo, the recognition rate is very low , I think my signature file which is not configured well, just do not know what it did a good job , have to do this for you, give guidance
------ For reference only ---------------------------------------
heroes Hello , would like to ask you a question.
"
recording, each read data is stored in the buffer into the byte array , and then output it . And you, is to identify , then, is to extract information from the array . So what is the structure of this array like it . On this procedure, using 16-bit dual-channel recording, then this array , each two put together , is a sample value, this value is the loudness of the sound . The 2 bytes as a unit , then, as is the two-channel , that is, a set of two units , one for the left channel and a right channel , alternate entry .
"
read your postings in the forum , but do not quite understand .
I do now a heartbeat frequency calculation procedures , sound recording to , but get this byte array buffer , do not know how to do the next step , seeking guidance , thank you.
没有评论:
发表评论