Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Error-tolerance in digital speech recording systems
(USC Thesis Other)
Error-tolerance in digital speech recording systems
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ERROR-TOLERANCE IN DIGITAL SPEECH RECORDING SYSTEMS by Haiyang Zhu A Thesis Presented to the FACULTY OF THE VITERBI SCHOOL OF ENGINEERING UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE (ELECTRICAL ENGINEERING) May 2006 Copyright 2006 Haiyang Zhu UMI Number: 1437586 1437586 2006 UMI Microform Copyright All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, MI 48106-1346 by ProQuest Information and Learning Company. Dedication This thesis is dedicated to my parents: Xingyin Zhu and Cuilan Wu. Their love for family continues to inspire me today. ii Acknowledgements I would like to thank Dr. Melvin A. Breuer for advising me through the course of my research. I am thankful to him for giving me the opportunity to enjoy and appreciate research. It was both an honor and a pleasure to be his student and work with him. I would also like to thank the other members of my defense committee Dr. Antonio Ortega and Dr. Shrikanth S. (Shri) Narayanan for their insightful advice. Special thanks go to all professors, graduate students and project assistants in the testing group for their support and inspiration. I would like to thank my parents for their endless love and encouragement. iii Table of Contents Dedication . ... .... ... .... ... ... .... ... .... ... .... ii Acknowledgements .... ... .... ... ... .... ... .... ... .... iii ListofTables ... .... ... .... ... ... .... ... .... ... .... vi ListofFigures . . .... ... .... ... ... .... ... .... ... .... vii Abstract... ... .... ... .... ... ... .... ... .... ... .... ix Chapter1: Introduction. ... .... ... ... .... ... .... ... .... 1 1.1 Motivation ... ... .... ... ... .... ... .... ... .... 1 1.2 Contributions .. ... .... ... ... .... ... .... ... .... 2 1.3 Outline . .... ... .... ... ... .... ... .... ... .... 2 Chapter2: ArchitectureofaDTAD . ... ... .... ... .... ... .... 4 Chapter3: PESQ .... ... .... ... ... .... ... .... ... .... 7 3.1 PerceptualModelsforQualityAssessment ... ... .... ... .... 7 3.2 HowPESQWorks .. .... ... ... .... ... .... ... .... 9 3.3 PESQOutputs . ... .... ... ... .... ... .... ... .... 11 3.4 PESQImplementation .... ... ... .... ... .... ... .... 11 3.5 Summary .... ... .... ... ... .... ... .... ... .... 12 Chapter4: CodecAlgorithms .... ... ... .... ... .... ... .... 13 4.1 FS1016 CELP . . . . . .... ... ... .... ... .... ... .... 13 4.2 G.723.1 . .... ... .... ... ... .... ... .... ... .... 17 4.3 Summary .... ... .... ... ... .... ... .... ... .... 19 Chapter5: FunctionalTestingMethod . . ... .... ... .... ... .... 21 5.1 FunctionalTestingMethod . ... ... .... ... .... ... .... 21 5.2 SimulationResults .. .... ... ... .... ... .... ... .... 24 5.3 Summary .... ... .... ... ... .... ... .... ... .... 33 iv Chapter6: BitSensitivity . . .... ... ... .... ... .... ... .... 35 6.1 BitSensitivity. . ... .... ... ... .... ... .... ... .... 35 6.2 SimulationResults .. .... ... ... .... ... .... ... .... 36 6.3 Summary .... ... .... ... ... .... ... .... ... .... 41 Chapter 7: SimplescoreTestMethodology ... .... ... .... ... .... 42 7.1 SimplescoreTestMethodology .. ... .... ... .... ... .... 42 7.2 SimulationResults .. .... ... ... .... ... .... ... .... 45 7.3 Summary .... ... .... ... ... .... ... .... ... .... 47 Chapter8: ImprovementofErrorToleranceviaErrorCorrectingCodes. .... 48 8.1 ErrorCorrectingCodes ... ... ... .... ... .... ... .... 48 8.2 SimulationResults .. .... ... ... .... ... .... ... .... 50 8.3 Summary .... ... .... ... ... .... ... .... ... .... 58 Chapter9: Conclusion . ... .... ... ... .... ... .... ... .... 61 9.1 Summary .... ... .... ... ... .... ... .... ... .... 61 9.2 FutureResearchDirections . ... ... .... ... .... ... .... 63 Bibliography ... .... ... .... ... ... .... ... .... ... .... 64 v List of Tables Table3.1: Listeningqualityscale ... ... .... ... .... ... .... 11 Table 4.1: Bit allocation for the FS1016 CELP coder . . .... ... .... 15 Table 4.2: Bit allocation of the 5.3 kbps Mode of the G.723.1 codec . .... 18 Table 5.1: Average PESQ MOS and standard deviation over all training pat- ternsusingafault-freeflashmemory . . ... .... ... .... 26 Table 5.2: Acceptable fault density of FS1016 CELP and G.723.1 . . .... 27 Table 5.3: Yield comparison using traditional and error-tolerance methods . 32 Table 7.1: Verification of simplescoretestmethodology. .... ... .... 47 Table 8.1: Acceptable fault density for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) . . . . . .... ... .... ... .... 51 Table 8.2: Correction probabilities for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) under fault density 1% . . . . . . .... 53 Table 8.3: Acceptable fault density for BCH codes (127,113,2), (127,106,3) and (127,99,4) . .... ... ... .... ... .... ... .... 55 Table 8.4: Acceptable fault density for BCH codes (63,51,2), (63,45,3) and (63,39,4) . ... .... ... ... .... ... .... ... .... 55 vi List of Figures Figure2.1: ArchitectureofDTAD . . ... .... ... .... ... .... 4 Figure3.1: UseofPESQ. .... ... ... .... ... .... ... .... 8 Figure3.2: StructureofPESQ . ... ... .... ... .... ... .... 10 Figure4.1: Subjectivespeechqualityofvariouscodecs . .... ... .... 14 Figure 4.2: Block diagram of the FS1016 CELP encoder .... ... .... 16 Figure 4.3: Block diagram of the FS1016 CELP decoder .... ... .... 17 Figure 4.4: Block diagram of the G.723.1 encoder . . . . .... ... .... 19 Figure 4.5: Block diagram of the G.723.1 decoder . . . . .... ... .... 20 Figure5.1: Functionaltestingmethodology .... ... .... ... .... 22 Figure5.2: Improvedfunctionaltestingmethodology .. .... ... .... 23 Figure 5.3: Average test score over 50 fault distributions vs. Fault density . 27 Figure 5.4: Average PESQ MOS vs. # of errors in 8 secs of speech, FS1016 CELP.. ... .... ... ... .... ... .... ... .... 28 Figure 5.5: Average PESQ MOS vs. # of errors in 8 secs of speech, G.723.1 29 Figure 5.6: Standard deviation of PESQ MOS over 50 fault distributions vs. Fault density, G.723.1 . . . . . . .... ... .... ... .... 30 Figure 5.7: Acceptable percentage vs. Fault density, G.723.1 . . . . . .... 31 Figure 5.8: Probability that a memory falls in a fault density level . .... 33 vii Figure 6.1: Bit sensitivity of subframe 3 in one FS1016 CELP frame .... 37 Figure 6.2: Bit sensitivity in one FS1016 CELP frame . .... ... .... 38 Figure 6.3: Bit sensitivity of the combination of adaptive and fixed gains in one G.723.1 frame . . . . . . . . .... ... .... ... .... 39 Figure 6.4: Bit sensitivity in one G.723.1 frame . . . . . .... ... .... 40 Figure 7.1: Simplescoretestmethodology . .... ... .... ... .... 42 Figure 7.2: Definition of simplescore . ... .... ... .... ... .... 43 Figure 7.3: Range of simplescore ... ... .... ... .... ... .... 44 Figure 7.4: Direct percentage vs. Dividing point position P .. ... .... 46 Figure 8.1: Functional testing methodology with error correcting coding . . 49 Figure 8.2: Average test score over 50 fault distributions vs. Fault density for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) 52 Figure 8.3: Correction probabilities vs. Fault density for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1).... ... .... 54 Figure 8.4: Average test score over 50 fault distributions vs. Fault density (0%-1%)forBCHcodes(63,51,2),(63,45,3)and(63,39,4).... 56 Figure 8.5: Average test score over 50 fault distributions vs. Fault density (1%-5%)forBCHcodes(63,51,2),(63,45,3)and(63,39,4).... 57 Figure 8.6: Average test score over 50 fault distributions vs. Fault density (0%-1%) for BCH codes (127,113,2), (127,106,3) and (127,99,4) 58 Figure 8.7: Average test score over 50 fault distributions vs. Fault density (1%-5%) for BCH codes (127,113,2), (127,106,3) and (127,99,4) 59 Figure 8.8: Acceptable fault density vs. # of added bits .... ... .... 60 viii Abstract As VLSI scaling continues, process variations and defect densities are increasing, resulting in decreased yield. Digital systems that exhibit acceptable behavior even though they contain defects and generate output errors are said to be error-tolerant. Error-tolerance can be used to enhance effective yield, and hence reduce system costs. This thesis contributes new test methodologies to support error-tolerance related to flash memories found in digital speech recording systems. Test methods are used to determine whether or not a memory results in acceptable performance. First a functional test method is described. Then a simple score test methodology combined with memory diagnosis is described. Third error correction codes are used to improve error-tolerance. These methods have been evaluated experimentally via simulation. The results clearly demonstrate that the proposed test methodologies are capable of identifying erroneous yet acceptable performance, and hence lead to enhanced yield. ix Chapter 1 Introduction 1.1 Motivation As VLSI scaling continues along its traditional path, process variations and defect densities are increasing, resulting in decreased yield. This delays the time before which a new process becomes economically viable. There are many multimedia applications where perfect functional operation is not required. A system implementing these applications that incorporates circuits with defects might produces acceptable results since humans can mask small errors in sound and vision. These circuits are said to be error-tolerant [2]. Error-tolerance can be used to enhance effective yield, and hence reduce system costs. Digital speech recording systems, such as a digital telephone answering device (DTAD) and digital voice recorder, have error-tolerance attributes. The output speech of a DTAD may have good quality even if contains with faulty circuitry. This thesis studies the error-tolerance attributes of a DTAD due to defects in its flash memory. 1 Some of the issues addressed are: 1. How can the quality of the output speech of a DTAD with and without defects be evaluated to determine whether or not its performance is acceptable? 2. How can the error-tolerance attributes of faulty flash memories be tested? 3. How can the error-tolerance of the flash memory be improved? 1.2 Contributions This thesis presents three distinct contributions to the field of error-tolerance test- ing: 1. A functional testing method is described for determining whether or not the flash memory used in a DTAD provides acceptable (error-tolerant) results and the effective yield using functional testing is given. 2. A simple score testing method is described that, combined with memory diag- nosis, determines whether or not the flash memory used in a DTAD provides acceptable (error-tolerant) results. 3. A technique for improving the error-tolerance of a flash memory is described that uses error correcting codes. 1.3 Outline The rest of this thesis is organized as follows. Chapter 2 describes the typical architecture of a DTAD. Chapter 3 presents the ITU-T PESQ standard which is used 2 to evaluate the quality of speech. Chapter 4 details two codec algorithms: FS1016 CELP and ITU-T G.723.1. In Chapter 5 a functional testing method is proposed and simulation results presented. Chapter 6 deals with the bit sensitivity of FS1016 CELP and G.723.1. In Chapter 7 the simple score testing method is described. Improvement in error tolerance using error correcting codes are presented in Chapter 8. Chapter 9 offers conclusion and describes some future research directions. 3 Chapter 2 Architecture of a DTAD There are hundreds of different types of DTADs on the market. They have similar architectures, as shown in Figure 2.1. Two main components of a DTAD are the microcontroller and the flash memory. The microcontroller is composed of an analog- to-digital converter (ADC), a digital-to-analog converter (DAC), and a codec that contains an encoder and a decoder. Two typical microcontroller chips are the Philips PCD6001 [15] and the Freescale DSP56853 [7]. Microcontroller ADC Encoder DAC Decoder Codec Flash memory Original speech Output speech Figure 2.1: Architecture of DTAD 4 The operation of a DTAD is as follows. A DTAD is usually programmed to take a call after a certain number of rings. An ADC samples and quantizes the caller’s speech, the encoder in the codec encodes the original speech, and the output bit-stream of the encoder is stored in a flash memory. The size of the flash memory determines the amount of speech that can be recorded. The process of encoding is also called compressing, since the encoded speech file is much smaller than the raw data that enters the encoder. When a user wants to listen to a recorded message, the microcontroller extracts the encoded speech stored in the flash memory and the decoder in the codec decodes and finally the output speech is produced. Usually the output speech has some quality loss comparing with the original speech due to sampling, quantization and lossy compression. Thus there is some quality degradation in the output speech even if there are no defects in the DTAD. One question of interest is: what if any defects in the flash memory of a DTAD result in acceptable performance. Typically, the original speech is sampled at a rate of 8000 samples/sec and each sample is quantized to 8 bits by a low resolution ADC. The digitized speech rate is 64 kbps (kbits/sec), which means 1 second of speech is converted into 64 kbits. If each sample is quantized to 16 bits, the output speech quality will be increased. However, high resolution ADC and DAC are much more expensive than low resolution ones. Many codec algorithms exists. Chip producers use either standard algorithms or develop proprietary ones. Compression rate and output speech quality are two impor- tant criteria associated with a codec algorithms. Usually there is a trade-off between these two criteria. High compression rate algorithms allow more speech to be stored in the same flash memory, but may generate low quality output speech. Codec algorithms 5 with high output quality usually need more flash memory space. In this thesis two stan- dardized algorithms are studied. One is the FS1016 Code-Excited Linear Prediction (CELP) [14], the other is the ITU-T G.723.1 [9]. These two algorithms can compress the original 64 kbps speech to 4.8 kbps and 5.3 kbps, respectively. If compression rate is defined as the ratio of the original speech rate and the encoded rate, the compression rates for the two algorithms are 13.3 and 12.1, respectively. In Chapter 4 we will briefly describe aspects of the FS1016 CELP and G.723.1 encoding system. 6 Chapter 3 PESQ In this chapter we briefly describe one standard method to evaluate the quality of output speech, referred to as ITU-T P.862 PESQ. 3.1 Perceptual Models for Quality Assessment One traditional method for determining speech quality is to conduct subjective tests using human listeners. Extensive guidelines for subjective tests are given in ITU-T Rec- ommendations P.800/P.830 [10, 11]. The results of these subjective tests are averaged to give a mean opinion score (MOS). Such tests are expensive and impractical for test- ing in the field. For this reason the automated perceptual evaluation of speech quality between the original speech and the output speech provided by ITU-T Recommenda- tion P.862 PESQ [13] is very useful. P.862 allows researchers automatically estimates the quality scores that would be given in a typical subjective test. This is done by making an intrusive test, as shown in Figure 3.1, and processing the output speech through PESQ. 7 PESQ Original speech Output speech System under test PESQ score Figure 3.1: Use of PESQ Models for quality assessment modeling perception, specifically human auditory perception, is the core concept behind PESQ and its predecessors. Signal compression algorithms used in modern speech and audio codecs use perceptual information to decide which parts of a signal to retain code and which to discard. Simple measures like SNR do not give an accurate measure of the quality of these systems. Perceptually masked coding noise at a typical SNR of 13dB can be completely inaudible, whereas random noise at the same value of SNR is extremely disturbing. A perceptual model is used to correctly distinguish between audible and inaudible distortions, and this has proven to be the best way to accurately predict the audibility and annoyance of complex distortions. We have selected perceptual Evaluation of Speech Quality (PESQ) described in ITU-T Recommendation P.862 as an objective method for the measure of the quality of the output speech of a faulty DTAD. It is recommended that PESQ be used for 8 speech quality assessment of 3.1 kHz (narrow-band) handset telephony and narrow- band speech codecs [13]. According to Recommendation P.862, PESQ had demon- strated acceptable accuracy for CELP and hybrid codecs ≥4 kbps. Therefore, it is applicable for FS1016 CELP (4.8 kbps) and G.723.1 (5.3 kbps). PESQ performs better than PAMS [17], PSQM [1] and MNB [18], and replaced the use of P.861 PSQM [12] in February 2001. The correlation achieved by PSQM with respect to the subjective MOS in these benchmark was only 0.26, whereas an ideal model would have a correlation of 1. PESQ, for the same test, has a correlation of 0.935 for both known and unknown data, which means that PESQ give a reliable result. Therefore PESQ is suitable as a the measure to evaluate the quality of output speech. 3.2 How PESQ Works The operation of PESQ is described in [16]. PESQ compares the original speech with the output speech that is generated by passing the original speech through the system under test. The output of PESQ is a prediction of the perceived quality that would be given to the output speech by subjects in a subjective listening test. The structure of PESQ is shown in Figure 3.2. The model includes the following stages. 1. Level alignment. To compare signals, the reference speech signal and the degraded signal are aligned to the same constant power level. This corresponds to the normal listening level used in subjective tests. 9 Time align and equalise Original signal Degraded signal System under test Prediction of perceived speech quality Level align Input filter Level align Auditory transform Auditory transform Disturbance processing Auditory transform Identify bad intervals Input filter Re-align bad intervals Figure 3.2: Structure of PESQ 2. Input filtering. PESQ models and compensates for filtering that takes place in the telephone handset and in the network. 3. Time alignment. The system may include a delay, which may change several times during a test-for example Voice over IP often has variable delay. PESQ uses a powerful technique, based on PAMS, to identify and account for delay changes. 4. Auditory transform. The reference and degraded signals are passed through an auditory transform that mimics key properties of human hearing. This transform removes those parts of the signal that are inaudible to the listener. 5. Disturbance processing. Disturbance parameters are calculated using non-linear averages over specific areas of the error surface: • the absolute (symmetric) disturbance: a measure of absolute audible error • the additive (asymmetric) disturbance: a measure of audible errors that are much louder than the reference. 10 3.3 PESQ Outputs P.800 [10] defines the Mean Opinion Score (MOS), namely a listening quality score between 1 and 5, as shown in Table 3.1. The most eminent result of PESQ is the MOS. It directly expresses speech quality. The PESQ MOS as defined by the ITU Recommendation P.862 ranges from 1.0 (worst) up to 4.5 (best). This may surprise one at first glance since the ITU scale ranges up to 5.0, but the explanation is simple: PESQ simulates a listening test and is optimized to reproduce the average result of all listeners. Statistics, however, prove that the best average result one can generally expect from a listening test is not 5.0, instead it is 4.5. It appears the subjects are always cautious to score a 5, meaning “excellent”, even if there is no degradation at all. MOS Quality of the speech 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad Table 3.1: Listening quality scale 3.4 PESQ Implementation The detailed implementation of PESQ is detailed in ITU-T Recommendation P.862. An ANSI-C reference implementation is provided. A conformance testing procedure is also specified to allow a user to validate that an alternative implementation of the 11 model is correct. This Recommendation includes an electronic attachment containing an ANSI-C reference implementation of PESQ and conformance testing data. The conformance speeches are used as original speeches and the attached implementation is used to generate PESQ MOS as the measure. 3.5 Summary In this chapter ITU-T P.862 PESQ is described. We describe the functionality and structure of PESQ, and its advantage over other evaluation methods. PESQ output is the MOS, ranging from 1.0 (worst) up to 4.5 (best). The reference implementation in ITU-T P.862 will be used in this thesis. PESQ MOS is a reliable and accurate measure for determining whether or not the quality of the output speech of a faulty DTAD is acceptable. 12 Chapter 4 Codec Algorithms This chapter provides an overview of two codec algorithms: FS1016 CELP [14] and G.723.1 [9]. First a rudimentary comparison of some of various codec schemes in terms of their speech quality and bit-rate is given. The published mean opinion score (MOS) values of various codecs are shown in Figure 4.1 [6]. Observe in the figure that a range of speech codecs have emerged over the years. Many of these codecs attained the quality of the 64kbps pulse code modulation (PCM) speech codec, though at the cost of significantly increased coding delay and implemen- tation complexity. Two important attributes of a DTAD codec are compression rate and output speech quality. FS1016 CELP and G.723.1 are the codecs considered in this work. 4.1 FS1016 CELP In 1984, the U.S. Department of Defense (DoD) initiated a program to develop a new secure voice communication system to supplement the existing FS1015 linear prediction coefficient (LPC) coder. Between 1988 and 1989, the 4.8 kbps CELP coder, jointly 13 Figure 4.1: Subjective speech quality of various codecs developed by the DoD and Bell Labs, was selected. This codec was later enhanced and standardized as Federal Standard 1016 (FS1016) [14]. The FS1016 codec uses a standard CELP structure, with both a fixed and an adaptive codebook producing the excitation to an all-pole synthesis filter. A frame length of 30 ms is used, and each frame is split into four 7.5 ms subframes. The filter coefficients for a tenth-order all-pole synthesis filter are determined for each frame using forward-adaptive LPC analysis, and then converted to line spectral frequencies (LSFs) and scalar quantized with 34 bits. The codebook gains are scalar quantized with 5 bits each. In odd subframes the adaptive codebook index is coded with 8 bits, but in even 14 subframes it is encoded with 6 bits. One bit per frame is used for synchronization, and 4 bits per frame are used to provide error correction for the most sensitive bits. Finally, 1 bit per frame is allocated for future expansion. One frame contains 144 bits. Table 4.1 summarizes the bit allocation scheme of the FS1016 CELP codec. Parameter Number Resolution Bit Number per Frame LPC coefficients 10 3,4,4,4,4, 1–34 3,3,3,3,3 Pitch period 4 8,6,8,6 35–42,62–67, (adaptive codebook index) 87–94,114–119 Adaptive 4 5 43-47,68–72, codebook gain 95–99,120–124 Stochastic 4 9 48-56,73–81, codebook index 100–108,125–133 Stochastic 4 5 57–61,82–86, codebook gain 109–113,134–138 Future expansion 1 1 139 Error correction 4 1 140-143 Synchronization 1 1 144 Total 144 Table 4.1: Bit allocation for the FS1016 CELP coder Figure 4.2 and Figure 4.3 show the block diagram of the FS1016 CELP encoder and decoder respectively [4]. The FS1016 CELP algorithm uses several techniques to improve its performance with respect to bit transmission errors. A Hamming (15,11,1) error correction code is used to protect the 11 most sensitive bits of each frame. This, together with careful assignment of binary indices to codebook indices and the use of adaptive smoothers at the decoder, yields a codec that is reasonably resilient to channel errors. Adaptive 15 Input PCM speech Frame/ subframe segmentation Perceptual weighting filter Zero-input response of m.f.s.f. Stochastic codebook search LP analysis Impulse response of m.f.s.f. Adaptive codebook search Gain modification LPC encoder LPC decoder and interpolation Pitch period encoder Gain encoder Gain encoder Gain decoder Pack Gain decoder Total response, update of system state Stochastic codebook gain index Stochastic codebook index Adaptive codebook gain index Pitch period index LPC index CELP bit-stream Figure 4.2: Block diagram of the FS1016 CELP encoder smoothers, when the decoding of the (15,11,1) Hamming code indicates error-free con- ditions, operate on both the fixed and adaptive codebook gains, as well as the adaptive codebook index. Finally, if an error in the 34 bits representing the quantized LSFs causes adjacent LSFs to overlap, it can be detected by the decoder and action is taken to mitigate it. The combination of the methods described above allows the FS1016 CELP to provide reasonable speech quality at bit error rates as high as 1% [8]. This 16 Stochastic codebook search Adaptive codebook LPC decoder and interpolation Pitch period decoder Gain decoder Unpack Gain decoder Stochastic codebook gain index Stochastic codebook index Adaptive codebook gain index Pitch period index LPC index CELP bit-stream interpolation Formant synthesis filter × + Postfilter × Synthesis speech Figure 4.3: Block diagram of the FS1016 CELP decoder means that the defect density, measured in terms of bits of flash memory, can be as high as 1%. In the analysis we use CELP Source 3.3 Code Package [3] to simulate the operation of FS1016 CELP. 4.2 G.723.1 The ITU-T G.723.1 [9] dual-rate codec is part of the H.324 multimedia compression and transmission standard. This codec has two-bit rates associated with it, namely 5.3 and 6.3 kbps. The higher bit rate has greater quality. The lower bit rate gives good quality and provides 17 system designers with additional flexibility. In this thesis only the 5.3 kbps rate is considered. This codec encodes speech or other audio signals in frames using linear predictive analysis-by-synthesis coding. The excitation signal for the high rate coder is Multipulse Maximum Likelihood Quantization (MP-MLQ), and for the low rate coder is Algebraic-Code-Excited Linear-Prediction (ACELP). The frame size is 30 ms and there is an additional look ahead of 7.5 msec, resulting in a total algorithmic delay of 37.5 msec. One frame contains 158 bits. Table 4.2 summarizes the bit allocation scheme of the G.723.1 codec. Unlike FS1016 CELP, the G.723.1 does not contain error correction code. Parameter Subframe1 Subframe2 Subframe3 Subframe4 Total LPC Indices 24 Adaptive 7 2 7 2 18 codebook lag Excitation and 12 12 12 12 48 pitch gains combined Pulse positions 12 12 12 12 48 Pulse signs 4 4 4 4 16 Grid index 1 1 1 1 4 Total 158 Table 4.2: Bit allocation of the 5.3 kbps Mode of the G.723.1 codec Figure 4.4 and Figure 4.5 show the block diagram of the encoder and decoder of G.723.1 respectively [9]. The ANSI-C reference implementation provided with ITU-T Recommendation G.723.1 is used in our simulation of this DTAD codec. 18 30ms segmentation High-Pass Filter LPC Analysis Formant Perceptual Weighting Pitch Estimator Harmonic Noise Shaping LSP Quantiser LSP Decoder Impulse Response Calculator Zero Input Response z[n] Pitch Predictor p[n] Memory Update LSP Interpolator Pitch Decoder Excitation Decoder MP-MLQ/ ACELP + Local Decoder - - y[n] s[n] x[n] W(z) P(z) A(z) S(z) u[n] v[n] e[n] t[n] r[n] Figure 4.4: Block diagram of the G.723.1 encoder 4.3 Summary This chapter presented an overview of two codec algorithms, namely FS1016 CELP and G.723.1. The diagrams of encoder, decoder and bit location of these two algorithms are illustrated. These two codec algorithms are used in our simulation of the DTAD codec. 19 Excitation Decoder Pitch Decoder + u[n] LSP Decoder LSP Interpolator Pitch Postfilter Synthesis Filter Formant Postfilter Gain Scaling Unit e[n] v[n] ppf[n] sy[n] q[n] pf[n] A(z) Figure 4.5: Block diagram of the G.723.1 decoder 20 Chapter 5 Functional Testing Method This chapter presents the functional testing method of a DTAD system containing a faulty flash memory. In this work the codec of the DTAD, including encoder and decoder, is considered to be fault-free. The proposed functional testing method is not based on a structural model of the system under test, the DTAD, but attempts to exercise the functions of the system. The quality of the output speech of the DTAD determines if the faulty flash memory is acceptable or not. 5.1 Functional Testing Method Figure 5.1 shows the functional testing methodology which is composed of four components: 1. the test pattern memory, where training patterns of several typical male and female speeches are stored, 2. the flash memory under test, 3. the codec, including the same encoder and decoder, 21 4. the MOS predictor that which evaluates the speech quality between the training patterns and the output speeches using PESQ and generates the test score. Test patterns (original speeches) Encoder Flash memory under test Decoder MOS predictor (PESQ) Encoded bit-stream Output speeches Test score Figure 5.1: Functional testing methodology Two types of the codec implementations exist, namely hardware and software. In a hardware implementation, training patterns and output speeches are stored in the test pattern memory. The encoder, decoder and MOS predictor can be implemented using an ASIC, FPGA or DSP chip. A DSP is programable and changing the program can make this test board applicable to other codec algorithms. In a software imple- mentation, the codec and MOS predictor are implemented using PC software and only the socket for the flash memory is needed. Training patterns and output speeches are stored in the computer. A software implementation is more flexible and less costly than hardware. However, a hardware implementation executes much faster than a software implementation. An improved functional testing method is shown in Figure 5.2. Before test is performed, the training patterns are encoded and the encoded bit-streams are stored in the test pattern memory. When test is performed, the encoded bit-streams are copied 22 to the flash memory under test. The encoder only generates the encoded bit-streams once before test. This scheme simplifies the test process and reduce test time. Test patterns (original speeches) Flash memory under test Decoder MOS predictor (PESQ) Output speeches Test score Stored encoded bit-stream Figure 5.2: Improved functional testing methodology The functional testing procedure is as follows. 1. Encode training patterns and store the patterns and the encoded bit-streams in the test pattern memory. 2. Using a fault-free flash memory, obtain the reference test score. 3. Using a faulty flash memory, obtain a PESQ MOS for each training pattern. The test score of a flash memory is the average or worst case PESQ MOS over all training patterns. 4. Set the acceptance thresholdT. If the test score of a given flash memory is greater than T, the memory is acceptable, otherwise, it is unacceptable. 23 5.2 Simulation Results In this section we describe the results of a simulation based implementation of our improved testing method. The flash memory is simulated using a file containing the encoded bit-stream generated by the encoder. Faults in the flash memory are implemented by changing the corresponding bits value in the file. The specifications of functional testing are listed as follows. 1. Training patterns The speech files provided for conformance validation of ITU-T Recommendation P.862 PESQ are used as training patterns. There are 26 training patterns. Half are male and half are female speakers. The average duration of the training patterns is 7.9015 s. Each sample of speech is represented by 8 bits. The sampling rate is 8000 samples/sec, resulting in a speech rate of 64 kbps. 2. Flash memory The size of the simulated flash memory is 100 kbits. We employ a multiple stuck- at fault model. The chance that a fault is stuck-at-1 or stuck-at-0 is 50/50. Faults are randomly allocated through the memory based upon a uniform distribution. The probability that a fault results in an error is assumed to be 0.5. The fault density is defined as the ratio between the number of faults and the size of the flash memory. Twenty different fault densities between 0% and 1% are simulated. For each fault density, 50 different fault random distributions of faults are considered. 24 The same faulty flash memory model is used for both FS1016 CELP and G.723.1 codec algorithms. 3. Codec Two codec algorithms are simulated. CELP Source 3.3 Code Package [3] is used to implement a 4.8 kbps FS1016 CELP codec including both encoder and decoder. One bug in this code package was fixed. The ANSI-C reference implementation provided with ITU-T Recommendation G.723.1 is used in the simulation of the DTAD codec. Both of the algorithms have a frame time of 30 msec. The bit rates are 4.8 kbps and 5.3 kbps, which are equivalent to 144 and 158 bits/frame, respectively. The average sizes of the encoded bit-stream file for these two codecs are 37.927 and 41.615 kbits. 4. MOS predictor The C language reference implementation [13] of the electronic attachment of Recommendation P.862 is used as a MOS predictor. The output of the reference implementation is PESQ MOS. The simulation process and results are described next. 1. Encode training patterns and store the encoded bit-streams for both codec algo- rithms. 2. Obtain reference test scores for the 26 training speeches assuming a fault-free flash memory. 25 The average PESQ MOS (the reference test score) and the standard deviation over all training patterns are shown in Table 5.1. Algorithm Average PESQ MOS Standard deviation FS1016 CELP 3.081 0.1843 G.723.1 3.610 0.1593 Table 5.1: Average PESQ MOS and standard deviation over all training patterns using a fault-free flash memory Since the standard deviation is small, it appears that these two codec algorithms generate fairly uniform MOS over the various training patterns. A reference test score is defined as the average PESQ MOS over all training patterns using a fault-free flash memory. Faults in a flash memory will degrade the speech quality and the reference test score is usually the highest test score for any given flash memory. 3. Test faulty flash memories and obtain test scores. The test score of a given flash memory is the average PESQ MOS over all training patterns. The average test score of a given fault density is obtained as follows. • Average over all 26 training speeches; • Average over all 50 distributions for this fault density. The relation between the average test score and the fault density is shown in Fig- ure 5.3. Obviously, with an increase in fault density, the average test score decreases. 26 0 0.2 0.4 0.6 0.8 1 2 2.2 2.4 2.6 2.8 Threshold 3 3.2 3.4 3.6 Average test score vs. Fault density Fault density (%) Average test score over 50 fault distributions G.723.1 FS1016 CELP assumed to be acceptable acceptable fault density Figure 5.3: Average test score over 50 fault distributions vs. Fault density We define the acceptable fault density to be the fault density under which the average test score is equal to the acceptance threshold T = 3 (Fair). The acceptable fault density of two codec algorithms are shown in Table 5.2. Algorithm Acceptable fault density FS1016 CELP 0.11% G.723.1 0.20% Table 5.2: Acceptable fault density of FS1016 CELP and G.723.1 27 Since a fault may not cause an error, the relation between the test score and the number of errors in 8 seconds of speech is of interests, which is shown in Figure 5.4 and Figure 5.5. 0 50 100 150 200 250 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 3.1 Average PESQ MOS vs. # of errors in 8 secs of speech, FS1016 CELP # of errors in 8 secs of speech Average PESQ MOS Figure 5.4: Average PESQ MOS vs. # of errors in 8 secs of speech, FS1016 CELP The reference test score for FS1016 CELP is close to the thresholdT = 3. Therefore there is not too much room for error tolerance. In the following simulation, the focus will be on G.723.1. The test score for G.723.1 decreases faster than for FS1016 CELP as the fault density increases. Chapter 8 will describes how to improve error tolerance for G.723.1 using error correcting codes. 28 0 50 100 150 200 250 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 Average PESQ MOS vs. # of errors in 8 secs of speech, G.723.1 # of errors in 8 secs of speech Average PESQ MOS Figure 5.5: Average PESQ MOS vs. # of errors in 8 secs of speech, G.723.1 In Figure 5.5 there is a range of number of errors in which the average PESQ MOS reaches the threshold. We pick the middle value 50 errors. Since the probability that a fault results in an error is 0.5, 50 errors corresponds to a fault density of 2× 50 41.615K = 0.24%, where 41.615 kbits is the size of the encoded bit-stream file. This fault density is approximately equal to that obtained from Figure 5.3. We next focus on one training speech namely “or105.wav”. The standard deviation of the PESQ MOS over all 50 fault distributions as a function of fault density is shown in Figure 5.6. As the fault density increases, the standard deviation initially increases to a peak and then decreases. The explanation for this is based on bit sensitivity which 29 will be described in Chapter 6. When the fault density is low, the PESQ MOS scores for the 50 fault distributions are relatively high and the standard deviation is low. With an increase in the fault density, some fault distributions result in high scores, and some result in low scores due to bit sensitivity. The standard deviation thus increases. When the fault density is high, the scores for all fault distributions are all low and the standard deviation decreases. 0 0.2 0.4 0.6 0.8 1 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 Standard deviation of PESQ MOS vs. Fault density, G.723.1 Fault density (%) Standard deviation of PESQ MOS over 50 fault distributions Figure 5.6: Standard deviation of PESQ MOS over 50 fault distributions vs. Fault density, G.723.1 30 4. Determine the threshold value T. According to the definition of MOS, “3” represents a “fair” quality of speech. There- fore the threshold value T is set to “3” in our simulations. The acceptable percentage for a given fault density is defined to be the percentage of faulty flash memories whose test scores are greater than the threshold T. The acceptable percentage is shown in Figure 5.7. Clearly the acceptable percentage decreases as the fault density increases. At the acceptable fault density of 0.2%, the acceptable percentage is about 53%. 0 0.2 0.4 0.6 0.8 1 0 10 20 30 40 50 60 70 80 90 100 Acceptable percentage vs. Fault density with Threshold T=3, G.723.1 Fault density (%) Acceptable percentage (%) acceptable fault density Figure 5.7: Acceptable percentage vs. Fault density, G.723.1 31 From the acceptable percentage, we can obtain the effective yield using the func- tional testing method. Assume that the fault density levels are f i ,0≤i≤L,f i <f i+1 , where f L is the largest fault density, the probability that a memory falls in the fault density level f i is p i , and the acceptable percentage for a fault density level f i is A i . p i can be obtained from the memory manufacturer and A i is obtained from the above simulation. f 0 = 0 represents fault-free level and A 0 = 1. If the traditional testing method is used, only fault-free memories are acceptable and the total yield Y is Y =p 0 A 0 =p 0 . However, using the error-tolerance functional testing method, the effective yield is Y = L i=0 p i A i . Therefore, the yield is increased by Y = L i=1 p i A i . For example, let the probability that a memory has a fault density levelf i ,asshown in Figure 5.8, the resulting yields for the traditional and error-tolerance methods are listed in Table 5.3, respectively. The effective yield is increased substantially. Testing method Yield Traditional 36.6% Error-tolerance 98.9% Table 5.3: Yield comparison using traditional and error-tolerance methods 32 0 0.2 0.4 0.6 0.8 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Fault density (%) Probability that a memory falls in a fault density level Figure 5.8: Probability that a memory falls in a fault density level 5.3 Summary In this chapter a functional testing method is described and simulation results presented. The acceptable fault densities for FS1016 and G.723.1 are 0.11% and 0.2%, respectively. The reference test score for FS1016 CELP is close to the thresholdT =3. Therefore there is not too much room for error tolerance. The standard deviation of PESQ MOS over 50 fault distributions is given. We show the relation between acceptable percentage and fault density, based on which the effective yield is given. Using an example, we show that the error-tolerance method increases the effective 33 yield a lot. In the following chapters, another test methodology called simple score will be described. 34 Chapter 6 Bit Sensitivity 6.1 Bit Sensitivity The encoder converts frames of sampled speech to encoded bit-stream that repre- sents speech parameters. Faults in the flash memory corrupt these parameters, thus degrading the quality of the decoded speech. Each bit’s contribution to the decoded speech quality is different. Thus, the number of faults and the actual distribution of these faults in a flash memory are two factors that affect the degree of degradation of the output speech quality. Usually more faults cause more degradation. Flash memo- ries with the same amount of faults but different fault distribution may have different output speech attributes. Analyzing bit sensitivity gives a better understanding of how fault distribution affects speech quality. The encoded bit-stream is organized frame by frame and a specified bit position in different frames represents the same parameter. Therefore the sensitivity of a bit on a specific position in one frame instead of all bits will be studied, which means that only 144 bits for FS1016 CELP and 158 bits for G.723.1 will be studied. 35 6.2 Simulation Results Some bits are much more sensitive to errors than others and should be more pro- tected. However, it is not obvious how the sensitivity of different bits should be mea- sured. One commonly used approach [5] is for a given bit to invert this bit in every frame and measure the segmental SNR degradation that results. In this work the func- tional testing method proposed in Chapter 5 is executed and the average test score is used to indicate the bit sensitivity instead of segmental SNR degradation. First the training patterns are encoded into bit-stream. Then, the binary value of a given bit position in each frame of the encoded bit-stream is inverted. For example, if the sensitivity of the 25th bit is analyzed, the value of the 25th bit in all frames is inverted, which means a “1” is changed to a “0”, and a “0” is changed to a “1”. The encoded bit-stream with these inverted bits is said to be corrupted. Bit sensitivity is indicated by the test score of the output speech decoded from the corrupted encoded bit- stream. The test score is the average PESQ MOS over all training patterns generated by PESQ describe in Chapter 3. The higher the test score of a given bit, the less sensitivity it has. Fig. 6.1 shows the sensitivity of each bit in the subframe 3 of a FS1016 CELP encoded frame. Fig. 6.2 shows the sensitivity of each bit in a FS1016 CELP encoded frame indicated by the test score. These two figures are arranged by CELP parameter. The bits for each parameter are ordered left to right from least significant bit (LSB) to most significant bit (MSB). The following conclusion can be drawn from these results. 36 85 90 95 100 105 110 115 High sensitivity 2 2.5 Low sensitivity Bit position Test score Bit sensitivity of FS1016 CELP frame Pitch period Adaptive codebook gain Stochastic codebook index Stochastic codebook gain Figure 6.1: Bit sensitivity of subframe 3 in one FS1016 CELP frame 1. The MSB generally exhibits higher sensitivity to bit errors than other bits asso- ciated with the same parameter. 2. The least sensitive bits are the synchronization (bit 144) and error correcting code (bit 140–143) since the decoded speech with such corrupted bits remains thesameinoursimulation. 3. The adaptive codebook delay indices (bit 40–42,92–94), the MSBs of the adaptive codebook gain (bit 47,72,99,124), the LPC (bit 6,7,10,11,14,15,18,19,22) and the future expansion (bit 139) are the most sensitive. 37 0 50 100 150 High sensitivity 2 2.5 Low sensitivity Bit position Test score Bit sensitivity of FS1016 CELP frame LPC Subframe1 Subframe2 Subframe3 Subframe4 Figure 6.2: Bit sensitivity in one FS1016 CELP frame 38 Fig. 6.3 shows the sensitivity of the bit representing combination of adaptive and fixed gains of all subframes of a G.723.1 encoded frame. Fig. 6.4 shows the sensitivity of each bit in a G.723.1 encoded frame. The following conclusion can be drawn from these results. 1. The bits of combined gains except 3 LSBs in each subframe (bit 43–90) are more sensitive than other bits. 2. The bits of pulse positions, pulse signs and grid index (bit 91–158) are fairly insensitive. 50 60 70 80 90 High sensitivity 1.5 2 2.5 3 3.5 Low sensitivity Bit position Test score Bit sensitivity of G.723.1 frame Subframe1 Subframe2 Subframe3 Subframe4 Figure 6.3: Bit sensitivity of the combination of adaptive and fixed gains in one G.723.1 frame 39 0 20 40 60 80 100 120 140 160 High sensitivity 1.5 2 2.5 3 3.5 Low sensitivity Bit position Test score Bit sensitivity of G.723.1 frame LPC indices Adaptive codebook lags All the gains combined Pulse positions Pulse signs Grid index Figure 6.4: Bit sensitivity in one G.723.1 frame 40 6.3 Summary This chapter presented the bit sensitivity of FS1016 CELP and G.723.1. We de- scribe a method to measure bit sensitivity by inverting a bit in each frame. The test score generated by the functional testing method proposed in Chapter 5 is used to indicate the bit sensitivity. Simulation results show the bit sensitivity of two codec algorithms. The bit sensitivity will be used in Chapter 7 and Chapter 8. 41 Chapter 7 Simple score Test Methodology This chapter presents another testing method, called simple score testing, to deter- mine whether or not a faulty flash memory is acceptable. This testing is based on a structural model and combined with flash memory fault diagnosis. 7.1 Simple score Test Methodology The simple score testing methodology is shown in Figure 7.1. Flash memory under test Memory diagnosis acceptable Functional testing method Simple_score testing unacceptable Cannot be decided Faults locations Figure 7.1: Simple score test methodology Memory diagnosis can give the locations of faults in a flash memory. According to the faults’ locations, simple score can decide whether the faulty memory is acceptable or not. If simple score cannot determine the acceptability, functional testing method 42 can be executed. Since the run time of simple score is much less than that of the functional testing, the cost of testing large numbers of flash memories can be reduced. High sensitivity Low sensitivity Frame 1 Frame 2 Frame 3 . . . . . . Frame n ×× ×× × × ×× × × 158 bits N 1 P N 2 x 1 x 2 Faults Value Simple_score= N 1 x 1 +N 2 x 2 represents a fault × G 1 G 2 Group Figure 7.2: Definition of simple score In this chapter only G.723.1 codec algorithm is considered. The definition of sim- ple score is illustrated in Figure 7.2. The memory is organized frame by frame. 158 bits in one G.723.1 frame are sorted according to their sensitivity. A dividing point, P (1≤P ≤ 158), in the sorted bit series is selected to divide the 158 bits into two groups G 1 and G 2 .Group G i is associated with a variable x i ,where 0≤ x i ≤ 1,i =1,2. If the total number of faults in one flash memory is N and the numbers of faults falling 43 in the groupsG 1 andG 2 areN 1 andN 2 respectively, then the simple score ss of a flash memory is ss = N 1 N 2 ⎛ ⎜ ⎜ ⎝ x 1 x 2 ⎞ ⎟ ⎟ ⎠ =N 1 x 1 +N 2 x 2 . The range of simple score is 0≤ss≤N 1 +N 2 =N. P, x 1 and x 2 are unknown parameters and need to be determined. N, N 1 and N 2 are obtained through memory diagnosis. Obviously unacceptable memories have higher simple score than acceptable mem- ories. The range of simple score of all acceptable and unacceptable memories is shown in Figure 7.3. Acceptable memories Unacceptable memories Simple_score max_a min_u min_a max_u Figure 7.3: Range of simple score If the memory is fault-free, its simple score will be 0 and this simple score should be the minimum of the acceptable memories, i.e., min a =0. 1. If the simple score of a memory is less than min u , the memory can be classified as acceptable. 44 2. If the simple score of a memory is greater thanmax a , the memory can be classified as unacceptable. 3. If the simple score of a memory is in the overlap range, i.e., between min u and max a , the memory cannot be classified via simple score method, and functional testing method is required. Next we determine P, x 1 and x 2 based on 1000 memories simulated in Section 5.2, since the test scores and faults’ locations for these device are known. Direct percentage is the percent of memories with simple score less thanmin u or greater thanmax a .This represents the percentage of memories that can be decided using the simple score test method alone. The objective is to find the values of P, x 1 and x 2 that maximize the direct percentage. We explicitly consider each value of P from 1 to 158. For each value of P we determine x 1 and x 2 using MATLAB function call “fmincon”, with the constraints that 0≤x 1 ,x 2 ≤ 1. 7.2 Simulation Results The acceptance threshold T is set to be 3. Using the above procedure, the optimal values of x 1 and x 2 are determined for each value of P. From this we have determined the following. 1. Optimal dividing point is P = 44. 2. Optimal group contribution are x 1 =1.00000 and x 2 =0.00623. 3. min u =13.237 and max a =69.358. 45 4. Maximal direct percentage is 56.1%. 12.0% of the memories have a simple score less thanmin u and are thus considered as acceptable, and 44.1% of the memories have a simple score greater than max a and are thus considered as unacceptable. The relation between direct percentage and the dividing point position P is shown in Figure 7.4. 20 40 60 80 100 120 140 0.46 0.48 0.5 0.52 0.54 0.56 0.58 Direct_percentage vs. Dividing point P Dividing point P Direct_percentage Figure 7.4: Direct percentage vs. Dividing point position P To verify simple score method, 100 more faulty flash memories with fault densities of 0%–1% were simulated. N 1 and N 2 of these memories are obtained with P = 44. The simple scores of these flash memories are N 1 x 1 +N 2 x 2 ,where x 1 and x 2 are the optimal values obtained above. Then we compare these simple scores with min u and 46 max a obtained above. Also we run functional testing for these memories. The results are listed in Table 7.1. Conditions #of memories simple score<min u 14 simple score<min u & test score>Threshold=3 14 simple score>max a 44 simple score>max a & test score<Threshold=3 44 Table 7.1: Verification of simple score test methodology From the verification results, the memories with simple score less than min u are all acceptable and those with simple score greater than max a are all unacceptable as well. Therefore, simple score testing method gives an accurate prediction of the acceptability of the faulty memories. The direct percentage for this batch of memories is 58%. 7.3 Summary This chapter describes simple score test methodology based on faults’ locations. We give the definition of simple score and the process to determine the required values P, x 1 and x 2 . The optimal values are obtained through simulation. Then we run verifica- tion simulation using the obtained optimal values. The results show that simple score method is accurate to determine whether or not the faulty flash memory is acceptable. 47 Chapter 8 Improvement of Error Tolerance via Error Correcting Codes This Chapter presents a method to improve error tolerance using error correcting codes. 8.1 Error Correcting Codes From the simulation results of the functional testing method in Section 5.2 , the acceptable fault density is 0.2%, and the average test score drops quickly for a higher fault density. The Hamming (15,11,1) error correcting code is embedded in the FS1016 CELP, while there are no error correcting codes in the G.723.1. Therefore G.723.1 is vulnerable to errors in the bit-stream. Error correction is a good way to enhance the error tolerance of G.723.1. The most sensitive bits obtained from Chapter 6 are good candidates to be protected. Although more flash memory space is needed to store the error correcting code, the total cost can still be reduced since the flash memories 48 with high fault densities become acceptable and are cheaper than those with low fault densities. Figure 8.1 shows revised functional testing methodology with error correcting cod- ing. Two components added to the original functional test methodology are: 1. error correcting encoder that encodes the most sensitive bits of each bit-stream frame and adds error correcting codes to each bit-stream frame, 2. error correcting decoder that decodes the error correcting codes and corrects the erroneous bits in each bit-stream frame. Error correcting encoder Error correcting decoder Test patterns (original speeches) Encoder Flash memory under test Decoder MOS predictor (PESQ) Test score Figure 8.1: Functional testing methodology with error correcting coding Many error correcting codes exist. We will only consider Hamming and Bose- Chaudhuri-Hocquenghem (BCH) code in this work. Both Hamming and BCH code are represented as (n,k,t), where k is the number of original bits in each bit-stream frame that are inputs of the error correcting encoder, n is the total number of bits that are generated by the error correcting encoder, and t is the number of bit errors that can be corrected by the error correcting decoder. Therefore n−k is the number of bits that are added to each bit-stream frame. 49 For the Hamming codes considered here, (n,k,t)=(2 m − 1,2 m − 1−m,1) where m is any positive integer and t = 1. For example, if m = 4, the Hamming code is (15,11,1). 11 original bits of each bit-stream frame are encoded, 4 bits will be added to each frame and only 1 bit error can be corrected. For BCH code, (n,k,t)=(2 m − 1,2 m − 1−mt,t) where m and t are any positive integers. For example, if m=7 and t =3, the BCH code is (127,106,3). 106 bits of each bit-stream frame are encoded, 21 bits will be added to each frame and 3 bit errors can be corrected. Hamming codes can be considered to be a special case of a BCH code if t=1. The same functional testing process described in Chapter 5 is applied. 8.2 Simulation Results The specifications are the same as those listed in Section 5.2. Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) and BCH codes (63,51,2), (63,45,3), (63,39,4), (127,113,2), (127,106,3) and (127,99,4) are used for error correcting encoder and decoder. One implementation of a Hamming code is the MATLAB function call • “code = encode(msg,n,k,’hamming/binary’)” and • “msg = decode(code,n,k,’hamming/binary’)” 50 with default generator polynomial. One implementation of a BCH code is the MATLAB function call • “code = bchenc(msg,n,k)” and • “decoded = bchdec(code,n,k)”. The bits in one bit-stream frame are sorted according to their sensitivity. The most sensitive bits will be encoded. The relation between the average test score and fault density for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) are shown in Figure 8.2. The acceptable fault densities for these Hamming codes are shown in Table 8.1. Error correcting code Acceptable fault density Original G.723.1 0.20% Hamming code (15,11,1) 0.26% Hamming code (31,26,1) 0.44% Hamming code (63,57,1) 0.81% Hamming code (127,120,1) 0.54% Table 8.1: Acceptable fault density for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) From Figure 8.2, the acceptable fault density is enhanced using any of these Ham- ming codes. The optimal acceptable fault density is achieved using Hamming code (63,57,1) which increases the acceptable fault density to four times the original G.723.1 algorithm at a cost of only adding 6 bits per frame. Hamming code (63,57,1) has a higher acceptable fault density than Hamming codes (15,11,1) and (31,26,1) since it protects more bits. However the acceptable fault density for Hamming code (127,120,1) 51 0 0.2 0.4 0.6 0.8 1 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 Average test score vs. Fault density Fault density (%) Average test score over 50 fault distributions Original G.723.1 Hamming (15,11,1) Hamming (31,26,1) Hamming (63,57,1) Hamming (127,120,1) Figure 8.2: Average test score over 50 fault distributions vs. Fault density for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) is less than that for Hamming code (63,57,1) even though it uses one more bit of over- head. The reason is explained below. Hamming code (127,120,1) encodes 63 more bits than (63,57,1) and hence more errors can occur in the encoded bits. However, these codes can only correct one bit. Therefore, if there are more than one bit error in one frame, the decoded frame will be wrong, which will cause more faults in the frame and degrade the quality of the output speech. Also more less sensitive bits are protected. This is a waste of resource. 52 Let p be the probability that one bit is erroneous, n be the total number of bits generated by the error correcting encoder andt be the number of bit errors that can be corrected. p is equal to one half of the fault density since the probability that assume a fault causing an error is 0.5. The correction probability P c is defined as the probability that an error correcting code can correct all error bits in the encoded bits. We have P c = t i=0 n i (1−p) n−i p i . The relation between the correction probabilities for Hamming codes and the fault density is shown in Figure 8.3. For a given fault density of 1%, p = 0.01 2 . The correction probabilities for various Hamming codes are shown in Table 8.2. From Table 8.2, Hamming code (63,57,1) pro- tects the most sensitive bits and has a much higher corrction probability than Hamming code (127,120,1). Under high fault density, the correction probabilities for Hamming codes (63,57,1) and (127,120,1) drop very fast. When the fault density is above 4%, these codes become somewhat ineffective. Here, BCH codes that can correct two or more errors might prove useful. Error correcting code Correction probability Hamming code (15,11,1) 0.9975 Hamming code (31,26,1) 0.9894 Hamming code (63,57,1) 0.9601 Hamming code (127,120,1) 0.8668 Table 8.2: Correction probabilities for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) under fault density 1% 53 0 1 2 3 4 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Correction probability vs. Fault density Fault density (%) Correction probability Hamming(15,11,1) Hamming(31,26,1) Hamming(63,57,1) Hamming(127,120,1) Figure 8.3: Correction probabilities vs. Fault density for Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) The relation between the average test score and the fault density for BCH codes (63,51,2), (63,45,3) and (63,39,4) are shown in Figure 8.4. BCH code (63,51,2) results in higher test scores than BCH codes (63,45,3) and (63,39,4) under low fault densities since it protects more sensitive bits. The acceptable fault densities of BCH codes (63,51,2), (63,45,3) and (63,39,4) are greater than 1%. The simulation results under high fault densities are shown in Figure 8.5. The acceptable fault densities of BCH codes (63,51,2), (63,45,3) and (63,39,4) are shown in Table 8.4. 54 BCH Code Acceptable fault density (127,113,2) 1.12% (127,106,3) 1.85% (127,99,4) 2.69% Table 8.3: Acceptable fault density for BCH codes (127,113,2), (127,106,3) and (127,99,4) BCH Code Acceptable fault density (63,51,2) 1.66% (63,45,3) 2.18% (63,39,4) 2.00% Table 8.4: Acceptable fault density for BCH codes (63,51,2), (63,45,3) and (63,39,4) The relation between the average test score and the fault density of BCH codes (127,113,2), (127,106,3) and (127,99,4) under low fault densities and high fault densities are shown in Figure 8.6 and Figure 8.7. The acceptable fault densities of BCH codes (127,113,2), (127,106,3) and (127,99,4) are shown in Table 8.3. BCH codes (127,113,2), (127,106,3) and (127,99,4) give very good performance under low fault densities and their average test scores are almost constant. In summary, the relation between the acceptable fault density and the number of added bits for the error correcting codes considered are shown in Figure 8.8. From this figure we see that, if more bits are added to a frame, the acceptable fault density increases usually. However Hamming code (63,57,1) and BCH codes (63,51,2) and (63,45,3) use less bits and result in a high acceptable fault density than Hamming code (127,120,1) and BCH codes (127,113,2) and (127,106,3) respectively. 55 0 0.2 0.4 0.6 0.8 1 3.2 3.25 3.3 3.35 3.4 3.45 3.5 3.55 3.6 3.65 Average test score vs. Fault density Fault density (%) Average test score over 50 fault distributions Original G.723.1 BCH (63,51,2) BCH (63,45,3) BCH (63,39,4) Figure 8.4: Average test score over 50 fault distributions vs. Fault density (0%-1%) for BCH codes (63,51,2), (63,45,3) and (63,39,4) Which error correcting code is the best for maximizing error-tolerance? We use the effective yield in seconds of speech Y s to answer this question. In Chapter 5, we assume that the fault density levels are f i ,0≤i≤L,f i <f i+1 ,where f L is the largest fault density and the probability that a memory falls in the fault density level f i is p i . Similar to what was done for functional testing, we can obtain the acceptable percentage A i (C j ) for error correcting codeC j ,1≤j≤N,whereN represents the number of error 56 1 1.5 2 2.5 3 3.5 4 4.5 5 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 Average test score vs. Fault density Fault density (%) Average test score over 50 fault distributions BCH (63,51,2) BCH (63,45,3) BCH (63,39,4) Figure 8.5: Average test score over 50 fault distributions vs. Fault density (1%-5%) for BCH codes (63,51,2), (63,45,3) and (63,39,4) correcting codes. Assume that one frame has F bits and error correcting code C j adds b j bits. The effective yield in seconds of speech for error correcting code C j is Y s (C j )= Y (C j ) F +b j = L i=0 p i A i (C j ) F +b j . We will select the error correcting code C j that maximize Y s . 57 0 0.2 0.4 0.6 0.8 1 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 Average test score vs. Fault density Fault density (%) Average test score over 50 fault distributions Original G.723.1 BCH (127,113,2) BCH (127,106,3) BCH (127,99,4) Figure 8.6: Average test score over 50 fault distributions vs. Fault density (0%-1%) for BCH codes (127,113,2), (127,106,3) and (127,99,4) 8.3 Summary This chapter describes how to improve error-tolerance with error correcting codes. Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) and BCH codes (63,51,2), (63,45,3), (63,39,4), (127,113,2), (127,106,3) and (127,99,4) are simulated. These codes can give higher acceptable fault density, and thus improve error-tolerance at a cost of adding bits. The effective yield in seconds of speech Y s is defined for selecting error correcting codes. 58 1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 3 3.5 4 Average test score vs. Fault density Fault density (%) Average test score over 50 fault distributions BCH (127,113,2) BCH (127,106,3) BCH (127,99,4) Figure 8.7: Average test score over 50 fault distributions vs. Fault density (1%-5%) for BCH codes (127,113,2), (127,106,3) and (127,99,4) 59 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 0 0.5 1 1.5 2 2.5 3 Acceptable fault density vs. # of added bits # of added bits Acceptable fault density (%) Original G.723.1 BCH (63,51,2) BCH (63,45,3) BCH (63,39,4) BCH (127,113,2) BCH (127,106,3) BCH (127,99,4) Hamming (15,11,1) Hamming (31,26,1) Hamming (63,57,1) Hamming (127,120,1) Figure 8.8: Acceptable fault density vs. # of added bits 60 Chapter 9 Conclusion Digital speech recording systems have error-tolerance attributes. The output speech of a DTAD may have good quality even if contains with faulty circuitry. This chapter summarizes the contributions of this thesis to the field of error-tolerance in a DTAD system and discusses possible future directions. 9.1 Summary In this work we described a bunch of technologies dealing with the error-tolerance attributes of flash memories of a DTAD system are studied. We solve the problems presented in Chapter 1 and have the following contributions. 1. The ITU-T P.862 Recommendation PESQ is chosen to determine whether or not the faulty memory is acceptable or not. The output of PESQ MOS gives the quality of the output speech of a DTAD system. PESQ is suitable for our simulations. 61 2. A functional testing method is described for determining whether or not the flash memory used in a DTAD provides acceptable (error-tolerant) results. We describe the functional testing process and give the simulation results. The acceptable fault densities of the FS1016 CELP and G.723.1 are 0.11% and 0.2%. Using the acceptable percentage, we present how the effective yield is increased using functional testing. 3. A simple score testing method is described that, combined with memory diag- nosis, determines whether or not the flash memory used in a DTAD provides acceptable (error-tolerant) results. Simple score testing is structural and based on faults’ locations. We give the definition of simple score and direct percentage. Our objective is to maximize the direct percentage. Through simulation the op- timal parameters required by the definition of simple socre are obtained. And the verification simulation shows that simple score testing gives an accurate pre- diction on the acceptability of flash memories. 4. A technique for improving the error-tolerance of a flash memory is described that uses error correcting codes. We simulate Hamming codes (15,11,1), (31,26,1), (63,57,1) and (127,120,1) and BCH codes (63,51,2), (63,45,3), (63,39,4), (127,113,2), (127,106,3) and (127,99,4). The acceptable fault density is enhanced using any of these error correcting codes at a cost of adding bits. An indicator the effective yield in seconds of speech is defined to determine which error correcting codes to be used. 62 9.2 Future Research Directions The future research can be extended in the following directions. In this work, codec including encoder and decoder is assumed to be fault-free and only faults in flash memory are studied. We use PC software to implement the encoder, decoder and MOS predictor (PESQ). In a real DTAD system, the encoder, decoder and MOS predictor (PESQ) are usually implemented using hardware like ASIC, FPGA or DSP. These chips are more expensive than flash memories and may also have error-tolerance attributes. Then the total cost of the whole DTAD system will be reduced with the error-tolerance of the encoder, decoder. How the faults in the hardware implementation of codecs affects the quality of output speech is a new research topic. 63 Bibliography [1] J. G. Beerends and J. A. Stemerdink, “A perceptual speech-quality measure based on a psychoacoustic sound representation,” Journal of the Audio Engineering So- ciety, vol. 42, no. 3, pp. 115–123, 1994. [2] M. A. Breuer, S. K. Gupta, and T. M. Mak, “Design and error-tolerance in the presence of massive numbers of defects,” IEEE Design and Test Magazine, pp. 216–227, May–June 2004. [3] (1999) Celp source 3.3 code package. [Online]. Available: http://maya.arcon. com/ddvpc/clp3read.htm [4] W. Chu, Speech Coding Algorithms: Foundation and Evolution of Standardized Coders. New Jersey: John Wiley & Sons, Inc., 2003. [5] R. Cox, W. Kleijin, and P. Kroon, “Robust celp coders for noisy backgrounds and noisy channels,” in Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Glasgow, Scotland: IEEE, May 1989, pp. 739– 742. [6] R. Cox and P. Kroon, “Low bit-rate speech coders for multimedia communica- tions,” IEEE Communications Magazine, pp. 34–41, Dec. 1996. [7] (2005, June) 56800e 16-bit digital signal controllers. Freescale Semiconduc- tor. [Online]. Available: http://www.freescale.com/files/dsp/doc/data sheet/ DSP56853.pdf [8] L. Hanzo, F. C. A. Somerville, and J. P. Woodard, Voice Compression and Com- munications: Principles and Applications for Fixed and Wireless Channels.New York: IEEE Press, 2001. [9] Recommendation ITU-T G.723.1, Speech coders : Dual rate speech coder for mul- timedia communications transmitting at 5.3 and 6.3 kbit/s, International Telecom- munications Union, 1996. [10] Recommendation ITU-T P.800, Methods for subjective determination of transmis- sion quality, International Telecommunications Union, 1996. [11] Recommendation ITU-T P.830, Subjective performance assessment of telphone- band and wideband digital codecs, International Telecommunications Union, 1996. 64 [12] Recommendation ITU-T P.861, Objective quality measurement of telephoneband (300-3400 Hz) speech codecs, International Telecommunications Union, 1998. [13] Recommendation ITU-T P.862, Perceptual evaluation of speech quality (PESQ):An objective method for end-to-end speech quality assessment of narrow- band telephone networks and speech codecs, International Telecommunications Union, 2001. [14] Details to Assist in Implementation of Federal Standard 1016 CELP,National Communications System, Arlington, VA, 1992. [15] (2001, Apr.) Pcd6001 digital telephone answering machine chip. Philips Semiconductors. [Online]. Available: http://www.semiconductors.philips.com/ acrobat/datasheets/PCD6001 2.pdf [16] (2001, Sept.) Pesq: An introduction. Psytechnics Limited. [Online]. Available: http://www.psytechnics.com/downloads/pesq/wp pesq introduction.pdf [17] A. W. Rix and M. P. Hollier, “The perceptual analysis measurement system for robust end-to-end speech quality assessment,” inProceedings of International Con- ference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, June 2000. [18] S. Voran, “Objective estimation of perceived speech quality - part i: development of the measuring normalizing block technique,” IEEE Trans. Speech and Audio Processing, vol. 7, no. 4, pp. 371–382, July 1994. 65
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Information hiding in digital images: Watermarking and steganography
PDF
Effect of variations in global and local spectral properties of a steady state vowel spectrum on speech perception by cochlear implant users
PDF
Design tradeoffs in a packet-switched network on chip architecture
PDF
Design and analysis of server scheduling for video -on -demand systems
PDF
A comparative study of network simulators: NS and OPNET
PDF
Contribution to transform coding system implementation
PDF
Area comparisons of FIFO queues using SRAM and DRAM memory cores
PDF
A study of unsupervised speaker indexing
PDF
Color processing and rate control for storage and transmission of digital image and video
PDF
Intelligent systems for video analysis and access over the Internet
PDF
Geometrical modeling and analysis of cortical surfaces: An approach to finding flat maps of the human brain
PDF
Data compression and detection
PDF
Computer-aided lesion detection in positron emission tomography: A signal subspace fitting approach
PDF
Fault -tolerant control in complex systems with real-time applications
PDF
Contributions to content -based image retrieval
PDF
Complexity -distortion tradeoffs in image and video compression
PDF
Algorithms and architectures for robust video transmission
PDF
Design and performance analysis of low complexity encoding algorithm for H.264 /AVC
PDF
Contributions to efficient vector quantization and frequency assignment design and implementation
PDF
Contributions to image and video coding for reliable and secure communications
Asset Metadata
Creator
Zhu, Haiyang
(author)
Core Title
Error-tolerance in digital speech recording systems
School
Viterbi School of Engineering
Degree
Master of Science
Degree Program
Electrical Engineering
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
engineering, electronics and electrical,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Breuer, Melvin A. (
committee chair
), Narayanan, Shrikanth (
committee member
), Ortega, Antonio (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-51889
Unique identifier
UC11338297
Identifier
1437586.pdf (filename),usctheses-c16-51889 (legacy record id)
Legacy Identifier
1437586.pdf
Dmrecord
51889
Document Type
Thesis
Rights
Zhu, Haiyang
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
engineering, electronics and electrical