Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
MEMS piezoelectric resonant microphone arrays and their applications
(USC Thesis Other)
MEMS piezoelectric resonant microphone arrays and their applications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MEMS PIEZOELECTRIC RESONANT MICROPHONE ARRAYS
AND THEIR APPLICATIONS
by
Hai Liu
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
August 2022
Copyright 2022 Hai Liu
ii
Acknowledgments
MEMS group in Electrical and Computer Engineering of USC has been a great place to
research cutting-edge MEMS sensors and actuators. I would like to express my deepest gratitude
to my advisor, Professor Eun Sok Kim, the PI of this group, who admitted me to the group, helped
me get the Annenberg fellowship and supported me continuously in my six years PhD research. I
would not have been able to become an expert on MEMS and complete this dissertation without
his guidance on the direction of the projects, discussion about the scientific and engineering
problems, encouraging me to explore the projects not funded, and revision of my publications line
by line. In addition, I appreciate Prof. Kim’s generous and warm help in my daily life from the
very early stage as a fresh PhD student when I arrived in the US for the first time to the late stage
as a PhD candidate when my daughter was born. His commitment to research, hardworking,
meticulousness, and kindness has always been encouraging and guided me to fight for excellence
and to be a better person.
I have benefited significantly from Prof. Wei Wu and Prof. Qifa Zhou, who serve on my
dissertation committee. Their knowledge and insights have greatly enriched my work. I also
appreciate their generosity in allowing me to use the equipment in their labs and encouraging me
all the time. At the same time, I would like to thank Prof. K. Kirk Shung, Prof. Constantine Sideris,
and Prof. Chris Kyriakakis, who served on my qualifying exam committee and gave me extremely
valuable suggestions.
At the same time, I am truly grateful to the former and current members of the USC MEMS
group, who are also my excellent collaborators. I would like to thank Dr. Yongkui Tang and Dr.
Yufeng Wang especially, who helped me a lot in my research, my class, and my daily life when I
was a fresh PhD student. I appreciate learning the microfabrication processes and test methods
iii
from Dr. Yongkui Tang, whose extensive experience, insights, and kindness have always been
valuable. I enjoyed the fruitful collaboration with Dr. Song Liu, who introduced deep learning
algorithms in my work, extremely advancing the applications of the resonant microphones I
developed. I also would like to thank Dr. Anton Shkel, Dr. Yunqi Cao, Matin Barekatain, Jaehoon
Lee, Zohreh Azizi, Akash Roy, and Junyi Wang, who were tremendously supportive with their
outstanding expertise in digital signal processing, deep learning, circuit design, equipment
maintenance, fabrication processes, and acoustic simulation, etc. At the same time, I have always
benefited from the valuable conversations with Dr. Lurui Zhao and Kianoush Esfahani.
I also owe a debt of thanks to many students and friends who offered me valuable help and
support. I would like to thank Jun Tao from Prof. Rehan Kapadia’s group, Pan Hu, Deming Meng,
Hao Yang, and Zerui Liu from Prof. Wei Wu’s group, and Huandong Chen from Prof. Jayakanth
Ravichandran’s group for their tremendous help in the fabrication of my devices, thank Yushun
Zeng, Gengxi Lu, Runze Li, and Ruimin Chen from Prof. Qifa Zhou’s group for their enormous
assist in processes and characterization of my MEMS microphone, thank Jorge Gomez Ponce, and
Guillermo Castro from Prof. Andreas Molisch’s group for their extensive help in spectrum analysis,
thank Samer Idres from Prof. Hossein Hashemi’s group, Rezwan Rasul, and Aoyang Zhang from
Prof. Mike Chen’s group for the inspiring discussion about analog circuits.
Thanks are due to Daniel Cerrone, and Prof. Roberta Kato at Children’s Hospital Los Angeles
(CHLA) for working with me on the lung sound recording and analysis. Their clinic data collection
and studies were very inspiring.
Especially, cleanroom lab manager Dr. Donghai Zhu and engineer Alfonso Jimenez deserve
extremely high praise. They not only supported the public facilities and machines in the cleanroom
iv
but also helped me solve problems of the sputter of our lab. Their support has been essential in
completing this work.
I have learned a lot through discussions with the professors, TAs, and classmates from many
classes. Prof. Aluizio Prata always helped me solve my problems with electromagnetics during his
office hours. Prof. Armand Tanguay did not leave after the class but answered my questions about
optics. Prof. Alice Parker replied to my email about the digital circuit half an hour after 11 pm.
Prof. Hossein Hashemi was always ready to answer my questions about the analog circuit, and his
TA Dr. Aria Samiei was always very helpful. Dr. Nikolaos Flemotomos was still happy to help
me with the speech recognition even one year after I took the class. I will not forget Dr. Inna
Abramova’s help on the phase control for active noise cancellation although I did not take her
class. I appreciate the fruitful discussion and friendship with Ran Wang, Sicheng Pan, Qinghao
Meng, etc. in the classes. I could not have been a knowledgeable researcher without their help.
Finally, my wife Wenyuan has always been my rock no matter what ups and downs life brings.
Words are not enough to express my gratitude for her love, accompanying, support, and sacrifices
in all these years to make this dissertation possible. I did it all for her and our daughter, Charlotte.
At the same time, I am eternally grateful to my parents and my younger brother, for their endless
support and encouragement.
v
Table of Contents
Acknowledgements ...................................................................................................................... ii
List of Tables ............................................................................................................................. viii
List of Figures .............................................................................................................................. ix
Abstract .................................................................................................................................... xviii
Chapter 1 Introduction ............................................................................................................ 1
1.1 Background ......................................................................................................................................... 1
1.2 State of the art ...................................................................................................................................... 4
1.2.1 Flat band MEMS microphone ...................................................................................................... 4
1.2.2 Resonant microphone array (RMA) ............................................................................................. 9
1.3 Problem statement ............................................................................................................................. 16
1.4 Aim, scope, and significance of the research .................................................................................... 17
1.5 Overview of the dissertation ............................................................................................................ 199
Chapter 2 Modeling ................................................................................................................ 20
2.1 Lumped element model of MEMS cantilever microphone ............................................................... 20
2.1.1 Impedance of the air ................................................................................................................... 21
2.1.2 Impedance of cantilever (Zc) ...................................................................................................... 23
2.1.3 Piezoelectric Transduction ......................................................................................................... 25
2.1.4 Electrical impedance .................................................................................................................. 26
2.2 Lumped element model of MEMS cantilever microphone with Pre-amp ........................................ 26
2.2.1 Sensitivity ................................................................................................................................... 27
2.2.2 Noise .......................................................................................................................................... 28
2.3 Analytical model of width-stepped cantilever vibration ................................................................... 30
2.3.1 Mode of vibration ....................................................................................................................... 31
2.3.2 Forced vibration function ........................................................................................................... 34
2.3.3 Stress and charge ........................................................................................................................ 37
2.4 Case study and discussion ................................................................................................................. 37
2.4.1 Model ......................................................................................................................................... 38
2.4.2 Sensitivity ................................................................................................................................... 40
2.4.3 Noise .......................................................................................................................................... 42
2.5 Summary ........................................................................................................................................... 42
Chapter 3 RMAs for Lung Sound Detection and Classification ........................................ 44
3.1 Introduction ....................................................................................................................................... 44
vi
3.2 RMA of rectangular cantilevers ........................................................................................................ 45
3.2.1 Design ........................................................................................................................................ 45
3.2.2 Fabrication .................................................................................................................................. 47
3.2.3 Characterization ......................................................................................................................... 49
3.2.4 Lung sound detection and classification .................................................................................... 52
3.3 RMA of spiral resonant microphones ............................................................................................... 58
3.3.1 Design ........................................................................................................................................ 58
3.3.2 Fabrication .................................................................................................................................. 60
3.3.3 Characterization ......................................................................................................................... 62
3.3.4 Lung sound detection ................................................................................................................. 64
3.4 RMA of serpentine beams supported cantilever ............................................................................... 65
3.4.1 Design ........................................................................................................................................ 65
3.4.2 Fabrication .................................................................................................................................. 67
3.4.3 Characterization ......................................................................................................................... 69
3.4.4 Lung sound detection and classification .................................................................................... 71
3.5 RMA of width-stepped cantilever ..................................................................................................... 74
3.5.1 Design ........................................................................................................................................ 74
3.5.2 Fabrication .................................................................................................................................. 76
3.5.3 Characterization ......................................................................................................................... 77
3.5.4 Lung sound detection and classification .................................................................................... 91
3.6 Summary ........................................................................................................................................... 99
Chapter 4 RMAs for Active Noise Cancellation ................................................................ 100
4.1 Introduction ..................................................................................................................................... 100
4.2 Resonant microphone arrays (RMAs) ............................................................................................. 101
4.3 Experimental setup for ANC ........................................................................................................... 105
4.4 Digital algorithms for ANC ............................................................................................................. 108
4.4.1 ANC phase compensation for RMA ........................................................................................ 108
4.4.2 ANC with digital adaptive filter ............................................................................................... 110
4.4.3 ANC with deep learning ........................................................................................................... 110
4.5 ANC results ..................................................................................................................................... 111
4.6 Speech recognition .......................................................................................................................... 116
4.7 Smaller RMA for ANC ................................................................................................................... 118
4.7.1 Design ...................................................................................................................................... 118
4.7.2 Small RMA fabricated ............................................................................................................. 120
4.7.3 Sensitivity of the RMA ............................................................................................................ 120
4.8 Summary ......................................................................................................................................... 121
Chapter 5 RMAs for Speeching Sensing and Recognition ................................................ 122
5.1 Introduction ..................................................................................................................................... 122
5.2 RMAs for wide band and narrow band speech spectrum ................................................................ 123
vii
5.2.1 Design ...................................................................................................................................... 123
5.2.2 RMAs fabricated ...................................................................................................................... 125
5.2.3 Sensitivity ................................................................................................................................. 126
5.2.4 Noise ........................................................................................................................................ 128
5.3 RMA for wearable applications ...................................................................................................... 131
5.4 Summary ......................................................................................................................................... 133
Chapter 6 Conclusion and Future Directions .................................................................... 134
6.1 Conclusion ....................................................................................................................................... 134
6.2 Future directions .............................................................................................................................. 135
6.2.1 Electronic stethoscope .............................................................................................................. 136
6.2.2 Hearing aids ............................................................................................................................. 137
Bibliography ............................................................................................................................. 139
viii
List of Tables
Table 1.1 Comparison of published and commercialized flat band microphones .......................... 8
Table 1.2 Comparison of published resonant acoustic sensor. ..................................................... 15
Table 2.1 Parameters used in the model. ...................................................................................... 40
Table 2.2 Measured and modeled RMS noise of #1 resonant microphone in the RMA. ............. 42
Table 3.1 Design of cantilever RMA. ........................................................................................... 46
Table 3.2 Thickness of different layers of a cantilever resonant microphone. ............................. 46
Table 3.3 Properties of piezoelectric materials [64]. .................................................................... 47
Table 3.4 Equipment and parameters for the fabrication .............................................................. 47
Table 3.5 Design of spiral resonant microphone array. ................................................................ 60
Table 3.6 Designed sizes and resonant frequencies of the resonant microphones with serpentine
beams in the RMA ........................................................................................................ 66
Table 3.7 Thickness of different layers of a width-stepped cantilever resonant microphone. ..... 76
Table 3.8 Design of RMA. ............................................................................................................ 76
Table 3.9 Measured paralleled capacitance and resistance of the resonant microphones in the
RMA. ............................................................................................................................ 91
Table 4.1 Resonant frequency and size of the microphones in the array Fig. 4.2a. ................... 103
Table 4.2 Resonant frequency and size of microphones in the array Fig. 4.2b. ......................... 103
Table 4.3 Resonant frequencies in the RMA with 2 µm thick Si paddles for ANC. .................. 119
Table 5.1 Resonant frequencies in narrow band and sizes of the resonant microphones in the
RMA for ASR. ........................................................................................................... 125
Table 5.2 Resonant frequencies in wide band and sizes of the resonant microphones in the RMA
for ASR. ...................................................................................................................... 125
Table 5.3 Resonant frequencies in narrow band and sizes of the resonant microphones in the
RMA. .......................................................................................................................... 131
ix
List of Figures
Figure 1.1 Revenue forecast of MEMS microphones in the global market from 2017 to 2027
[2]. .............................................................................................................................. 2
Figure 1.2: Shipments forecast of smart speakers and displays, in which there are MEMS
microphones. The increment is around 50 million units per year [3] ........................ 2
Figure 1.3: Hearing aid configuration with error microphone for ANC (active noise
cancellation) [5] ......................................................................................................... 3
Figure 1.4: Architecture of medical cyber physical systems (MCPS) [7]. ................................... 3
Figure 1.5: Electronic stethoscope from Eko [10]. ....................................................................... 3
Figure 1.6: Frequency response of typical MEMS microphone ICS-40720 from TDK [11]. ...... 4
Figure 1.7: Frequency range of speech, music, and human hearing limit [12]. ............................ 4
Figure 1.8: Schematic of capacitive MEMS microphone [13]. .................................................... 5
Figure 1.9: Photo of piezoelectric MEMS transducer of Vesper [17]. ......................................... 5
Figure 1.10: Structure of measurement capacitive MEMS microphone with 0.5 um thick 1.95
mm diameter octagonal SiN diaphragm [19]. ............................................................ 6
Figure 1.11: Cantilever piezoelectric MEMS transducer a 2 mm x 2 mm cantilever covered with
piezoelectric ZnO film [20]. ...................................................................................... 6
Figure 1.12: Comparison of unamplified sensitivity of flat band microphones. ............................ 7
Figure 1.13: Comparison of SNR of flat band microphones within the bandwidth. ...................... 7
Figure 1.14: Structure of the human auditory system [34]. ............................................................ 9
Figure 1.15: Frequency response of the human auditory system (a) the eardrum [35] and (b)
some positions of the basilar membrane in the cochlear [36]. ................................. 10
Figure 1.16: Structure of capacitive cochlear analog transducer with a membrane diaphragm 30
mm long 0.14 – 1.82 mm wide [37]. ........................................................................ 11
Figure 1.17: Structure of piezoelectric artificial basilar membrane [38]. ..................................... 11
Figure 1.18: Array of 64 SU 8 resonant beams with different thickness (2.99 – 142 µm), width
(0.1 – 0.6 mm) and length (0.75 – 1.5 mm) [42]. .................................................... 12
Figure 1.19: Array of 8 Kapton/Al triboelectric beams 8.2 – 32 mm long and 6 – 8 mm wide
[43]. .......................................................................................................................... 12
x
Figure 1.20: Array of 10 pieces 2 um thick 0.3 mm wide 0.6 – 1.35 mm long Si cantilevers
covered with 0.5 um thick AlN [49]. ....................................................................... 13
Figure 1.21: Array of 10 pieces 15 um thick 0.3 mm wide 0.55 – 3 mm long piezoelectric PMN-
PT cantilevers [48]. .................................................................................................. 13
Figure 1.22: Array of 4 pieces AlN bimorph cantilever with 0.4 mm with and 0.3 – 0.443 mm
length: (a) photo of the array; (b) sensitivity of the array 0.05 – 0.1 mV/Pa [47] ... 13
Figure 1.23: Array of 13 pieces of 5 um thick 1 -2.5 mm wide and long Si cantilevers: (a) photo
of the array; (b) sensitivity of the array 10.8 – 202.6 mV/Pa [45]. .......................... 13
Figure 1.24: Comparison of unamplified sensitivity of resonant microphone array (RMA) at
resonance frequency and flat band microphones within the bandwidth. ................. 14
Figure 1.25: Comparison of SNR of RMA at resonance frequency and flat band microphones
within the bandwidth. ............................................................................................... 14
Figure 1.26: Lung sound spectrum [55]. ....................................................................................... 16
Figure 1.27: Intelligibility of the speech at different frequencies [57]. ........................................ 16
Figure 1.28: Mel frequency distribution [59]. .............................................................................. 17
Figure 2.1: Ideal structure of cantilever microphone. ................................................................. 21
Figure 2.2: Real structure of cantilever microphone with warpage. ........................................... 21
Figure 2.3: Lumped element model of MEMS cantilever microphone (a) schematic of
impedance with warped cantilever (b) equivalent circuit ........................................ 21
Figure 2.4: (a) Rectangular channel [60] (b) Air gap around a cantilever. ................................. 22
Figure 2.5: Dimension to calculate the paralleled acoustic impedance with warped cantilever (a)
cross section view (b) top view ................................................................................ 23
Figure 2.6: Impedance of cantilever ........................................................................................... 23
Figure 2.7: Cantilever impedance, piezoelectric transduction, and equivalent impedance of a
piezoelectric layer. ................................................................................................... 25
Figure 2.8: Piezoelectric electrical impedance. .......................................................................... 26
Figure 2.9: Full lumped element model of MEMS cantilever microphone together with Pre-
amp. .......................................................................................................................... 27
Figure 2.10: Noise model of MEMS cantilever microphone together with Pre-amp ................... 28
xi
Figure 2.11: (a) Noise voltage and (b) Noise current of op amp LTC6244 used for microphone
array developed in this study. .................................................................................. 30
Figure 2.12: Schematic of the width-stepped cantilever. .............................................................. 30
Figure 2.13: Schematic of packaged RMA. .................................................................................. 38
Figure 2.14: Lumped element model of the packaged RMA. ....................................................... 38
Figure 2.15: Impedance of metal box. .......................................................................................... 39
Figure 2.16: Modeled and experimental Sensitivity of the RMA with width-stepped cantilever
resonant microphones. The colored-solid curves are the modeled sensitivities, while
the dotted curves are the experimentally measured sensitivities. ............................ 41
Figure 2.17: Measured and modeled noise of #1 resonant microphone in the RMA. .................. 42
Figure 3.1: Typical spectrogram of wheezing. ........................................................................... 45
Figure 3.2: Design of the RMA with resonant cantilevers: (a) illustration of the array; (b)
illustration of one cantilever microphone; (c) crossing section illustration of the
cantilever microphone. ............................................................................................. 46
Figure 3.3: Simulated stress distribution on the cantilever microphone under sound pressure. . 46
Figure 3.4: Fabrication process of the microphone array. .......................................................... 48
Figure 3.5: Fabricated cantilever RMA. ..................................................................................... 48
Figure 3.6: Test board with fabricated RMA and pre-amps. ...................................................... 49
Figure 3.7: Metal box package shielding the test board, which is mounted on the inside of the
cover of the box. The reference microphone was placed as close as possible to the
test board. ................................................................................................................. 49
Figure 3.8: Test set-up for sensitivity characterization of the RMA. ......................................... 50
Figure 3.9: Measured unamplified sensitivities of the resonant microphones in the array and
their quality factors. ................................................................................................. 51
Figure 3.10: designed and measured resonant frequency for the rectangular RMA. ................... 52
Figure 3.11: Test set-up for lung sound recording and wheeze detection and classification. ...... 53
Figure 3.12: Signal of a lung sound (with strong wheezing) recorded with a resonant cantilever
microphone #2 (resonant frequency at 228 Hz) and the flat-band reference
microphone: (a) waveform with resonant microphone; (b) waveform with reference
microphone; (c) spectrogram with resonant microphone; (d) spectrogram with
reference microphone ............................................................................................... 54
xii
Figure 3.13: Signal of a lung sound with wheezing recorded with the combination of two
resonant cantilever microphones #2 and #6 (resonant frequency at 228 Hz and 453
Hz) and the flat-band reference microphone: (a) waveform with RMA; (b)
waveform with reference microphone; (c) spectrogram with RMA; (d) spectrogram
with reference microphone. ..................................................................................... 55
Figure 3.14: Configuration of automatic classification with temporal convolutional network
(TCN). ...................................................................................................................... 56
Figure 3.15: Accuracies of wheezing identification based on deep learning with the resonant
microphone array (97.44%) and a flat-band reference microphone (89.74%). ....... 57
Figure 3.16: Design of the RMA with resonant spiral microphone: (a) illustration of the array; (b)
illustration of one spiral microphone; (c) crossing section illustration of the spiral
microphone. ............................................................................................................. 59
Figure 3.17: Simulated stress distribution on the spiral microphone under sound pressure. ........ 59
Figure 3.18: Warpage of the cantilever and spiral structures vs resonant frequency under the
compressive stress of ZnO (-500 MPa) from simulation. ........................................ 59
Figure 3.19: Fabrication process of the RMA of spiral microphones: (a) SOI wafer; (b)
patterning and etching of Si (in KOH) and buried oxide (in HF 7:1) from the
backside; (c) deposition and patterning of 0.2 µm sputtered Al ground, 0.5 µm
sputtered ZnO, 0.1 µm PECVD SiN and 0.2 µm sputtered Al top electrode; (d)
sputtering 0.5 µm Al protection layer at the backside; (e) patterning and etching of
the device layer with deep reactive ion etch (DRIE) from the front side; (f) etching
Al protection layer to release the cantilever. ........................................................... 61
Figure 3.20: Simulated temperatures and warpages of the spiral microphone (a) without and (b)
Al layer at the backside, when the cantilever is just about to be released during
DRIE with 0.5 w/cm2 heat flux. .............................................................................. 61
Figure 3.21: Photos of the fabricated RMA of spiral resonant microphones. .............................. 62
Figure 3.22: Measured unamplified sensitivities of the RMA of spiral resonant microphones ... 63
Figure 3.23: Designed and measured resonant frequencies of the RMA with spiral microphones.
.................................................................................................................................. 63
Figure 3.24: Signal of a lung sound with wheezing recorded with resonant spiral microphones #5
(resonant frequency at 450 Hz) and the flat-band reference microphone: (a)
waveform with RMA; (b) waveform with reference microphone; (c) spectrogram
with RMA; (d) spectrogram with reference microphone ......................................... 64
Figure 3.25: Design of the RMA with cantilever supported by serpentine beams: (a) illustration
of the array; (b) illustration of one resonant microphone; (c) crossing section
illustration of the spiral microphone. ....................................................................... 66
xiii
Figure 3.26: Simulated stress distribution on the designed microphone under sound pressure. .. 66
Figure 3.27: Size of the resonant microphone with different designs. ......................................... 67
Figure 3.28: Warpage of the resonant microphone under the compressive stress of ZnO (-500
MPa) from simulation. ............................................................................................. 67
Figure 3.29: Fabrication process of the microphone array. .......................................................... 67
Figure 3.30: Simulated temperatures and warpages of the microphone (a) without and (b) Al
layer at the backside, when the cantilever is just about to be released during DRIE
with 0.5 w/cm2 heat flux. ........................................................................................ 68
Figure 3.31: Photos of the fabricated RMA of resonant microphones with serpentine support
beams. ...................................................................................................................... 68
Figure 3.32: Measured unamplified sensitivities of the microphones in the array from 100 to
1,000 Hz. .................................................................................................................. 69
Figure 3.33: Comparison of sensitivity and bandwidth of this work, reported resonant
microphone arrays (RMAs) and flat band microphones. Sensitivitiy at resonance
frequencies is counted for the RMAs. In the dominant frequency of lung sounds,
this work has the highest unamplified sensitivity compared with all reported
microphones, and the widest bandwidth compared with all reported RMAs .......... 70
Figure 3.34: Waveform and spectrogram of a lung sound (containing relatively weak wheezing
sound) recorded : (a) waveform with a resonant microphone #7 (503 Hz resonance
frequency) in the array and (b) waveform with a reference flat band microphone
GRAS 40AO. (c) spectrogram with a resonant microphone #7 (503 Hz resonance
frequency) in the array and (d) spectrogram with the reference microphone. ......... 72
Figure 3.35: Accuracies of automatic wheezing identification through TCN-based deep learning,
with various combinations of the microphones in the array as well as with the
GRAS 40AO reference microphone. ....................................................................... 73
Figure 3.36: Width-stepped cantilever (a) center-stepped cantilever (b) edge-stepped cantilever.
.................................................................................................................................. 74
Figure 3.37: Average stress comparison between the width-stepped cantilever and standard
cantilever under pressure 1 Pa. ................................................................................ 74
Figure 3.38: Resonance of center width-stepped cantilever (a) fundamental resonance 397 Hz (b)
2nd resonance 758 Hz, and edge width-stepped cantilever (c) fundamental
resonance 384 Hz (d) 2nd resonance 1394 Hz. ........................................................ 75
Figure 3.39: Design of the RMA with resonant width-stepped cantilevers: (a) illustration of the
array; (b) illustration of one cantilever microphone; (c) crossing section illustration
of the cantilever microphone. .................................................................................. 75
xiv
Figure 3.40: Photos of the fabricated RMA of resonant microphones with width-stepped
cantilevers. ............................................................................................................... 76
Figure 3.41: Schematic of sensitivity measurement through plane wave tube. ............................ 77
Figure 3.42: Assembly for the test (a) RMA was assembled on a PCB with op-amps for each
resonant microphone in the array, which was attached to the cover of the metal box
(b) The metal cover, with sound inlet at the RMA location, was screwed to the body
of the metal box ....................................................................................................... 78
Figure 3.43: Photo of sensitivity test configuration with PWT (plane wave tube). ...................... 78
Figure 3.44: Measured unamplified sensitivities of the microphones in the array from 100 to
1,000 Hz. .................................................................................................................. 79
Figure 3.45: Comparison of sensitivity and bandwidth of this work, reported resonant
microphone arrays (RMAs) and flat band microphones. Sensitivity at resonance
frequencies is counted for the RMAs. In the dominant frequency of lung sounds,
this work has the highest unamplified sensitivity compared with all reported
microphones. ............................................................................................................ 80
Figure 3.46: Measured unamplified sensitivities of the RMA from 100 to 1,000 Hz. ................. 81
Figure 3.47: Measured quality factor at resonances of the RMA. ................................................ 81
Figure 3.48: RMA Noise measurement set up (a) The metal box with RMA, pre-amp, and ADC
was placed on vibration isolation table for noise test (b) RMA together with pre-
amp on PCB was placed in double metal boxes with all single electrical wires
restricted in the metal box to avoid electromagnetic interference. .......................... 82
Figure 3.49: Measured pre-amp input referred peak-to-peak noise (measured output noise
divided by amplification 101). ................................................................................. 83
Figure 3.50: Figure 3.49: Measured pre-amp input referred peak-to-peak noise (measured output
noise divided by amplification 101) after A-weighting. .......................................... 84
Figure 3.51: Pre-amp input referred RMS noise (a) Before A-Weighting (b) After A-Weighting.
.................................................................................................................................. 85
Figure 3.52: Minimum detectable sound and SNR of the resonant microphone in the RMA (a)
Minimum detectable sound (b) SNR. ...................................................................... 86
Figure 3.53: Minimum detectable sound and SNR of the resonant microphone in the RMA after
A-Weighting (a) Minimum detectable sound (b) SNR. ........................................... 87
Figure 3.54: Minimum detectable sound and SNR of the RMA between 200 and 660 Hz (a)
Minimum detectable sound (b) SNR. ...................................................................... 88
xv
Figure 3.55: PSD of measured pre-amp input referred noise without A-Weighting. ................... 89
Figure 3.56: PSD of measured pre-amp input referred noise with A-Weighting. ........................ 90
Figure 3.57: A waveform of lung sound with wheezing (a) recorded by #7 cantilever resonant
microphone in the RMA, wheezing is distinguishable (b) recorded by reference
microphone GRAS 40 AO, wheezing is not distinguishable. .................................. 93
Figure 3.58: Wheezing is obvious in the spectrogram of lung sound of Fig. 3.56 for both (a)
recorded by #7 cantilever resonant microphone in the RMA (b) recorded by
reference microphone GRAS 40 AO. ...................................................................... 93
Figure 3.59: A waveform of lung sound with wheezing (a) recorded by reference microphone,
wheezing is not distinguishable (b) recorded by reference microphone followed by
digital band pass filtering (500-700 Hz), wheezing is distinguishable. ................... 94
Figure 3.60: Waveform and spectrogram of a lung sound with weak wheezing recorded by #6 in
the RMA (a) waveform, wheezing is not distinguishable (b) spectrogram, wheezing
is distinguishable. ..................................................................................................... 94
Figure 3.61: Waveform and spectrogram of a lung sound with weak wheezing recorded by
reference microphone GRAS 40AO (a) waveform, wheezing is not distinguishable
(b) spectrogram, wheezing is not distinguishable. ................................................... 95
Figure 3.62: Schematic of deep learning algorithms for the lung sound automatic classification
(a) TCN without recording pre-processing (b) CNN with pre-extracted features
from the recordings. ................................................................................................. 96
Figure 3.63: Schematic of K-fold cross-validation (K=10) [69]. ................................................. 97
Figure 3.64: Wheezing automatic classification accuracy for lung sounds recorded by the RMA
with width-stepped cantilever resonant microphones, and the reference microphone
GRAS 40AO, with two deep learning algorithms (a) TCN (temporal convolutional
networks) with features of MFCC, CSTFT, and mSpec extracted from the
recordings as input. .................................................................................................. 98
Figure 4.1: In hearing aids, noise through Path 2 can only be reduced by active noise
cancellation (ANC). ............................................................................................... 101
Figure 4.2: Photos of the fabricated resonant microphone arrays (a) with resonant frequencies
from 856 to 4,889 Hz to sense speech [45] and (b) with resonant frequencies from
5,380 to 8,820 Hz to actively cancel noise in that frequency range. (c) Photo of a
resonant microphone in the array. .......................................................................... 102
Figure 4.3: Frequency response of the resonant microphones to sense the speech with high
sensitivity from 856 to 4,889 Hz which covers the main spectrum of speech. ...... 104
xvi
Figure 4.4: Frequency response of the resonant microphones to sense the high frequency noise
with high sensitivity from 5,380 to 8,820 Hz for active noise cancellation in this
frequency range. ..................................................................................................... 104
Figure 4.5: Schematic for testing the ANC which mimics the situation illustrated in Fig. 4.1. 105
Figure 4.6: Photos of (a) the ANC experimental set up in an anechoic chamber, (b) the RMA
with preamplifiers, which is placed in a metal box, and (c) the front view of the
metal box with holes for sound transmission to RMA. ......................................... 106
Figure 4.7: Schematics of (a) the analog inverter for ANC RMA and (b) the high pass inverting
amplifier with f_c=5 kHz for ANC with a flat band microphone. ........................ 107
Figure 4.8: Digital phase compensator: s(n) is the noise signal; H(s) is the transfer function of
the resonant microphone; y(n) is the output of the phase compensator. ................ 109
Figure 4.9: Schematic of ANC with a digital adaptive filter. ................................................... 110
Figure 4.10: Schematic of the deep learning with TCN for ANC. ............................................. 111
Figure 4.11: Noise power spectral densities before, and after ANC with RMA, with analog
inverter and with digital phase compensator. ........................................................ 113
Figure 4.12: Noise power spectral densities before and after ANC with RMA plus adaptive filter
and with RMA plus deep learning. ........................................................................ 113
Figure 4.13: Noise change after ANC with adaptive filter for both RMA and flat band
microphone vs the noise power spectral density (PSD) before ANC. ................... 114
Figure 4.14: Noise power spectral density before and after ANC with flat band microphone plus
analog inverter. ...................................................................................................... 114
Figure 4.15: Noise power spectral density before and after ANC with flat band microphone plus
adaptive filter and with flat band microphone plus deep learning. ........................ 115
Figure 4.16: Percentage of noise reduction levels for ANC with analog inverter for RMA and flat
band microphone. ................................................................................................... 115
Figure 4.17: Word error rate (WER) of ASR before and after the ANC with RMA and analog
inverter for the speech with different SNR (with the 70 dB SPL on the test
microphone for all the speech files). ...................................................................... 117
Figure 4.18: Word error rate (WER) of ASR before and after the ANC with RMA and analog
inverter for the speech with the same SNR of -20 dB but with different SPL of the
speech on the test microphone. .............................................................................. 117
Figure 4.19: Size comparison of RMAs studied in Chapter 3 and new design: (a) Design of the
RMAs (b) Area of RMAs. ..................................................................................... 119
xvii
Figure 4.20: Size of the resonant microphones (RM) with 5 and 2 µm thick Si paddles. .......... 119
Figure 4.21: The fabricated RMAs with 2 µm thick width-stepped for ANC (active noise
cancellation) ........................................................................................................... 120
Figure 4.22: Unamplified sensitivity and quality factor of the RMA with 2 µm thick cantilevers
for ANC. ................................................................................................................ 120
Figure 5.1: Speech recognition market size. ............................................................................. 123
Figure 5.2: Subjective pitch and frequency [58]. ...................................................................... 123
Figure 5.3: Eight Mel filters in (a) narrow band and (b) broad band. The center frequencies are
designed for the resonance frequencies of the RMA. ............................................ 124
Figure 5.4: Photos of the fabricated RMA for narrow band speech spectrum. ........................ 125
Figure 5.5: Photos of the fabricated RMA for wide band speech spectrum (a) Top side view (b)
back side view. ....................................................................................................... 126
Figure 5.6: Measured unamplified sensitivities of the RMAs (a) for narrow band speech
spectrum and (b) for wide band speech spectrum. ................................................. 127
Figure 5.7: Comparison of microphones’ unamplified sensitivities. ........................................ 128
Figure 5.8: Op-amp input referred noise RMS with the RMAs after A-weighting (a) RMA for
narrow band speech spectrum and (b) RMA for wide band speech spectrum. ...... 129
Figure 5.9: Minimum datable sound of the RMAs with op-amp after A-weighting (a) RMA for
narrow band speech spectrum and (b) RMA for wide band speech spectrum. ...... 129
Figure 5.10: Measured SNR of the RMAs (a) for narrow band speech spectrum and (b) for wide
band speech spectrum. ........................................................................................... 130
Figure 5.11: Size comparison of RMAs studied for speech sensing and recognition: (a) Design of
the RMAs (b) Area of RMAs. ................................................................................ 131
Figure 5.12: The fabricated RMAs with 2 µm thick width-stepped cantilevers for speech sensing.
................................................................................................................................ 132
Figure 5.13: Unamplified sensitivity and quality factor of the RMA with 2 µm thick cantilevers
for speech sensing. ................................................................................................. 132
xviii
Abstract
This dissertation presents the development and research of the piezoelectric MEMS resonant
microphone array (RMA) with multiple resonances to sense the sound for lung sound detection
and classification, active noise cancellation, speech recognition, etc.
A complete lumped element model of MEMS piezoelectric resonant microphone of Si
cantilever with warpage is presented. The effect of the cantilever warpage on the acoustic
impedance is studied theoretically and is shown to lead to increased acoustic pressure leak through
the gap between the cantilever and the substrate wall, which decreases the sensitivity as the sound
frequency becomes lower than a critical value. At the same time, an analytical vibration model of
the width-stepped cantilever with multiple layers (piezoelectric film, electrodes, and insulating
layer) is built and used to derive the electrical equivalent impedance of the cantilever microphone
for the lumped element model. Also, a noise model coupled with an op amp’s noise model is
developed for the resonant microphone. The models for both the microphone’s sensitivity and
noise vs frequency are validated through a fabricated RMA with resonant microphones based on
width-stepped piezoelectric cantilevers.
For wheezing detection and classification of lung sounds, four RMAs with four types of
resonant microphones (rectangular cantilever, spiral microphone, rectangular plate with serpentine
support beams, and width-stepped cantilever) with low resonant frequencies 200 – 600 Hz are
designed, fabricated, and characterized, followed by lung sound recording and signal processing.
Very high unamplified sensitivity 265 ~ 86.0 mV/Pa and extremely low noise floor of -4.0 ~ 7.4
dBA at the resonance frequencies are obtained. Consequently, the acoustic feature of wheezing is
distinguished better in both the time domain and frequency domain in comparison with a flat band
xix
reference microphone. With this advantage, higher identification accuracy of lung sounds with and
without wheezing is achieved.
For active noise cancellation (ANC), RMA with resonance frequencies 5 – 9 kHz is developed
and shown to be effective in actively canceling the noise over the frequency range (that is above
the range where most of the speech information resides) and in improving speech recognition.
Compared to a similar ANC based on a flat band microphone, there is more noise reduction for
ANC with RMA than a flat band microphone when the noise level is low, due to its high sensitivity
and low noise floor. For this application, much smaller RMA with thinner width-stepped cantilever
resonant microphones, which are better for wearable applications, are developed.
For speech sensing and recognition, three RMAs (one for wide band speech spectrum, one for
narrow band speech spectrum, and one for small size) with width-stepped cantilevers are designed,
fabricated, and characterized. The signal-to-noise ratio (SNR) of the RMA for the narrow band
speech spectrum is higher than 73 dBA for 1 Pa sound at all the resonance frequencies.
1
Chapter 1
Introduction
This chapter states the aim, scope, and significance of the thesis. Before that, the background
is introduced to describe the context and motivation of the study, followed by the literature review,
in which state of the art in both industry and academia is reviewed and the gap between the existing
research and unmet need is stated.
1.1 Background
We are bracing a new era of technology revolution: cyber-physical systems (CPS) which
integrate computing, network, distributed sensors, and actuators [1]. The human, physical world,
and the digital world interact closely in this system, in which acoustic sensors, especially micro-
electro-mechanical systems (MEMS) microphones play a significant role. CMOS compatible
fabrication process based on silicon makes MEMS microphone small, high production, high yield,
and low cost. Consequently, they have been widely used in mobile devices, including mobile
phones, pads, headphones, etc., and the market size is increasing in recent years and is expected to
continue to increase in future years as shown in Fig. 1.1, which is forecasted in 2016 by IHS [2].
Besides mobile devices, another fast-growing market is the smart speaker (Amazon Echo, Google
Home, etc.) as shown in Fig. 1.2 [3], which is forecasted in 2019 and contributes to boosting the
shipments of MEMS microphones.
2
There are as many as 7 MEMS microphones from Knowles used in each Amazon echo [4],
which is for automatic speech recognition (ASR) and speaker control. Different from traditional
input through keyboard and mouse for computer and screen for smart phone, more convenient
voice input through the microphone is becoming more and more popular in cars, smart speakers,
etc. Besides asking Echo to play music, you can ask for information like weather, news, flight, etc.,
you can also ask for services like Uber and food delivery, and you can even ask to turn on the
coffee machine which is connected to Echo wirelessly. This is the typical example of the internet
of things (IoT), which is the infrastructure of CPS. High-quality MEMS microphones are essential
in this system as a new human-machine interface to deliver a clear speech to the system.
Hearing aids, a medical device for patients with hearing loss, have the same requirements for
the high-performance microphone. High sensitivity and low noise floor are critical to sense the
speech. At the same time, reducing the noise from the environment is important for the patients to
understand the speech well. More than one microphone is required to cancel the noise through
active noise cancellation (ANC) (Fig. 1.3) [5]. Also, smart hearing aids as part of IoT are coming
with more services through microphones as an interface.[6].
Figure 1.1 Revenue forecast of MEMS
microphones in the global market from 2017 to
2027 [2].
Figure 1.2: Shipments forecast of smart speakers and
displays, in which there are MEMS microphones.
The increment is around 50 million units per year [3]
3
Smart hearing aids belong to medical cyber-physical systems (MCPS), which incorporate
medical sensors (thermometer, ECG, stethoscope, etc.), medical treatment devices (portable
insulin pen, etc.), and medical decision center (database, artificial intelligence, doctors, nurses,
etc.), which is illustrated in Fig. 1.4 [7]. The electronic stethoscope is one of the important medical
sensors to monitor lung sounds, which indicate various lung diseases [8]. By monitoring lung
sounds continuously, the patients can know immediately when the asthma is coming so that they
can prepare for that. However, electronic stethoscopes are still rarely used in hospitals or homes.
There are two reasons. One reason is that the diagnosis accuracy, which is 71.2% - 100% according
to a recent review [9], is not high enough. Another reason is that existing electronic stethoscopes
are not wearable because of the bulky acoustic coupler as shown in Fig. 1.5 [10]. The diagnosis
accuracy can be increased, and the acoustic coupler can be minimized if there are MEMS
microphones with higher SNR and sensitivity for electronic stethoscopes.
Figure 1.4: Architecture of medical cyber physical
systems (MCPS) [7].
Figure 1.5: Electronic stethoscope from Eko
[10].
Figure 1.3: Hearing aid configuration with error microphone for ANC (active noise
cancellation) [5]
Error microphone
Main microphone
4
Based on the above analysis, we realize that better MEMS microphones with higher sensitivity
and SNR are an unmet need in CPS for voice human-machine interface and medical devices
including hearing aids and electronic stethoscopes.
1.2 State of the art
1.2.1 Flat band MEMS microphone
For traditional microphones, the band before the fundamental resonance frequency with flat
sensitivity is utilized to sense the sound (Fig. 1.6) [11]. Thus, we can call it flat band microphone.
The bandwidth is usually from 100 Hz to 10 kHz covering the human speech frequency range (100
Hz – 7 kHz) and most music frequency range (50 Hz – 12 kHz), which is shown in Fig. 1.7 [12].
The sound can be picked up without any change through this flat frequency response.
Most of the commercialized MEMS microphones are capacitive transducers, in which the
capacitive change between the vibrated diaphragm under sound pressure and the backplate is
converted to voltage as shown in Fig. 1.8 [13]. The sensitivity of them (including both the
sensitivity of the MEMS transducer and gain of the ASIC) is usually between 5 mV/Pa (TDK
Figure 1.7: Frequency range of speech, music,
and human hearing limit [12].
Figure 1.6: Frequency response of typical
MEMS microphone ICS-40720 from TDK [11].
Flat band
5
INMP411) [14] and 25.12 mV/Pa (TDK ICS-40730, differential mode) [15]. The signal-to-noise
ratio (SNR) is usually between 59 dBA (Knowles SPH2430HR5H-B) [16] and 74 dBA (TDK ICS-
40730) [15]. It should be noticed that all specifications including sensitivity and SNR in the
datasheet of a commercialized MEMS microphone are based on the output of the package without
distinguishing the MEMS transducer and ASIC. Therefore, the sensitivity of the MEMS acoustic
transducer itself is lower than the spec. At the same time, SNR is usually tested by comparing the
noise at a certain frequency range and 1 Pa signal at 1 kHz. Both SNR and sensitivity are important
specifications. If a sound loudness level is lower than the equivalent input noise (EIN) (EIN = 94
dB SPL - SNR), it would be hard to detect that sound. If the sensitivity is too low, the microphone
output of a weak sound may be lower than the EIN of the pre-amplifier followed.
Vesper is the only company commercializing the piezoelectric MEMS microphone, which
converts the vibration of the piezoelectric AlN cantilever to voltage as shown in Fig. 1.9 [17]. The
advantage of the piezoelectric transducer is that it does not require a power supply (ASIC in the
microphone package still needs power) while bias is necessary for the capacitive sensor. The
sensitivity and SNR of the Vesper VM2000 are 12.59 mV/Pa and 65 dBA [18]. The sensitivity of
the piezoelectric transducer was disclosed as 7.94 mV/Pa [17].
Figure 1.8: Schematic of capacitive MEMS
microphone [13].
Figure 1.9: Photo of piezoelectric MEMS
transducer of Vesper [17].
6
Besides commercialized ones, MEMS microphones are still researched and developed in both
industry and academia. The highest unamplified sensitivity of the capacitive MEMS microphone
transducer reported is 22.39 mV/Pa with a 0.5 um thick 1.95 mm diameter octagonal SiN
diaphragm as shown in Fig. 1.10, which also has very high SNR as 73 dBA within 22 Hz – 22 kHz
[19]. However, the polarized voltage is as high as 200 V, which is much higher than 10 V for a
standard capacitive MEMS microphone, so it is not suitable for wearable devices. As far as
piezoelectric MEMS microphones are concerned, the highest sensitivity reported is 38 mV/Pa,
with the SiN cantilever covered with piezoelectric ZnO film as shown in Fig. 1.11 [20]. However,
the bandwidth is narrow at 100-700 Hz because the fundamental resonant frequency is 890 Hz.
The sensitivities of other capacitive and piezoelectric MEMS microphones studied are from 0.039
mV/Pa to 19.95 mV/Pa [21 - 26]. Unfortunately, many of the publications do not disclose whether
it counts the gain of the ASIC/pre-amplifier or not, which diminish the significance of the
sensitivity to the transducer. Thus, the SNR is more important in this situation, which is between
48.9 and 68 dBA [21 - 26].
Figure 1.10: Structure of measurement
capacitive MEMS microphone with 0.5 um
thick 1.95 mm diameter octagonal SiN
diaphragm [19].
Figure 1.11: Cantilever piezoelectric MEMS
transducer a 2 mm x 2 mm cantilever covered with
piezoelectric ZnO film [20].
7
There is another type of MEMS microphone optical fiber microphone has high SNR which can
be 67 dB [28] and 75 dB [27]. However, a laser source and electro-optical unit are required which
make the system big and unwearable.
The performance of the typical MEMS microphones commercialized and studied is compared
in Table 1.1 as well as several non-MEMS microphone transducers. The unamplified sensitivity
and SNR of the microphone transducers in Table 1 are also shown in Fig. 1.12 and Fig. 1.13.
Whether the sensitivity includes the gain of the ASIC/Preamplifier is marked to reflect the
performance of the transducer itself. The frequency (1 kHz or within the bandwidth) under which
the SNR was measured is also described as the SNR with other frequencies in the bandwidth may
be lower than that at 1 kHz. At the same time, the size of the microphone diaphragm or cantilever
is on the list as a larger transducer usually has higher sensitivity and SNR. Also, the bias for the
capacitive transducers is noted as higher bias leads to higher sensitivity.
Compared with a traditional non-MEMS condenser microphone transducer with high
unamplified sensitivity of 50mV/Pa [32] and high SNR 84.5 [31], the existing MEMS microphone
is still not good enough. A better MEMS microphone with higher sensitivity and SNR is required
for wearable medical sensors.
Figure 1.12: Comparison of unamplified
sensitivity of flat band microphones.
Figure 1.13: Comparison of SNR of flat band
microphones within the bandwidth.
8
Table 1.1 Comparison of published and commercialized flat band microphones
Ref.
Classif
ication
Transduce
r type
Sensitivity SNR
Band-
width
(Hz)
Size***
(mm)
Bias
(V) Value
(mV/Pa)
Exclude
gain of
amp
Valu
e
(dB)
Noise
Frequency
(Hz)
[19]
MEMS
Capacitive
22.39 Yes 73 22 – 22k 25 – 20k
1.95 x
1.95
200
[21] 7.33 Yes 30.4 0.1 – 100k 0.1 – 100k
1.14 x
1.14
5.8
[22] 10.23
N/A
**
54 1k 50 – 22k Φ 0.8 N/A
[13] 0.39 N/A 53 1k 300 – 20k Φ 0.23 9.3
[14] 5 No 62 20 – 20k 28 – 20k N/A N/A
[15] 25.12 No 74 20 – 20k 35 – 10k N/A N/A
[16] 7.9 No 59 20 – 20k 50 – 10k N/A N/A
[17]
Piezo-
electric
7.94 Yes 66 20 – 20k 100 – 8k 1 x 1
No
[23] 10.5 Yes 68 15 – 100k 15 – 100k
5.48 x
5.48
[24] 0.039 Yes 54 1k 69 – 20k
0.414 x
0.414
[25] 19.95 N/A 64 1k 188 – 10k Φ 1
[26] 13.34 N/A 48.9 20 – 10k 190 – 9.5k 1 x 1
[20] 38 Yes / / 100 - 700 2 x 2
[27]
Optical
fiber
100 N/A 75 N/A 10 -10k N/A N/A
[28]
0.17/Pa
*
Yes 67 1k – 30k 700 – 8.6k
0.37 x
0.37
N/A
[29]
Piezo-
resistive
0.01 N/A -6 N/A 100 – 15k Φ 2.34 10
[30] N/A N/A 57 1k N/A
1.25 x
0.8
N/A
[31]
Non-
MEMS
Capacitive
44.7 Yes 84.5 N/A 250 – 8k Φ 12.7 200
[32] 50 Yes 80 N/A 5 – 20k Φ 12.7 No
[33] 100 N/A 74 100 – 10k 100 – 6k Φ 4 No
* Different unit was used and unable to convert to mV/Pa in Ref [28].
** N/A: Data not available
*** Size of vibration area: diaphragm, cantilever, etc.
9
1.2.2 Resonant microphone array (RMA)
We can see that the sensitivity of the microphone at the resonant frequency is much higher
than that at the flat band from the frequency response in Fig. 1.6. At the same time, the SNR at the
resonant frequency is also higher than the flat band [28]. Therefore, the high sensitivity and SNR
of the microphone resonator can be utilized to detect weak sound. However, the bandwidth of one
resonator is small. Thus, a resonant microphone array (RMA) consisting of multiple acoustic
resonators with different resonances can be formed for any interested frequency range we want.
The human auditory system, which is illustrated in Fig. 1.14 [34], is composed of resonators.
Fig. 1.15 (a) shows that the frequency response of the tympanic membrane (eardrum) is around 1
– 2 kHz [35]. For the cochlear in the inner ear, there are around 30,000 resonances at different
positions of the basilar membrane which has different stiffness at different positions from one end
to the other [36]. As illustrated in Fig. 1.15 (b) [36] for the frequency response at some positions
of the basilar membrane, it is a natural RMA.
Figure 1.14: Structure of the human auditory system [34].
10
Acoustic artificial basilar membrane or cochlear [37 - 51] mimicking the human cochlear and
basilar membrane with multiple resonant frequencies have been researched to replace the existing
commercialized artificial cochlear with an array of electrodes that is stimulated by the microphone
and circuits outside [52]. There are three types of such acoustic transducers. The first type is a
trapezoid shape membrane with different widths at two ends [37 - 41], which is a “copy” of the
basilar membrane. The second type is an array of resonant beams [42 - 44], the two ends of the
beams are fixed on the substrate. The third type is an array of resonant cantilevers [45 - 51] with
one edge fixed on the substrate.
The structure of the first membrane-type cochlear analog transducer is shown in Fig. 1.16,
which is a capacitive transducer that senses the sound transferred through the fluid to mimic the
real cochlear environment [37]. The resonant frequency is 10 – 70 kHz, which is higher than the
sound spectrum human most hears. It works well at the wide end of the diaphragm with sensitivity
0.1 – 0.5 mV/Pa at the output of the preamplifier. Somehow there is no meaningful electrical output
at the narrow end. Another artificial basilar membrane is the piezoelectric type with 40 µm thick
(a) (b)
Figure 1.15: Frequency response of the human auditory system (a) the eardrum [35] and (b) some
positions of the basilar membrane in the cochlear [36].
11
PVDF as the diaphragm, which can also work in a fluid environment as shown in Fig. 1.17. The
resonant frequency in the air is 6.8 – 20 kHz with sensitivity 0.159 – 1.59 mV/Pa. It becomes
0.0159 mV/Pa with resonance frequency 1.5 -5 kHz when there is fluid in the chamber [38]. The
sensitivity of the diaphragm type can be high as 27.4 mV/Pa with a large diaphragm of 150 mm
long and 30-60 mm wide on which there are PZT films for the transducer [40]. However, such a
large device is not practical for cell phones and other mobile applications let alone cochlear
implants.
Although diaphragm type transducer can work with fluid, which reflects the real environment
of the basilar membrane in the cochlear, it does not distinguish the resonant frequency as well as
the real basilar membrane. Therefore, beam array and cantilever array were developed with each
beam or cantilever having one particular resonant frequency [42 – 51]. In one study, 64 SU8 beams
with various thicknesses, widths, and lengths were developed with resonant frequencies of 11.5 –
290 kHz as shown in Fig. 1.18 [42]. Lower resonant frequencies of 250 Hz – 2.3 kHz with
relatively larger sensitivities of 1.74 – 13.1 mV/Pa can be achieved with very big beams as shown
in Fig. 1.19 [43].
Figure 1.16: Structure of capacitive cochlear
analog transducer with a membrane diaphragm
30 mm long 0.14 – 1.82 mm wide [37].
Figure 1.17: Structure of piezoelectric
artificial basilar membrane [38].
12
Compared with the diaphragm and beam structure, the residual stress of cantilevers does not
affect the resonant frequency much so the resonant frequencies of the cantilever array can be
controlled much better. At the same time, the sensitivity can be higher because there is much larger
deformation under acoustic pressure with the cantilever. Almost all cantilever acoustic transducers
are piezoelectric because there is the pull-in instability issue for electrostatic transducers with large
vibration of the cantilever. J. Jang etc. studied an array of 10 pieces of Si cantilevers covered with
AlN film as shown in Fig. 1.20 [49]. The sensitivity is 0.354 – 1.67 mV/Pa with resonance
frequencies 2.63 – 13.3 kHz. S. Hur etc. studied an array of 10 pieces piezoelectric PMN-PT
cantilevers shown in Fig. 1.21 [48]. The sensitivity is 1.27 – 80.4 mV/Pa. Except for [45 - 47], all
above reported resonant membrane or resonant beam/cantilever arrays do not disclose whether the
gain of the preamplifier is excluded in the sensitivity. At the same time, SNR was not measured.
C. Zhao etc. designed an array of AlN bimorph cantilevers shown in Fig. 1.22 [47]. The sensitivity
excluding the gain of preamplifier and SNR was reported as 0.05 – 0.1 mV/Pa and 64 – 84 dBA
in the air at the resonant frequencies 18.8 – 40.8 kHz and it is 0.001 – 0.01 mV/Pa and 43 – 48
dBA in the water with resonant frequencies 5.6 – 13.5 kHz. Lukas Baumgartel designed an array
of 13 pieces Si paddle cantilevers covered with ZnO film with very high unamplified sensitivity
Figure 1.18: Array of 64 SU 8 resonant beams
with different thickness (2.99 – 142 µm), width
(0.1 – 0.6 mm) and length (0.75 – 1.5 mm) [42].
Figure 1.19: Array of 8 Kapton/Al triboelectric
beams 8.2 – 32 mm long and 6 – 8 mm wide
[43].
13
10.8 – 202.6 mV/Pa at resonant frequencies 860 -6263 Hz as shown in Fig. 1.23 [45]. Very high
SNR 85.8 and high unamplified sensitivity 138 mV/Pa were measured for a similar paddle in [46].
Figure 1.20: Array of 10 pieces 2 um thick 0.3
mm wide 0.6 – 1.35 mm long Si cantilevers
covered with 0.5 um thick AlN [49].
Figure 1.21: Array of 10 pieces 15 um thick 0.3
mm wide 0.55 – 3 mm long piezoelectric PMN-
PT cantilevers [48].
(a) (b)
Figure 1.22: Array of 4 pieces AlN bimorph cantilever with 0.4 mm with and 0.3 – 0.443 mm length:
(a) photo of the array; (b) sensitivity of the array 0.05 – 0.1 mV/Pa [47]
(a) (b)
Figure 1.23: Array of 13 pieces of 5 um thick 1 -2.5 mm wide and long Si cantilevers: (a) photo of the
array; (b) sensitivity of the array 10.8 – 202.6 mV/Pa [45].
14
The comparison of reviewed resonant acoustic sensors and sensor arrays is summarized in
Table 1.2, in which whether the sensitivity excludes the gain of amplifiers is noted. The size of the
vibration part of the transducer (membrane, beams, cantilevers) is also listed. The unamplified
sensitivity and SNR are illustrated in Fig. 1.24 and Fig. 1.25. We can see that compared with flat
band MEMS microphone transducers, a much higher sensitivity of 202.6 mV/Pa [45] and SNR
85.8 dBA [46] can be achieved through a self-powered piezoelectric resonant cantilever
microphone array with reasonable cantilever size (1 – 2.5 mm). Therefore, such a resonant
microphone array has great potential for lung sound detection with the wearable electronic
stethoscope for medical cyber-physical systems (MCPS) and speech recognition in a voice human-
machine interface. However, there are still problems with the existing highly sensitive resonant
microphone array, which will be stated in the next section.
Figure 1.24: Comparison of unamplified
sensitivity of resonant microphone array
(RMA) at resonance frequency and flat band
microphones within the bandwidth.
Figure 1.25: Comparison of SNR of RMA at
resonance frequency and flat band
microphones within the bandwidth.
15
Table 1.2 Comparison of published resonant acoustic sensor.
Ref.
Classif
ication
Transducer
type
Sensitivity
(at resonant
frequencies)
SNR
(at
resonant
freq.)
(dB)
Resonant
Frequency
Range
(Hz)
Q
Size***
(mm)
Ch
an
nel
No
.
Value
(mV/Pa)
Exclude
gain of
amp
[37]
*
Memb
rane
(fixed
at four
edges)
Capacitive
0.05 -
0.35
No N/A 10k – 70k N/A
W: 0.14-1.82
L: 30
32
[38]
Piezoelectric
0.159 -
1.59
N/A** N/A 7k – 20k N/A
W: 2-4
L: 30
24
[39]
0.5 –
2.94
N/A N/A 2k – 16k N/A
W: 0.07 – 0.3
L: 0.7
2
[40]
23.3 –
29.85
N/A N/A 500 – 1k N/A
W: 30 – 60
L: 150
3
[41]
0.035 –
2.24
N/A N/A
2.5 –
13.5k
N/A
W: 1 – 8
L: 28
23
[42]
Beams
(fixed
at two
edges)
No
transducer
N/A N/A N/A
11.5k –
290k
N/A
0.1 x 0.75 –
0.6 x 1.5
64
[43] Triboelectric
1.74 –
13.1
N/A 24 290 – 2.3k N/A
8.2 x 6 –
32 x 8
8
[44] Piezoelectric
0.0017 –
0.0063
N/A N/A
10.2k –
24.3k
N/A
1.14 x 0.4 –
3.3 x 0.4
10
[45]
Cantile
vers
(fixed
at one
edge)
Piezoelectric
10.8 –
202.6
Yes N/A
860 –
6.263k
29 –
51.5
1.0 x 1.0 –
2.5 x 2.5
13
[46] 138 Yes 85.8 1.36k N/A 2.3 x 2.3 1
[47]
0.05 –
0.1
Yes 84
18.8k –
40.8k
47 -
90
0.4 x 0.3 –
0.4 x 0.443
4
[48]
1.74 –
80.4
N/A N/A
380 –
13.6k
N/A
0.55 x 0.3 –
3 x 0.3
10
[49]
0.354 –
1.67
N/A N/A
2.63k –
13.3k
43.7
-
134
0.6 x 0.3 –
1.35 x 0.3
10
[50] Optical N/A N/A N/A
286 –
6.948k
9.38
–
14.1
1.5 x 0.1 –
7.5 x 0.1
4
[51]
No
transducer
N/A N/A N/A 1.6k – 7k N/A
1.67 x 0.0625
–2.5 x 0.0625
24
* Ref [37]: Sensitivity, SNR and resonant frequencies was measured with fluid in the chamber.
** N/A: Data not available
*** Size of vibration area: membrane, beam, cantilever, etc.
16
1.3 Problem statement
We have discussed that the resonant cantilever microphone array (RCMA) has greater potential
than the flat band microphone for high sensitivity and SNR. At the same time, the resonant
microphones in the array are also natural acoustic filters with high Q at each resonant frequency
so that the noise out of the bandwidth of each resonant frequency can be suppressed, which makes
the transducer noise-robust. However, only one transducer reported [45] has a very high
unamplified sensitivity of 202.6 mV/Pa suitable for speech recognition and medical sensor
applications. It has been used for speech recognition [53] and lung sound classification [46, 54]
and shows good performance. However, there are still some problems with this transducer.
First, low resonant frequencies are missed. The resonant frequency of the reported device [45]
is from 860 to 6263 Hz as shown in Fig. 1.23 (b). It missed the main frequency of lung sounds
which is between 60 and 600 Hz as shown in Fig. 1.26 [55]. As far as speech sensing is concerned,
the low resonant frequencies from 300 to 860 Hz were also missed as the narrow speech band is
300 – 4000 Hz and the wide speech band is 50 – 7000 Hz according to the specification of ITU
[56]. Fig. 1.27 also shows that speech at around 500 Hz is an important intelligible portion [57]
Figure 1.26: Lung sound spectrum [55].
Figure 1.27: Intelligibility of the speech at
different frequencies [57].
17
Second, the resonant frequencies were designed with even distribution. However, the human
ear does not percept sound frequency evenly but distinguishes the frequency better at lower
frequencies [58]. Mel frequency distribution as shown in Fig. 1.28 reflects this fact [59]. Therefore,
RMA with Mel distributed resonant frequencies are required.
Third, besides stimulating the hearing neurons in the cochlear [39, 43, 47, 49], speech
recognition [53], and lung sound classification [46, 54], other applications of RMA have not been
explored. One important potential application is active noise cancellation for hearing aids to
control the noise reaching the eardrum. The high sensitivity and low noise properties of RMA are
beneficial to detect all the noise and then cancel it through proper algorithms.
1.4 Aim, scope, and significance of the research
Self-powered piezoelectric resonant microphone array (RMA) covering low resonant
frequencies (down to 200 Hz) with high sensitivity and SNR will be researched and developed for
Figure 1.28: Mel frequency distribution [59].
18
lung sound detection and classification, speech sensing, and active noise cancellation with hearing
aids. Mel distributed resonant frequencies for RMA will also be studied. At the same time, the
resonant microphone with a lower resonant frequency is usually larger. Therefore, this work will
also overcome this challenge to minimize the resonant microphones without sacrificing sensitivity
and SNR.
High sensitivity and SNR RMA with low resonant frequencies covering the lung sound
spectrum will help to detect and classify adventitious lung sounds like wheezing better. At the
same time, the acoustic coupler can also be minimized. Consequently, it will enable high-
performance and low-power wearable electronic stethoscopes for the patients. Through monitoring
the lung sounds continuously in real-time, the patients and their care providers within the medical
cyber-physical systems (MCPS) can get alert whenever abnormal adventitious lung sounds emerge.
The patients will be less likely to be at risk situations facing unpredicted life-threatening disease
attacks like asthma.
For speech recognition, the proposed device will make it more accurate with weak voices or
speech from a long distance. Therefore, it will enhance the ability of the voice human-machine
interface. At the same time, patients with hearing loss wearing the hearing aids with proposed
RMA for both speech sensing and active noise cancellation will understand the speech much better
in a noisy environment.
19
1.5 Overview of the dissertation
This dissertation consists of six chapters.
This chapter (chapter 1) begins with the background of the high-performance microphone in
cyber-physical systems and the gap in the literature. Then the aim, scope, and significance of this
study are presented.
Chapter 2 presents the modeling of the resonant microphone array (RMA) including the
complete lumped element model of the acoustic, mechanical, and electrical domain of the RMA,
analytical model of the cantilever microphone vibration, and noise model of the RMA with pre-
amplifier.
Chapter 3 presents the resonant microphone array (RMA) with low resonant frequencies (200
– 800 Hz) for lung sound detection and classification. Design, fabrication, characterization of the
device, and its application to wheezing detection and classification in long sounds are discussed.
Chapter 4 presents the RMA for ANC (active noise cancellation) for hearing aids. The design
of the RMA transducer and hearing aid system is shown first. Then the performance of ANC and
its effect on ASR are discussed. Finally, smaller RMA development for ANC for hearing aids is
presented.
Chapter 5 presents the RMAs for speech sensing and recognition. Design and characterization
of the RMA for both normal application and wearable application were presented.
Chapter 6 concludes this dissertation and discusses future directions.
20
Chapter 2
Modeling
This chapter presents the modeling of RMA. A complete lumped element model is established
with the warped MEMS cantilever microphone, and its effect on the acoustic pressure leakage is
studied. The cantilever vibration, acoustic impedance, and pressure-voltage transformation
through the piezoelectric effect are all included in the model. An electroacoustic model of a paddle-
shaped cantilever is used to develop a lumped equivalent-circuit model for acoustics through the
gap between the cantilever and the surrounding substrate. A noise model of a piezoelectric
cantilever together with an op-amp is described. Finally, the models for both sensitivity and noise
are validated through experimental results with an RMA.
2.1 Lumped element model of MEMS cantilever microphone
The basic structure of the MEMS microphone in the array is a support layer Si cantilever with
piezoelectric thin film ZnO, which converts the mechanical strain caused by sound pressure to
voltage. An ideal MEMS cantilever microphone structure is shown in Fig. 2.1. However, there is
warpage in the cantilever because of thin film stress (Fig. 2.2) in a real device fabricated, resulting
in an increased air gap (and thus pressure leakage) around the three edges of the cantilever. Thus,
for audio frequency (less than 10 kHz with wavelength larger than 36 mm), an electroacoustic
lumped-circuit-element model (Fig. 2.3b) is developed to account for the pressure leakage and
added to the rest of the model that accounts for piezoelectric conversion. The 𝑍
!
is the equivalent
21
electroacoustic impedance due to the pressure leakage through the air gap around the cantilever
(Fig. 2.1) and is paralleled to the equivalent impedance of the cantilever vibration 𝑍
"
which is
derived in a later section. Since the plane acoustic wave propagates through the air having an
equivalent impedance of air mass density (𝜌
#
) times sound speed (𝑐
#
), the 𝑍
$
(= 𝜌
#
𝑐
#
) and 𝑍
%
(=
𝜌
#
𝑐
#
) are added to model the free-space air acoustic impedance.
2.1.1 Impedance of the air gap
For a baffle having a rectangular opening with width 𝑎 and height 𝑑 (Fig. 2.4a) with the d
being not too large, the acoustic impedance for a wave traveling along the x-direction (with depth
b) is [60]
Figure 2.1: Ideal structure of cantilever
microphone.
Si
cantilever
Gap
Si base
Channel
through
KOH
etch
!
"
!
#
$
$
"
"
!
"
%!
$
ℎ
Figure 2.2: Real structure of cantilever
microphone with warpage.
Warped
Si cantilever
(a) (b)
Figure 2.3: Lumped element model of MEMS cantilever microphone (a) schematic of impedance
with warped cantilever (b) equivalent circuit
22
𝑍 = (
!
"
!
+
!
#
!
)
$
"
%&
'("#
+𝑗𝜔
$
#
*&
'("#
(2-1)
where h and r are viscosity and mass density, respectively. For audio sounds with a frequency
less than 10 kHz (that has a wavelength larger than 36 mm), treating the gap around the three
edges in Fig. 2.1 (and repeated with the definitions for air gap d and length l shown in Fig. 2.4b)
as if it were one wide-and-short rectangular gap (as shown in Fig. 2.4a) would be reasonable for
a small-sized cantilever plus gap (e.g., 3 x 4mm), since the total length of each side is less than
10% of the wavelength.
According to Fig. 2.1, the dimension of the Si cantilever is 𝑙×𝑐
&
×ℎ, which is obtained from
a Si diaphragm with dimension 𝑙
%
×𝑐
%
×ℎ. The slot width for the cantilever release is 𝑔, and the
width of the diaphragm rim that is left is 𝑒. The dimensions in the Si channel with the warped
cantilever are shown in Fig. 2.5. The shunt acoustic impedance of the air gap can be calculated as:
𝑍
&
= (
!
(,
$
-,)
!
+
!
/
$
!
)
$
"
%0
'((,
$
-,)/
$
+𝑗𝜔
$
#
*0
'((,
$
-,)/
$
+
2×-.
!
,
$
!
+
!
1
%
&
'('&
!
2
!
/
$
"
%0
'(,
$
1
%
&
'('&
!
2
+𝑗𝜔
$
#
*0
'(,
$
1
%
&
'('&
!
2
0
where 𝑙
&
= sin(180°−54.74°)×(𝑙+𝑔+𝑒)/sin (54.74°−𝛼)
(a) (b)
Figure 2.4: (a) Rectangular channel [60] (b) Air gap around a cantilever.
(2-2)
23
2.1.2 Impedance of cantilever (Zc)
A cantilever can equivalently be represented with a capacitor, an inductor, and a resistor in
series (Fig. 2.6) [61].
(a) (b)
Figure 2.5: Dimension to calculate the paralleled acoustic impedance with
warped cantilever (a) cross section view (b) top view
!
!
!
"
!
#+%
"
"
Figure 2.6: Impedance of cantilever
Air gap
24
The capacitance is due to volume compliance under static pressure
𝐶
"
=
∀
"
|
'( #,*( #
𝑝
"
where ∀
"
is volume displacement under pressure 𝑝
"
∀
"
= 𝑐
&
B 𝑤(𝑥)𝑑𝑥
&
#
where 𝑤(𝑥) is cantilever displacement function, which is obtained through the analytical model
in later sections; 𝑐
&
is the width of the cantilever; 𝑙 is the length of the cantilever. For the stepped
cantilever, Fig. 2.6 has
∀
"
= 𝑐
+
B 𝑤
,
(𝑥)𝑑𝑥
&
!
#
+𝑐
&
B 𝑤
-
(𝑥)𝑑𝑥
&
&
!
where 𝑐
+
and 𝑐
&
are total width at the fixed end and free end separately.
The inductance is equivalent to mass Mc, which can be obtained through
1
2
𝑀
"
𝑣
"
-
=
1
2
B𝑚(𝑥)𝑣(𝑥)
-
𝑑𝑥
&
#
For harmonic vibration, the velocity 𝑣
"
= ∀
"
.
= 𝑗𝜔∀
"
, 𝑣(𝑥)= 𝑤(𝑥)
.
= 𝑗𝜔𝑤(𝑥)
Thus, 𝐿
"
= 𝑀
"
=
∫ 0(2)4(2)
"
#
$
∀
%
"
For width-stepped cantilever
1
2
𝑀
"
𝑣
"
-
=
1
2
B 𝑚
,
(𝑥)𝑣
,
(𝑥)
-
𝑑𝑥+
&
!
#
1
2
B𝑚
-
(𝑥)𝑣
-
(𝑥)
-
𝑑𝑥
&
&
!
𝐿
"
= 𝑀
"
=
∫ 𝑚
,
(𝑥)𝑤
,
(𝑥)
-
𝑑𝑥+∫ 𝑚
-
(𝑥)𝑤
-
(𝑥)
-
𝑑𝑥
&
&
!
&
!
#
∀
"
-
where 𝑚
,
(𝑥) and 𝑚
-
(𝑥) are obtained in a later section dealing with the cantilever analytical
model.
(2-3)
(2-4)
(2-5)
(2-6)
(2-7)
25
The resistance is
𝑅
"
= 2𝜁O
𝑀
"
𝐶
"
where 𝜁is damping coefficient, which is assumed to be 0.03 in this study, which is also used in
similar literature [61].
2.1.3 Piezoelectric Transduction
The cantilever’s electroacoustical impedance 𝑍
"
is added to a transformer and 𝑍
6
which model
piezoelectric transduction and electrical impedance of piezoelectric layer on the cantilever,
respectively.
Define system piezoelectric coefficient
𝑑
7
=
𝑄
'( #
𝑝
"
=
∀
7( #
𝑉
where 𝑄is charge generated, which can be calculated through the cantilever vibration analytical
model in a later section.
If 𝑉 = 0
𝑣
"
=
𝑝
"
𝑍
"
,𝐼 = 𝑛𝑣
"
= 𝑄
.
= 𝑗𝜔𝑄,𝑄 = 𝑑
7
×𝑝
"
(2-8)
Figure 2.7: Cantilever impedance, piezoelectric transduction, and equivalent
impedance of a piezoelectric layer.
(2-9)
26
Therefore
𝑛 = 𝑗𝜔𝑑
7
𝑍
"
If 𝑝
"
= 0
𝑣
"
= 𝑗𝜔∀
"
= 𝑗𝜔𝑑
7
𝑉,𝑣
"
=
𝑛𝑉
𝑛
-
𝑍
6
Therefore
𝑛 =
1
𝑗𝜔𝑑
7
𝑍
6
2.1.4 Electrical impedance
The electrical impedance 𝑍
6
of the piezoelectric resonant microphone is modeled with
equivalent capacitance 𝐶
6
and resistance 𝑅
6
in parallel as shown in Fig. 2.8.
2.2 Lumped element model of MEMS cantilever microphone with Pre-amp
The input impedance 𝑍
7
of a pre-amp and a bias resistor 𝑅
!
(10
9
ohm) are added to the
equivalent circuit together with an ideal op-amp and the resistors for non-inverting amplification
(Fig. 2.9). The Pre-amp LTC6244 is selected as it has a very high input impedance
( 𝑅
7
= 10
,-
𝑜ℎ𝑚 in parallel with 𝐶
7
= 2.1𝑝𝐹).
(2-10)
(2-11)
Figure 2.8: Piezoelectric electrical impedance.
27
2.2.1 Sensitivity
The sensitivity of the system can be derived through lumped element model shown in
Fig. 2.9.
Define
𝑍
89
= 𝑍
6
∥ 𝑅
!
∥ 𝑍
7
Then we can get
𝑉
89
=
𝑛𝑍
89
𝑍
"
+𝑛
-
𝑍
89
×
(𝑍
"
+𝑛
-
𝑍
89
) ∥ 𝑍
!
𝑍
$
+(𝑍
"
+𝑛
-
𝑍
89
) ∥ 𝑍
!
+𝑍
%
𝑉
:;<
= 𝑉
89
(1+
𝑅
=
𝑅
>
)
Thus, the unamplified sensitivity
𝑆
;
=
𝑉
89
𝑝
Figure 2.9: Full lumped element model of MEMS cantilever microphone together with Pre-amp.
o
o
o
(2-12)
(2-13)
(2-14)
28
2.2.2 Noise
The electrical noises from the microphone and the pre-amp are modeled with equivalent
noise sources as shown in Fig. 2.10, according to the Texas Instrument’s noise analysis in
operational amplifier circuits [82].
where 𝐶
89
is paralleled microphone capacitance 𝐶
6
and input capacitance of pre-amp 𝐶
7
; and 𝑅
89
is paralleled microphone resistance 𝑅
6
, input resistance of pre-amp 𝑅
7
, and bias resistance 𝑅
!
.
Input referred noise spectrum caused by 𝑅
>
and 𝑅
=
𝑒
89,
-
= 4𝐾𝑇(𝑅
=
∥ 𝑅
>
)
Input referred noise spectrum caused by 𝑅
89
𝑒
89-
-
=
4𝐾𝑇𝑅
89
1+(𝜔𝐶
89
𝑅
89
)
-
Input referred noise spectrum caused by pre-amp input referred voltage noise 𝑒
9
Figure 2.10: Noise model of MEMS cantilever microphone together with Pre-amp
Cin Rin
Vout
Rf Rg
+
-
Eg Ef
Inn
Inp
En
Iin
(2-15)
(2-16)
29
𝑒
69
-
=
𝑘
6
-
𝑓
+𝑒
94
-
where
?
&
"
=
is flicker noise, and 𝑒
94
-
is white noise, which can be obtained from the op-amp
datasheet (Fig. 2.11a)
Input referred noise spectrum caused by the op-amp’s input referred current noise 𝑖
97
𝑒
897
-
=
𝑖
97
-
𝑅
89
-
1+(𝜔𝐶
89
𝑅
89
)
-
where
𝑖
97
-
=
𝐾
-
𝑓
+𝑖
974
-
where
@
"
=
and 𝑖
974
-
can be obtained from the op-amp datasheet (Fig. 2.11b)
Input referred noise spectrum caused by pre-amp input referred current noise 𝑖
99
𝑒
899
-
= 𝑖
99
-
^𝑅
=
∥ 𝑅
>
_
-
where 𝑖
99
-
= 𝑖
97
-
Therefore, the total input referred noise spectrum
𝑒
<
-
= 𝑒
89,
-
+𝑒
89-
-
+𝑒
69
-
+𝑒
897
-
+𝑒
899
-
The total input referred noise
𝐸
<
-
=B 𝑒
<
-
𝑑𝑓
=
'
=
(
𝑓
A
is usually 20 Hz, and 𝑓
B
is usually 20 kHz for microphones for general audio applications.
Thus, we can get the minimum detectable sound pressure of the system
𝑝
0
=
𝐸
<
𝑆
;
(2-17)
(2-18)
(2-19)
(2-20)
(2-21)
(2-22)
30
2.3 Analytical model of width-stepped cantilever vibration
A width-stepped cantilever design (Fig. 2.12) has advantages over a standard cantilever,
because of a smaller size for the same resonance frequency as well as higher stress concentration
near the fixed end, which leads to the higher voltage generated.
Figure 2.11: (a) Noise voltage and (b) Noise current of op amp LTC6244 used for microphone array
developed in this study.
(a) (b)
Figure 2.12: Schematic of the width-stepped cantilever.
!
!
"
!
"
"
!
ℎ
"! $!$%&'()$,ℎ
!"
&ℎ,%-
"! $!$%&'()$,ℎ
!#
&ℎ,%-
.,/ ,0(!1&,(2,ℎ
$%"
&ℎ,%-
.,/ ,0(!1&,(2,ℎ
$%#
&ℎ,%-
324 5,$6($!$%&',% !17$',ℎ
&
&ℎ,%-
., %12&,!$8$'
!
!
𝑥
𝑧
31
2.3.1 Mode of vibration
Though the width-stepped cantilever is two-dimensional, it is approximated as a one-
dimensional beam with two sections with the following equations for free vibration (i.e., harmonic
vibration with frequency f, or w = 2pf) of the beam [83]
⎩
⎪
⎨
⎪
⎧
𝑑
C
𝑊
,
(𝑥)
𝑑𝑥
C
−𝛽
,
C
𝑊
,
(𝑥) = 0, 0 ≤ 𝑥 ≤ 𝑙
+
𝑑
C
𝑊
-
(𝑥)
𝑑𝑥
C
−𝛽
-
C
𝑊
-
(𝑥) = 0, 𝑙
+
< 𝑥 ≤ 𝑙
where 𝛽
,
C
= 𝑚
,
𝜔
-
/𝐸
,
𝐼
,
, 𝛽
-
C
= 𝑚
-
𝜔
-
/𝐸
-
𝐼
-
, with 𝑚
,
, 𝑚
-
being the mass per unit length, 𝐸
,
, 𝐸
-
being the Young’s modulus, and 𝐼
,
, 𝐼
-
being the moment of inertia, respectively.
For 0 ≤ 𝑥 ≤ 𝑙
+
, the neutral axis 𝑧
#
is the solution of below equation for multi-layer structure
Σ∫𝐸(𝑧−𝑧
:
)𝑑𝑧 = 0
∫ 𝐸
)
(𝑧−𝑧
*
)𝑑𝑧+∫ 𝐸
+
,-,
!"
,
,
*
(𝑧−𝑧
*
)𝑑𝑧+∫ 𝐸
).
(𝑧−𝑧
*
)𝑑𝑧+
,-,
!"
-,
#$"
,-,
!"
∫ 𝐸
/
(𝑧−
,-,
!"
-,
#$"
-,
%
,-,
!"
-,
#$"
𝑧
*
)𝑑𝑧+∫ 𝐸
).
(𝑧−𝑧
*
)𝑑𝑧+
,-,
!"
-,
#$"
-,
%
-,
#$&
,-,
!"
-,
#$"
-,
%
∫ 𝐸
+
(𝑧−𝑧
*
)𝑑𝑧=0
,-,
!"
-,
#$"
-,
%
-,
#$&
-,
!&
,-,
!"
-,
#$"
-,
%
-,
#$&
where 𝐸
D
,𝐸
E
,𝐸
DF
,𝐸
G
are Young’s modulus of Si, Al, PECVD SiN, and ZnO, respectively;
ℎ, ℎ
E
,ℎ
DF
,ℎ
G
are the thickness of Si, Al, SiN, and ZnO as shown in Fig. 2.12.
For 𝑙
+
< 𝑥 ≤ 𝑙, the neutral axis is ℎ/2
For area moment of inertia, we have the following for 0 ≤ 𝑥 ≤ 𝑙
+
.
𝐸
,
𝐼
,
= Σ∫𝐸(𝑧−𝑧
#
)
-
𝑑𝐴 = 𝑐
+
Σ∫𝐸(𝑧−𝑧
#
)
-
𝑑𝑧
𝐸
,
𝐼
,
= 𝑐
+
[B 𝐸
D
(𝑧−𝑧
#
)
-
𝑑𝑧+B 𝐸
E
HIH
)*
H
H
#
(𝑧−𝑧
#
)
-
𝑑𝑧+B 𝐸
DF
(𝑧−𝑧
#
)
-
𝑑𝑧
HIH
)*
IH
+,*
HIH
)*
+B 𝐸
G
(𝑧−𝑧
#
)
-
𝑑𝑧+B 𝐸
DF
(𝑧−𝑧
#
)
-
𝑑𝑧
HIH
)*
IH
+,*
IH
-
IH
+,"
HIH
)*
IH
+,*
IH
-
HIH
)*
IH
+,*
IH
-*
HIH
)*
IH
+,*
+B 𝐸
E
(𝑧−𝑧
#
)
-
𝑑𝑧]
HIH
)*
IH
+,*
IH
-
IH
+,"
IH
)"
HIH
)*
IH
+,*
IH
-
IH
+,"
(2-24)
(2-25)
(2-23)
32
For 𝑙
+
< 𝑥 ≤ 𝑙,
𝐸
0
𝐼
0
=
𝑐
1
𝐸
)
ℎ
2
12
As far as the equivalent mass is concerned, for 0 ≤ 𝑥 ≤ 𝑙
+
,
𝑚
,
(𝑥)=n 𝜌
8
𝐴
8
= n 𝜌
8
𝑐
+
ℎ
8
= 𝑐
+
n 𝜌
8
ℎ
8
9
8( ,
9
8( ,
9
8( ,
For 𝑙
+
< 𝑥 ≤ 𝑙,
𝑚
-
(𝑥) = 𝜌
D
𝐴
-
= 𝑐
&
𝜌
D
ℎ
The general solution of the fourth-order differential equation 2-23 is
⎩
⎨
⎧
𝑊
,
(𝑥)= 𝐶
,
(𝑐𝑜𝑠𝛽
,
𝑥+𝑐𝑜𝑠ℎ𝛽
,
𝑥)+𝐶
-
(𝑐𝑜𝑠𝛽
,
𝑥−𝑐𝑜𝑠ℎ𝛽
,
𝑥)
+𝐶
J
(𝑠𝑖𝑛𝛽
,
𝑥+𝑠𝑖𝑛ℎ𝛽
,
𝑥)+𝐶
C
(𝑠𝑖𝑛𝛽
,
𝑥−𝑠𝑖𝑛ℎ𝛽
,
𝑥)
𝑊
-
(𝑥) = 𝐶
K
(𝑐𝑜𝑠𝛽
-
𝑥+𝑐𝑜𝑠ℎ𝛽
-
𝑥)+𝐶
L
(𝑐𝑜𝑠𝛽
-
𝑥−𝑐𝑜𝑠ℎ𝛽
-
𝑥)
+𝐶
M
(𝑠𝑖𝑛𝛽
-
𝑥+𝑠𝑖𝑛ℎ𝛽
-
𝑥)+𝐶
N
(𝑠𝑖𝑛𝛽
-
𝑥−𝑠𝑖𝑛ℎ𝛽
-
𝑥)
Let 𝜆 = 𝛽𝑙, we get
⎩
⎪
⎪
⎨
⎪
⎪
⎧
𝑊
,
(𝑥) = 𝐶
,
q𝑐𝑜𝑠𝜆
,
𝑥
𝑙
+𝑐𝑜𝑠ℎ𝜆
,
𝑥
𝑙
r+𝐶
-
q𝑐𝑜𝑠𝜆
,
𝑥
𝑙
−𝑐𝑜𝑠ℎ𝜆
,
𝑥
𝑙
r
+𝐶
J
q𝑠𝑖𝑛𝜆
,
𝑥
𝑙
+𝑠𝑖𝑛ℎ𝜆
,
𝑥
𝑙
r+𝐶
C
q𝑠𝑖𝑛𝜆
,
𝑥
𝑙
−𝑠𝑖𝑛ℎ𝜆
,
𝑥
𝑙
r
𝑊
-
(𝑥)= 𝐶
K
q𝑐𝑜𝑠𝜆
-
𝑥
𝑙
+𝑐𝑜𝑠ℎ𝜆
𝑥
𝑙
r+𝐶
L
q𝑐𝑜𝑠𝜆
-
𝑥
𝑙
−𝑐𝑜𝑠ℎ𝜆
-
𝑥
𝑙
r
+𝐶
M
q𝑠𝑖𝑛𝜆
-
𝑥
𝑙
+𝑠𝑖𝑛ℎ𝜆
-
𝑥
𝑙
r+𝐶
N
q𝑠𝑖𝑛𝜆
-
𝑥
𝑙
−𝑠𝑖𝑛ℎ𝜆
-
𝑥
𝑙
r
Resonance angular frequency
𝜔 = β
,
-
O
𝐸
,
𝐼
,
m
,
(x)
= 𝛽
-
-
O
𝐸
-
𝐼
-
𝑚
-
(𝑥)
= 𝜆
,
-
O
𝐸
,
𝐼
,
𝑚
,
(𝑥)𝑙
C
= 𝜆
-
-
O
𝐸
-
𝐼
-
𝑚
-
(𝑥)𝑙
C
Thus
𝛽
,
= 𝛽
-
× O
𝐸
-
𝐼
-
𝐸
,
𝐼
,
×
𝑚
,
(𝑥)
𝑚
-
(𝑥)
.
,𝜆
,
= 𝜆
-
× O
𝐸
-
𝐼
-
𝐸
,
𝐼
,
×
𝑚
,
(𝑥)
𝑚
-
(𝑥)
.
(2-26)
(2-27)
(2-28)
(2-29)
(2-30)
(2-31)
33
The boundary conditions for the beam are as follows.
At 𝑥 = 0
𝑊
,
(0) = 0, 𝑊
,
.
(0) = 0,
At 𝑥 = 𝑙
+
𝑊
,
(𝑙
+
)= 𝑊
-
(𝑙
+
), 𝑊
,
.
(𝑙
+
)= 𝑊
-
.
(𝑙
+
),
𝐸
,
𝐼
,
𝑊
,
..
(𝑙
+
) = 𝐸
-
𝐼
-
𝑊
-
..
(𝑙
+
), 𝐸
,
𝐼
,
𝑊
,
...
(𝑙
+
)= 𝐸
-
𝐼
-
𝑊
-
...
(𝑙
+
)
At 𝑥 = 𝑙
𝐸
-
𝐼
-
𝑊
-
..
(𝑙) = 0, 𝐸
-
𝐼
-
𝑊
-
..
′(𝑙) = 0
From the boundary conditions above, 𝜆
,
and 𝜆
-
can be obtained, but the analytical expressions
for 𝜆
,
and 𝜆
-
are complicated. Thus, a numerical method is used to solve the equations.
Then the resonance frequencies are calculated
𝑓
9
= 𝜆
-
-
O
𝐸
-
𝐼
-
𝑚
-
(𝑥)𝑙
C
/(2∗𝜋)
And the expressions of mode function 𝑊
,
(𝑥) and 𝑊
-
(𝑥) are
𝑊
, 8
(𝑥)= 𝐴
8
𝑊
, 8
!
(𝑥),𝑖 = 1,2,3…
𝑊
-8
(𝑥)= 𝐴
8
𝑊
-8
!
(𝑥),𝑖 = 1,2,3…
where 𝑖 = 1,2,3… means vibration modes; 𝐴
8
is arbitrary constant and should be normalized
𝐴
8
=
O
1
∫ 𝑚
,
(𝑥)^𝑊
, 8
!
(𝑥)_
-
𝑑𝑥+∫ 𝑚
-
(𝑥)^𝑊
-8
!
(𝑥)_
-
𝑑𝑥
&
&+
&+
#
(2-32)
(2-33)
(2-34)
34
2.3.2 Forced vibration function
Forced vibration function can be expressed as [83]
𝑤
34
(𝑥,𝑡)=4𝑊
34
(𝑥)𝑞
4
(𝑡)
5
463
=4𝐴
4
𝑊
34
7
(𝑥)𝑞
4
(𝑡)
5
463
𝑤
04
(𝑥,𝑡)=4𝑊
04
(𝑥)𝑞
4
(𝑡)
5
463
=4𝐴
4
𝑊
04
7
(𝑥)𝑞
4
(𝑡)
5
463
where 𝑊
, 8
(𝑥) and 𝑊
-8
(𝑥) are 𝑖th normal mode, 𝑞
8
(𝑡) is the corresponding generalized coordinate
of the cantilever.
The general forced vibration function is
Using ∑ 𝑊
4
(𝑥)𝑞
4
(𝑡)
5
463
to replace 𝑤(𝑥,𝑡), we get
Multiplying by 𝑊
O
(𝑥) and integrating from 0 to 𝑙 results:
According to orthogonality, we get
∂
2
∂x
2
[EI(x)
∂
2
w
∂x
2
(x,t)] +c(x)
∂w
∂t
(x,t) +ρA(x)
∂
2
w
∂t
2
(x,t) = f(x,t)
ρA(x)
∞
∑
i=1
ω
2
i
W
i
(x)q
i
(t)+c(x)
∞
∑
i=1
dq
i
(t)
dt
W
i
(x)+ρA(x)
∞
∑
i=1
d
2
q
i
(t)
dt
2
W
i
(x) = f(x,t)
∞
∑
i=1
ω
2
i
q
i
(t)
∫
l
0
ρA(x)W
i
(x)W
j
(x)dx +
∞
∑
i=1
dq
i
(t)
dt
∫
l
0
c(x)W
i
(x)W
j
(x)dx +
∞
∑
i=1
d
2
q
i
(t)
dt
2
∫
l
0
ρA(x)W
i
(x)W
j
(x)dx
=
∫
l
0
W
j
(x)f(x,t)dx
d
2
q
i
(t)
dt
2
+
[∫
l
s
0
c
1
(x)W
2
1i
(x)dx +
∫
l
l
s
c
2
(x)W
2
2i
(x)dx
]
dq
i
(t)
dt
+ω
2
i
q
i
(t) =
[∫
l
s
0
W
1i
(x)f(x,t)dx +
∫
l
l
s
W
2i
(x)f(x,t)dx
]
= Q
i
(t)
d
2
q
i
(t)
dt
2
+
[∫
l
s
0
c
1
(x)W
2
1i
(x)dx +
∫
l
l
s
c
2
(x)W
2
2i
(x)dx
]
dq
i
(t)
dt
+ω
2
i
q
i
(t) =
[∫
l
s
0
W
1i
(x)f(x,t)dx +
∫
l
l
s
W
2i
(x)f(x,t)dx
]
= Q
i
(t)
(2-35)
(2-36)
(2-37)
(2-38)
(2-39)
35
which is
where
For harmonic excitation
𝑓(𝑥,𝑡)= 𝐹(𝑥)𝑐𝑜𝑠𝜔𝑡
where
The solution of equation 2-42 is
𝑞
8
(𝑡) = 𝐹
8
𝐻(Ω)cos (Ω𝑡−𝜙
8
)
Therefore
d
2
q
i
(t)
dt
2
+
[
∫
l
s
0
c
1
(x)(W
b
1i
(x))
2
dx + ∫
l
l
s
c
2
(x)(W
b
2i
(x))
2
dx
]
[
∫
l
s
0
ρA
1
(x)(W
b
1i
)
2
dx + ∫
l
l
s
ρA
2
(x)(W
b
2i
)
2
dx
]
dq
i
(t)
dt
+ω
2
i
q
i
(t)
=
[∫
l
s
0
W
1i
(x)f(x,t)dx +
∫
l
l
s
W
2i
(x)f(x,t)dx
]
= Q
i
(t)
[
∫
l
s
0
c
1
(x)(W
b
1i
(x))
2
dx + ∫
l
l
s
c
2
(x)(W
b
2i
(x))
2
dx
]
[
∫
l
s
0
ρA
1
(x)(W
b
1i
)
2
dx + ∫
l
l
s
ρA
2
(x)(W
b
2i
)
2
dx
]
= 2ζ
i
ω
i
d
2
q
i
(t)
dt
2
+2ζ
i
ω
i
dq
i
(t)
dt
+ω
2
i
q
i
(t) = F
i
cosΩt
F
i
=
∫
l
s
0
W
1i
(x)F
1
(x)dx +
∫
l
l
s
W
2i
(x)F
2
(x)dx = A
i
[∫
l
s
0
W
b
1i
(x)F
1
(x)dx +
∫
l
l
s
W
b
2i
(x)F
2
(x)dx
]
w
1
(x,t) =∑
∞
i=1
E
i
H(Ω)
i
W
b
1i
(x)cos(Ωt−ϕ
i
)
w
2
(x,t) =∑
∞
i=1
E
i
H(Ω)
i
W
b
2i
(x)cos(Ωt−ϕ
i
)
(2-40)
(2-41)
(2-42)
(2-43)
(2-44)
(2-45)
36
where
𝐻(Ω)
8
=
1
𝜔
8
-
1−q
Ω
𝜔
8
r
-
-
+q2𝜁
Ω
𝜔
8
r
-
,
-
=
1
(𝜔
8
-
−Ω
-
)
-
+q
𝑐
𝜌𝐴
r
-
Ω
-
,
-
=
1
{(𝜔
8
-
−Ω
-
)
-
+(2𝜁𝜔
8
)
-
Ω
-
}
,
-
where 𝜔
8
is angular resonance frequency, Ω is angular frequency.
𝐸
8
= (B 𝑊
, 8
!
(𝑥)𝐹
,
(𝑥)𝑑𝑥+B𝑊
-8
!
(𝑥)𝐹
-
(𝑥)𝑑𝑥)
&
&
!
𝐴
8
-
&
!
#
=
∫ 𝑊
, 8
!
(𝑥)𝐹
,
(𝑥)𝑑𝑥+∫ 𝑊
-8
!
(𝑥)𝐹
-
(𝑥)𝑑𝑥
&
&
!
&
!
#
∫ 𝑚
,
(𝑥)^𝑊
, 8
!
(𝑥)_
-
𝑑𝑥+∫ 𝑚
-
(𝑥)^𝑊
-8
!
(𝑥)_
-
𝑑𝑥
&
&
!
&
!
#
where for uniform pressure 𝑝
"
𝐹
,
(𝑥) = 𝑝
"
𝑐
+
𝐹
-
(𝑥) = 𝑝
"
𝑐
&
If only fundamental resonance is considered
𝑤
,
(𝑥,𝑡)= 𝐸
,
𝐻(Ω)
,
𝑊
,,
!
(𝑥)cos (Ω𝑡−𝜙
,
)
𝑤
-
(𝑥,𝑡)= 𝐸
,
𝐻(Ω)
,
𝑊
-,
!
(𝑥)cos (Ω𝑡−𝜙
,
)
And the amplitude is
𝑤
,
(𝑥) = 𝐸
,
𝐻(Ω)
,
𝑊
,,
!
(𝑥)
𝑤
-
(𝑥) = 𝐸
,
𝐻(Ω)
,
𝑊
-,
!
(𝑥)
(2-46)
(2-47)
(2-48)
(2-49)
(2-50)
37
2.3.3 Stress and charge
The charge generated by the piezoelectric ZnO layer is determined by its piezoelectric
coefficient and stress induced by the cantilever bending. The ZnO layer is patterned to remain only
on the section close to the fixed end where the stress is higher.
The amplitude of the normal in-plane bending stress (sxx or syy) is
𝜎(𝑥,𝑧) = 𝑧×𝐸
P
×
𝑑𝑤
,
-
(𝑥)
𝑑𝑥
-
where 𝑧 is the distance from the neutral axis, and 𝐸
P
is elastic modulus of ZnO.
The average stress over the length direction and thickness direction, with the stress variation
along the width assumed to be small (though it actually is not) is
𝜎
$Q6
=
∫ ∫ 𝜎(𝑥,𝑧)𝑑𝑥𝑑𝑧
&
!
#
HIH
)*
IH
+,*
IH
-
RP
$
HIH
)*
IH
+,*
RP
$
ℎ
G
𝑙
+
The charge generated (short circuit)
𝑄 = 𝑑
J,
×𝜎
$Q6
×𝐴
6
where 𝐴
6
is the top-view area covered by the top and bottom electrodes with ZnO piezoelectric
layer and SiN insulating layer between them.
The 𝑤
,
(𝑥) and 𝑤
-
(𝑥) derived in this section are used in Eqs. 2-5, 2-6, and 2-7 for the lumped
element model, while the 𝑄 derived here is used in Eq. 2-9.
2.4 Case study and discussion
Let’s consider the design of a resonant microphone array (RMA) with width-stepped
cantilever microphones as shown in Fig. 3.39 in chapter 3 section 3.5. the design parameters are
in Table 3.7 and Table 3.8.
(2-51)
(2-52)
(2-53)
38
2.4.1 Model
An RMA composed of eight cantilever microphones and a printed circuit board (PCB) with
eight pre-amps is sealed in a metal box for electromagnetic shielding. Only the backside of the
RMA is exposed for sound inlet, as illustrated in Fig. 2.13.
Thus, the RMA can be modeled with eight parallelly-connected cantilever microphones with
𝑍
!"
being the equivalent impedance of the metal box with a finite back cavity (Fig. 2.14). The 𝑍
!"
is composed of an equivalent capacitor (representing the back cavity) and the equivalent
impedance to model the holes (for acoustic and electrical signal) and any air gap on the metal box,
as shown in Fig. 2.15.
Figure 2.13: Schematic of packaged RMA.
RMA
PCB with pre-amp
Aluminum box
Sound inlet
Figure 2.14: Lumped element model of the packaged RMA.
!
!"
o
o
"
#1
#2
#8
39
where
𝐶
!"
=
𝑉
!"
𝜌
#
𝑐
#
-
𝑍
S
/%
=
1
𝑗𝜔𝐶
!"
where 𝜌
#
is the density of air, 𝑐
#
is sound speed in air
𝑍
H!
=
𝜋
L
𝜂𝑡
!
64𝑙
H
C
+𝑗𝜔
𝜋
C
𝜌
#
𝑡
!
64𝑙
H
-
𝑍
>!
=
𝜋
L
𝜂𝑡
!
64𝑝
6T
𝑔
!
J
+𝑗𝜔
𝜋
C
𝜌
#
𝑡
!
64𝑝
6T
𝑔
!
𝑍
H>
= 𝑍
H!
∥ 𝑍
>!
where 𝑍
H!
is the impedance of the rectangular hole on the metal box for input/output wires,
and 𝑍
>!
is the impedance of the gap between the cover and the body of the metal box.
𝑍
!"
= 𝑍
S
/%
∥ 𝑍
H>
Figure 2.15: Impedance of metal box.
(2-54)
(2-55)
(2-56)
(2-57)
(2-58)
(2-59)
40
2.4.2 Sensitivity
Considering all 8 resonant microphones in the array and the metal box, the sensitivity equation
of 2-13 becomes
𝑉
89,?
=
𝑍
$TT$U
𝑍
$TT$U
+𝑍
!"
×
𝑛
?
𝑍
89,?
𝑍
",?
+𝑛
?
-
𝑍
89,?
×
(𝑍
",?
+𝑛
?
-
𝑍
89,?
) ∥ 𝑍
!,?
𝑍
$,?
+^𝑍
",?
+𝑛
?
-
𝑍
89,?
_ ∥ 𝑍
!,?
+𝑍
%,?
Symbol Value Description
t 400 µm Thickness of Si wafer
ℎ, 𝑙, 𝑙
!
, 𝑐
"
, 𝑐
!
Table 3.7, 3.8 Dimension of cantilever and thin films
g 20 µm Gap between the cantilever and the base around
e 50 µm Width of the diaphragm after cantilever release
𝛼 1.6° Cantilever warpage angle observed
𝐸
#
170×10
$
𝑘𝑔/𝑚
%
Modulus of silicon
𝐸
&
70×10
$
𝑘𝑔/𝑚
%
Modulus of aluminum
𝐸
#'
160×10
$
𝑘𝑔/𝑚
%
Modulus of PECVD SiN
𝐸
(
150×10
$
𝑘𝑔/𝑚
%
Modulus of ZnO
𝜌
#
2.33×10
%
𝑘𝑔/𝑚
%
Density of Silicon
𝜌
&
2.7×10
%
𝑘𝑔/𝑚
%
Density of Aluminum
𝜌
#'
2.5×10
%
𝑘𝑔/𝑚
%
Density of PECVD SiN
𝜌
()*
5.68×10
%
𝑘𝑔/𝑚
%
Density of ZnO
𝜌
+
1.225 𝑘𝑔/𝑚
%
Density of air
𝑐
+
343 𝑚/𝑠 Speed of sound in air
𝜁 0.03 Damping coefficient
𝑑
%,
5×10
-,.
C/N Piezoelectric coefficient of ZnO
𝑍
/0
Table 3.9 Measured capacitance of microphones in RMA
𝑅
0
Table 3.9 Measured Resistance of microphones in RMA
𝑅
1
10
$
ohm Resistance of bias resistance
𝑅
2
10
,.
ohm Input resistance of pre-amp
𝐶
2
2.1 pF Input capacitance of pre-amp
𝑉
1/
0.0016 𝑚
%
Volume of the metal box for RMA
𝑝
03
0.37 𝑚 Perimeter of the cover of metal box
𝑔
1
0.2 𝑚𝑚 Gap between cover and body of metal box
𝑙
4
2 𝑚𝑚 Size of the empty area in the hole on the metal box
𝑡
1
1 𝑚𝑚 Thickness of the metal box
𝜂 1.82 ×10
-5
Pa/s Dynamic viscosity of air
(2-60)
Table 2.1 Parameters used in the model.
41
where 𝑘 is 1~8 representing each resonant microphone in the array, 𝑍
$TT$U
is the paralleled
impedance of all 8 resonant microphones together with their Si channels
𝑍
$TT$U
=
1
∑
1
𝑍
$,?
+^𝑍
",?
+𝑛
?
-
𝑍
89,?
_ ∥ 𝑍
!,?
+𝑍
%,?
N
?( ,
The modeled sensitivities are compared to the experimental results (Fig. 2.16). The simulated
sensitivities are much higher, though the general trend of the frequency response is well matched.
The difference is likely due to the increased mechanical stiffness of a warped cantilever (which
has not been modeled), the ZnO’s piezoelectric coefficient being lower in the fabricated cantilever,
and/or the damping coefficient 𝜁 being higher than the assumed value of 0.03.
(2-61)
Figure 2.16: Modeled and experimental Sensitivity of the RMA with width-stepped cantilever resonant
microphones. The colored-solid curves are the modeled sensitivities, while the dotted curves are the
experimentally measured sensitivities.
42
2.4.3 Noise
The modeled and measured input referred noise spectrum (Fig. 2.17) is very close, as well as
the RMS noise (Table 2.2) (Before A-Weighting).
2.5 Summary
A complete lumped element model of MEMS piezoelectric resonant microphone of Si
cantilever with warpage in Si channel is presented. The warped cantilever leads to a larger pressure
leak in the Si channel at a lower frequency because of the large air gap formed between the
Figure 2.17: Measured and modeled noise of #1 resonant microphone in the RMA.
Table 2.2 Measured and modeled input referred RMS noise of #1 resonant microphone in the RMA.
RMS noise over 20 – 20k Hz
Hz
Measured 9.4 𝜇𝑉
Modeled 6 𝜇𝑉
43
cantilever and the wall of the channel. Such a large air gap has a more pronounced effect on
acoustic pressure leak at a lower frequency, and reduces the differential pressure between the two
sides of the cantilever more at a lower frequency, as the air gap has a shunting effect on the sound
pressure.
An analytical vibration model of the width-stepped cantilever with multiple layers
(piezoelectric film, electrodes, and insulating layer) is developed based on one-dimensional beam
theory, though the width-stepped cantilever is two-dimensional in nature. Added to the analytical
vibration mode are the equivalent electrical impedance of a cantilever (again based on beam theory)
and the one-dimensional piezoelectric transform parameters for a complete lumped element model.
The noise model is developed from the preamp circuit with the microphone modeled with an
equivalent capacitor and resistor in parallel and incorporates contributions from the pre-amp circuit
and resistors.
44
Chapter 3
RMAs for Lung Sound Detection and Classification
This chapter presents the RMAs with low resonance frequencies 200 – 600 Hz for wheezing
detection and classification in lung sounds. Four RMAs with four types of resonant microphones
(rectangular cantilever, spiral microphone, and rectangular plate with serpentine support beams,
width-stepped cantilever,) were designed, fabricated, and characterized, followed by their
application to wheezing detection and classification [79, 80]. Very high unamplified sensitivity
265.4 mV/Pa ~ 86 mV/Pa and extremely low noise floor -4 dBA ~ 7.4 dBA at the resonances were
achieved for the novel RMA. Consequently, the acoustic feature of wheezing was distinguished
better in both the time domain and frequency domain in comparison with a flat-band reference
microphone. With this advantage, higher identification accuracy of lung sounds with and without
wheezing than the traditional microphone was achieved.
3.1 Introduction
Wheeze, which is one of the most significant adventitious lung sounds, is closely related to
asthma, chronic obstructive pulmonary disease (COPD), etc. [62]. Consequently, constantly
monitoring lung sounds (with a wearable stethoscope) for automatic detection of wheezing is
highly desirable for patients, especially those with asthma. Around 10 people die from asthma in
the US every day [63]. Many of these deaths can be avoided through prompt treatment and care
when the asthma is about to come which can be indicated by wheezing through real-time
45
monitoring lung sounds continuously. The dominant frequency of wheezing is about 400 Hz and
is prevalent between 200 – 800 Hz [62]. A typical spectrogram of lung sound with wheezing is
shown in Fig. 3.1. Therefore, the target resonance frequencies with high sensitivity and SNR of
the proposed RMA are from 200 to 800 Hz.
3.2 RMA of rectangular cantilevers
3.2.1 Design
The rectangular cantilever is a straight-forward design. As shown in Fig. 3.2 (a), six silicon
rectangular cantilevers are designed in an array for resonant frequencies from 200 to 500 Hz. The
length of the silicon cantilever is 6.25 - 3.92 mm long with fixed 1 mm width and 5 µm thickness.
The size and resonant frequencies of the cantilever microphones in the RMA are summarized in
Table 3.1. The thickness of each layer is in Table 3.2.
Piezoelectric ZnO film is used to convert the vibration of the cantilever under sound pressure
to voltage for the transducer. ZnO is a popular piezoelectric material as well as Lead zirconate
Figure 3.1: Typical spectrogram of wheezing.
Wheezing
46
titanate (PZT) and AlN. PZT has the highest piezoelectric coefficient (Table 3.3) which is good
for the high sensitivity of the acoustic transducer [64]. However, there is lead in the PZT, which
is toxic and not good for health care applications. ZnO is environment-friendly and biocompatible
so it is good for medical applications [65]. At the same time, the piezoelectric coefficient of ZnO
is higher than AlN (Table 3.3) [64]. Therefore, ZnO is used for the piezoelectric microphone
transducer in this study. As illustrated in Fig. 3.2 (c), the ZnO film only covers the area close to
the anchor on the cantilever as there is larger stress under the sound pressure, which is simulated
by the finite element analysis (FEA) method through COMSOL and shown in Fig. 3.3.
Figure 3.2: Design of the RMA with resonant
cantilevers: (a) illustration of the array; (b) illustration
of one cantilever microphone; (c) crossing section
illustration of the cantilever microphone.
Figure 3.3: Simulated stress distribution
on the cantilever microphone under
sound pressure.
Stress (Pa)
Table 3.1 Design of cantilever RMA.
Table 3.2 Thickness of different layers of a
cantilever resonant microphone.
#1
#2 #3
#4
#5
#6
(a)
(b)
(c) ZnO
Top electrode
SiN
silicon
Ground
47
3.2.2 Fabrication
The microphone arrays were fabricated based on an SOI wafer using the 5 µm device layer of
the SOI wafer as the support layer for the cantilever microphone. The thickness of the device layer
is so uniform and precise that the resonant frequency of each microphone can be very close to the
design value. The RMA was microfabricated through the fabrication process illustrated in Fig. 3.4.
First, 5 µm thick Si diaphragms were micromachined with KOH, followed by deposition and
patterning of ground Al, ZnO, SiN, and top electrode Al. The final structure was then released
through deep reactive ion etching (DRIE) of silicon. The equipment and parameters used for each
step are listed in Table 3.4.
Table 3.3 Properties of piezoelectric materials [64].
Table 3.4 Equipment and parameters for the fabrication
Process Method Equipment Parameters
Al deposition Sputtering Kurt J. Lesker PVD 75 Power 150W; Pressure 3mT
ZnO deposition Sputtering Mat-vac mrc 322 Power 200W; Pressure 5mT
SiN deposition PECVD Oxford PlasmaPro100 Temp. 350ºC; Gas NH3, 2%SiH4/N2
Cantilever release DRIE PlasmaPro 100 ICP Bosch etch, SF6 etch, C4F8 dep.
48
The cantilever RMA was fabricated successfully as shown in Fig. 3.5. Some warpage was
found especially for long cantilevers because of compressive stress of ZnO film.
Figure 3.4: Fabrication process of the microphone array.
Figure 3.5: Fabricated cantilever RMA.
25.7 mm
9.3 mm
Top electrode
Ground
Cantilever microphone
#1
#2
#3
#4
#5
#6
49
3.2.3 Characterization
The fabricated microphone array is mounted on a printed circuit board (PCB) with pre-
amplifiers through wire bonding for electrical connection as shown in Fig. 3.6. The pre-amplifiers
used are ADA4610 with high input impedance >10T ohm as the output impedance of the cantilever
microphone is as high as 1G ohm. The PCB with the RMA is mounted in a metal package for
electromagnetic interference (EMI) shielding (Fig. 3.7). The sensitivities are characterized by
comparing the output with a well calibrated commercial microphone GARS 40AO, which has a
flat sensitivity of 12.5 mV/Pa over 0.1 – 10 kHz, under a sound with swept frequencies 0.1 – 1
kHz. As shown in Fig. 2.8, the output of the amplifiers of RMA as well as the output of the
reference microphone are connected to a DAQ from ROGA, the data of which is recorded by a
LabVIEW program on a computer. The unamplified sensitivity is then calculated as:
𝑈𝑛𝑎𝑚𝑝𝑙𝑖𝑓𝑖𝑒𝑑 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝐴
0
×𝑆
=
𝐴𝑓×𝐺
Where 𝐴
0
is the amplitude of the voltage from the output of the pre-amplifiers for RMA with
a gain of 𝐺, 𝐴
=
is the amplitude of the voltage from the output of the reference microphone with
the sensitivity of 𝑆
=
.
Figure 3.6: Test board with fabricated RMA
and pre-amps.
Figure 3.7: Metal box package shielding the test
board, which is mounted on the inside of the cover
of the box. The reference microphone was placed as
close as possible to the test board.
Ref. mic.
Metal
mesh
covering
the hole
Metal
box
50
The measured unamplified sensitivity at 6 resonant frequencies between 149 – 453 Hz is 17.6
– 78.2 mV/Pa as shown in Fig. 3.9, which is much higher than flat band MEMS microphone
transducer, which is usually lower than 12.5 mV/Pa, and reported RMA within this frequency
range where wheezing is prominent in lung sounds. The quality factor is 11.04 – 19.34. Proper
quality factor makes the RMA a naturally acoustic filter, which can pick up the wheezing feature
around the resonant frequencies in lung sounds sensitively while suppressing the noise out of the
resonant frequencies. The optimized quality factor is related to the number of resonant frequencies
within the interested frequency range. The quality factor should be smaller with fewer resonant
microphones in the RMA so that there is wide enough bandwidth within the interested frequency
range. Otherwise, the wheezing feature may be missed. On the other hand, the quality factor can
be larger with more resonant microphones in the RMA. Ideally, more resonant frequencies with
more resonant microphones in the RMA and higher Q are preferred for a better filtering effect.
Figure 3.8: Test set-up for sensitivity characterization of the RMA.
51
However, the size of the RMA would be bigger which is not desired for wearable devices.
Therefore, there should be a trade-off. The excellent wheezing distinguishing and high accuracy
of lung sound automatic classification in later sections prove that the quality of resonant
microphones in this array and quality factors are very good.
The measured sensitivity is around 30 - 50 Hz lower than the designed value (200 – 500 Hz)
as illustrated in Fig. 3.10. The reason is that the measured thickness of the resonant cantilever
microphone is 4.5 µm which is the lower limit in the specification of the thickness of the active
layer of the SOI water which is 5 +/- 0.5 µm. The calculated resonant frequency with the real
cantilever thickness 4.5 µm is close to the measured value.
Figure 3.9: Measured unamplified sensitivities of the resonant microphones in the array and their
quality factors.
52
3.2.4 Lung sound detection and classification
Lung sound recordings with and without wheezing were played through a loudspeaker in an
anechoic chamber. As illustrated in Fig. 3.11, the sounds were recorded through both the fabricated
RMA and the reference microphone GRAS 40AO for comparison, which is also used in the
sensitivity characterization. GRAS 40AO is a high-quality non-MEMS condenser microphone,
which has standard sensitivity 12.5 mV/Pa and high SNR 74 dB. Lung sound from each channel
of the 6 channels RMA was recorded separately, which provides more acoustic information than
one recording with parallel or series connected all channels signals. Any channel can be combined
for the signal analysis. Then the recordings of the lung sound recorded by these two microphones
were analyzed and compared in both the time domain and frequency domain, followed by the
automatic classification of the lung sounds with and without wheezing, which is necessary for
auscultation real-time motoring with wearable electronic stethoscope in medical cyber-physical
Figure 3.10: designed and measured resonant frequency for the rectangular RMA.
53
systems (MCPS) so that the patients and care providers can get alert immediately when there are
abnormal adventitious lung sounds.
It was found that wheezing is more distinguishable in recordings recorded by the RMA than
those recorded by a reference microphone. As shown in Fig. 3.12 (a) and (b) of the waveform of
the recorded lung sound with strong wheezing, the amplitude of lung sound with wheezing in
expiration recorded through the cantilever microphone with resonant frequency 228 Hz is 2 times
that recorded with the reference microphone, while the amplitudes at other segments are lower,
especially during the period of expiration without wheezing when heart sound dominates. The high
sensitivity of the resonant microphone around its resonant frequency where wheezing occurs and
low sensitivity far from the resonant frequency makes the wheezing signal stand out while the
noises, most of which are heart sounds, are depressed. When the loudness of the same lung sound
played is reduced to mimic the weak lung sound, it is hard to distinguish the wheezing from the
Figure 3.11: Test set-up for lung sound recording and wheeze
detection and classification.
54
waveform recorded with the reference microphone but can still be clearly noted from the waveform
recorded with the resonant microphone, as shown in Fig. 3.13 (a) and (b). At the same time, the
wheezing feature is still clear in the spectrogram of lung sound recorded by the resonant
microphone (Fig. 3.13 (c)) while it is weak and hard to tell in the spectrogram of that recorded by
the reference microphone (Fig. 3.13 (d)).
Figure 3.12: Signal of a lung sound (with strong wheezing) recorded with a resonant cantilever
microphone #2 (resonant frequency at 228 Hz) and the flat-band reference microphone: (a) waveform
with resonant microphone; (b) waveform with reference microphone; (c) spectrogram with resonant
microphone; (d) spectrogram with reference microphone
Wheezing
Inspiration
Expiration
Wheezing
Inspiration
Expiration
Wheezing Wheezing
(a)
(b)
(c)
(d)
55
Based on this advantage of distinguishing wheezing better in lung sounds, it is expected that
automatic lung sound classification with recordings recorded by RMA would be more accurate
than those with a flat band microphone. This assumption was verified through automatic
classification of 40 episodes of lung sounds recorded from different positions on patients including
trachea, left and right chest, and left and right back. Each sound segment lasted for 4-7 seconds.
Figure 3.13: Signal of a lung sound with wheezing recorded with the combination of two resonant
cantilever microphones #2 and #6 (resonant frequency at 228 Hz and 453 Hz) and the flat-band reference
microphone: (a) waveform with RMA; (b) waveform with reference microphone; (c) spectrogram with
RMA; (d) spectrogram with reference microphone.
Wheezing No wheezing Feature
Wheezing
Weak wheezing feature
(a) (b)
(c) (d)
56
20 samples of them have wheezing while there is no wheezing in the other 20 samples. The lung
sounds were played and recorded through the setup shown in Fig. 3.11 by both fabricated RMA
and reference microphone GRAS 40AO. Then the recordings were classified automatically
through deep learning (DL).
The DL networks used 12-layer temporal convolutional networks (TCN) [66] to extract a ten-
dimensioned feature vector, which was thereafter connected by a fully connected layer with
SoftMax activation. All samples were down-sampled to be 8192-dimension vectors before feeding
into the networks for classification. No pre-processing was posed on the data apart from basic
normalization. The DL networks were trained for 100 epochs for each data set. And cross-
validation method was used to evaluate the accuracy of wheezing detection. The configuration of
the automatic classification is illustrated in Fig. 3.14.
Figure 3.14: Configuration of automatic classification with temporal convolutional network (TCN).
57
To know how many channels of resonant microphone array is necessary for the classification,
different combinations of the cantilever mics in the array were tested to compare with the reference
microphone. The combinations include 1, 2, 4, and all 6 resonant microphones in the array.
Experiment results showed that no matter how many resonant microphones in the array are used,
the classification accuracy is higher than that with the reference microphone as shown in Fig. 3.15.
At the same time, the accuracy increases with more resonant microphones. In practice, we need a
tradeoff between the performance and the size of the microphone array which is larger with more
channels. The outperformance arose from two reasons. First, the resonant frequencies of the RMA
were designed to show very high sensitivity on the interested characteristic frequencies for
wheezing detection. Second, the RMA itself is a band-pass filter, which filtered the noises, such
as heart beating or white noise, from the sensor level.
Figure 3.15: Accuracies of wheezing identification based on deep learning with the resonant
microphone array (97.44%) and a flat-band reference microphone (89.74%).
94.87%
97.44%
89.74%
Combination of microphones in cantilever RMA Ref. mic. GRAS 40AO
58
3.3 RMA of spiral resonant microphones
3.3.1 Design
Long rectangular cantilevers (6.25 – 3.92 mm) are required for low resonant frequency as
shown in the last chapter. It is not preferred for wearable stethoscopes which require a small size.
Therefore, smaller resonant microphones with the same low resonance frequencies 200 – 800 Hz
where wheezing is prevalent are unmet needs. Compared with a rectangular cantilever, the spiral
structure is flexible so that lower resonant frequencies can be achieved.
An array with 11 spiral resonant microphones based on 5 µm thick Si were designed as shown
in Fig. 3.16. Piezoelectric ZnO film only covers the spiral beams where the stress is resident under
the sound pressure as shown in Fig. 3.17. The size of the spiral resonant microphones is 2.68 –
1.65 mm, which is much smaller than the rectangular cantilever. The size and resonant frequency
of each special resonant microphone are listed in Table 3.5. The thickness of each layer (Si
cantilever, piezoelectric ZnO film, ground and top electrode, insulator SiN) is the same as the
rectangular cantilever listed in Table 3.2.
Warpage was also found for the rectangular cantilever microphones as shown in Fig. 3.5,
especially for long ones with lower resonant frequencies #1, #2, and #3. There would be gas and
pressure leaks with the warped microphone. As a result, the sensitivity would be lower, especially
at lower frequencies. Therefore, the warpage, which is caused by the compressive stress of ZnO
film should be reduced. Warpage simulation results in Fig. 3.18 show that spiral structure has
smaller warpage compared with rectangular ones with the same resonant frequency. At the same
time, thinner Si thickness would cause larger warpage although the microphone can be smaller.
59
Figure 3.17: Simulated stress distribution
on the spiral microphone under sound
pressure.
Stress (Pa)
Figure 3.18: Warpage of the cantilever and spiral
structures vs resonant frequency under the
compressive stress of ZnO (-500 MPa) from
simulation.
#1
#2
#3
#4
#5 #6
#7
#8
#9
#10
#11
Si
ZnO
Ground
Top electrode
SiN
(a) (b)
(c)
Figure 3.16: Design of the RMA with resonant spiral microphone: (a) illustration of the array;
(b) illustration of one spiral microphone; (c) crossing section illustration of the spiral
microphone.
60
3.3.2 Fabrication
The fabrication process is the same as cantilever RMA shown in Fig. 3.4 except that an Al
protective layer is required in DRIE etching Si to release the spiral structure as illustrated in Fig.
3.19. The Al protection layer on the backside before deep ion reaction etching (DRIE) of the Si
diaphragm enhances the yield of the serpentine-beam-supported cantilever greatly for the
following reason. At the beginning of the DRIE, the heat on the diaphragm is dissipated through
the four edges of the diaphragm to the bulk of the wafer substrate. However, as the cantilever is
close to being released, the heat dissipation to the substrate is mainly through the spiral beams, if
there is no Al layer on the backside. The long and narrow spiral beam is not efficient in transferring
heat, and the temperature would become very high (as indicated by the simulation shown in Fig.
4a). That temperature rise would cause warpage of the cantilevers, causing some to be broken, and
also potentially damage Al electrode and ZnO layer. This problem is solved by depositing 0.5 µm
thick Al on the backside which provides a heat conduction path, as indicated by the simulation
shown in Fig. 3.20.
Table 3.5 Design of spiral resonant microphone array.
No.
Resonant
frequency
(Hz)
Size
(mm)
No.
Resonant
frequency
(Hz)
Size
(mm)
1 200 2.68 7 600 1.82
2 250 2.47 8 650 1.77
3 300 2.32 9 700 1.73
4 400 2.10 10 750 1.69
5 450 2.01 11 800 1.65
6 500 1.94
61
Figure 3.19: Fabrication process of the RMA of spiral microphones: (a) SOI wafer; (b) patterning
and etching of Si (in KOH) and buried oxide (in HF 7:1) from the backside; (c) deposition and
patterning of 0.2 µm sputtered Al ground, 0.5 µm sputtered ZnO, 0.1 µm PECVD SiN and 0.2 µm
sputtered Al top electrode; (d) sputtering 0.5 µm Al protection layer at the backside; (e) patterning
and etching of the device layer with deep reactive ion etch (DRIE) from the front side; (f) etching Al
protection layer to release the cantilever.
.
Figure 3.20: Simulated temperatures and warpages of the spiral microphone (a) without and (b) Al
layer at the backside, when the cantilever is just about to be released during DRIE with 0.5 w/cm2
heat flux.
(k)
(a)
(b)
62
The RMA of the spiral resonant microphone was fabricated successfully as shown in Fig. 3.21.
It can be seen that the warpage of the spiral microphone is smaller than that of the rectangular
cantilever microphone.
3.3.3 Characterization
The unamplified sensitivity of the fabricated RMA with spiral microphones is characterized
by the same method as shown in Fig. 3.8. The measured unamplified sensitivity is from 14 mV/Pa
to 28.4 mV/Pa as shown in Fig. 3.22, which is higher than the standard sensitivity (12.5 mV/Pa)
of the MEMS flat band microphone. Compared to the cantilever type, the sensitivity of the spiral
microphone is lower, but the sensitivity variation is smaller because the spiral structure is more
flexible so that there is less concentrated stress. The quality factors at resonant frequencies are
Figure 3.21: Photos of the fabricated RMA of spiral resonant microphones.
#1
#2
#3
#4
#5
#6
#7
#8
#9
#10
#11
2.47 mm
1.73 mm
63
from 12.7 to 27.5. At the same time, the actual resonant frequencies of the spiral microphones are
very close to the designed values (Fig. 3.23).
Figure 3.22: Measured unamplified sensitivities of the RMA of spiral resonant microphones
#1 has electrical
connection problem
Figure 3.23: Designed and measured resonant frequencies of the RMA with spiral microphones.
64
3.3.4 Lung sound detection
The lung sounds with wheezing were recorded with both the RMA of spiral microphones and
flat band reference microphone GRAS 40AO as illustrated in Fig. 3.11. It was found that when the
lung sound loudness level is low, the wheezing can be distinguished better with RMA of spiral
microphones than with flat band reference microphone in spectrogram although it is not
distinguishable in both waveforms as shown in Fig. 3.24.
Figure 3.24: Signal of a lung sound with wheezing recorded with resonant spiral microphones #5 (resonant
frequency at 450 Hz) and the flat-band reference microphone: (a) waveform with RMA; (b) waveform with
reference microphone; (c) spectrogram with RMA; (d) spectrogram with reference microphone
Wheezing
(a) (b)
(c)
(d)
Wheezing not clear
65
3.4 RMA of serpentine beams supported cantilever
3.4.1 Design
The size of the spiral resonant microphones is smaller than the rectangular cantilever type, but
the sensitivity is also lower. We want to develop a small resonant microphone with unamplified
sensitivity as high as the long rectangular one or even higher for wearable electronic stethoscopes.
A novel structure, a 5 µm thick rectangular plate supported by two serpentine support beams, is
designed. It can be considered a cantilever with flexible springs, which can help to reduce the size
of the microphone with a low resonance frequency. The RMA with 11 such novel cantilevers with
2.5 – 1.65 mm size and 200 – 800 Hz with resonant frequencies is illustrated in Fig. 3.25. The gap
in the serpentine support beams and between the microphone and Si substrate around is also 20
µm. The thickness of each layer of the microphone (Si cantilever, piezoelectric ZnO film, Al
ground, Al top electrode, and insulator SiN) is the same as the rectangular cantilever and spiral
structure as listed in Table 2.2. ZnO film only covers the serpentine support beam area where the
stress is large during vibration under the sound pressure as shown in Fig. 3.26. The designed
resonant frequency of each microphone in the RMA and its size are listed in Table 3.6. It can be
seen that its size is much smaller than long cantilever microphones and close to spiral microphones
with the same resonant frequency as shown in Fig. 3.27.
At the same time, simulation results Fig. 3.28 indicate that the warpage of this novel structure
under the compressive stress of ZnO is the smallest among rectangular cantilever and spiral
microphones. Consequently, there will be less gas leakage under the sound pressure, which is good
for high sensitivity at lower frequencies.
66
Figure 3.25: Design of the RMA with cantilever supported by serpentine beams: (a) illustration of the
array; (b) illustration of one resonant microphone; (c) crossing section illustration of the spiral
microphone.
Fixed edge
ZnO
Top electrode
SiN
Si Ground
20 µm
(a) (b)
(c)
Figure 3.26: Simulated stress distribution
on the designed microphone under sound
pressure.
Stress (Pa)
Table 3.6 Designed sizes and resonant frequencies of the
resonant microphones with serpentine beams in the RMA
No.
Resonant
Frequency
(Hz)
Length
of side
(mm)
No.
Resonant
Frequency
(Hz)
Length
of side
(mm)
1 200 2.50 7 600 1.80
2 250 2.50 8 650 1.80
3 300 2.50 9 700 1.73
4 400 2.20 10 750 1.68
5 450 2.08 11 800 1.65
6 500 2.00
67
3.4.2 Fabrication
As shown in Fig. 3.29, the fabrication based on an SOI wafer with a 5 µm thick Si active layer
is the same as RMA with spiral microphones with a protective Al layer, which lowers the
temperature and thermal stress during DRIE etching Si to release the final structures as shown in
Fig. 3.30. The proposed RMA was fabricated successfully as shown in Fig. 3.31.
Figure 3.27: Size of the resonant microphone with
different designs.
Figure 3.28: Warpage of the resonant microphone
under the compressive stress of ZnO (-500 MPa)
from simulation.
Figure 3.29: Fabrication process of the microphone array.
68
Figure 3.30: Simulated temperatures and warpages of the microphone (a) without and (b) Al layer
at the backside, when the cantilever is just about to be released during DRIE with 0.5 w/cm2 heat
flux.
(k)
(a)
(b)
Figure 3.31: Photos of the fabricated RMA of resonant microphones with serpentine support beams.
69
3.4.3 Characterization
The unamplified sensitivity of the fabricated RMA is characterized by the same method as
shown in Fig. 3.8. As shown in Fig. 3.32, it is 34.6 - 131.4 mV/Pa at resonant frequencies, which
is higher than both spiral resonant microphones and rectangular cantilever resonant microphones
presented previously in this chapter. At the same time, as shown in Fig. 3.33, 131.4 mV/Pa is also
the highest unamplified sensitivity in all reported microphones within the frequency range lower
than 800 Hz where the main frequency of both normal and adventitious lung sounds resident. The
quality factor at resonant frequencies is 8.6 – 12.4. The measured resonant frequencies are lower
than the designed values in Table 3.6 because the actual device layer of the SOI wafer turns out to
be 4 µm, thinner than the specified 5 µm.
Figure 3.32: Measured unamplified sensitivities of the microphones in the array from 100 to 1,000 Hz.
70
Figure 3.33: Comparison of sensitivity and bandwidth of this work, reported resonant microphone
arrays (RMAs) and flat band microphones. Sensitivitiy at resonance frequencies is counted for the
RMAs. In the dominant frequency of lung sounds, this work has the highest unamplified sensitivity
compared with all reported microphones, and the widest bandwidth compared with all reported
RMAs
Dominant frequency
of lung sounds
*Unamplified sensitivity of the acoustic transducer
**Whether the gain of amplifiers is excluded or not is not disclosed
71
3.4.4 Lung sound detection and classification
We have used twenty-two sets of wheezing sounds and twenty-two sets of lung sounds without
wheezing from International Conference on Biomedical and Health Informatics (ICBHI)
Respiratory Sound Database (https://bhichallenge.med.auth.gr) to play the sounds with a
loudspeaker in an anechoic chamber. And we record the sounds with a fabricated resonant
microphone array and a reference microphone (GARS 40AO), as illustrated in Fig. 3.11. The
signal from each microphone of the array is acquired separately.
The lung sounds obtained through each resonant microphone in the microphone array and the
reference microphone are analyzed in both time and frequency domains. The results show that the
resonant microphones in the array distinguish the wheezing signature much better than the
reference flat-band microphone, especially when the wheezing sound level is low. As can be seen
in Fig. 3.34, the resonant microphones show (or record) wheezing sound prominently in both
frequency and time domains, but the wheezing sound is not found in the signal recorded by the
reference microphone under the same condition. Unlike the flat-band reference microphone, the
resonant microphone has higher sensitivity around the resonant frequency and lower sensitivity
outside the resonant frequency, which makes the weak wheezing sound stand out. Eleven resonant
microphones with eleven resonant frequencies in the array can ensure that all typical frequencies
of wheezing are covered.
72
In continuous monitoring lung sounds with a wearable stethoscope, automatic classification of
adventitious lung sounds such as wheezing is necessary, and we have carried out automatic
wheezing classification based on the start-of-art deep learning (DL) method through TCN with the
resonant microphone array and with the flat-band reference microphone as illustrated in Fig. 3.14.
The accuracy of the wheezing classification for lung sounds with different combinations of the
Figure 3.34: Waveform and spectrogram of a lung sound (containing relatively weak wheezing sound)
recorded : (a) waveform with a resonant microphone #7 (503 Hz resonance frequency) in the array and
(b) waveform with a reference flat band microphone GRAS 40AO. (c) spectrogram with a resonant
microphone #7 (503 Hz resonance frequency) in the array and (d) spectrogram with the reference
microphone.
Wheezing
Wheezing is feature not obvious
Wheezing
Wheezing is feature not obvious
73
channels in the resonant microphone array is compared with that obtained with the reference
microphone in Fig. 3.35. The results show that the accuracy is higher with combining lung sounds
from more channels of the microphone array because wheezing in a lung sound usually exists on
several frequencies which should be covered by a couple of resonant microphones in the array.
The accuracy with nine and all eleven channels can be as high as 97.73%, which is much higher
than 81.82% obtained with the reference microphone. It also suggests that eleven microphones in
the array are good enough for wheezing detection. More channels are not necessary as detection
accuracy will not be improved further with a larger number of microphones in the array.
Figure 3.35: Accuracies of automatic wheezing identification through TCN-based deep learning, with
various combinations of the microphones in the array as well as with the GRAS 40AO reference
microphone.
84.09%
86.36%
93.18%
97.73% 97.73%
81.82%
Combination of the resonant
microphones in the RMA
Ref. mic.
74
3.5 RMA of width-stepped cantilever
3.5.1 Design
We are wondering how we can have even higher unamplified sensitivity with even lower input
referred noise and minimum detectable sound, which is unamplified output noise over sensitivity.
The width-stepped cantilever (Fig. 3.36) with narrower width near the anchor not only is more
flexible than a standard rectangular cantilever but also has higher concentrated stress (Fig. 3.37).
It indicates that a width-stepped cantilever can be both smaller and more sensitive than a standard
rectangular cantilever. Compared with the spiral structure and rectangular plate supported by two
serpentine support beams, a stepped cantilever is less flexible but has a higher stress concentration.
Thus, higher sensitivity is expected. There are two types of design for the width-stepped cantilever,
center width-stepped cantilever (Fig. 3.36a) and edge-width stepped cantilever (Fig.3.36b). If they
have the same cantilever width (𝑐
&
and 𝑐
+
) and length (𝑙
+
and 𝑙), the same fundamental resonance
is the same. However, the second resonance of the edge width-stepped cantilever is higher than
the center width-stepped cantilever (Fig. 3.38). The fundamental resonance will be used for lung
sound detection, and it is better to have the second resonance far away to avoid interference. Thus,
an edge width-stepped cantilever design is adopted.
Figure 3.36: Width-stepped cantilever (a) center-
stepped cantilever (b) edge-stepped cantilever.
!
!
"
!
(a)
!
"
"
!
!
"
!
/2 "
!
/2
(b)
!
"
"
Figure 3.37: Average stress comparison
between the width-stepped cantilever and
standard cantilever under pressure 1 Pa.
75
The RMA with 8 edge width-stepped cantilevers with 3.6 – 2.3 mm size and 200 – 800 Hz
resonance frequencies is illustrated in Fig. 3.39. The gap between the microphone and Si substrate
around is also 20 µm. The thickness of each layer of the microphone (Si cantilever, piezoelectric
ZnO film, Al ground, Al top electrode, and insulator SiN) is listed in Table 3.7. ZnO film only
covers the narrow-stepped area where the stress is large during vibration under the sound pressure
as shown in Fig. 3.39. The designed resonant frequencies of the microphones, which is Mel-
distributed, in the RMA and their sizes are listed in Table 3.8.
(a) (b) (c) (d)
Figure 3.38: Resonance of center width-stepped cantilever (a) fundamental resonance 397 Hz (b) 2nd
resonance 758 Hz, and edge width-stepped cantilever (c) fundamental resonance 384 Hz (d) 2nd
resonance 1394 Hz.
Figure 3.39: Design of the RMA with resonant width-stepped cantilevers: (a) illustration of the array; (b)
illustration of one cantilever microphone; (c) crossing section illustration of the cantilever microphone.
#1
#2
#3 #4
#5
#6
#7
#8
Si
Al ground
SiN insulator
ZnO
SiN insulator
Al electrode
5 µm thick Si width
stepped cantilever
microphone
0.6 µm ZnO
piezoelectric layer
deposited on the narrow
width stepped area
Anchor
(a) (b) (c)
76
3.5.2 Fabrication
The fabrication process is the same as the rectangular cantilever RMA as shown in Fig. 3.4.
The proposed RMA composed of width-stepped cantilever microphones was fabricated
successfully as shown in Fig. 3.40.
Table 3.7 Thickness of different layers of a width-
stepped cantilever resonant microphone.
Layer
#
Name Material
Thickness
(µm)
1 Cantilever Si 4
2 Ground Al 0.2
3 Insulator SiN 0.2
3 Piezoelectric ZnO 0.5
4 Insulator SiN 0.2
5 Top electrode Al 0.2
Table 3.8 Design of RMA.
Figure 3.40: Photos of the fabricated RMA of resonant microphones with width-stepped cantilevers.
#4
#5
#6
#7
#8
#1
#2
#3
9.3 mm
25.7 mm
2.5 mm
2.5 mm
#6 Width-stepped cantilever
Piezoelectric film ZnO and Al electrodes
77
3.5.3 Characterization
3.5.3.1 Sensitivity
The unamplified sensitivity of the fabricated RMA was characterized through a plane wave
tube as illustrated in Fig. 3.41, in which only plane wave can propagate at interested frequencies
for both DUT and the reference measurement microphone at the same position [67]. The plane
wave tube can guarantee that the pressure is the same at DUT and the reference microphone. At
the same time, the sound is restricted within the tube so that it can only access the RMA through
its inlet without any leak from other ways (like the hole on the metal box in Fig. 3.42). The RMA
together with the pre-amplification PCB board was placed on the cover of a metal box with a sound
inlet (Fig. 3.42). A rigid acrylic PWT (plane wave tube) with a smooth surface was used for
sensitivity characterization (Fig. 3.43). The PWT is 3mm thick with 25.4 ×25.4 𝑚𝑚 square cross
section. The inlet for RMA and reference microphone on the PWT was cut through laser. A speaker
with 25 mm diameter (AS02504PR-N50-R from PUI Audio) was placed at one end of the tube to
play sound, while foam was placed at the other end of the PWT to absorb sound to avoid reflecting
or transmitting out of the tube. The whole test system was placed in an anechoic chamber.
Figure 3.41: Schematic of sensitivity measurement through plane wave tube.
DUT
Reference mic.
Uniform plane wave
78
Figure 3.42: Assembly for the test (a) RMA was assembled on a PCB with op-amps for each resonant
microphone in the array, which was attached to the cover of the metal box (b) The metal cover, with
sound inlet at the RMA location, was screwed to the body of the metal box
Metal cover of the box
PCB with op-amps
RMA
Body of metal box
Output of
each resonant
microphone
Sound inlet
Metal box with RMA
and PCB on the cover
Output
(a) (b)
Figure 3.43: Photo of sensitivity test configuration with PWT (plane wave tube).
Speaker RMA in
metal box
Ref. mic.
End
PWT
Output of 8 channels
Power supply
for per-amp
Anechoic chamber
79
The RMA is very sensitive with unamplified sensitivity 265 mV/Pa to 86 mV/Pa at the 8
resonances between 230 Hz and 630 Hz (Fig. 3.44), which is the highest in all reported
microphones including both resonance microphone arrays and flat band microphones (Fig. 3.45).
At the same time, the sensitivity of the RMA within this band (between 230 Hz and 630 Hz) is not
only high at resonances but also high (above 35 mV/Pa) at all other frequencies (Fig. 3.46), which
is good for detection of wheezing (mainly resides within this frequency range) in lung sounds. The
quality factor is between 13.5 and 22 (Fig. 3.47). It is lower for the cantilever microphone with a
lower resonance frequency, which is larger than that with higher resonance frequency. The
damping of the microphone is mainly from the air around the vibrating cantilever or diaphragm,
thus larger cantilever or diaphragm leads to higher damping coefficient and lower quality factor.
Figure 3.44: Measured unamplified sensitivities of the microphones in the array from 100 to 1,000 Hz.
80
Figure 3.45: Comparison of sensitivity and bandwidth of this work, reported resonant microphone
arrays (RMAs) and flat band microphones. Sensitivity at resonance frequencies is counted for the
RMAs. In the dominant frequency of lung sounds, this work has the highest unamplified sensitivity
compared with all reported microphones.
*Unamplified sensitivity of the acoustic transducer
**Whether the gain of amplifiers is excluded or not is not disclosed
81
Figure 3.46: Measured unamplified sensitivities of the RMA from 100 to 1,000 Hz.
Figure 3.47: Measured quality factor at resonances of the RMA.
82
3.5.3.2 Noise
The PCB board with pre-amps and the RMA assembled was placed in double metal boxes for
noise measurement (Fig. 3.48).
Figure 3.48: RMA Noise measurement set up (a) The metal box with RMA, pre-amp, and ADC was
placed on vibration isolation table for noise test (b) RMA together with pre-amp on PCB was placed in
double metal boxes with all single electrical wires restricted in the metal box to avoid electromagnetic
interference.
Metal box with RMA and
pre-amp inside
Metal box with audio
ADC inside
Vibration isolation table
(a)
Battery
RMA
BNC connector
to outside
Signal cable
PCB with
pre-amps
Small metal
box
Big metal
box
(b)
83
Though in practice a double box with no acoustic port will not be used because of its large
weight, volume, cost, and lack of acoustic port, a double metal box was used to block off sound
and vibration noises from the environment as well as electromagnetic interference noises (that are
not from microphone and pre-amp), to estimate a potential maximum achievable signal-to-noise
ratio (SNR). The metal box was placed on a vibration isolation table to prevent unwanted
mechanical vibration. Thus, the SNR in real applications with a single metal package and acoustic
port is expected to be lower than the value reported here depending on the noise in the environment.
The measured pre-amp input-referred peak-to-peak noise 20 Hz – 20 kHz (i.e., the measured
output noise divided by the pre-amp’s amplification, 101) for all resonant microphones in the RMA
is around 50 µV (Fig. 3.49). 20 Hz – 20 kHz noise, which was usually counted for traditional
microphones, was used as comparison. Also, A-weighting (following the human ear’s frequency
response) is usually applied for noise estimates for microphones. The pre-amp input-referred Noise
peak-to-peak after A-weighting is around 20 µV (Fig.3.50).
Figure 3.49: Measured pre-amp input referred peak-to-peak noise 20 Hz – 20 kHz (measured
output noise divided by amplification 101).
#1 #2
#3 #4
#5 #6
#7 #8
84
The root mean square (RMS) noise voltage (Er,RMS) is related to the peak-to-peak noise
voltage (Et,PP) for a random signal as follows [84]
𝐸
<,VWD
=
𝐸
<,XX
6.6
It is around 9 µV for all resonant microphones in the array before A-Weighting (Fig. 3.51a),
and around 4 µV after A-Weighting (Fig. 3.51b).
(3-1)
Figure 3.50: Figure 3.49: Measured pre-amp input referred peak-to-peak noise 20 Hz – 20 kHz
(measured output noise divided by amplification 101) after A-weighting.
#1 #2
#3 #4
#5 #6
#7 #8
85
The minimum datable sound (MDS) level in Pa, which is also the noise floor, can be
obtained through
𝑝
WYD
=
𝐸
<,VWD
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
;9$07&8=86%
where 𝑝
WYD
depends on the frequency, as the sensitivity is not uniform at different frequencies
which is higher around resonances.
The SNR is typically defined to be the signal-to-noise ratio for a sound pressure level of 1 Pa
and can be obtained with the following equation [85]
𝑆𝑁𝑅 = 94−20log
,#
𝑝
WYD
2×10
RK
𝑃
WYD
and 𝑆𝑁𝑅 are uniform over an audio frequency range for a flat band microphone with
uniform sensitivity and noise floor over the frequency range. However, with the set of the resonant
microphones, the SNRs (89~80.6 dB before A-Weighting and 98~86.6 dBA after A-Weighting)
are very high over 230 – 630 Hz because of very high sensitivities near the resonant frequencies
(Fig. 3.52 for those before A-Weighting, and Fig. 3.53 for those after A-Weighting). The SNRs
near the resonant frequencies are much higher than the SNR for a flat band MEMS microphone,
Figure 3.51: Pre-amp input referred RMS noise (a) Before A-Weighting (b) After A-Weighting.
(3-2)
(a) (b)
(3-3)
86
the highest of which is ICS-40730 from TDK InvenSense [68] with 74 dBA SNR. Also, the noise
floor is low (-4 ~ 15 dBA after A-Weighting) over 230 - 630 Hz (that are covered by the set of the
resonant microphones) (Fig. 3.54) where wheezing mainly resides in lung sounds.
Figure 3.52: Minimum detectable sound (noise floor) and SNR of the resonant microphone in the
RMA (a) Minimum detectable sound (b) SNR.
(a)
(b)
87
Figure 3.53: Minimum detectable sound and SNR of the resonant microphone in the RMA after A-
Weighting (a) Minimum detectable sound (b) SNR.
(a)
(b)
88
Figure 3.54: Minimum detectable sound and SNR of the RMA between 200 and 660 Hz (a) Minimum
detectable sound (b) SNR.
(a)
(b)
89
The power spectral density (PSD) of the pre-amp input-referred noise between 20 Hz and 20
kHz is -120 dB ~ -160 dB (Fig. 3.55). It is -145 dB ~ -170 dB after A-Weighting (Fig. 3.56).
Figure 3.55: PSD of measured pre-amp input referred noise without A-Weighting.
#1 #2
#3
#4
#5 #6
#7
#8
90
Figure 3.56: PSD of measured pre-amp input referred noise with A-Weighting.
#1 #2
#3
#4
#5 #6
#7
#8
91
3.5.3.3 Impedance
The impedance of the resonant microphones in the RMA was measured through the impedance
analyzer Agilent 4294A, and the capacitance and resistance with the paralleled model (Fig. 2.10)
were extracted. The paralleled capacitance of the resonant microphones is between 20 and 30 pF,
and the paralleled resistance is 10 Gohm (Table 3.9).
3.5.4 Lung sound detection and classification
Lung sound recordings from International Conference on Biomedical and Health Informatics
(ICBHI) Respiratory Sound Database (https://bhichallenge.med.auth.gr), in which whether and
when there is wheezing is noted, were played through a loudspeaker with a plane wave tube (Fig.
3.43) and recorded through both the fabricated RMA and the reference microphone GRAS 40AO
for comparison. The same configuration was also used in the sensitivity characterization. Lung
sound from each channel of the 8 channels RMA with width-stepped cantilevers was recorded
separately, which can be combined or kept separate for the signal analysis, the same as other RMAs
Table 3.9 Measured paralleled capacitance and resistance of the resonant microphones in the RMA.
Resonant microphone# Capacitance (pF) Resistance (Gohm)
1 23.9 10
2 25.4 10
3 23.5 10
4 24 10
5 26.3 10
6 27.6 10
7 28.5 10
8 27.2 10
92
studied in previous sections in this chapter. Then the recordings of the lung sounds recorded by
these two microphones were analyzed and compared in both the time domain and frequency
domain, followed by the automatic classification of the lung sounds with and without wheezing,
which is necessary for auscultation real time motoring with a wearable electronic stethoscope.
3.5.4.1 Basic analysis
There are three advantages of the RMA over the traditional microphone with uniform
sensitivity: high sensitivity (Fig. 3.44), low noise (Figs. 3.52 and 3.53), and natural filtering.
Therefore, the wheezing in lung sounds recorded by the RMA can easily be distinguished from
waveform directly if the wheezing signature is obvious (Fig. 3.57a), while it is hard to tell the
wheezing in the waveform of recording with the reference microphone (Fig. 3.57b), although
wheezing is obvious in the spectrogram of the lung sound recordings with both RMA and reference
microphone (Fig. 3.58). The wheezing becomes distinguishable when the digital filter is applied
to the waveform recorded by the reference microphone but is still not as obvious as that with RMA
(Fig. 3.59b). At the same time, it takes time and consumes power for digital filtering while there
is no such problem for RMA with a natural physical filtering effect.
When the wheezing is not obvious, it can still be distinguished from the spectrogram of the
lung sound recording with RMA when it is hard to tell in waveform (Fig. 3.60). However, it is
impossible to distinguish the wheezing in both waveform and spectrogram for reference
microphone in this condition (Fig. 3.61). The lower noise floor of the RMA makes it possible to
detect lower sound than the reference traditional microphone GRAS 40AO, which also has a low
noise floor (25 dB(A) with pre-amp) but not as low as the RMA (Fig. 3.54a, 15 dB(A) in all lung
sound band, 7.4 dB(A) around the resonances, with pre-amp).
93
Figure 3.57: A waveform of lung sound with wheezing (a) recorded by #7 cantilever resonant microphone
in the RMA, wheezing is distinguishable (b) recorded by reference microphone GRAS 40 AO, wheezing
is not distinguishable.
wheezing
(a)
(b)
Figure 3.58: Wheezing is obvious in the spectrogram of lung sound of Fig. 3.56 for both (a) recorded by
#7 cantilever resonant microphone in the RMA (b) recorded by reference microphone GRAS 40 AO.
wheezing
wheezing
(a)
(b)
94
Figure 3.59: A waveform of lung sound with wheezing (a) recorded by reference microphone, wheezing is
not distinguishable (b) recorded by reference microphone followed by digital band pass filtering (500-700
Hz), wheezing is distinguishable.
wheezing
(a)
(b)
Figure 3.60: Waveform and spectrogram of a lung sound with weak wheezing recorded by #6 in the RMA
(a) waveform, wheezing is not distinguishable (b) spectrogram, wheezing is distinguishable.
wheezing
(a)
(b)
95
3.5.4.2 Automatic classification
Automatic classification of with wheezing and without wheezing in the lung sound is necessary
for lung sound monitoring so that the patent and care provider can get alert once there is wheezing.
Classification accuracy is the most important factor. It depends on both the quality of the lung
sound recording through microphones and the algorithm used for classification. In this study, fifty
lung sound recordings, in which twenty-five of them have wheezing while the other twenty-five
do not, from the ICBHI lung sound database were played by a speaker and recorded by the RMA
with width-stepped cantilever resonant microphones and the reference microphone (GRAS 40AO)
through plane wave tube (Fig. 3.43) in an anechoic chamber. Then the fifty recordings were
classified automatically, and the classification accuracies were compared between the RMA and
the reference microphone.
Figure 3.61: Waveform and spectrogram of a lung sound with weak wheezing recorded by reference
microphone GRAS 40AO (a) waveform, wheezing is not distinguishable (b) spectrogram, wheezing is not
distinguishable.
(a)
(b)
96
Two types of deep learning algorithms were used for the lung sound automatic classification.
One is TCN (Temporal Convolutional Networks), in which the lung sound waveforms recorded (8
channels with the RMA, and 1 channel with the reference microphone) were processed directly
without any pre-processing for the classification. Twelve layers were designed in TCN to extract
a ten-dimensioned feature vector for the classification (Fig. 3.14). The other one is traditional CNN
(Convolutional Neural Networks), in which the special features MFCC (Mel-frequency cepstral
coefficients), CSTFT (Chroma Short-Time Fourier Transform), and mSpec (Mel-Spectrogram))
extracted from the lung sound waveforms recorded were processed for the classification. The
differences between these two algorithms are shown in Fig. 3.62.
Figure 3.62: Schematic of deep learning algorithms for the lung sound automatic classification (a) TCN
without recording pre-processing (b) CNN with pre-extracted features from the recordings.
Recorded
waveform in
time domain
TCN
Classification
results
Recorded
waveform in
time domain
Feature extraction:
MFCC,
CSTFT,
mSpec
CNN
Classification
results
(a)
(b)
97
K-fold cross-validation method was applied for training and test of the data in the deep learning
algorithms so that all the data can be utilized for both training and test. The recordings were divided
into K, which is an integer, subgroups randomly. One of the groups was used for test while other
groups were used for training in each iteration until every group has been used for test (Fig. 3.63).
The total classification accuracy is the average accuracy of all K iterations. K was 5 in this study,
which means that there were 40 recordings for training, and 10 recordings for test in each iteration.
There were 5 iterations in total.
The automatic classification accuracy of the lung sound recorded by the RMA was higher than
that recorded by the reference microphone (Fig. 3.64) due to RMA’s high sensitivity and low noise
floor, which makes the wheezing feature more distinguishable (Figs. 3.57 ~ 3.61). Thus, the RMA
has better performance for lung sound monitoring than the traditional reference microphone with
uniform but lower sensitivity and higher noise floor.
Figure 3.63: Schematic of K-fold cross-validation (K=10) [69].
98
Figure 3.64: Wheezing automatic classification accuracy for lung sounds recorded by the RMA with
width-stepped cantilever resonant microphones, and the reference microphone GRAS 40AO, with two
deep learning algorithms (a) TCN (temporal convolutional networks) with features of MFCC, CSTFT,
and mSpec extracted from the recordings as input.
(a)
(b)
92%
84%
92%
86%
99
3.6 Summary
Four RMAs with rectangular cantilever microphones, spiral microphones, microphones of the
plate with serpentine support beams, and width-stepped cantilever microphones with resonance
frequency 200 – 800 Hz where wheezing is prevalent were developed successfully for wheezing
detection and lung sounds classification. Higher unamplified sensitivity and lower noise floor than
conventional microphone was achieved for the novel RMAs. The width-stepped cantilever
microphones have the highest unamplified sensitivity (86 mV/Pa ~ 265 mV/Pa at resonances)
within the frequency range lower than 800 Hz in all reported microphones, and the resonant
microphones of the plate with serpentine support beams have smaller size and the high unamplified
sensitivity. Based on these advantages, it is proved that wheezing can be distinguished better and
automatic classification accuracy is higher with the RMAs than that with the conventional
microphone.
100
Chapter 4
RMA for Active Noise Cancellation
This chapter presents active noise cancelation (ANC) based on two sets of the MEMS resonant
microphone array (RMA), which offers very high sensitivities (and thus very low noise floor) near
resonant frequencies and also provides a natural filtering effect. The chapter describes the ANC
between 5 – 9 kHz and shows improved speech recognition in a noisy environment through
automatic speech recognition (ASR) test of a speech with different noises (of different intensity
levels) added [81]. In all the tested cases, there is less word error rate (WER) with the ANC when
compared to that without the ANC. At the same time, smaller RMAs with 2 µm thick Si width-
stepped cantilever resonant microphones were developed successfully for hearing aids.
4.1 Introduction
About 27.7 million adults in age 20-69 years old in the United States [70] suffer from hearing
loss and need hearing aids. Despite marked improvements in hearing aids, they still do not offer
clear and understandable speech in a noisy environment [71]. Noises can be reduced with analog
filters or digital signal processing on the signals picked up by a microphone [72] or a resonant
microphone array [53]. But they cannot cancel the noises getting to the ear directly without being
picked up by a microphone, as illustrated in Fig. 4.1 [73]. Such noises can be canceled or
suppressed only through active noise cancellation (ANC). Present research on ANC for hearing
aid uses feedforward and feedback microphones plus digital algorithms to control the noise
101
through a secondary path [73-75]. These solutions, however, would be power hungry, with their
ANC performance depending on how low the noise floor of the microphones is. Resonant
microphone arrays (RMA), on the other hand, do not need additional filters because they are
natural acoustic filters. We can target any interested noise frequencies through the combination of
different resonant microphones in the RMA. Therefore, there would be less power consumption.
Through such a self-filtering effect plus high sensitivity and low noise floor, better ANC with low
power can be achieved with RMAs.
4.2 Resonant microphone arrays (RMAs)
For the ANC, we used two sets of resonant microphone arrays shown in Fig. 4.2. Each array
is composed of multiple piezoelectric center width-stepped cantilever (which can be called paddle)
microphones with different resonant frequencies covering two frequency ranges: one between 0.8
and 5 kHz and the other between 5 and 9 kHz. The microphones provide high sensitivities and a
natural filtering effect near the resonant frequencies.
The basic structure of each cantilever microphone [45] is 5 µm thick Si paddle, a cantilever-
like structure with narrower width near the anchor (Fig. 4.2c). This structure reduces the footprint
Figure 4.1: In hearing aids, noise through Path 2 can only be reduced by active noise
cancellation (ANC).
102
of each microphone compared to a traditional rectangular cantilever for a given resonant frequency.
The size and resonant frequency of each microphone in the two arrays are shown in Tables 4.1 and
4.2. The gap between the edge of the microphone paddle and Si substrate is controlled as small as
20 µm to reduce the sound leak through the gap (for a good low frequency response). A patterned
0.5 µm thick piezoelectric ZnO film is only on the narrow area (near the anchor) of the paddle to
convert the sound to voltage.
The fabrication process of the device is the same as RMA with resonant rectangular cantilevers
described in Fig. 3.4. Each cantilever microphone works independently so that any of them can be
combined for different applications. The microphone array shown in Fig. 4.2a has 13 such resonant
microphones [45], and we use ten of them with resonance frequencies 856 – 4,833 Hz to sense
speech, as the main speech spectrum is within 300 – 4,000 Hz. Another array (Fig. 4.2b) has 9
microphones with resonant frequencies 5,380 – 8,820 Hz and is used to pick up sound in that range
so that we may actively cancel the noise within that range.
Figure 4.2: Photos of the fabricated resonant microphone arrays (a) with resonant frequencies from
856 to 4,889 Hz to sense speech [45] and (b) with resonant frequencies from 5,380 to 8,820 Hz to
actively cancel noise in that frequency range. (c) Photo of a resonant microphone in the array.
103
The measured frequency response and quality factor of the resonant microphones in each array
are shown in Figs. 4.3 and 4.4. The unamplified sensitivities at the resonant frequencies are higher
than that of a typical commercial microphone which is around 12.5 mV/Pa. The highest
unamplified sensitivity is 202.1 mV/Pa at 856 Hz, 16 times that of the commercial microphone.
The quality factors of the resonant microphones for speech sensing are 21.6 – 50.9 (Fig. 4.3), while
those for noise cancellation are 35.7 – 75.1 (Fig. 4.4).
Table 4.1 Resonant frequency and size of the microphones in the array Fig. 4.2a.
Leg# Resonant freq. (Hz) Size D (mm) Leg# Resonant freq. (Hz) Size D (mm)
1 856 2.5 6 3,046 1.5
2 1,342 2.3 7 3,460 1.3
3 1,750 2.0 8 3,793 1.2
4 2,180 1.8 9 4,356 1.2
5 2,595 1.6 10 4,889 1.2
Table 4.2 Resonant frequency and size of microphones in the array Fig. 4.2b.
Leg# Resonant freq. (Hz) Size D (mm) Leg# Resonant freq. (Hz) Size D (mm)
1 5,380 0.90 6 7,580 0.75
2 5,940 0.85 7 8,260 0.72
3 6,500 0.85 8 8,600 0.70
4 6,760 0.80 9 8,820 0.70
5 7,120 0.80
104
Figure 4.3: Frequency response of the resonant microphones to sense the speech with
high sensitivity from 856 to 4,889 Hz which covers the main spectrum of speech.
Figure 4.4: Frequency response of the resonant microphones to sense the high frequency
noise with high sensitivity from 5,380 to 8,820 Hz for active noise cancellation in this
frequency range.
105
4.3 Experimental setup for ANC
The experimental setup for testing the active noise cancellation (ANC) is illustrated in Fig. 4.5,
which mimics Fig. 4.1 which clearly illustrates the need for ANC to cancel out the noise through
Path 2. The target noise cancellation in this study is to actively cancel out any and every sound
(whether useful or not) over 5 – 9 kHz (which is above the range where main speech information
resides); to obtain better speech recognition, and that without the presence of the high frequency
(or high pitched) sounds bothering some, especially those with hyperacusis. In the experimental
setup (Fig. 4.6), the resonant microphone array with resonance frequencies 856 – 4,889 Hz, called
the speech RMA, picks up the speech over the frequency range where most of the speech
information is present, while the other resonant microphone array with resonance frequencies
5,380 – 8,820 Hz, called the ANC RMA, picks up all the sounds in the frequency range where
very little speech information is present. The signal picked up by the ANC RMA is processed and
then played to cancel the noise that would have reached the eardrum through Path 2. A reference
microphone (GRAS 40AO) mimicking the eardrum is used to test the ANC performance.
Figure 4.5: Schematic for testing the ANC which mimics the situation illustrated in Fig. 4.1.
106
For comparison, ANC with a traditional flat-band electret condenser microphone (ECM)
(POM-3535L-3-LW100-R from PUI Audio, amplified sensitivity 17.8 mV/Pa, SNR 68 dB) also
is implemented by replacing the ANC RMA (and its accompanying electronics) with the flat band
microphone in Fig. 4.5 and Fig. 4.6a.
An analog summing and inverting circuit based on an op-amp shown in Fig. 4.7a is connected
to the ANC RMA to sum and invert the noise signals picked up by the resonant microphones to
Figure 4.6: Photos of (a) the ANC experimental set up in an anechoic chamber, (b) the RMA with
preamplifiers, which is placed in a metal box, and (c) the front view of the metal box with holes for
sound transmission to RMA.
107
actively cancel the noise with 180° out-of-phase sound, as illustrated in Fig. 4.5. For ANC with a
flat band microphone, a high pass (>5 kHz) op-amp inverter shown in Fig. 4.8b is applied to the
output of the flat band microphone so that only the signal beyond 5 kHz may pass through. Such
a high pass filter is not required for the ANC RMA which picks up sound from 5 – 9 kHz with its
resonances spread over that range, as RMA has inherent filtering in the acoustic domain with high
quality factor. The analog gains in Fig. 4.7 were optimized manually for maximum noise reduction
by adjusting R11-R19 in Fig. 4.7a and R2 in Fig. 4.7b such that the amplitude of the noise-
canceling sound played by the speaker may be close to that of the noise. Although adaptive (or
automatic) gain control is needed for ANC in real applications, it is not necessary for comparing
the analog approach based on RMA with that based on a flat-band microphone. The gain of the
amplification circuit for the speech RMA can be adjusted separately without affecting ANC
performance as illustrated in Fig. 4.5.
Figure 4.7: Schematics of (a) the analog inverter for ANC RMA and (b) the high pass inverting
amplifier with f_c=5 kHz for ANC with a flat band microphone.
108
4.4 Digital algorithms for ANC
Besides the ANC experiments with the analog inverter described in the previous paragraphs,
digital signal processing was also used on the signals picked up by the ANC RMA and the flat-
band microphone to compare the various ANC approaches.
4.4.1 ANC phase compensation for RMA
The resonant microphone can be considered to be a second order system with a transfer
function H(s)
𝐻(𝑠) =
𝑘𝜔
#
-
𝑠
-
+𝜔
#
𝑠 𝑄 ⁄ +𝜔
#
-
where 𝑘 is DC gain, 𝜔
#
and Q are the resonance frequency and quality factor at resonance,
respectively, and are obtained from Figs. 4.3 and 4.4 to be used with Eq. 1 for implementing the
phase compensator described below. According to Eq. 1, there is a 90° phase shift at the resonance
frequency, which impacts the ANC performance.
Consequently, a digital algorithm illustrated in Fig. 4.8 is designed to compensate for the phase
shift. First, we note that the resonant microphone’s transfer function converts the input signal 𝑠(𝑛)
to output signal 𝑥(𝑛), as follows in the frequency domain
𝑋(𝑒
Ow
) = 𝑆(𝑒
Ow
)𝐻(𝑒
Ow
)
where 𝑋(𝑒
Ow
) and 𝑆(𝑒
Ow
) are the Fourier transforms of 𝑥(𝑛) and 𝑠(𝑛), respectively, while
𝐻(𝑒
Ow
) is the transfer function of the resonant microphone.
If the digitized values of 𝑥(𝑛) are flipped in time (i.e., the signal in a buffer is convoluted from
right to left), the reversing order of 𝑥(𝑛) is 𝑥
89Q
(𝑛), which in the frequency domain is
𝑋
89Q
(𝑒
Ow
) = 𝑆(𝑒
ROw
)𝐻(𝑒
ROw
)
(4-1)
(4-2)
(4-3)
109
where 𝑋
89Q
(𝑒
Ow
) is the Fourier transform of 𝑥
89Q
(𝑛), flipped 𝑥(𝑛). If 𝑋
89Q
^𝑒
Ow
_ is passed
through the transfer function of the resonant microphone 𝐻(𝑒
Ow
), we get
𝑌
89Q
(𝑒
Ow
) = 𝑆(𝑒
ROw
)𝐻(𝑒
ROw
) 𝐻(𝑒
Ow
)
Then through flipping 𝑌
89Q
(𝑛), which is the inverse Fourier transform of 𝑌
89Q
(𝑒
O4
), in time
domain, we have
𝑌^𝑒
Ow
_= 𝑆^𝑒
Ow
_𝐻^𝑒
Ow
_𝐻^𝑒
ROw
_ = 𝑆(𝑒
Ow
)𝐻(𝑒
Ow
)
-
Since the phase of 𝑦(𝑛), which is the inverse Fourier transform of 𝑌(𝑒
Ow
), is the same as the
original noise 𝑠(𝑛), the digital phase compensator shown in Fig. 4.9 restores phase shift near the
resonance of the resonant microphone.
For ANC with RMA and the digital phase compensator, the gain of 𝑦(𝑛) at 9 different channels
of the ANC RMA is optimized so that the 𝐸, which is the noise energy after ANC, is minimized.
𝐸 = n (𝐺n 𝐺
8
𝑦
8
(𝑛)
Z
8( ,
−𝑠
#
(𝑛))
-
where 𝑠
#
(𝑛) is the noise reaching the test microphone, 𝐺
8
is the gain of 𝑦(𝑛) at different
channels, 𝐺 is the gain of summed 𝑦(𝑛) for ANC. This optimization is currently implemented in
a laptop computer and takes 13 msec.
Figure 4.8: Digital phase compensator: s(n) is the noise signal; H(s) is the transfer function of the
resonant microphone; y(n) is the output of the phase compensator.
(4-4)
(4-5)
(4-6)
110
4.4.2 ANC with digital adaptive filter
A digital adaptive filter (Fig. 4.9) has been used for both the RMA and the reference flat band
microphone. A least mean square (LMS) optimizer is used to optimize a 101 order (100 taps) FIR
(finite impulse response) causal filter, with step size of 0.1 for the gradient descent process, so that
the error noise energy 𝐸 (∑(𝑠(𝑛)−𝑦(𝑛))
-
) may be minimized. There is a tradeoff between the
convergence rate (determined by the step size) and noise reduction performance.
4.4.3 ANC with deep learning
Deep learning (Fig. 4.10) also has been applied to process the noise signal picked up by the
ANC RMA and the flat band microphone to evaluate the performance of the RMA-based and flat-
band-microphone-based approach ANC. The deep learning algorithm is based on temporal
convolutional network (TCN) [9] which regresses the noise at the current time by tracing back
0.09 seconds for historic data picked up by the ANC microphone after training with a set of noises
(e.g., white noise, mixed sinusoidal noise, etc.). During training, it converges to the best
performance by tracing 11 seconds of training sounds under a learning rate of 0.02.
Figure 4.9: Schematic of ANC with a digital adaptive filter.
111
4.5 ANC results
The noise levels before and after the ANC with RMA and analog inverter are shown in Fig.
4.11 for white noise over 5,380 – 8,820 Hz. The noise reduction is not uniform over the frequency
range. At most frequencies, the noise reduction is 0 – 10 dB. At some frequencies such as 5.985,
7.500, 8.160, and 8.500 kHz, the noise reduction is -20 to -25 dB. These frequencies are close to
the resonance frequencies (vertical dash line in Fig. 4.11) of the ANC RMA. Thus, the resonance
frequencies of the ANC RMA can be designed especially for specific noise frequencies for best
noise reduction. In general, noise reduction can be improved with a larger number of resonant
microphones in the ANC RMA so that noise at more frequencies can be canceled out well.
However, the size of the RMA would be larger, and there is a trade-off between ANC performance
and device size. Also, the ANC with RMA can be more effective, if the phase shift issue at the
microphone’s resonance frequency is compensated, as can be seen in Fig. 4.11 with the
experimental data obtained with the digital phase compensator (Fig. 4.8).
Figure 4.10: Schematic of the deep learning with TCN for ANC.
112
In comparison, the noise reduction with ANC through a flat band microphone is uniform but
the reduction level is very limited as shown in Fig. 4.14. Figure 4.16 shows the percentage of a
particular noise reduction level within this frequency range 5,380 – 8,820 for the two approaches.
We can see that noise reduction of -10 to -30 dB is more often with the RMA-based approach than
with the flat-band-microphone-based approach.
The ANC with a digital adaptive filter (Fig. 4.9) improves the ANC performance with both
RMA and flat band microphone over the ANC with deep learning, as shown in Figs. 4.12 and 4.15.
We can see that -20 dB to -10 dB noise reduction can be achieved at most frequencies for white
noise over 5,380 – 8,820 Hz. However, the noise reduction with the adaptive filter is still limited
when the noise signal level before ANC is low, as shown in Fig. 4.12 at 7.6 kHz and Fig. 4.15 at
7.5 kHz. In this situation, the RMA-based approach performs better than the flat-band-
microphone-based approach, as shown in Fig. 4.13, which plots the noise change after ANC versus
the noise level. The reason is that when the noise signal level is lower than the equivalent input-
referred noise (or the noise floor) of the flat band microphone, the microphone is unable to pick
up the noise for ANC. Thus, the highly sensitive RMA (with a much lower noise floor than the
flat band microphone at the RMA’s resonance frequencies) offers a means for better ANC in the
case of a low level of sound/noise.
The noise reduction with the deep learning is lower than that with the adaptive filter but can
be improved with optimized deep learning. However, the deep learning technique works only for
noises that can be trained into deep learning and cannot cancel any and every kind of noise.
Moreover, long training time undermines the application of deep learning to real time ANC with
time-varying noises. In the case of the ANC with the adaptive filter, the noise energy reduction
varies as 30%, 85%, and 96% with the filter-optimization time of 0.016, 0.172, and 3.127 sec,
113
respectively. For fast time-varying noise, a high convergence rate and short settling time are
necessary for good ANC performance.
Figure 4.11: Noise power spectral densities before, and after ANC with RMA, with analog inverter
and with digital phase compensator.
Figure 4.12: Noise power spectral densities before and after ANC with RMA plus adaptive filter and with RMA
plus deep learning.
114
Figure 4.13: Noise change after ANC with adaptive filter for both RMA and flat band microphone vs the
noise power spectral density (PSD) before ANC.
Figure 4.14: Noise power spectral density before and after ANC with flat band microphone plus analog
inverter.
115
Figure 4.15: Noise power spectral density before and after ANC with flat band microphone plus adaptive
filter and with flat band microphone plus deep learning.
Figure 4.16: Percentage of noise reduction levels for ANC with analog inverter for RMA and flat band
microphone.
116
4.6 Speech recognition
The speech plus noise with different SNR was played and then recorded by the test mic in Fig.
4.5 with and without active noise cancellation. The recorded speech was applied for Automatic
speech recognition (ASR) on IBM deep learning platform [76]. The measured word error rates
(WERs) of the ASR with and without the RMA-based ANC with analog inverter for both the white
noise and sinusoidal noises (5.985, 7.500, 8.160, 8.500 kHz, which are close to resonance
frequencies of the ANC RMA) are shown in Fig. 4.17. Overall, the WER with the ANC is lower
than that without the ANC, as expected. The marginal improvement of WER with the ANC is the
best for speech with signal-to-noise ratio (SNR) around -25 dB. As the acceptable WER is typically
25% [77], we can see that the minimum SNR acceptable without the ANC is -22 dB, while it is -
26 dB with the ANC. If the frequencies of noises in the real environment are not near the resonance
frequencies of the ANC RMA, WER in ASR will suffer, and more resonant microphones in the
ANC RMA (at the cost of a larger device size) or a digital adaptive filter (e.g., Fig. 4.9 with its
optimization time substantially reduced) may be needed.
Since the Test mic (in Fig. 4.5) has a noise floor, the sound pressure level (SPL) of the speech
also affects speech recognition, for a given SNR in the environment. As shown in Fig. 4.18, the
measured WER increases with lower SPL for the same SNR of -20 dB of the speech. The
worsening slope of the WER with ANC is lower than that without the ANC, as the SPL is
decreased. The minimum acceptable SPL (defined by the minimum acceptable WER of 25%) is
67 dB when ANC is used, lower than 69 dB when no ANC is used. The significance of this finding
is that the hearing-aid wearer can understand speech without going closer to the speaker. It is noted
that the noise floor of the test microphone (in this case the reference microphone GRAS 40AO)
affects the net SNR, as the SPL of the speech is reduced, for a given SNR in the environment
117
Figure 4.17: Word error rate (WER) of ASR before and after the ANC with RMA and analog inverter
for the speech with different SNR (with the 70 dB SPL on the test microphone for all the speech files).
Figure 4.18: Word error rate (WER) of ASR before and after the ANC with RMA and analog inverter
for the speech with the same SNR of -20 dB but with different SPL of the speech on the test microphone.
118
4.7 Smaller RMA for ANC
4.7.1 Design
The RMAs tested for ANC of hearing aids shown in Fig. 4.2 are 9.3 x 25.7 mm. It is too big
to be placed in real hearing aids. Therefore, small size RMAs are required. A straightforward way
to shrink the size of resonant microphones while keeping similar resonant frequencies is to reduce
the thickness of the microphone cantilever/paddles. 2 µm thick cantilevers were researched instead
of 5 µm which was used in previous RMAs we studied. The cost is that the yield of fabrication
was lower because thin structures are easy to be broken during the fabrication process. At the same
time, there was more warpage because of residue stress of the thin films deposited, which led to
lower sensitivity with sound leakage or even cantilever broken. There is little stress in the
cantilever base released from the SOI wafer. The stress of the thin films of sputtered piezoelectric
ZnO, insulator PECVD SiN, and electrode Al was characterized and minimized by optimizing the
deposition parameters.
The designed RMA with thinner width-stepped cantilevers, which are illustrated in Fig. 4.19,
is much smaller than the previous ones we studied. The area of the RMA is only one tenth of the
previous size. The size and resonance frequencies of each resonant microphone in RMA for ANC
are shown in Table 4.1. The size comparison between newly designed resonant microphones with
2 µm thick Si cantilevers and those studied in Chapter 3 with 5 µm thick cantilevers is shown in
Fig. 4.20. The fabrication is based on the SOI wafer with 2 µm thick active layer. The fabrication
process is the same as the RMAs studied in Chapter 3.2 as illustrated in Fig. 3.4.
119
(a) (b)
Figure 4.19: Size comparison of RMAs studied in Chapter 3 and new design: (a) Design of the RMAs
(b) Area of RMAs.
RMA in Fig. 4.2(b)
Figure 4.20: Size of the resonant microphones (RM) with 5 and 2 µm thick Si paddles.
.
Table 4.3 Resonant frequencies in the RMA with 2 µm thick Si paddles for ANC.
No.
Resonance
frequency (Hz)
Size (mm) No.
Resonance
frequency (Hz)
Size (mm)
1 8,000 0.66 5 12,000 0.54
2 9,000 0.615 6 13,000 0.52
3 10,000 0.58
4 11,000 0.56
120
4.7.2 Small RMA fabricated
The RMA with 2 µm thick width-stepped cantilevers for ANC was fabricated successfully (Fig.
4.21), with the same processes as previous RMAs (Fig. 3.4).
4.7.3 Sensitivity of the RMA
The unamplified sensitivities of the RMA with 2 µm thick cantilevers are shown in Fig. 4.23.
The measured resonance frequencies are lower than that designed (Table 4.1) because the active
layer of the SOI wafer, from which the Si cantilever was released, is thinner than that was expected.
Figure 4.21: The fabricated RMAs with 2 µm thick width-stepped for ANC (active noise cancellation)
#1
#2
#3
#4
#5
#6
#7
#8
6.6 mm
10 mm
#1
#2
#3 #4
#5
#6
4.9 mm
5.5 mm
(a) (b)
Figure 4.22: Unamplified sensitivity and quality factor of the RMA with 2 µm thick cantilevers for ANC.
121
4.8 Summary
Active noise cancellation (ANC) based on two sets of MEMS resonant microphone array
(RMA) (one for speech and the other for ANC) is presented, and shown to be effective in actively
canceling the noise over 5 - 9 kHz (above the range where most of the speech information resides)
and in improving speech recognition. Compared to a similar ANC based on a flat band microphone,
ANC with RMA can cancel more noise around its resonance frequencies, because of very high
sensitivities at those frequencies. The ANC performance with RMA can be improved with a digital
phase compensator that takes care of the phase shift issue near the resonance frequency. Also,
there is more noise reduction for ANC with RMA than the flat band microphone when the noise
loudness level is low. Furthermore, the ANC with RMA and analog inverter is tested for ASR as
a function of (1) signal-to-noise ratio in the environment and (2) the sound pressure level of speech
with two types of noises added. The ANC is shown to improve ASR performance. At the same
time, much smaller RMA with thinner width-stepped cantilever resonant microphones, which are
better for wearable applications, was developed successfully for ANC.
122
Chapter 5
RMAs for Speech Sensing and Recognition
This chapter presents the RMAs with resonant frequencies 300 – 8000 Hz for speech
recognition for the human-machine interface and hearing aids. Three RMAs (one for wideband
speech spectrum, one for narrow band speech spectrum, and one is small for wearable applications)
with width-stepped cantilever resonant microphones were designed, fabricated, and characterized
successfully. Very high sensitivity and high SNR were achieved for the RMA. The SNR of the
RMA for the narrow band speech spectrum is higher than 73 dBA at all resonances, which is very
good for speech sensing and recognition. At the same time, a smaller RMA with one fifth area of
the previously reported one was developed successfully for speech sensing, which is good for
wearable applications.
5.1 Introduction
Delivering a clear speech in communication instruments like telephone, cell phone, and
medical instruments like hearing aids is extremely important. At the same time, automatic speech
recognition (ASR) is becoming more and more popular in smart personal assistants in cell phones,
smart speakers, cars, etc. as an important sensing part of the internet of things (IoT) and cyber-
physical system (CPS) [78]. Therefore, the microphone with high sensitivity and SNR is an unmet
need for clear speech sensing and accurate speech recognition.
123
5.2 RMAs for wide band and narrow band speech spectrum
5.2.1 Design
The frequencies are evenly distributed in existing RMAs. However, the human ear does not
perceive sound linearly with frequency as shown in Fig. 5.2 [58] but perceives sound pitch
linearly with Mel, which can be expressed by the equation:
𝑀𝑒𝑙𝑠 = 1127ln (1+𝑓/700)
where f is the frequency. Thus, the ASR would be better with resonance frequencies of the RMA
evenly distributed in Mel scale.
Figure 5.1: Speech recognition market size.
Figure 5.2: Subjective pitch and frequency [58].
124
Ideally, more resonant frequencies of RMA would be better for ASR. However, the RMA
would be larger with more resonant microphones inside, which is not preferred for wearable
applications. Therefore, there is a trade-off between performance and size. Eight Mel distributed
resonance frequencies are designed for both narrow speech band (300 – 4000 Hz) and broad speech
band (50 - 7000) Hz as shown in Fig. 5.3 and Table 5.1 and Table 5.2.
Figure 5.3: Eight Mel filters in (a) narrow band and (b) broad band. The center frequencies are
designed for the resonance frequencies of the RMA.
(a)
(b)
125
5.2.2 RMAs fabricated
The RMAs are fabricated (Figs. 5.4 and 5.5) with the same process as previous RMAs for lung
sound detection and active noise cancellation (Fig. 3.4).
Table 5.1 Resonant frequencies in narrow band and sizes of the resonant microphones in the
RMA for ASR.
No.
Resonance
frequency (Hz)
Size (mm) No.
Resonance
frequency (Hz)
Size (mm)
1 381 3 5 1555 1.7
2 599 2.4 6 2009 1.58
3 861 2.2 7 2555 1.43
4 1176 1.87 8 3211 1.3
Table 5.2 Resonant frequencies in wide band and sizes of the resonant microphones in the
RMA for ASR.
No.
Resonance
frequency (Hz)
Size (mm) No.
Resonance
frequency (Hz)
Size (mm)
1 458 2.6 5 2474 1.42
2 790 2.2 6 3384 1.3
3 1217 1.8 7 4555 1.1
4 1767 1.6 8 6062 0.95
(a) (b)
Figure 5.4: Photos of the fabricated RMA for narrow band speech spectrum.
8.25 mm
18.43 mm
#1
#4
#6
#8
#2
#3
#5
#7
126
5.2.3 Sensitivity
The sensitivity is measured in a plane wave tube acoustic environment (Fig. 3.41) in which
only plane wave can propagate, and the DUT and reference microphone GRAS 40AO receive the
same sound pressure. The unamplified sensitivity (measured sensitivity over the amplification gain
100) of the RMAs obtained for narrow band speech spectrum is higher than other MEMS flat band
microphones in the same frequency range (Figs. 5.6a and 5.7). For the RMA for wide band speech
spectrum, only MEMS microphone [19] has higher sensitivity at frequencies higher than 2500 Hz
(Figs 5.6b and 5.7). RMA [45] has higher unamplified sensitivity in its resonances. However, it is
lowest resonance frequency is higher than 800 Hz, much higher than the low frequencies in the
speech (300 Hz). The RMAs developed in this work provide both high unamplified sensitivities
and wide frequency range for speech sensing and recognition. The quality factors of the resonant
microphones increase with higher resonances as the area is smaller with higher resonance
frequencies so that there is less damping, which is mainly caused by the air around the microphone.
Figure 5.5: Photos of the fabricated RMA for wide band speech spectrum (a) Top side view (b)
back side view.
17.42 mm
8.18 mm
#1
#3
#7
#8
#2
#4
#5
#6
(a) (b)
127
Figure 5.6: Measured unamplified sensitivities of the RMAs (a) for narrow band speech spectrum
and (b) for wide band speech spectrum.
(a)
(b)
128
5.2.4 Noise
The noise of the RMAs with the op-amps LTC6244 was measured with double metal boxes on
a vibration isolation table (Fig. 3.48). Electromagnetic noise, sound in the environment, and
vibration can be prevented with this test configuration. The measured input referred RMS noise
20 Hz – 20 kHz of the op-amp with the RMA is around 4.5 µV for both RMAs fabricated (Fig.
5.8). The minimum detectable sound of the RMA for narrow band speech spectrum is 0.7 dBA –
20.1 dBA at the resonances, and that of the RMA for wide band speech spectrum is 1.7 dBA –
31.4 dBA at the resonances (Fig. 5.9). Accordingly, the SNR of the RMA for narrow band speech
Figure 5.7: Comparison of microphones’ unamplified sensitivities.
129
spectrum is 93.3 dBA – 73.9 dBA, and that of the wide band speech spectrum is 92.3 dBA – 62.6
dBA (Fig. 5.10). It is higher at lower frequencies as the sensitivities at lower resonances
frequencies are higher while the noise level is similar. Such high SNR is good for speech sensing
and recognition.
Figure 5.8: Op-amp input referred noise RMS 20 Hz – 20 kHz with the RMAs after A-weighting (a)
RMA for narrow band speech spectrum and (b) RMA for wide band speech spectrum.
(a) (b)
Figure 5.9: Minimum datable sound of the RMAs with op-amp after A-weighting (a) RMA for narrow
band speech spectrum and (b) RMA for wide band speech spectrum.
(a) (b)
130
Figure 5.10: Measured SNR of the RMAs (a) for narrow band speech spectrum and (b) for wide
band speech spectrum.
(a)
(b)
131
5.3 RMA for wearable applications
The RMAs tested for speech sensing of hearing aids shown in Fig. 4.2 are 9.3 x 25.7 mm. It is
too big to be placed in real hearing aids and other wearable devices. The RMAs developed in this
chapter are smaller 8.25 × 18.83 𝑚𝑚
-
for narrow band speech spectrum and 8.18 ×
17.42 𝑚𝑚
-
for wide band speech spectrum. However, they are still not small enough for wearable
applications. Thus, an even smaller RMA based on 2 µm thick Si with-stepped cantilever replacing
the previous 5 µm thick Si cantilever was designed, fabricated, and characterized successfully.
The size of the smaller RMA for speech sensing is 5.6 × 8.3 𝑚𝑚
-
, which is much smaller
than previous RMAs based on 5 µm thick Si cantilever (Fig. 5.11). The design for each resonant
microphone in the array is shown in Table 5.3.
(b) (b)
Figure 5.11: Size comparison of RMAs studied for speech sensing and recognition: (a) Design of the
RMAs (b) Area of RMAs.
9.3 mm
25.7
mm
8.25 mm
18.43
mm
8.18 mm
17.42
mm
5.6 mm
8.3
mm
RMA in
Fig 4.2(a)
RMA in
Fig 5.4
RMA in
Fig 5.5
Current
design
Table 5.3 Resonant frequencies in narrow band and sizes of the resonant microphones in the RMA.
No.
Resonance
frequency (Hz)
Size (mm) No.
Resonance
frequency (Hz)
Size (mm)
1 381 2.2 5 1555 1.235
2 599 1.835 6 2009 1.111
3 861 1.58 7 2555 1.15
4 1176 1.39 8 3211 1.032
132
The RMA was fabricated successfully (Fig. 5.12). The unamplified sensitivities of the RMAs
with 2 µm thick cantilevers (Fig. 5.13) are lower than that with 5 µm thick cantilevers (Fig. 5.6)
because the 2 µm thick resonant cantilever microphones have larger warpage caused by the
compressive stress of ZnO thin film (Fig. 5.12). The warpage may also cause the quality factors
for the resonant microphones in the RMA (between 5 and 25, Fig. 5.13) smaller than that with 5
µm thick cantilevers (between 15 and 40, Fig. 5.6).
Figure 5.12: The fabricated RMAs with 2 µm thick width-stepped cantilevers for speech sensing.
#1
#2
#3
#4
#5
#6
#7
#8
5.6 mm
8.3 mm
Figure 5.13: Unamplified sensitivity and quality factor of the RMA with 2 µm thick cantilevers for speech
sensing.
133
5.4 Summary
Two RMAs based on 5 µm thick width-stepped Si cantilever resonant microphones were
developed successfully for narrow band speech spectrum and wide band speech spectrum
separately. High sensitivities and SNRs were achieved for speech sensing and recognition. At the
same time, small RMA, which is one fifth area of reported RMA for speech recognition, based on
2 µm thick width-stepped Si cantilever resonant microphones were designed and fabricated
successfully for wearable applications.
134
Chapter 6
Conclusion and Future Directions
6.1 Conclusion
Higher sensitivities and SNR than reported microphones are achieved through developing
novel piezoelectric MEMS resonant microphone arrays (RMAs), which work as both sensors and
filters at resonances covering the frequency range where the wheezing is prominent. Accordingly,
wheezing can be distinguished better in both frequency domain and time domain. Consequently,
higher automatic classification accuracy is achieved. Present work paves the way to a highly
sensitive and thin wearable electronic stethoscope with which asthma can be detected in advance
through lung sound monitoring. Through medical cyber-physical systems (MCPS), the caregiver
can respond fast according to the monitoring results and have the patients treated in real time.
At the same time, MEMS RMA is developed and applied for active noise cancellation (ANC)
successfully for the first time. ANC has been widely used in headphones and cabins of cars to
control the noise from the environment, and can also be used for hearing aids. The high SNR of
the RMA makes it detect weak noise in the environment better than the traditional microphone
with uniform but low sensitivities and SNR. Higher speech recognition rate is demonstrated for
hearing aid application with the RMA for ANC.
Also, RMAs of Mel-distributed resonance frequencies with high sensitivities and SNR are
developed for speech recognition. Both narrow band speech spectrum and wide band speech
spectrum are considered for trading off the RMA size which is affected by the number of resonant
135
microphones in the array and the frequency range covered. Meanwhile, much smaller RMAs based
on thinner Si cantilevers are developed for wearable applications. This work provides low-power
consuming and highly sensitive microphones not only for traditional communication and speech
sensing in telephone and hearing aids but also for new human-machine interface of speech control
in smart phones, smart speakers, smart hearing aids, etc. in the scope of internet of things (IoT)
and cyber physical system (CPS).
Besides experimental work, a complete lumped element model of MEMS piezoelectric
resonant microphone of Si cantilever with warpage in Si channel is developed. The warpage effect
on the acoustic impedance is studied theoretically, and shown to increase the air gap through which
acoustic pressure leaks (more at a lower frequency), leading to lower sensitivity as the frequency
decreases. At the same time, an analytical vibration model based on beam theory is developed for
a width-stepped cantilever with multiple thin film layers (piezoelectric film, electrodes, and
insulating layer), and is used for the impedance of the cantilever microphone and the piezoelectric
transform parameters in a lumped element model. A noise model is also built for the resonant
microphone together with the op-amp. The models for the microphone sensitivity’s frequency
response and noise are compared with measured data on RMA with width-stepped cantilevers.
The piezoelectric MEMS RMAs are expected to play important roles in CPS and MCPS for
smart consumer products, vehicles and medical devices due to their high sensitivities, low noises,
high SNR, zero-power, and natural filtering effect.
6.2 Future directions
Future directions include even better RMAs, low power algorithms, and system integration for
various applications of the electronic stethoscope and hearing aids, etc.
136
6.2.1 Electronic stethoscope
The high SNR of the developed RMAs not only make it possible for the electronic stethoscope
to detect and classify the lung sound more accurately, but also make it more wearable by
eliminating the acoustic coupler, which is required to amplify the weak lung sound mechanically
for microphones with lower SNR, or reducing its size. Smaller RMAs for wearable electronic
stethoscopes can be developed in the future by utilizing a thinner Si cantilever (2 µm or 1 µm),
which has been developed for speech sensing and active noise cancellation in this thesis. The
challenge of using a thinner Si cantilever is that it is more sensitive to the stress from thin films of
piezoelectric material and electrodes. Fabrication processes should be optimized better to control
the stress of various thin films so that there is little warpage of the resonant microphones in the
array. Alternatively, a proof mass is also very effective to lower the size of the cantilever at the
target resonance frequency.
At the same time, though op-amps are used in current work to demonstrate the performance of
the RMAs, application specific integrated circuits (ASIC) with multiple channels for amplifying
the signals from the RMA and converting analog signals to digital signals can be developed for
the wearable stethoscope.
Besides hardware, low power lung sound signal processing algorithms should be developed.
Two specifications are important for the algorithms. One is the lung sound classification accuracy.
The algorithms should classify the lung sound with disease signatures like wheezing accurately so
that the patients and the care providers can act properly. Also, there is various noise in the lung
sound during recording especially when the patients wear the stethoscope in changing
environments. The algorithms should be robust enough to classify the lung sounds with all these
noises. The other one is the power consumption of the algorithms. It is very important for wearable
137
stethoscopes as the battery cannot last long if the algorithms consume too much power. Deep
learning is powerful, but it usually consumes a lot of power. Low power machine learning
algorithms can be developed for wearable stethoscopes with RMAs.
The algorithms can be implemented on a low power digital signal processor (DSP) or
microcontroller unit (MCU) for the wearable electronic stethoscope system.
6.2.2 Hearing aids
Smaller RMAs based on 2 µm thick Si cantilever instead of 5 µm thick have been developed
in this thesis for both speech sensing and active noise cancellation for the hearing aids. The
resonance frequency shift is obvious for the RMA with high resonance frequencies for ANC due
to the large variation of the active layer thickness (+/- 0.5 µm variation for 2µm thick active layer)
of the SOI wafer, which is used for the Si cantilever for the resonant microphones in the array. The
higher resonance frequency, the larger shift of the resonance frequency with the same thickness
variation. Two directions can be considered for accurate resonance frequency control. One is to
use an SOI wafer with more accurate active layer thickness control. The other one is to use the
proof mass to tune the resonance frequency. The proof mass has a significant effect to lower the
size of the cantilever for the same resonance frequency.
A challenge for the ANC with RMA is that there is 180 degree of phase change before and
after the resonance. The signal from the resonant microphone with frequencies lower than its
resonance has the same phase as the physical sound in the air. Thus, an analog inverter can be
connected to the microphone to reverse the phase and then play the reversed signal through a
speaker to cancel the noise in the environment actively. However, the signal from the resonance
microphone with frequencies higher than its resonance has an opposite phase to the physical sound
in the air. As a result, the sound should be played directly without an analog inverter to cancel the
138
noise actively. Proper algorithms should be developed to compensate for the phase of the signal
from the sound with frequencies higher than the resonance frequency of the resonant microphone.
At the same time, the algorithms should work fast to process the signals in real-time with little
delay and should consume low power.
139
Bibliography
[1] R. (Raj) Rajkumar, I. Lee, L. Sha, and J. Stankovic, “Cyber-physical systems: the next
computing revolution,” in Proceedings of the 47th Design Automation Conference on -
DAC ’10, Anaheim, California, 2010, p. 731, doi: 10.1145/1837274.1837461.
[2] “Global MEMS Microphones Market”, https://www.kbvresearch.com/mems-
microphones-market/, (accessed May 1, 2022).
[3] A.-F. Pelé, “EETimes - Internet Giants to Boost MEMS Demand -,” EETimes, Oct. 08,
2019. https://www.eetimes.com/internet-giants-to-boost-mems-demand/ (accessed Apr. 11,
2020).
[4] “Amazon Echo Teardown,” iFixit, Dec. 16, 2014.
https://www.ifixit.com/Teardown/Amazon+Echo+Teardown/33953 (accessed Apr. 11,
2020).
[5] Derya Dalga, and Simon Doclo. "Combined feedforward-feedback noise reduction schemes
for open-fitting hearing aids." In 2011 IEEE Workshop on Applications of Signal
Processing to Audio and Acoustics (WASPAA), pp. 185-188. IEEE, 2011.
[6] B. Rajan, B. Bhavana, K. R. Anusha, G. Kusumanjali and G. S. Pavithra, "IoT based Smart
and Efficient Hearing Aid using ARM Cortex Microcontroller," 2020 International
Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE),
2020, pp. 229-233
[7] Insup Lee et al., “Challenges and Research Directions in Medical Cyber–Physical Systems,”
Proc. IEEE, vol. 100, no. 1, pp. 75–90, Jan. 2012, doi: 10.1109/JPROC.2011.2165270.
[8] K. N. Priftis, L. J. Hadjileontiadis, and M. L. Everard, Eds., Breath Sounds. Cham: Springer
International Publishing, 2018.
[9] R. X. A. Pramono, S. Bowyer, and E. Rodriguez-Villegas, “Automatic adventitious
respiratory sound analysis: A systematic review,” PLoS ONE, vol. 12, no. 5, p. e0177926,
May 2017, doi: 10.1371/journal.pone.0177926.
[10] “Electronic & Digital Stethoscopes for Sale | Eko.” https://shop.ekohealth.com/ (accessed
Apr. 11, 2020).
[11] “Data Sheet of ICS-40720”, https://product.tdk.com/info/en/documents/catalog_datasheet/
mems-mic/DS-000045-v1.3-ICS-40720.pdf, (accessed Apr. 11, 2020).
[12] L. R. Rabiner and R. W. Schafer, Theory and applications of digital speech processing, 1st
ed. Upper Saddle River: Pearson, 2011, pp 140.
140
[13] D. T. Martin, J. Liu, K. Kadirvel, R. M. Fox, M. Sheplak, and T. Nishida, “A
Micromachined Dual-Backplate Capacitive Microphone for Aeroacoustic Measurements,”
J. Microelectromech. Syst., vol. 16, no. 6, pp. 1289–1302, Dec. 2007.
[14] “Datasheet of INMP411”, https://product.tdk.com/info/en/documents/catalog_datasheet/
mems-mic/INMP411.pdf, (Accessed Apr. 11, 2020)
[15] “Datasheet of ICS-40730”, https://product.tdk.com/info/en/documents/catalog_datasheet/
mems-mic/ICS-40730-Datasheet.pdf, (accessed Apr. 11, 2020).
[16] “Datasheet of SPH2430HR5H-B”, https://www.knowles.com/docs/default-source/model-
downloads/spw2430hr5h-b-revb.pdf?Status=Master&sfvrsn=bd77b1_4, (accessed Apr. 11,
2020).
[17] R. Littrell and R. Gagnon, “PIEZOELECTRIC MEMS MICROPHONES NOISE
SOURCES,” in 2016 Solid-State, Actuators, and Microsystems Workshop Technical Digest,
Hilton Head, South Carolina, USA, May 2016, pp. 258–261, doi: 10.31438/trf.hh2016.69.
[18] “Datasheet of VM2000”, https://vespermems.com/wp-content/uploads/2018/01/Vesper-
VM2000-Brochure.pdf, (Accessed Apr. 11, 2020)
[19] P. R. Scheeper et al., “A new measurement microphone based on MEMS technology,” J.
Microelectromech. Syst., vol. 12, no. 6, pp. 880–891, Dec. 2003, doi:
10.1109/JMEMS.2003.820260.
[20] Seung S. Lee, R. P. Ried and R. M. White, "Piezoelectric cantilever microphone and
microspeaker," in Journal of Microelectromechanical Systems, vol. 5, no. 4, pp. 238-242,
Dec. 1996.
[21] S. T. Hansen, A. S. Ergun, W. Liou, B. A. Auld, and B. T. Khuri-Yakub, “Wideband
micromachined capacitive microphones with radio frequency detection,” The Journal of the
Acoustical Society of America, vol. 116, no. 2, pp. 828–842, Aug. 2004, doi:
10.1121/1.1771617.
[22] S.-C. Lo, S.-K. Yeh, J.-J. Wang, M. Wu, R. Chen, and W. Fang, “Bandwidth and SNR
enhancement of MEMS microphones using two poly-Si micromachining processes,” in
2018 IEEE Micro Electro Mechanical Systems (MEMS), Belfast, Jan. 2018, pp. 1064–1067,
doi: 10.1109/MEMSYS.2018.8346743.
[23] J. Hillenbrand and G. M. Sessler, “High-sensitivity piezoelectric microphones based on
stacked cellular polymer films (L),” The Journal of the Acoustical Society of America, vol.
116, no. 6, pp. 3267–3270, Dec. 2004, doi: 10.1121/1.1810272.
[24] M. D. Williams, B. A. Griffin, T. N. Reagan, J. R. Underbrink, and M. Sheplak, “An AlN
MEMS Piezoelectric Microphone for Aeroacoustic Applications,” J. Microelectromech.
Syst., vol. 21, no. 2, pp. 270–283, Apr. 2012, doi: 10.1109/JMEMS.2011.2176921.
141
[25] J.-L. Huang et al., “High sensitivity and high S/N microphone achieved by PZT film with
central-circle electrode design,” in 2017 IEEE 30th International Conference on Micro
Electro Mechanical Systems (MEMS), Las Vegas, NV, USA, Jan. 2017, pp. 1188–1191, doi:
10.1109/MEMSYS.2017.7863628.
[26] S.-H. Tseng, S.-C. Lo, Y.-C. Chen, Y.-C. Lee, M. Wu, and W. Fang,
“IMPLEMENTATION OF PIEZOELECTRIC MEMS MICROPHONE FOR
SENSITIVITY AND SENSING RANGE ENHANCEMENT,” p. 4.
[27] “Datasheet of OPTIMC 1190”, http://www.optoacoustics.com/sites/default/files/
documents/optimic-1190-datasheet.pdf, (Accessed, Apr. 11, 2020)
[28] W. Jo, O. C. Akkaya, O. Solgaard, and M. J. F. Digonnet, “Miniature fiber acoustic sensors
using a photonic-crystal membrane,” Optical Fiber Technology, vol. 19, no. 6, pp. 785–792,
Dec. 2013, doi: 10.1016/j.yofte.2013.07.009.
[29] “Datasheet of ENDEVCO model 8507C-2”, http://www.ic72.com/pdf_file/8/182232.pdf,
(Accessed, Apr. 11, 2020)
[30] Y. Fuji et al., “An ultra-sensitive spintronic strain-gauge sensor with gauge factor of 5000
and demonstration of a Spin-MEMS Microphone,” in 2017 19th International Conference
on Solid-State Sensors, Actuators and Microsystems (TRANSDUCERS), Kaohsiung, Jun.
2017, pp. 63–66, doi: 10.1109/TRANSDUCERS.2017.7993988.
[31] “Datasheet of Bruel & Kjaer 4160”, https://www.bksv.com/-/media/literature/Product-
Data/bp0459.ashx, (Accessed, Apr. 11, 2020)
[32] “Datasheet of GRAS 40AZ”, https://www.gras.dk/products/measurement-microphone-
cartridge/prepolarized-cartridges-0-volt/product/ss_export/pdf2?product_id=152,
(Accessed, Apr. 11, 2020)
[33] S. Woo et al., “Realization of a High Sensitivity Microphone for a Hearing Aid Using a
Graphene–PMMA Laminated Diaphragm,” ACS Appl. Mater. Interfaces, vol. 9, no. 2, pp.
1237–1246, Jan. 2017, doi: 10.1021/acsami.6b12184.
[34] “How The Ear Works,” True Sound Hearing Aid Center, Aug. 08, 2018.
http://truesoundhac.com/hearing-loss/how-the-ear-works/ (accessed Apr. 12, 2020).
[35] J. O. Pickles, An introduction to the physiology of hearing, 4. ed. London: Emerald, 2012,
pp. 20
[36] Richard M. Stern and Nelson Morgan, “Features Based on Auditory Physiology and
Perception”, Chapter 8 in Techniques for Noise Robustness in Automatic Speech
Recognition, T. Virtanen, B. Raj, and R. Singh, eds., Wiley Press, 2012, pp. 193-227
[37] R. D. White, R. Littrell, and K. Grosh, “Copying the Cochlea: Micromachined Biomimetic
Acoustic Sensors,” p. 8, 2007.
142
[38] H. Shintaku, T. Nakagawa, D. Kitagawa, H. Tanujaya, S. Kawano, and J. Ito, “Development
of piezoelectric acoustic sensor with frequency selectivity for artificial cochlea,” Sensors
and Actuators A: Physical, vol. 158, no. 2, pp. 183–192, Mar. 2010, doi:
10.1016/j.sna.2009.12.021.
[39] T. Inaoka et al., “Piezoelectric materials mimic the function of the cochlear sensory
epithelium,” Proceedings of the National Academy of Sciences, vol. 108, no. 45, pp. 18390–
18395, Nov. 2011, doi: 10.1073/pnas.1110036108.
[40] H. S. Lee et al., “Flexible Inorganic Piezoelectric Acoustic Nanosensors for Biomimetic
Artificial Hair Cells,” Adv. Funct. Mater., vol. 24, no. 44, pp. 6914–6921, Nov. 2014, doi:
10.1002/adfm.201402270.
[41] Y. Jung, J.-H. Kwak, Y. Lee, W. Kim, and S. Hur, “Development of a Multi-Channel
Piezoelectric Acoustic Sensor Based on an Artificial Basilar Membrane,” Sensors, vol. 14,
no. 1, pp. 117–128, Dec. 2013, doi: 10.3390/s140100117.
[42] H. Shintaku, T. Kobayashi, K. Zusho, H. Kotera, and S. Kawano, “Wide-range frequency
selectivity in an acoustic sensor fabricated using a microbeam array with non-uniform
thickness,” J. Micromech. Microeng., vol. 23, no. 11, p. 115014, Nov. 2013, doi:
10.1088/0960-1317/23/11/115014.
[43] J. Jang, J. Lee, J. H. Jang, and H. Choi, “A Triboelectric-Based Artificial Basilar Membrane
to Mimic Cochlear Tonotopy,” Adv. Healthcare Mater., vol. 5, no. 19, pp. 2481–2487, Oct.
2016, doi: 10.1002/adhm.201600232.
[44] J. Jang, “MEMS piezoelectric artificial basilar membrane with passive frequency selectivity
for short pulse width signal modulation,” p. 5, 2013.
[45] L. Baumgartel, A. Vafanejad, S.-J. Chen, and E. S. Kim, “Resonance-Enhanced
Piezoelectric Microphone Array for Broadband or Prefiltered Acoustic Sensing,” J.
Microelectromech. Syst., vol. 22, no. 1, pp. 107–114, Feb. 2013, doi:
10.1109/JMEMS.2012.2216505.
[46] A. A. Shkel and E. S. Kim, “Continuous Health Monitoring With Resonant-Microphone-
Array-Based Wearable Stethoscope,” IEEE Sensors J., vol. 19, no. 12, pp. 4629–4638, Jun.
2019, doi: 10.1109/JSEN.2019.2900713.
[47] C. Zhao, K. E. Knisely, D. J. Colesa, B. E. Pfingst, Y. Raphael, and K. Grosh, “Voltage
readout from a piezoelectric intracochlear acoustic transducer implanted in a living guinea
pig,” Sci Rep, vol. 9, no. 1, p. 3711, Dec. 2019, doi: 10.1038/s41598-019-39303-1.
[48] S. Hur, J.-H. Kwak, Y. Jung, and Y. H. Lee, “Biomimetic acoustic sensor based on
piezoelectric cantilever array,” IEICE Electron. Express, vol. 9, no. 11, pp. 945–950, 2012,
doi: 10.1587/elex.9.945.
143
[49] J. Jang et al., “A microelectromechanical system artificial basilar membrane based on a
piezoelectric cantilever array and its characterization using an animal model,” Sci Rep, vol.
5, no. 1, p. 12447, Dec. 2015, doi: 10.1038/srep12447.
[50] T. Xu, M. Bachman, F.-G. Zeng, and G.-P. Li, “Polymeric micro-cantilever array for
auditory front-end processing,” Sensors and Actuators A: Physical, vol. 114, no. 2–3, pp.
176–182, Sep. 2004, doi: 10.1016/j.sna.2003.11.035.
[51] K. Tanaka, M. Abe, and S. Ando, “A novel mechanical cochlea ‘Fishbone’ with dual
sensor/actuator characteristics,” IEEE/ASME Trans. Mechatron., vol. 3, no. 2, pp. 98–105,
Jun. 1998, doi: 10.1109/3516.686677.
[52] “Cochlear Implants,” NIDCD, Aug. 18, 2015. https://www.nidcd.nih.gov/health/cochlear-
implants (accessed Apr. 12, 2020).
[53] A. A. Shkel, L. Baumgartel, and E. S. Kim, “A resonant piezoelectric microphone array for
detection of acoustic signatures in noisy environments,” in 2015 28th IEEE International
Conference on Micro Electro Mechanical Systems (MEMS), Estoril, Portugal, Jan. 2015,
pp. 917–920, doi: 10.1109/MEMSYS.2015.7051109.
[54] A. A. Shkel and E. S. Kim, “Wearable low-power wireless lung sound detection enhanced
by resonant transducer array for pre-filtered signal acquisition,” in 2017 19th International
Conference on Solid-State Sensors, Actuators and Microsystems (TRANSDUCERS),
Kaohsiung, Jun. 2017, pp. 842–845, doi: 10.1109/TRANSDUCERS.2017.7994180.
[55] V. Gross, A. Dittmar, T. Penzel, F. Schüttler, and P. von WICHERT, “The Relationship
between Normal Lung Sounds, Age, and Gender,” Am J Respir Crit Care Med, vol. 162,
no. 3, pp. 905–909, Sep. 2000, doi: 10.1164/ajrccm.162.3.9905104.
[56] “G series.” http://www.itu.int/net/itu-t/sigdb/speaudio/Gseries.htm (accessed Apr. 12,
2020).
[57] “Speech intelligibility - Facts about human voice frequency range,” DPA.
https://www.dpamicrophones.com/mic-university/facts-about-speech-intelligibility
(accessed Apr. 12, 2020).
[58] L. R. Rabiner and R. W. Schafer, Theory and applications of digital speech processing, 1st
ed. Upper Saddle River: Pearson, 2011. pp. 143 - 145.
[59] L. R. Rabiner and R. W. Schafer, Theory and applications of digital speech processing, 1st
ed. Upper Saddle River: Pearson, 2011. pp. 464.
[60] T. Veijola, "Acoustic impedance elements modeling oscillating gas flow in micro
channels." In Proceedings of MSM, pp. 96-99. 2001.
[61] Matthew D. Williams, "Development of a MEMS piezoelectric microphone for
aeroacoustic applications." PhD diss., University of Florida, pp. 71-82, 2011.
144
[62] N. Meslier, G. Charbonneau, and J.-L. Racineux, “Wheezes,” European Respiratory
Journal, vol. 8, no. 11, pp. 1942–1948, Nov. 1995, doi: 10.1183/09031936.95.08111942.
[63] “Asthma mortality data | CDC,” Oct. 04, 2019. https://www.cdc.gov/asthma/data-
visualizations/mortality-data.htm (accessed Apr. 13, 2020).
[64] M. A. Fraga, H. Furlan, R. S. Pessoa, and M. Massi, “Wide bandgap semiconductor thin
films for piezoelectric and piezoresistive MEMS sensors applied at high temperatures: an
overview,” Microsyst Technol, vol. 20, no. 1, pp. 9–21, Jan. 2014, doi: 10.1007/s00542-
013-2029-z.
[65] Z. Li, R. Yang, M. Yu, F. Bai, C. Li, and Z. L. Wang, “Cellular Level Biocompatibility
and Biosafety of ZnO Nanowires,” J. Phys. Chem. C, vol. 112, no. 51, pp. 20114–20117,
Dec. 2008, doi: 10.1021/jp808878p.
[66] C. Lea, M. D. Flynn, R. Vidal, A. Reiter, and G. D. Hager, “Temporal Convolutional
Networks for Action Segmentation and Detection,” in 2017 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), Honolulu, HI, Jul. 2017, pp. 1003–1012, doi:
10.1109/CVPR.2017.113.
[67] Marshall Buck, "Plane Wave Tubes-Uses and Limitations." In Audio Engineering Society
Convention 117. Audio Engineering Society, 2004.
[68] “ICS-40730 Datasheet”, https://invensense.tdk.com/download-pdf/ics-40730-datasheet/,
(Accessed, May 13th, 2022).
[69] Phauk Sokkhey, and Takeo Okazaki. "Hybrid machine learning algorithms for predicting
academic performance." Int. J. Adv. Comput. Sci. Appl 11, no. 1 (2020): 32-41.
[70] H. J. Hoffman, R. A. Dobie, K. G. Losonczy, C. L. Themann, and G. A. Flamme, “Declining
Prevalence of Hearing Loss in US Adults Aged 20 to 69 Years,” JAMA Otolaryngol Head
Neck Surg, vol. 143, no. 3, p. 274, Mar. 2017.
[71] E. M. Picou, “MarkeTrak 10 (MT10) Survey Results Demonstrate High Satisfaction with
and Benefits from Hearing Aids,” Semin Hear, vol. 41, no. 01, pp. 021–036, Oct. 2020.
[72] K. Chung, “Challenges and Recent Developments in Hearing Aids: Part I. Speech
Understanding in Noise, Microphone Technologies and Noise Reduction Algorithms,”
Trends in Amplification, vol. 8, no. 3, pp. 83–124, Jan. 2004.
[73] C.-Y. Ho, K.-K. Shyu, C.-Y. Chang, and S. M. Kuo, “Integrated active noise control for
open-fit hearing aids with customized filter,” Applied Acoustics, vol. 137, pp. 1–8, Aug.
2018.
[74] R. Serizel, M. Moonen, J. Wouters, and S. H. Jensen, “A zone of quiet based approach to
integrated active noise control and noise reduction in hearing AIDS,” in 2009 IEEE
145
Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY,
2009, pp. 229–232.
[75] D. Dalga and S. Doclo, “Combined feedforward-feedback noise reduction schemes for
open-fitting hearing aids,” in 2011 IEEE Workshop on Applications of Signal Processing
to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2011, pp. 185–188.
[76] IBM Watson speech to text service. https://speech-to-text-demo.ng.bluemix.net. [online].
Accessed March 26th 2020.
[77] C. Munteanu, G. Penn, R. Baecker, E. Toms, and D. James, “Measuring the Acceptable
Word Error Rate of Machine-Generated Webcast Transcripts,” p. 5, 2006.
[78] “Voice and Speech Recognition Market Size - Industry Report, 2019-2026”,
https://www.polarismarketresearch.com/industry-analysis/voice-recognition-market,
(accessed May 1, 2022).
[79] Hai Liu, Song Liu, Anton A. Shkel, Yongkui Tang, and Eun Sok Kim. "MEMS resonant
microphone array for lung sound classification." In 2019 IEEE International Electron
Devices Meeting (IEDM), pp. 34-4. IEEE, 2019.
[80] Hai Liu, Song Liu, Anton A. Shkel, Yongkui Tang, and Eun Sok Kim. "Multi-band MEMS
resonant microphone array for continuous lung-sound monitoring and classification."
In 2020 IEEE 33rd International Conference on Micro Electro Mechanical Systems
(MEMS), pp. 857-860. IEEE, 2020.
[81] Hai Liu, Song Liu, Anton A. Shkel, and Eun Sok Kim. "Active noise cancellation with
MEMS resonant microphone array." Journal of Microelectromechanical Systems 29, no. 5
(2020): 839-845.
[82] Texas Instruments. "Noise analysis in operational amplifier circuits." Application Report,
SLVA043B (2007).
[83] Singiresu S. Rao, "Mechanical Vibrations, in SI Units, Global Edition." ed: Pearson,
London (2017)
[84] Analog Devices, "Op amp noise relationships: 1/f noise, rms noise, and equivalent noise
bandwidth." MT-048 Tutorial (2009).
[85] Jerad Lewis. "Understanding microphone sensitivity." Analog Dialogue 46, no. 2 (2012):
14-16.
Abstract (if available)
Abstract
This dissertation presents the development and research of the piezoelectric MEMS resonant microphone array (RMA) with multiple resonances to sense the sound for lung sound detection and classification, active noise cancellation, speech recognition, etc.
A complete lumped element model of MEMS piezoelectric resonant microphone of Si cantilever with warpage is presented. The effect of the cantilever warpage on the acoustic impedance is studied theoretically and is shown to lead to increased acoustic pressure leak through the gap between the cantilever and the substrate wall, which decreases the sensitivity as the sound frequency becomes lower than a critical value. At the same time, an analytical vibration model of the width-stepped cantilever with multiple layers (piezoelectric film, electrodes, and insulating layer) is built and used to derive the electrical equivalent impedance of the cantilever microphone for the lumped element model. Also, a noise model coupled with an op amp’s noise model is developed for the resonant microphone. The models for both the microphone’s sensitivity and noise vs frequency are validated through a fabricated RMA with resonant microphones based on width-stepped piezoelectric cantilevers.
For wheezing detection and classification of lung sounds, four RMAs with four types of resonant microphones (rectangular cantilever, spiral microphone, rectangular plate with serpentine support beams, and width-stepped cantilever) with low resonant frequencies 200 – 600 Hz are designed, fabricated, and characterized, followed by lung sound recording and signal processing. Very high unamplified sensitivity 265 ~ 86.0 mV/Pa and extremely low noise floor of -4.0 ~ 7.4 dBA at the resonance frequencies are obtained. Consequently, the acoustic feature of wheezing is distinguished better in both the time domain and frequency domain in comparison with a flat band reference microphone. With this advantage, higher identification accuracy of lung sounds with and without wheezing is achieved.
For active noise cancellation (ANC), RMA with resonance frequencies 5 – 9 kHz is developed and shown to be effective in actively canceling the noise over the frequency range (that is above the range where most of the speech information resides) and in improving speech recognition. Compared to a similar ANC based on a flat band microphone, there is more noise reduction for ANC with RMA than a flat band microphone when the noise level is low, due to its high sensitivity and low noise floor. For this application, much smaller RMA with thinner width-stepped cantilever resonant microphones, which are better for wearable applications, are developed.
For speech sensing and recognition, three RMAs (one for wide band speech spectrum, one for narrow band speech spectrum, and one for small size) with width-stepped cantilevers are designed, fabricated, and characterized. The signal-to-noise ratio (SNR) of the RMA for the narrow band speech spectrum is higher than 73 dBA for 1 Pa sound at all the resonance frequencies.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Zero-power sensing and processing with piezoelectric resonators
PDF
Piezoelectric ultrasonic and acoustic microelectromechanical systems (MEMS) for biomedical, manipulation, and actuation applications
PDF
Piezoelectric MEMS for acoustic sensing, contactless RF switching, and resonant mass sensing
PDF
Audio and ultrasound MEMS built on PZT substrate for liquid and airborne applications
PDF
Battery-less detection and recording of tamper activity along with wireless interrogation
PDF
Integrated wireless piezoelectric ultrasonic transducer system for biomedical applications
PDF
Wineglass mode resonators, their applications and study of their quality factor
PDF
Additive manufacturing of piezoelectric and composite for biomedical application
PDF
Magnetic spring in electromagnetic vibration energy harvester and applications of focused ultrasonic transducer
PDF
Highly integrated 2D ultrasonic arrays and electronics for modular large apertures
PDF
Phase change heterostructures for electronic and photonic applications
PDF
Representation, classification and information fusion for robust and efficient multimodal human states recognition
Asset Metadata
Creator
Liu, Hai
(author)
Core Title
MEMS piezoelectric resonant microphone arrays and their applications
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Degree Conferral Date
2022-08
Publication Date
07/27/2023
Defense Date
05/25/2022
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
active noise cancellation,lung sound,MEMS,Microphone,OAI-PMH Harvest,piezoelectric,resonant,speech recognition
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Kim, Eun Sok (
committee chair
), Wu, Wei (
committee member
), Zhou, Qifa (
committee member
)
Creator Email
hai.liu@outlook.com,hailiu@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC111375366
Unique identifier
UC111375366
Legacy Identifier
etd-LiuHai-11029
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Liu, Hai
Type
texts
Source
20220728-usctheses-batch-962
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
active noise cancellation
lung sound
MEMS
piezoelectric
resonant
speech recognition