Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Functional real-time MRI of the upper airway
(USC Thesis Other)
Functional real-time MRI of the upper airway
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
FUNCTIONAL REAL-TIME MRI OF THE UPPER AIRWAY by Weiyi Chen A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) December 2019 Copyright 2019 Weiyi Chen Abstract M agnetic Resonance Imaging (MRI) is the most promising modality for evaluatingupperairwaydynamics,becauseitisnon-invasiveandinvolves no ionizing radiation. For the past decade, real-time MRI (RT-MRI) has been extensivelyusedwithsignificantimprovementinspatiotemporalresolutiontotime- resolve dynamics of natural speech production and sleep disorder. Existing techniques track tissue surfaces, such as the vocal tract and airway. However, they lack the ability to measure upper airway functions, such as internal muscle movement and muscle tone variation across different airway sites. This dissertation introduces new techniques and novel experiment designs compatible with RT-MRI techniques in order to reveal internal muscle motion and physiolog- ical traits of the upper airway. I develop intermittent tagging during RT-MRI as a means to visualize internal tongue motion during speech production. This approach eliminates the need for re-binning data using multiple repetitions and is suitable for investigations of nat- ural speech production. I demonstrate a framework to select imaging parameters in consideration of image quality and tag persistence and achieved an imaging win- dow of approximately 650-800 ms at 1.5 T. I demonstrate the ability to capture tongue motion patterns and their relative timing as exemplified by internal tongue deformation during American English diphthong vowels and consonants. ii Next, I demonstrate intermittent tagging with REALTAG to extend tag persis- tence. This approach provides 2× improvement in contrast-to-noise ratio (CNR), > 1.9× longer tag persistence, and is suitable for investigations of natural speech production. I develop an improved method for phase sensitive reconstruction that provides superior image quality compared to prior approaches in the presence of time-varying background phase. I demonstrate an imaging window of 1250 ms at 1.5 T, which is adequate to capture internal tongue deformations during Ameri- can English vowel-to-vowel transitions in separate words. This provides a powerful new tool for imaging muscle movement during natural speech production and other applications, where CINE imaging is not applicable or suboptimal due to the nat- ural variability of human action. Finally, I present a novel RT-MRI based experiment that measures upper air- way biomarkers relevant to the study of sleep-related breathing disorders. These are upper airway loop gain (UALG) and the fluctuation of airway area (FAA). I combine simultaneous multi-slice RT-MRI with continuous positive airway pres- sure (CPAP) and carefully designed pressure changes. I demonstrate that this new test can localize specific airway sites that are prone to collapse, while the con- ventional apnea-hypopnes index only provides estimation of the overall severity of obstructive sleep apnea. This new experiment can directly measure location- specific active (UALG) and passive (FAA) physiological traits, and visually resolve airway dynamics. In this dissertation, I introduce novel techniques and experiment designs while leverage maturing fast imaging methods to provide the needed spatiotemporal resolution for upper airway RT-MRI. With the proposed methods, we can set our sights further than the anatomical structures and onto the even more interesting yet intrinsically complex functions of the upper airway. iii To mom and dad, who have been building my grit since day zero by naming me after it. iv Acknowledgements Throughout the years I have read many dissertations. Acknowledgements are among my favorite parts. They share not only the joy of accomplishment, but also the gratitude to collective contributions, generous support and unfailing love. Today I am writing my own acknowledgements in a most heartfelt way possible. I could not wish for a better advisor than Prof. Krishna Nayak. His MRI classes are my favorites and were the reasons that I decided to start my PhD in this field. Since then Krishna has given me the means and freedom to chase my research direction. He trained me with greatest patience to be an engineer and a scientist. Krishna shaped my life philosophy in countless ways with wisdom, rigor, ethic and friendship. I look forward to reflecting back and perceive even more. I look forward to continuing our friendship as go beyond this point. It is my greatest fortunate to work with Prof. Dani Byrd. Dani is a best role model one can ask for as a true scientist. She patiently provided answers to every question I had. She passionately and rigorously edited my papers, and each of her comment is worth gold. Through that I learned the most precious of what I know about writing. It has been a great pleasure to work with Prof. Michael Khoo. His kind guidance paved a clear way for me to pilot through the complicated experiment design for sleep studies. He provided me opportunities to work closely with sleep v clinicians. He offered the best access to world leading sleep apnea specialists and helped promote my work. For those I sincerely thank him. I am extremely lucky to have not only one world-class MRI expert, but two, on my committee. Prof. Justin Haldar has been a great source to learn how to do signal processing properly. Justin is a phenomenal teacher, his classes on MRI and inverse problems are among my favorite classes ever taken. From him I learned that MRI can be perfectly explained via classic physics. Since then I stopped pretending that I know more about quantum physics as a physics undergrad. And yes, I know MRI history better now! I am deeply grateful to Prof. Shrikanth Narayanan. He took me into SPAN family and supported my research on speech production. He showed me how to be a visionary engineer. He showed how to translate techniques into scientific research and real-world applications. From him I learned to try my best to be energetic and to work with contagious enthusiasm. MREL has been a beautiful lab to work and live in, mostly because of the past andpresentlabmates. Icannotthankmylabpalsenough, butIwilltry. Brianand Yinghua has been great tutors and friends. I still remember Yinghua came to my apartment working with me through midnight to deliver my first ever presentation on MRI for Krishna’s class. Hopefully my first ever MRI subject experience for his experiment was a proper return. Brian carried me during his time here and has been a role model and wonderful friend ever since. From him I got all I know of being strategic and that non-Cartesian parallel imaging is the deal. I thank him (not) for dragging me up to meet at RTH cafe 7am for weekly journal club even though I got him stood up many times. I wish that renaming MREL’s Early Bird Journal Club by his and carry on the torch is an enough make up. Sajan and Yoon shaped upper airway MRI in this lab and this work is full of their fingerprints. I vi am sure that they along with Krishna remain as the most cited persons in this dissertation. Johannes was the reflective thinking guru in the lab, and I did enjoy every warm hug during otherwise serious professional conferences. I thank Hung and Eamon being role models as senior students and being such good friends in the lab. Terrence was a great office mate. We enjoyed a lot of conversation, and I learned a lot from his deep understanding of physiology and MR physics. Xin and I shared much empathy for each other’s up and downs during our entirePhDtraining. Ihavealwaysenjoyedourconversationnomatteritwasabout research, life, gossip, or just for procrastination breaks. I thank Yi for being the honey badger of the lab for his inspiring focus on research. I am totally amazed by him for always being so happy and at the same time complaining about everything. My theory is that his focus helps him shape the best human multi-band excitation profile so that he can do both and suppress other insignificance. Yongwan is the best teammate one can ask for. Particularly I shall thank him for making spirals a lot more friendly. I enjoyed our discussion on research and chitchat on many other subjects. He is a younger MRELer but I have always respected him as a hyeong (older brother) for his inspirational hard working. Ahsan is awesome. I thank him for being an awesome office mate and conversational companion. Vanessa was always there for me when I needed help on writing. She was also there whenever I need a native speaker for my experiment, even it was 4:45am. All these kept me training hard so that I can provide A and V the best heart beat and breath hold. I thank Yannick for his sharpness in every detailed concept and for being crystal clear during every presentation, from which I learned a lot. I also thank him for his swimming skills which I (secretly but not creepily) observed and learned from in the pool. For all of the younger MRELers, I shall thank each of you for the friendly research environment. You might have no idea that your collective contributions vii have already kicked in. Hope that you will be proud of being an MRELer and that one day MREL will be proud of you. I am very grateful to my clinical collaborators. Dr. Sally Ward, Dr. Eric Kezirian and Dr. Emily Gillett provided me the opportunity to work closely with them. I thank all of the collaborators from SRBD and SPAN group. Leo and Winston helped me for all of the sleep experiment with their deep understanding of physiology. Asterios, Tanner, Colin, Miran, Sarah and Mai bore with me during Sunday afternoon scans and we shared many fruitful discussions. I can tell you that I did enjoy every Sunday scan because you made it fun. My PhD journey is not always downhill for good reasons, but my friends at Los Angeles and the Bay Area made sure they are my endless source of moral support, albeit they always started the conversation by asking when I am going to graduate. For that I thank them all and hopefully I made their lives more enjoyable as well through our friendship. Tip on my hat to all of the cyclists: I regret I did not figure out the coexistence of a higher FTP and my productivity, but I did share my purest happiness with you. Hooray to Jiangyang and Ray as my race rivals and best friends, and yes, I finally made it! I couldn’t have done any of this without the love and support from my family. My mom and dad have always been my strongest support with unconditional love since day zero. They named me after “gritty”, and since then have been raising me with cultivating that quality. Until now I still think it is the gift of my life. My cousin Jess, auntie and uncle has been my closest family though they are 3000 miles away to the east. They picked me up at LAX with the warmest welcome when I first came to this country. Jess has really been an older sister to me, I thank her for sharing my up and downs remotely via those cross-continental calls. viii My dearest, sharp, diligent and beautiful wife Yuchi is the best thing ever happened to me. Earning a PhD has become easier than it should have been as she is such a role model for me. She has granted me with love and understanding through both intellectual and mental support. I swear I was not intentional to make her back hurt in the scanner. I owe her a great debt of gratitude for her generosity and unbelievable patience during this uphill fight. I literally owe her a great debt as she flew here every month from 400 miles north for the past 5 years. I guess now it is time for us to finally give up that Southwest A-list status. Thank you for being my best friend and heroine. Let this dissertation also serve as a memorial of our unwavering love. ix Contents Abstract ii Dedication iv Acknowledgements v List of Tables xiii List of Figures xiv Abbreviations xvii 1 Introduction 1 1.1 Seeing the upper airway . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 MRI of the upper airway: the pursuit of fast imaging . . . . . . . . 4 1.3 Fast is not enough . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.1 Muscle mechanism in the internal tongue . . . . . . . . . . . 6 1.3.2 Endotypes in obstructive sleep apnea . . . . . . . . . . . . . 7 1.4 Outline of contributions . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Magnetic Resonance Imaging 12 2.1 MRI fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.1 Nuclear magnetic resonance . . . . . . . . . . . . . . . . . . 12 2.1.2 From polarization to Fourier reconstruction . . . . . . . . . 15 2.1.3 Advanced acquisition . . . . . . . . . . . . . . . . . . . . . . 21 2.1.4 Advanced Reconstruction . . . . . . . . . . . . . . . . . . . 30 2.2 Real-time upper airway MRI . . . . . . . . . . . . . . . . . . . . . . 33 2.2.1 Seeing speech: imaging requirements . . . . . . . . . . . . . 33 2.2.2 Seeing sleep: imaging requirements . . . . . . . . . . . . . . 35 2.2.3 Acquisition: spoiled gradient echo imaging . . . . . . . . . . 36 2.3 Tagged MRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.3.1 Tagging pulses . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.3.2 Tagged CINE MRI . . . . . . . . . . . . . . . . . . . . . . . 44 x 2.3.3 Tagged real-time MRI . . . . . . . . . . . . . . . . . . . . . 46 3 Visualizing internal tongue motion: intermittently tagged real- time MRI 48 3.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.1.1 Tagged real-time MRI implementation . . . . . . . . . . . . 48 3.1.2 Selection of acquisition parameters . . . . . . . . . . . . . . 51 3.1.3 Triggering mechanism . . . . . . . . . . . . . . . . . . . . . 53 3.1.4 Speech experiments . . . . . . . . . . . . . . . . . . . . . . . 54 3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.2.1 Acquisition parameters . . . . . . . . . . . . . . . . . . . . . 56 3.2.2 Triggering mechanism . . . . . . . . . . . . . . . . . . . . . 59 3.2.3 Visualization of tongue deformation . . . . . . . . . . . . . . 60 3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.3.1 Potential applications . . . . . . . . . . . . . . . . . . . . . . 67 3.3.2 Triggering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.3 Tagging consideration . . . . . . . . . . . . . . . . . . . . . 69 3.3.4 Imaging consideration . . . . . . . . . . . . . . . . . . . . . 70 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4 Capturinglongermotionpatternsinspeechproduction: real-time MRI with REALTAG 72 4.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.1.1 Imaging methods . . . . . . . . . . . . . . . . . . . . . . . . 73 4.1.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.1.3 Tag persistence . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.2.1 Off-resonance and the low pass filter width . . . . . . . . . . 78 4.2.2 Improved tag persistence . . . . . . . . . . . . . . . . . . . . 79 4.2.3 Tongue deformation with improved contrast . . . . . . . . . 80 4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.3.1 REALTAG for the tongue . . . . . . . . . . . . . . . . . . . 84 4.3.2 Imaging consideration . . . . . . . . . . . . . . . . . . . . . 84 4.3.3 Phase estimation . . . . . . . . . . . . . . . . . . . . . . . . 85 4.3.4 Tag line visualization . . . . . . . . . . . . . . . . . . . . . . 86 4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5 Seeing functional endotypes of OSA: real-time multi-slice MRI during continuous positive airway pressure 87 5.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.1.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.1.2 MRI protocol . . . . . . . . . . . . . . . . . . . . . . . . . . 90 xi 5.1.3 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2.1 Visualizing the physiological fluctuation . . . . . . . . . . . 95 5.2.2 Statistical findings . . . . . . . . . . . . . . . . . . . . . . . 96 5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3.1 OSApatients: lessstableneuromuscularcontrolsystemsand higher collapsibility . . . . . . . . . . . . . . . . . . . . . . . 100 5.3.2 Experimental considerations . . . . . . . . . . . . . . . . . . 101 5.3.3 Toward seeing the endotypes . . . . . . . . . . . . . . . . . . 102 5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6 Concluding remarks 105 Bibliography 111 xii List of Tables 1.1 Upper airway imaging modalites . . . . . . . . . . . . . . . . . . . . 4 3.1 American English diphthong stimuli . . . . . . . . . . . . . . . . . . 54 3.2 American English consonant stimuli . . . . . . . . . . . . . . . . . . 56 4.1 Vowel-to-vowel transition stimuli . . . . . . . . . . . . . . . . . . . 77 5.1 Upper airway loop gain and fluctuation of airway area comparison between OSA patients and the control group. . . . . . . . . . . . . 97 5.2 UALGandFAAresultsfordifferentslicesofonerepresentativeOSA patient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.3 FAA is significantly larger in OSA subjects and can be reduced under therapeutic airway pressure. . . . . . . . . . . . . . . . . . . 98 5.4 Airway area mean value during CPAP indicates significantly less stiff airway in the sampled OSA patients. . . . . . . . . . . . . . . 99 5.5 Intra-subject reproducibility of RT-MRI during CPAP. . . . . . . . 100 xiii List of Figures 1.1 Upper airway imaging modalites. . . . . . . . . . . . . . . . . . . . 3 1.2 Example MRI of the upper airway. . . . . . . . . . . . . . . . . . . 5 1.3 Endotypes in obstructive sleep apnea. . . . . . . . . . . . . . . . . . 8 2.1 Spin returns to equilibrium through relaxation. . . . . . . . . . . . 14 2.2 Excitation in the laboratory and rotating frame. . . . . . . . . . . . 16 2.3 Slice selective excitation. . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4 2DFT acquisition and k-space. . . . . . . . . . . . . . . . . . . . . . 19 2.5 Point spread function and image resolution. . . . . . . . . . . . . . 22 2.6 Examples of non-Cartesian sampling trajectories and their PSFs. . . 24 2.7 Image resolution is different from pixel size. . . . . . . . . . . . . . 25 2.8 The ideal k-t space. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.9 Motion artifacts by the Central Section Theorem of k-t space. . . . 28 2.10 View sharing increases frame rate. . . . . . . . . . . . . . . . . . . . 31 2.11 Seeing the upper airway: imaging requirements . . . . . . . . . . . 34 2.12 RF-spoiled gradient echo imaging. . . . . . . . . . . . . . . . . . . . 36 2.13 Examples of tagged MRI. . . . . . . . . . . . . . . . . . . . . . . . 38 2.14 1-1 SPAMM pulse sequence with total flip angle of 90 ◦ . . . . . . . 40 2.15 Higher order SPAMM sequences generate sharper tag lines. . . . . . 43 xiv 2.16 1D tagging and 2D tagging with composite SPAMM pulses. . . . . 44 2.17 CINE MRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.1 Speech RT-MRI with intermittent tagging . . . . . . . . . . . . . . 50 3.2 Simulation of tag persistence and steady-state signal as a function of imaging flip angle . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.3 Tag persistence in human tongue at 1.5T. . . . . . . . . . . . . . . 58 3.4 Example images of tag fading . . . . . . . . . . . . . . . . . . . . . 59 3.5 American English vowel charts . . . . . . . . . . . . . . . . . . . . . 61 3.6 Tagged RT-MRI reveals internal tongue deformations and their rel- ative timing during American English diphthong articulation. . . . 62 3.7 Tagged RT-MRI reveals deformation relative to the relatively neu- tral posture of the schwa /@/ (“a”) of the carrier sentence. . . . . . 64 3.8 Tagged RT-MRI shows different deformation patterns (relative to preceding schwa postures) during the articulation of consonants /ô/, /S/, and /tS/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.1 Phase sensitive reconstruction flowchart and example images from intermediate steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.2 Phase of an un-tagged images after low pass filtering with different widths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.3 Phase compensation by the low pass filter with different width W. . 80 4.4 Simulated and measured tag CNR decay for TFA= 90 ◦ in the orig- inal implementation and with TFA= 180 ◦ with REALTAG in the proposed method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.5 Time threshold improvement by using REALTAG in 5 in-vivo scans. 82 4.6 Representative images during the speech stimuli production. . . . . 83 xv 5.1 CPAP pressure level manipulation. . . . . . . . . . . . . . . . . . . 93 5.2 Results from a representative OSA patient illustrate the measure- ment of airway area change when there are interruptions due to airway collapse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.3 RT-MRI during rapid CPAP change. . . . . . . . . . . . . . . . . . 95 xvi Abbreviations AHI apnea/hypopnea index. 8, 92, 96, 99, 106 bSSFP balanced steady state free precession. 70, 72, 84, 108, 109 CPAP continuous positive airway pressure. 10, 88, 89, 90, 92, 94, 96, 97, 96, 99, 96, 100, 101, 103, 106 CSPAMM Complementary SPAtial Modulation of Magnetization. 39, 72 CT computer tomography. 2 DENSE Displacement ENcoding with Stimulated Echoes. 39, 68 DISE drug-induced sleep endoscopy. 2 EMA electromagnetic articulography. 2 EPI Echo Planer Imaging. 23, 25, 26, 68 FAA fluctuation of airway area. 92, 96, 97, 96, 98, 99, 96, 101, 102, 106 GRE gradient echo. 35, 36, 49, 51, 71 HARP HARmonic Phase. 39, 68, 107 ICC intra-class correlation. 92, 98, 96 xvii OCT optical coherence tomography. 2 OSA obstructive sleep apnea. 1, 2, 7, 8, 10, 88, 89, 91, 92, 95, 96, 97, 96, 97, 98, 96, 99, 96, 99, 100, 101, 102, 103 PSF point spread function. 22, 23 PSG polysomnography. 87, 99, 101, 106 RT-MRI real-time MRI. 5, 6, 7, 8, 9, 10, 33, 34, 33, 35, 46, 48, 49, 61, 63, 66, 68, 70, 72, 73, 75, 83, 87, 88, 94, 96, 99, 96, 103, 105, 106, 107, 109 SENC Strain ENCoding. 39, 46 SMS simultaneous multi-slice. 10, 88, 91, 96, 99, 103, 106 SPAMM SPAtial Modulation of Magnetization. 37, 39, 41, 42, 49, 51, 68, 69, 71, 73 SPGR RF-spoiled gradient echo. 36, 51, 72, 73, 105 UALG Upper airway loop gain. 91, 92, 96, 98, 99, 96, 99, 101, 102, 106 xviii Chapter 1 Introduction T he upper airway is a remarkably complex system that consists of both movable and immovable structures. It is a critical component in vocaliza- tion, respiration and digestion and is involved in various human functions. Speech production involves complex spatiotemporal coordination of multiple vocal organs in the oral and pharyngeal airways. Respiration during sleep involves periodic motion of the the tongue, velum, and pharyngeal wall to maintain patency of the upper airway during breathing. Eating and swallowing requires the tongue and velum to pass substances from the mouth to the pharynx and esophagus, while the epiglottis is kept shut. These functions are dynamic in nature with timely coordination of multiple organs. Visualizing and measuring the spatiotemporal coordination of these organs is thesubjectofscientificinvestigationinordertounderstandhowhealthyfunctionis controlled. For instance, techniques and tools have been developed to investigate the state of the vocal tract over time during speech production, unraveling its morphology and function [1]. This allows for investigation of typical and atypical speech production [2]. Investigationofupperairwaydynamicsrevealsnuancesinupperairwaydisease. Forexample, obstructivesleepapnea(OSA)ischaracterizedbyrepetitivecessation of airflow due to physical narrowing or collapse of the airway as a result of anatom- ical and physiological abnormalities in pharyngeal structure [3]. This collapse is typically attributed to excessive soft tissue elements, such as the tongue, velum, 1 uvula and epiglottis, and/or increased collapsibility of the pharyngeal airway [4]. OSA is a very common sleep disorder in the United States [5], with a prevalence of 4-9% in adults and 2% in children [6]. OSA places a substantial financial burden on society, with the cost of untreated OSA estimated to be US$67-165 billion [7]. Untreated OSA can contribute to the development of hypertension [8], coronary artery disease [9], congestive heart failure [10], arrhythmia [11], stroke [12], glu- cose intolerance and diabetes [13, 14]. It is anticipated that improved diagnosis andtreatmentofOSAinpatientswillcontributetothepreventionofthesediseases and slow down their progression [8]. 1.1 Seeing the upper airway Visualizing organ dynamics is an important step toward understanding the spatiotemporal properties of the upper airway. Figure 1.1 depicts the upper airway using several modalities, electromagnetic articulography (EMA) [15], drug-induced sleep endoscopy (DISE) [16], optical coherence tomography (OCT) [17], x-ray [18], computer tomography (CT) [19, 20], ultrasound [21, 22], and magnetic resonance imaging (MRI) [23, 1, 24, 25, 26, 27, 28, 29, 30]. Table 1.1 lists and compares these modalities. EMA is used in linguistic studies to track the movement of sensor coils adhered to the tongue and lips [15]. DISE is used in patients with OSA during drug- induced sleep to visualize the airway lumen [16]. OCT provides three-dimensional images of the upper airway with high spatial resolution [39]. However, all three modalities are invasive and provide information on soft tissue surfaces. X-ray [18] is used to image the airway, however, it is limited by radiation exposure and poor visualization of soft tissue. Tomography, such as CT [19, 20], ultrasound 2 DISE Ultrasound X-ray CT OCT EMA EMA MRI Figure 1.1: Upper airway imaging modalites. Figures are adapted from: EMA, electromagnetic articulography [31]; DISE, drug-induced sleep endoscopy [32]; OCT, optical coherence tomography [33]; ultrasound [34]; x-ray [35]; CT, com- puter tomography [36]; MRI, magnetic resonance imaging: sagittal [37] and axial [28]. MRI is arguably the most promising imaging modality for evaluating upper airway anatomy and function, as it provides adequate spatiotemporal resolution, versatile contrast, and involves no ionizing radiation or invasive procedure. [21, 22] and MRI, generates images of tissue slices non-invasively. Among these modalities, MRI uniquely provides static images with excellent soft tissue contrast and dynamic images with high frame rate, without the use of ionizing radiation. 3 Table 1.1: Upper airway imaging modalites EMA DISE OCT Ultra- sound X-ray CT MRI Speech example [15] [22] [18] [20] [38] Sleep example [16] [17] [21] [19] [29] Spatial resolution 7 3 3 7 3 3 − Temporal resolution 3 3 3 3 7 − − Tissue contrast 7 3 7 7 7 7 3 Tomography/surface 7 7 7 − 3 3 3 Invasive 7 7 7 − 3 3 3 Ionizing radiation 3 3 3 3 7 7 3 Operator bias 7 7 7 7 3 3 3 Cost 3 7 − 3 3 − 7 3: strength;7: weakness;−: neutral. Modailty acronyms: EMA, electromagnetic articulography; DISE, drug-induced sleep endoscopy; OCT, optical coherence tomography; CT, computer tomography; MRI, mag- netic resonance imaging. 1.2 MRI of the upper airway: the pursuit of fast imaging MRI is arguably the most promising imaging modality for evaluating upper airway anatomy and function, because it is non-invasive and involves no ionizing radiation. Figure 1.2 shows an MRI of the human upper airway. Two dimensional (2D) and three dimensional (3D) static MRI provides superb contrast and reso- lution to reveal anatomical structures [41, 28, 1]. Anatomical features have been shown to potentially contribute to airway collapse [41, 28]; much research has been performed while the subject is either silent or during sustained phonation [1], yielding static images that reveal the larynx and vocal tract shape, shedding light on vocal tract morphology in healthy and patient cohorts. However, due to 4 T 2 - weighted static image 2D real-time dynamic images 3D static image during sustained sound Tongue Lips Pharyngeal wall Figure 1.2: Example MRI of the upper airway. Top: Two dimensional (2D) and three dimensional (3D) static MRI provides superb contrast, resolution and/or spatial coverage to reveal the anatomical structures. Bottom: 2D real-time MRI (RT-MRI) has been demonstrated with focusing on tracking the air-tissue inter- faces, such as vocal tract surfaces. Bottom figure adapted from Ref [40]. the intrinsically dynamic nature of upper airway function, imaging at rest or dur- ing sustained posture provides limited information on speech production or sleep disorder. Over the past decades, tremendous effort has been made in the pursuit of high spatiotemporal resolution in dynamic (time-resolved) imaging. 2D CINE MRI was proposed to measure the airway change during tidal breathing [42]. 2D real- time MRI (RT-MRI) has been demonstrated alongside synchronized recording of physiological signals, such as polysomnography (PSG) used in sleep studies [30] 5 and audio signals used in speech production imaging [23]. Several research groups have reported dynamic 3D MRI of the vocal tract, despite limited spatiotemporal resolution or the requirement of long acquisitions with subject repetitions [43, 44, 45]. Recently 3D RT-MRI has been demonstrated during natural sleep [29] and natural speech [46]. Furthermore, recent RT-MRI advances include improvements in reducing reconstruction latency [47, 40, 48], mitigating off-resonance artifacts [49, 50], and combinations of the above. Reviews of current state-of-the-art MRI protocol scan be found in Lingala et al. [40] for speech production study and in Kim [37] for general upper airway imaging. 1.3 Fast is not enough RT-MRI techniques have been used extensively to image upper airway shaping, such as the air-tissue interface at articulators, vocal tract surfaces [40], and the pharyngealwall[30]. However, fastimagingisnotenoughtoresolvefunction; addi- tional information must be provided through other techniques. Functional evalu- ation often requires imaging of functional changes with co-recorded, synchronized audio or physiological signals. For instance, understanding the tongue function during articulation requires resolving tongue muscle movements with synchronized audio recording. The sole pursuit of fast imaging needs a change for resolving function. 1.3.1 Muscle mechanism in the internal tongue The upper respiratory tract consists of a series of connected resonance cavi- ties that are used in speech production. Different sounds are produced through coordinated movements of the velum, jaw, pharyngeal tongue root, tongue body, 6 tongue tip, and lips [1]. Among these articulators, the human tongue is the most powerful enabler of the remarkably complex shaping occurring in speech. The tongue is a muscular hydrostat comprised of numerous intrinsic and extrinsic mus- cles [51]. The internal deformation of tongue muscles cannot be easily interpreted by the contours of the tongue surface. The relationship between muscle activity and tongue shaping is the subject of scientific investigation as an important com- ponent in understanding how healthy speech is controlled and how it is disrupted in disease [52, 53]. However, scientists have remained reliant on inverse modeling of surface contours heavily contingent on modeling assumptions [53, 54, 55, 56]. RT-MRI techniques have been used extensively over the past decade to study speech production. Specifically, the dynamics of vocal tract shaping have been studied to visualize articulators at the air-tissue interface and vocal tract sur- faces [38]. These techniques all lack the ability to measure internal muscle move- ments. They cannot image and quantify the deformations of local regions within thehumantongue, arguablythemostimportantarticulator, duringnaturalspeech. 1.3.2 Endotypes in obstructive sleep apnea OSA is a heterogeneous sleep disorder characterized by structural and physio- logical risk factors [58, 59]. Airway obstruction is caused by different mechanisms across different subjects, indicating various endotypes of OSA [60]. (Endotype is the subtype of a disease defined by a unique or distinctive functional or patho- physiologic mechanism [61]. It is distinct from phenotype, which is defined as an observable expression of an individual’s characteristics that result from the inter- action between the genotype and the environment, without any implication of a mechanism [62].) 7 Obstructive Sleep Apnea Impaired anatomy (narrow/collapsible upper airway) 100% but variable magnitude (A) Low respiratory arousal threshold 37% (B) Unstable ventilatory control 36% (C) Ineffective upper-airway dilator muscles 36% Functional endotypes Figure 1.3: Endotypes in obstructive sleep apnea, modified from Ref [57]. MRI techniques can image impaired anatomy or quantify collapsibility of the upper airway. However, functional endotypes, shown in dashed rectangular, mediate OSA severity. functional endotypes are not mutually exclusive and are collectively presentin70%oftheentireOSApatients. ExistingMRItechniqueslacktheability to provide insight into the functional endotypes of OSA (A)-(C). Figure 1.3 shows other non-anatomical endotypes that contribute to OSA pathogenesisbesidesimpairedanatomy: ineffectivedilatormuscles(36%), unstable ventilator control (36%), and low respiratory arousal threshold (37%) are collec- tively present in 70% of OSA patients [57]. Although impaired anatomy is the crucial factor that results in OSA, these functional endotypes can mediate the severity of OSA. For example, recent studies suggest that obese individuals who do not get OSA are “protected” by two main mechanisms: an airway more resis- tant to collapse, as well as an augmented reflex response of airway dilator muscles [63]. Therefore, there is a demand for a systematic endotypic approach to diagnose OSA before further treatment [57, 59, 61]. 8 Conventional PSG with apnea/hypopnea index (AHI) measurement only pro- videanestimationoftheoverallseverityofOSAandcannotlocalizespecificairway sites that are prone to collapse. Existing static MRI can only assess the anatomical risk factor related to OSA [28]. Dynamic MRI developed Wu et al. [30] provided new perspective into anatomical endotype by quantitatively measuring collapsibil- ity of the airway. (A) and (B) in Figure 1.3 requires knowledge of neuromuscular reflex and passive collapsibility of the upper airway [64, 65]. (C) can be visualized through internal deformation of the upper airway dilator muscle. However, exist- ing RT-MRI techniques are unable to evaluate the function thereof in the dashed rectangular. 1.4 Outline of contributions In this dissertation, I develop advanced techniques to image the function of the upper airway. I develop an MR tagging method compatible with RT-MRI to visualize muscle deformation for the study of natural human speech production. Although not investigated, this method is also expected to visualize airway muscle as in [66] during continuously RT scan. I also develop a novel experiment combin- ing simultaneous multi-slice RT-MRI and continuous positive airway pressure to measure physiological traits during natural sleep. Chapter 2. Magnetic Resonance Imaging presents the basic principles of magnetic resonance imaging including discussions of nuclear magnetic resonance physics, MRI pipeline, advanced acquisition and reconstruction techniques. I will also discuss advanced topics, such as real-time upper airway MRI and tagged MRI. I provide more insight into the critical trade-offs between resolution, SNR, and acquisition time in the upper airway imaging. 9 Chapter 3. Visualizing internal tongue motion: intermittently tagged real-time MRI introduces a new MR tagging technique to visualize internal tongue deformation. I develop a tagging method compatible with RT-MRI for the study of natural human speech production. I apply tagging as a brief interrup- tion of continuous RT-MRI data acquisition. I explore the selection of imaging parameters for such speech studies to optimize image quality and tag persistence. I evaluate this method using simulations and in-vivo studies of American English diphthong and consonant production. I show that the proposed method can cap- ture tongue motion patterns and their relative timing through internal tongue deformation, and therefore provide a potential tool for studying muscle function in speech production and similar applications. Chapter 4. Capturing longer motion patterns in speech production: real-time MRI with REALTAG focuses on the tag persistence as it is a prin- cipal limitation of the method described in Chapter 3. I apply an MR inversion technique, called REALTAG, to intermittently tagged RT-MRI and demonstrate its successful application to the study of human speech production. I develop an improved phase correction technique for the setting of time-varying background phase. Specifically, I use a spatial low-pass filter to extract and compensate for smooth image phase and isolate tag lines. I validate this method using simulations and in vivo studies of American English vowel-to-vowel transitions. In experi- ments focused on the human tongue, I show that the proposed method extends tag persistence by a factor of 1.9×. Chapter 5. Seeing functional endotypes of OSA: real-time multi-slice MRI during continuous positive airway pressure introduces a new exper- iment for measurement of upper airway physiology during sleep. I optimize and 10 apply a previously developed simultaneous multi-slice (SMS) RT-MRI technique to image and quantify upper airway changes during rapid programmed changes in continuous positive airway pressure (CPAP) level. I use this tool to demon- strate that RT-MRI during CPAP can measure neuromuscular reflex and passive collapsibility of the upper airway in individuals with OSA. I show that SMS RT- MRI during CPAP can reproducibly identify physiological traits and anatomic risk factors that are valuable in the assessment of OSA. Chapter 6. Concluding remarks summarizes the key accomplishments in this dissertation and related future research directions. 11 Chapter 2 Magnetic Resonance Imaging 2.1 MRI fundamentals M R is fundamentally linked to quantum mechanics and is often presented as a phenomenon that necessitates a quantum mechanical explanation. However, MRI really is a classical effect and a consequence of the common sense expressed in classical mechanics [67]. Comprehensive explanations of MRI can be found in many textbook in great detail [68, 69, 70, 71]. This section provides a concise description of the principles of MRI. 2.1.1 Nuclear magnetic resonance When nuclear spins are subjected to a static magnetic field B 0 , there is a tendency of the spins to align in the same direction as B 0 , giving rise to a net magnetization momentM. Moreover, the spins exhibit resonance, which is mani- fested by precessional behavior at a well-defined frequency. Precession At thermal equilibrium, M andB 0 will align in the same direction. IfM is made to point in a different direction, precession will occur according to dM(t) dt =M(t)×γB 0 , (2.1) 12 where γ is called the gyromagnetic ratio, which is a known constant unique for each type of nuclei. For 1 H, γ/2π = 42.58 MHz/T. Relaxation If the perturbation to M is retracted, the magnetization will return to its equilibrium state. This process is called relaxation and can be divided into two components: longitudinal (z) relaxation and transverse (xy) relaxation. Figure 2.1 depicts the relaxation. The transverse component of the magnetization decays away while the longitudinal component returns to its thermal equilibrium state. The magnetization behaves according to dM z (t) dt =− M z (t)−M 0 T 1 , dM xy (t) dt =− M xy (t) T 2 , (2.2) where T 1 is called the spin-lattice time constant and characterizes the return to equilibrium along thez direction,T 2 is called the spin-spin time constant and char- acterizes the decay of the transverse magnetization. The solution of this equation is M z (t) =M 0 + [M z (0)−M 0 ]e −t/T 1 , M xy (t) =M 0 e −t/T 2 . (2.3) Equation 2.3 implies an exponential recovery back toM 0 along thez direction and an exponential decay of the transverse component. These processes are referred as T 1 and T 2 relaxation, respectively. In the presence of field inhomogeneity, spins precess differently based on local field strength. Intra-voxel dephasing further accelerates the signal decay in the 13 0 0.2 -1 0.4 0.6 M z 0.8 1 1 M x 0 0.5 M y 0 -0.5 1 -1 0 200 400 600 800 1000 time (ms) -1 -0.5 0 0.5 1 M xy M x M y |M xy | M 0 200 400 600 800 1000 0 0.5 1 M z T 2 relaxation T 1 relaxation B 0 Equilibrium Excited ω 0 Relaxation Figure 2.1: Spin returns to equilibrium through relaxation. This example shows relaxation with T 1 = 250 ms, T 2 = 70 ms. Left shows spin in a 3D coordinates. Note that the transverse relaxation (T 2 relaxation, right bottom) is more rapid than the longitudinal relaxation (T 1 relaxation, right top). transverse plane. We denote this faster decay rate by T ∗ 2 . As a result, transverse relaxation takes the form M xy (t) =M 0 e −t/T ∗ 2 . (2.4) Bloch equation The dynamics of the magnetization described above can be combined into the Bloch equation: dM(t) dt =M(t)×γB 0 − iM x (t) +jM y (t) T ∗ 2 − k [M z (t)−M 0 ] T 1 . (2.5) Note that T ∗ 2 will be used throughout this dissertation. There are MRI sequences that can reverse T ∗ 2 back to T 2 decay, such as spin echo sequences. However, the work in this dissertation will not include this class of sequences. 14 2.1.2 From polarization to Fourier reconstruction There are three core components at work in a typical clinical MRI scanner: a main magnetic field B 0 , an oscillating magnetic field B 1 (t), and three pairs of gradient coils generating linear magnetic field G x , G y , G z in three orthogonal directions. These components are essential to acquire an MR image. The pipeline of obtaining an MR image can be described as a 4-stage process. Polarization Atoms with an odd number of protons and/or odd number of neutrons, such as hydrogen 1 H, possess a nuclear spin angular momentum and exhibit the MR phenomenon. We refer to these MR relevant nuclei as spins. In the absence of an external magnetic field, the spins are oriented randomly and the net macroscopic magnetic moment is zero. However, with the presence of B 0 , the magnetic moment vectors tend to align in the direction of B 0 . At 1 T and 310 K, the ratio is only 7 out of 10 6 in the parallel state. Macroscopically however, this excess accounts for the polarization and gives rise to a small net magnetization M 0 . By convention we call the direction of B 0 the z-direction or longitudinal direction. Moreover, the nuclear spins will precess at a frequency called the Larmor frequencyω 0 that relates to the applied magnetic field ω 0 =γB 0 , (2.6) or f 0 = γ 2π B 0 , (2.7) 15 B 0 M ω 0 ω 0 M Laboratory frame Rotating frame B 1 B 1 Figure 2.2: Excitation in the laboratory and rotating frame. B 1 RF field tuned to the Larmor frequency and applied in the xy plane induces nutation of magnetiza- tion vector as it tips away from z axis. Left and right show the same tip down in the laboratory and rotating frame, respectively. Mostwhole-bodyimagingsystemsoperateatafixedfieldstrengthwithintherange of 0.1 to 3 T. In this range, the Larmor frequency lies in the radio frequency (RF) range. Excitation The polarization creates a net magnetization that aligns in the z direction. However, this magnetization is too small compared to B 0 and is undetectable unless a perturbation is introduced such that part of the magnetization is tipped onto the transverse direction. This is accomplished by applying the oscillating magnetic fieldB 1 (t). B 1 (t) applied in the transverse direction and rotating at the resonant frequency of the magnetization will excite the magnetization. Since the resonant frequency is in the RF range, B 1 (t) field is often referred as RF field. Figure 2.2 shows excitation in the laboratory frame and a rotating frame with the resonance frequency. 16 z ω Δω = γG Z Δz Δz M 0 M 0 M RF G Z Figure 2.3: Slice selective excitation. The gradient coils create a linear gradient field on top ofB 0 across the whole object. The gradient amplitude can be tuned so that the RF’s frequency band matches the spins’ Larmor frequency range within the region of interest. Only that region satisfies resonance condition and will be excited. If only a small region of the whole body needs to be imaged, the linear gradient fieldG(t) can be turned on together with the RF pulse to achieve selective exci- tation. Figure 2.3 shows selective excitation. With the gradient on, the resonance frequencies of all the spins vary linearly in the gradient direction. If a band-limited RF pulse is played simultaneously, then only the spins whose Larmor frequencies fall within the range of the RF pulse’s bandwidth will be excited. Under small flip angle approximation, it can be shown by solving the Bloch equation 2.5 that the slice profile is approximately the Fourier transform of the RF envelope. Band-limited sinc RF pulse is extensively used in MRI because it produces a near-rectangular slice profile. 17 MR signal equation and k-space Starting from the Bloch equation 2.5, The transverse magnetization after the excitation is M xy (r,t) =M xy (r, 0)e −iω 0 t e −t/T ∗ 2 (r) e −iΔω(r)t , (2.8) where we use the last term to account for the extra phase introduced by off- resonance due toB 0 inhomogeneity. A receive coil that is perpendicular to the transverse plane can pick up the magnetization. The signal is an accumulation of all of the magnetization. We can write the MR signal equation as s(t) = Z r M xy (r, 0)e −iω 0 t e −t/T ∗ 2 (r) e −iΔω(r)t dr. (2.9) However, the object cannot be resolved spatially by this free induction decay (FID). In order to obtain an image, one need to use the gradients to spatially encode the voxels. With gradients on, there will be accumulated linear phases across the object (ignoring T ∗ 2 decay and the off-resonance phase term): s(t) = Z r M xy (r, 0)e −iγ R t 0 G(τ)·rdτ dr. (2.10) Define k(t) = γ 2π Z t 0 G(τ)dτ, (2.11) we can rewrite Equation 2.10 as s(k) = Z r M xy (r, 0)e −i2πk(t)·r dr =F{M xy (r, 0)} k(t)= γ 2π R t 0 G(τ)dτ . (2.12) 18 RF G z G x G y 1 3 2 t 0 t 2 t 1 k x k y 1 2 3 t 0 t 1 t 1 t 1 t 2 t 2 t 2 2DFT pulse sequence k-space Figure 2.4: 2DFT acquisition and k-space. Left shows an example of 2DFT pulse sequence with three k-space lines (right). Three different G y gradient areas are used in three TRs, resulting in three different phase encoding lines in the k-space trajectory (marked with 1, 2, 3 in the G y waveform and in trajectory). This is the exact format of Fourier transform (FT), where k(t) and M xy (r) are Fourier pairs: k(t) is the spatial frequency spectrum of M xy (r). And the signal is acquired in this spatial frequency domain, or so called k-space. Equation 2.12 implies that the sampling takes the form of traveling the k-space as a function of time, and the trajectory in k-space is determined by the gradients waveform. Figure 2.4 shows an example of a 2DFT sequence acquiring a k-space containing three lines. Fourier reconstruction Equation 2.12 indicates that the information of the object magnetization M xy (r) lies in an infinite and continuous k-space. Ideally, if we know the entire 19 infinite k-space, a perfect recovery of the magnetizationM xy (r), denoted as object function ρ(r), can be obtained by taking the inverse Fourier transform: ρ(r) = Z k s(k)e i2πk·r dk =F −1 {s(k)}, (2.13) whereF −1 is the inverse Fourier transform. However in practice, we cannot sample an infinite k-space. For simplicity, let us consider 1D case and assume a finite k-space with maximum extent k max . Now we can only estimate the object ρ(r) by an image, denoted as ˆ ρ(r). Equation 2.13 needs to be re-written as ˆ ρ(r) = Z k [s(k)·u(2k max )]e i2πk·r dk =ρ(r)∗ sinc(r), (2.14) whereu(2k max ) is a box function with width 2k max , sinc(·) is the sinc function,∗ denotes convolution. The convolution with the sinc function is a result of finite k- spacesupportandimpliesablurringoftheobjectρ(r)intheimage ˆ ρ(r). Thisblur- ring determines the spatial resolution of the image, which will be further explained in Section 2.1.3. Now let us consider discrete sampling on k-space. Assume N k-space samples are acquired on an equal interval of Δ k . We further assume that the image has N pixels. Then the image ˆ ρ[m] is the discrete inverse Fourier transform (IFT) of the k-space s[n]: ˆ ρ[m] = N/2−1 X n=−N/2 s[n]e i2πnm/N ,−N/2≤m<N/2. (2.15) 20 We can use matrix notation, the image ˆ ρ and the k-spaces satisfy ˆ ρ =F H s, s =F ˆ ρ. (2.16) Equation 2.16 is known as Fourier reconstruction. It points out that the recon- struction of 2DFT imaging can be implemented as efficient Fast Fourier transforms (FFT). Equation 2.16 also indicates that k-space and image space are reciprocal lattices. Therefore, field of view (FOV) of the image and sampling interval in a 2DFT k-space are reciprocal pairs: FOV = N Δ k . (2.17) Similarly, image pixel size δ and k-space extent satisfy δ = 1 NΔ k . (2.18) Note that sometimes pixel size δ can be referred as image resolution, such as in 2DFT imaging. However, it does not dictate the resolution limit of the imaging system and therefore is not accurate in many applications. A more comprehensive discussion of image resolution will continue in Section 2.1.3. 2.1.3 Advanced acquisition In Section 2.1.2, we introduced the MRI pipeline with one simple example of 2DFT static imaging. However in many applications imaging considerations are more complicated. For example, motion exists in every scan. Temporal resolution needstobefastenoughtomitigatemotionartifacts. Oneeffectivesolutionistouse 21 r ρ(r) r h(r) * r ρ(r) ^ = r r r r r r (A) (C) (B) resolved points non-resolved points w w > w < w Figure 2.5: PSF and image resolution (adapted from Ref [69], section 8.1.). In this non-MRI example, the PSF is a box function with width w. (A) The two points (δ function in object ρ(r)) can be resolved only when the distance d > w. (B)(C) The two points becomes one single point in the image ˆ ρ(r) when d≤ w. Therefore the resolution of this system can be defined as w. Figure 2.7 shows an MRI example. other sampling strategies for dynamic imaging, as 2DFT sampling is not efficient and fast enough. As discussed in Section 2.1.2, image resolution is not accurately defined by pixel size with an arbitrary sampling pattern. One needs to address this issue when considering advanced data sampling. This section will discuss advanced acquisition techniques with more comprehensive imaging considerations, including spatiotemporal resolution, motion and spatial coverage. Point spread function and image resolution Equation 2.14 gives a convolution relation between the object ρ(r) and the image ˆ ρ(r). In fact, this convolution relation stands for all linear imaging systems [69]. If we denote the convolution kernel withh(r), any linear imaging system can be written as ˆ ρ(r) =ρ(r)∗h(r). (2.19) 22 In MRI,h(r) is determined by the k-space sampling pattern and is called the point spread function (PSF) of the imaging system. If the PSF is not a δ function, the convolution in Equation 2.19 represents a blurring of the object in image. Figure 2.5 shows two points in an imaging system with PSF being a box function of width w. The two points are resolveable only if their distance d>w, as shown in Figure 2.5 (A). Similarly, we can determine the resolution with other PSFs if the widths are well-defined. For example, Equation 2.14 indicates the PSF of 2DFT imaging is a sinc function. Full width half max- imum (FWHM) of the main lobe can be defined as the width and the resolution in 2DFT imaging [69]. More importantly, Equation 2.14 also points out that the PSF is the impulse response of the imaging system, given any sampling pattern and k-space support. Therefore, PSF can be obtained by taking the IFT on the k-space when the image is a δ function. For examples, PSF is a 2D sinc function with a square k-space support; it becomes a truncated and periodic sinc function with 2DFT sampling; PSF is a jinc function with a circular k-space support. More PSFs are showed in Figure 2.6. Figure 2.7 shows an example of image resolu- tion determined by the width of PSF, instead of the pixel size, in Non-Cartesian sampling. Non-Cartesian sampling trajectory 2DFT is the most commonly used sampling trajectory and dominates in clinical scans. The data are uniformly sampled, and images can be easily and quickly reconstructed using a fast Fourier transform. However, 2DFT does not efficiently utilize gradient. 2DFT creates coherent artifact when undersampling and therefore is more affected by motion, making it less appealing when resolving the object movement, such as in dynamic imaging. 23 EPI Radial Spiral k-space trajectory PSF Figure 2.6: Examples of non-Cartesian sampling trajectories (adapted from Ref [72]) and their PSFs. EPI samples multiple lines on the Cartesian grid by zig- zaging. Spiral and radial samples do not lie on the Cartesian grid. Therefore extra reconstruction step(s) are needed, such as gridding or NUFFT, to re-sample on the Cartesian grid. The bottom two rows show the corresponding PSFs. Dashed lines indicate the pixel size determined by the reciprocal of k-space extent in Equation 2.18. Sampling along a non-Cartesian trajectory can have many benefits based on the unique properties of these trajectories. Figure 2.6 shows several non-Cartesian sampling trajectories. EPI and spiral use longer readout to transverse the entire k-space more rapidly: EPI zig-zags the k-space with fewer excitation RFs; spiral swirls in and/or out. Spiral and radial visit the k-space center in every TR, which allow k-space center to be updated more frequently throughout data acquisition. 24 (A) (B) (C) Figure 2.7: Image resolution is different from pixel size. This example shows two points in spiral acquisition. Dashed lines indicate the pixel size determined by the reciprocal of k-space extent in Equation 2.18. In (A) the distance between the two points is wider than the resolution (the width of the PSF), therefore in the acquired image we can distinguish them. In (B) the two points overlap and become one single point as their distance is smaller than the resolution, albeit larger than the pixel size. In (C) the distance is smaller than the pixel size. They also has a contain fewer coherent artifacts from undersampling [73] and therefore are less affected by motion [74, 75]. However, it is also considerably more difficult to reconstruct images from non- Cartesian data because the non-Cartesian data points do not fall on a grid in k-space. Moreover, there are other considerations when using these advanced tra- jectories. For example, the longer readout of EPI and spiral is more susceptible to off-resonance, introducing shift, blur and/or distortion [76, 77, 50]. Radial sampling is less efficient than Cartesian: it over-samples the k-space center by compromising on sampling on periphery k-space regions, so that more samples are needed to reach at the Nyquist criteria in every k-space location. Another consid- eration raises regarding to hardware limits at which non-Cartesian sampling often reaches. Imperfections and nonidealities in the hardware of an MRI scanner can also generate image artifacts [72]. 25 Dynamic imaging and k-t space So far we have discussed acquisition methods under the assumption that the object is stationary. When the object moves or contains moving structures, motion artifacts arise [72]. In this section, we discuss motion in MRI from a k-t space perspective [78, 79] and show how artifacts arise from temporal undersampling. Motion degrades image in two ways [77]: intra- and inter-acquisition. The former includes motion during acquiring a PE line in 2DFT, an interleaf in spiral or a spoke in radial sampling. Additional phase due to the motion will accumulate and present in the data, causing blurring and distortion along the acquisition direction. The latter introduces inconsistencies in the acquired k-space as the data acquiredbeforeandafterthemotiondonotcorrespondtotheidenticalobjectpose. This causes a superposition of each undersampled pose and presents as aliasingand displacement artifacts [78], such as ghosting in 2DFT and EPI, streaking in radial and swirling in spiral, further blurring the images and lowering the SNR. Here we provide an example explanation of motion artifacts in 2DFT sampling. Figure 2.8 illustrates k-t space of a moving object. We have an object with only its center-piece moving. Assuming there is no intra-acquisition motion, we have series of images stacked along the time axis (t-axis) with interval TR in the xy- t space. Its 2D FT is the k-t space. If we can fully sample this k-t space and satisfy Nyquist criteria, 2D IFT can perfectly recover images at each time frame. However, this condition is barely satisfied in practice. As a result, motion artifact arises in the image. For 2DFT sampling, motion artifacts take the form of ghosting in the phase encoding (PE) direction, which can be explained by k-t space and the Central Section Theorem [68]. Figure 2.9 (A) is the same k-t space showed in Figure 2.8(C). In practice, we can only sample a part of the k-t space in one single time frame. For example, 26 y FT 2D (A) (B) (C) t x t k y k x y x t 0 0 Figure 2.8: The ideal k-t space. (A) shows an object along time, with only the center-piece moving. (B) shows the object in the xy-t space. We assume the object stays static within one single TR but moves in two adjacent TRs. (C) shows the Fourier transform of xy-t space: the ideal k-t space. If this k-t space is fully sampled in Nyquist criteria, the object motion can be resolved in each time frame. (B) shows 2DFT sampling in k-t space. We sampled a single PE line for each TR, resulting a sloped slice in the k-t space through multiple time points. This slice in k-t space determines the temporal resolution of dynamic imaging. (C) shows the 3D (k x -k y -t) IFT of k-t space, which is the xy-ω space. Only the center-piece information exists in the higher frequency planes since it is moving. The arrows in (B) and (C) mark the same direction that is perpendicular to the sampled slice (the normal direction,n). (D) By the Central Section Theory, the 2D (k x -k y ) IFT of this slice is the projection of the 3D xy-ω space that is perpendicular to the slice direction. The projection along the sloped direction results in the center-pieces 27 t k y k x t k y x ω y IFT 3D projection direction projection The Central Section Theory (A) (B) (C) (D) IFT 2D sample sampled k-t space reconstruction temporal resolution k x n slice normal n Figure 2.9: Motion artifacts by the Central Section Theorem of k-t space (modified from Ref [78]). (A) shows the same k-t space as in Figure 2.8. (B) However in practice, we can only sample part of the k-t space, such as a sloped slice in 2DFT sampling. (C) By the Central Section Theory, the 2D Fourier transform of a slice in the k-t space is the projection of the 3D xy-ω space that is perpendicular to this slice direction (denoted by the normal direction,n). This can explain the ghosting artifacts in the image domain (D). at higher frequency ω falling onto the reconstructed image and forming ghosting artifacts. Non-Cartesian sampling, such as spiral, sample the k-t space on a more com- plicated shape compared to the sloped slice in 2DFT sampling. This creates more incoherent artifacts than ghosting, which can be efficiently removed by advanced reconstruction techniques [80]. Furthermore, the view order, such as the order of 28 the interleaves in spiral sampling, can be changed, such that the adjacent inter- leaves can create more incoherence in k-t space by suppressing the point spread function side lobes [77], which can further improve the reconstruction quality [81]. Multiband excitation Multiband RF pulses simultaneously excite several slices [82]. The multiband RF pulse can be described as the product of two functions: R(t) =A(t)· N X n=1 P n (t), (2.20) whereA(t) is the standard complex RF waveform, such as in Figure 2.3, and P (t) is an additional phase modulation that determines the slice position, Δω n , and its phase, φ n , at TE= 0: P n (t) =e iΔωnt+φn . (2.21) The acquired slices will be stacked in the image. However, with the phase mod- ulation as in Equation 2.21, their relative locations is controlled along the PE direction. This technique can be combined with parallel imaging (see section 2.1.4) to unaliase the individual slice images, such as in [83]. It also has been used in upper airway imaging to simultaneously measure the compliance of multiple airway slices [30]. 29 2.1.4 Advanced Reconstruction Gridding and NUFFT There are many approaches for reconstructing non-Cartesian data. One class of methods transform non-Cartesian k-space data onto Cartesian grid and is referred as gridding [84, 85]. Another class of methods directly transform non-Cartesian k-space data into images [86], which is referred as Non-Uniform Fast Fourier Trans- form (NUFFT). These operations typically requires sampling density information to account for the non-uniformity of non-Cartesian sampling. This information is called density compensation function (DCF). For example, DCF can be determined for radial or spiral data using convolution interpolation [87]. There are also other iterative numerical methods based on using the coordinates of the sampled data such as by Pipe, et al [85]. Both of these density compensation methods are available open-source in [88]. Note that knowledge of the actual trajectory used to acquire the non-Cartesian data is required for these reconstruction methods. Trajectory correction [89, 90] should be used to avoid image artifacts due to differences between the desired and actual trajectories [77]. View sharing Temporal resolution in section 2.1.3 and Figure 2.9 is determined by the time span over which a k-t space is used to reconstruct the image. Temporal resolution can be improved by undersampling the k-t space. However, there are methods that can increase the apparent frame rate of the dynamic image, such as view sharing. 30 ...... time ...... frame 1 frame 3 frame 2 Sequence ... frame 1 frame 3 frame 2 ... No view sharing View sharing with a sliding window of 2 Figure 2.10: View sharing increases frame rate. Each gray block in the figure represents one single TR. Top shows dynamic imaging without view sharing, where multiple TRs composed one single time frame without any overlapping. Bottom shows the view sharing technique with a sliding window of 2. Each TR can be shared in multiple time frames to increase apparent frame rate. Figure 2.10 shows view sharing technique. View sharing allows individual TRs to be shared in the adjacent time frames and therefore increase the frame rate. Suppose N TRs are used to generate a image, then the frame rate without any view sharing is 1/(N·TR). However, if view sharing is used with a sliding window ofM <N, then the new frame rate is 1/(M·TR), which is faster than the original frame rate 1/(N·TR). Parallel imaging Parallel imaging (PI) utilizes an array of localized and phased receiver coils. Phased array coils were conceptually similar to phased array radar and ultrasound and were initially developed to increase signal-to-noise ratio (SNR) [91]. Since then, it has been used to accelerate scans by acquiring a reduced amount of k-space data [92]. While undersampling leads to aliasing, these localized coil elements are however sensitive to different regions across the object and therefore provide extra 31 encoding through their different sensitivity profiles. This extra information can be applied to resolve the aliasing caused by insufficient Fourier encoding. Most PI techniques can be divided into two categories, one operates in the image domain, such as SENSE [93]; the other operates in the k-space, represented by GRAPPA [94]. Compressed sensing Compressedsensing(CS)wasoriginallydevelopedaimingtoreconstructsignals and images from significantly fewer measurements of a “random” linear combina- tions of the signal values [95, 96]. MRI has been shown to be a successful example of compressed sensing theory [97]. There are three ingredients in compressed sensing MRI [98]. Firstly, images are sparse in some domain and can thus be represented with fewer coefficients than the number of pixels. The sparse domain can be either the image itself such as in angiography or the image transformed to other domains like wavelet. Secondly, undersampling artifacts should be incoherent (noise-like) in the sparsifying trans- form domain. This is typically done by pseudo-random sampling, such as pseudo golden angle spiral [81]. Lastly, nonlinear reconstruction is required to enforce both sparsity constraint and consistency of the reconstruction with the acquired samples. In the case of dynamic images, the constraint is typically temporal finite dif- ferences, provided that only a small region has dramatic motion over time. In the case of upper airway imaging for example, dramatic motion occurs mostly in the airway region, while the other parts of the head and neck remain mostly still [38, 30, 46, 99]. This prior information can be incorporated into the reconstruction 32 as a sparsity constraint, so that less sampling is required to recover the original image. 2.2 Real-time upper airway MRI Forthepastdecade, themaingoalofupperairwayMRIhasbeentoobtainhigh quality upper airway dynamic image frames with high temporal fidelity [37]. Imag- ing the upper airway in real time poses unique challenges compared to other parts of body. Particularly, upper airway RT-MRI has presented a challenging trade-off between spatial resolution, temporal resolution, signal-to-noise ratio (SNR), and artifact suppression [38]. Several of these factors can be traded differently based on the upper airway task of interest. Notably, several papers provide comprehensive reviews of current techniques. Lingala et al. [38], Scott et al. [1], and Bresch et al. [23] address the technical aspects of upper airway MR acquisition for speech production. Ramanarayanan et al. [100] present an in-depth review of image analysis techniques on RT-MRI of vocaltractmotion. NayakandFleck[101]introduceMRItechniquesforassessment of OSA. Kim [37] presents a comprehensive and up-to-date review of fast upper airway MRI, including imaging strategies for both speech production and sleep apnea research. In this section, we provide a brief introduction to the imaging requirements for RT upper airway imaging and current state-of-the-art imaging strategies. 2.2.1 Seeing speech: imaging requirements Figure 2.11 contains a recently reported consensus opinion among speech imag- ing researchers and linguists [38]. Each bright gray cloud in Figure 2.11 represents 33 0 50 100 150 200 250 300 Temporal resolution (ms) 0 1 2 3 4 5 6 Spatial resolution (mm) Fully-sampled spiral Under-sampled spiral velic movements sustained sounds velo-pharyngeal closure tongue movements vowel to consonant co-articulation events consonant constriction closures of alveolar trills airway closure during sleep sleep-related disorder speech production Seeing the upper airway: imaging requirements airway closure during inspiratory load snoring airway collapse during sub-theraputic pressure level Figure2.11: Seeingtheupperairway: imagingrequirements. Spatialversustempo- ral resolutions trade-offs in RT-MRI using short interleaved spiral trajectories. In comparisontofullsampling, sparsesamplingreducesspatiotemporaltrade-offsand enables improved visualization of several speech tasks in single plane imaging. The clouds in the above figure represent a recently reported consensus opinion among speech imaging researchers and linguists [38]. In general, vocal tract motion during speech production is relatively faster than pharyngeal airway motion during sleep, and generally requires higher temporal resolution than imaging for sleep-related disorder [37]. Sleep-related disorder requires finer spatial resolution as airway size is small in some patients; it also requires larger spatial coverage (not shown) in order to cover all upper airway sites. Speech imaging requirements and spiral trade-off are adapted from Ref [38]; sleep imaging requirements are approximation based on [102, 42, 29, 30, 99]. the current consensus opinion among speech imaging researchers in attendance at the2014SpeechMRISummit. Boundariesareapproximateduetothelackofgold- standard imaging techniques, and are being refined through the more widespread adoption of noninvasive techniques such as RT-MRI. 34 Figure 2.11 also contains a trade-off among spatial and temporal resolution using single slice spiral acquisition [40]. We show spiral trajectories over alternate trajectories because they have been shown to provide a superior trade-off among spatial resolution, time resolution, and robustness to motion artifacts. The spiral trajectories were designed to make maximum use of gradients (40 mT/m maximum gradient amplitude and 150 mT/m/ms slew rate). The trade-off curve is based on a simulation with a field of view (FOV) of 20 cm 2 at Nyquist (full) sampling and rate 6.5-fold undersampling for single-slice. 2.2.2 Seeing sleep: imaging requirements Figure 2.11 also includes an approximated spatiotemporal resolution for sleep- related disorder imaging. This approximation is based on our experience and the recent review in Kim [37]. In general, pharyngeal airway motion during sleep is relatively slower than vocal tract motion. It involves closure of the airway and generally requires lower temporal resolution than imaging for vocal tract shaping [37]. The goal of imaging sleep is to locate the specific position of airway closure or narrowing during apnea and hyponea. Therefore, spatial coverage needs to be increased to capture airway dynamics from retro-palatal to retro-glossal pharynx. Furthermore, the airway collapse pattern is valuable for OSA endotyping [28, 59]. As a result, in-plane spatial resolution requirements are generally higher to resolve the relatively small-sized airway cross section. In RT-MRI for sleep, with faster data acquisitions, the time saved is typically traded for finer spatial resolutions than would be possible otherwise. Alternatively, it can also be traded for increased anatomical coverage. 35 RF G Z G X G Y Signal Echo TE TR ... ... ... ... ... 117° quadratic phase increment Crusher Crusher Echo Figure 2.12: RF-spoiled gradient echo (SPGR) imaging. Example of SPGR sequence with Cartesian data acquisition. A quadratic phase increment between adjacent RF excitation is used to eliminate transverse magnetization besides using a gradient crusher. 2.2.3 Acquisition: spoiled gradient echo imaging gradient echo (GRE) are usually the choice for rapid imaging of the upper airway [37]. Compared with spin-echo techniques, short repetition times of GRE methods enable very rapid 2D and 3D imaging. Rapid GRE sequences [103] typically consist of a single RF excitation, imaging gradients and acquisition. The echo time (TE) is the time from the RF pulse to the formation of a gradient echo, and the repetition time (TR) is the time between excitation pulses. AmongGREsequences,RF-spoiledgradientecho(SPGR)withsmallflipangles is typically used. Figure 2.12 shows an example of SPGR sequence. With a small imaging flip angle (< 15 ◦ , near Ernst angle), it uses a gradient spoiler at the end 36 of the repetition and additionally varies the phase of each RF pulse to eliminate transverse magnetization, providing near proton-density contrast. The small flip angle allows fast recovery along the longitudinal axis and therefore imaging around the maximum possible SNR, as indicated by the small Ernst angle of the tissue of interest. SPGR sequences can use Cartesian, EPI, radial or spiral acquisitions. It has been widely used in dynamic 2D, multi-slice 2D and 3D imaging of the upper airway [37]. 2.3 Tagged MRI MR tagging is an established technique for measuring regional muscle function, such as in myocardium [105] and skeletal muscle [106, 107, 108]. Figure 2.13 shows examples of tagged MRI for canine myocardium and human tongue muscle. MR tagging is performed using a preparation pulse that spatially tags tissue by saturating or inverting a series of parallel strips or orthogonal grid lines. The deformation of these tag lines can be measured using dynamic imaging and then used to evaluate the properties of the imaged object, such as myocardial motion [105], limb maneuver [106], eye-ball rotation [107], brain deformation [109] and tongue movement [66]. The measured deformations can be further used to quantify musclemechanics, suchasstrain[110]andtorsion[111]intheheart, andtodevelop atlases of motion within the tongue [108]. There are many review articles on tagged MRI with its complete history, tech- niques and applications [112, 113, 105]. A comprehensive introduction of tagging pulses can be found in Section 5.5 of [71]. In this section, we present a brief intro- duction to tagged MRI. Firstly, we provide an abridged history of MR tagging techniques. Then we will introduce one particular example, SPAtial Modulation 37 (C) Tagged tongue: before deformation (D) Tagged tongue: after deformation (A) Tagged myocardium: end-diastole (B) Tagged myocardium: end-systole Figure 2.13: Examples of tagged MRI. (A)(B) shows canine myocardial tagging during end-diastole and end-systole (courtesy of Daniel B. Ennis, Stanford Univer- sity). (C)(D) shows tagged human tongue with CINE data acquisition in which dozens of subject repetition is needed (adapted from Ref [104]). of Magnetization (SPAMM), in detail. In the last two sections we will discuss tagging pulses applied with dynamic imaging. 2.3.1 Tagging pulses Tags are typically applied as a magnetization preparation pulse prior to the actual imaging pulse sequence [113]. By perturbing the magnetization of lim- ited regions in the tissue, the tagging pulse causes those regions to appear differ- ent (tagged) during the succeeding imaging sequence. If the tagged tissue moves between the times of tagging and imaging, the magnetization tag will move with 38 the underlying tissue, directly revealing the displacement of the tagged region in the subsequent images. However, the tagging information will gradually disap- pear due to the signal recovery, so the imaging module needs to be applied before the magnetization of the perturbed regions recovers to be indistinguishable from adjacent regions. The development of MR tagging has been driven by the need for visualizing regional myocardial functions [105]. The first tagging technique was invented by Zerhouni et al [114]. Since then, a series of tagging sequences has been invented to improve spatiotemporal resolution and extend tag line contrast. Well-known techniques in use today include SPAMM [115], Delay Alternating with Nutations for Tailored Excitation (DANTE) [116], Complementary SPAtial Modulation of Magnetization (CSPAMM) [117], HARmonic Phase (HARP) [118], Displacement ENcoding with Stimulated Echoes (DENSE) [119], and Strain ENCoding (SENC) [120]. In the following sections, we will focus on SPAMM sequence [115]. This tech- nique creates a visible pattern of magnetization by saturation or inversion, usually parallel stripes, grid lines or radial spokes on the reconstructed images. It allows immediate observation of tissue motion once the images are reconstructed. Fur- thermore, it can provide k-space with tagging information for other technique that allow faster and more automatic post processing, such as HARP [118]. SPAMM SPAMM was invented by Leon Axel et al in 1989 [115] and has been extensively validated in vitro [121] and in vivo [122]. Typically, a RF tagging pulse or a train of RF tagging pulses can be combined with gradients to specify the tagging pattern. The gradient is played alternately 39 RF G crusher 45° 45° t 1 t 4 t 3 t 2 x z y dephase x z y M 45° x z y 45° x z y t 1 t 4 t 3 t 2 -1 -0.5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Longitudinal magnetization at t 4 Bright Dark M z /M 0 position (cm) (A) (B) (C) Figure 2.14: 1-1 SPAMM pulse sequence with total flip angle of 90 ◦ . (A) The sequence diagram for a SPAMM sequence in its simplest form (1-1 SPAMM with total flip angle of 90 ◦ ): a (multiple-cycle) dephasing gradient is applied between two 45 ◦ RFs, followed by a crusher. The corresponding magnetization at each time pointt i is illustrated at the bottom. A depiction of the spins at each time point of the SPAMM sequence is showed in (C). (B) Simulated longitudinal magnetization immediately following the 1-1 SPAMM (at t 4 ). with the RF tagging pulses. Figure 2.14 shows its simplest form, a non-selective RF pulse is played, followed by a gradient pulse, and subsequently another non- selective excitation RF. Example in Figure 2.14 shows a total flip angle (TFA) of 90 ◦ , with each RF excitation being 45 ◦ . The effect of the first excitation RF is uniform excitation of the volume affected by the transmitting RF coil. All of the spins will precess in phase until the dephasing gradient produce a periodic phase variation. The wavelength of the periodicity depends on the strength and duration of the gradient pulse, and is given by λ = 2π γ R T 0 G(t)dt , (2.22) 40 where γ is the gyro-magnetic ratio, and T is the duration of the gradient pulse (t 2 −t 1 in Figure 2.14). The direction of the planes of constant phase is perpen- dicular to the gradient. However until now, the magnitude of the magnetization is still uniform. The second excitation pulse will turn the phase variation into a cor- responding magnitude variation of the magnetization. After the crusher dephases all of the remaining transverse magnetization, there will be a sinusoidal variation of the longitudinal magnetization along the direction of the gradient. A second such set of tagging pulses can be applied in another direction to create a grid of tags. One drawback of 1-1 SPAMM sequence is the coarse tag line boundary by the simple sinusoidal modulation. A longer sequence of alternating RF and gradient pulses can be used to create narrower bands of altered magnetization with sharper tag line boundary. Composite binomial RF pulses Composite RF pulses consist of concatenated subpulses to achieve a sharp frequency profile. Particularly, binomial RF pulses can be used to generate a frequency response that is a higher order sinusoidal waveform. Consider a frequency response S n (f). In the small flip angle approximation, the envelope of the RF pulse that produces S n (f) is approximately given by its Fourier transform. It can be recursively shown by the modulation and convolution theorem of the Fourier transform that for a high order sinusoidal waveform S n (f) =cos n (πfτ), (2.23) 41 its Fourier transform is F{S n (f)}∝ n X k=0 q n,k δ t− nτ 2 +kτ , (2.24) where n is an integer order, τ is the time interval between the centers of two adjacent RF pulses, k is the RF pulse index, and q n,k is the binomial coefficient q n,k := n k ! = n! (n−k)!k! . (2.25) Equation 2.24 shows that the RF pulses to generate such a frequency profile shouldbeatrainofdeltapulseswithitsrelativemagnitudeasbinomialcoefficients. For example, an order-3 binomial RF pulses consists of 4 subpulses (k = 0, 1, 2, 3), and has a relative flip angle of 1-3-3-1. Note that in practice we typically use hard pulses for the composite RFs. SPAMM with composite pulses Composite RF pulses can be alternately applied with dephasing gradients to generate sharper tag lines [123]. Figure 2.15 compares simulated tag lines between 1-1 SPAMM and 1-3-3-1 SPAMM. In the high-order SPAMM, the tagging part can consist of any number of RF pulses such that their relative flip angles follow a binomial pattern, e.g. 1-2-1, 1-3-3-1, etc. The modulating gradients lie in-between the RF pulses, point in the same direction, and have the same total flip angle as in 1-1 SPAMM. The higher the binomial order, the sharper the tag lines. This development leads to improving the tagging pattern quality with a slightly longer tagging pulse. Composite SPAMM sequences can also be used to generate multi-dimensional tag lines. Figure 2.16 shows examples of 1D-x, 1D-y and 2D 1-3-3-1 SPAMM 42 RF G crusher 33.75° 11.25° 33.75° 11.25° 1-1 SPAMM 1-3-3-1 SPAMM Example images RF G crusher 45° 45° RF G crusher 45° 22.5° 22.5° 1-2-1 SPAMM 1-1 1-2-1 1-3-3-1 Sequence -1 -0.5 0 0.5 1 position (cm) 0 0.2 0.4 0.6 0.8 1 1-1 1-2-1 1-3-3-1 Simulation M z /M 0 Dark Bright Sharp tag line Indistinct tag line Sharp tag line Indistinct tag line Figure 2.15: Higher order SPAMM sequences generate sharper tag lines. Higher order SPAMM sequences use binomial RF pulses alternately with dephasing gradi- ents to generate sharper tag lines with longer pulse duration. Top row shows 1-1, 1-2-1, and 1-3-3-1 SPAMM sequences. Middle row shows simulated magnetization immediately after the SPAMM sequences. Note that narrower tag lines generated by higher order SPAMM (arrows compare results between 1-1 and 1-3-3-1). Bot- tom rows (left to right) show representative images, with progression on binomial order and sharpness of the tag lines. sequences. SPAMM sequences with different dephasing gradient directions are applied sequentially to generate 2D tag grid. 90 ◦ out of phase RFs and crusher gradient with different size in the 2 nd direction are needed to avoid stimulated echo in 2D tagging. 43 RF G x 33.75° 11.25° 33.75° 11.25° G y 33.75° 11.25° 33.75° 11.25° RF 33.75° 11.25° 33.75° 11.25° G x RF 33.75° 11.25° 33.75° 11.25° G y 90° out of phase 1D tagging on x 1D tagging on y 2D tagging Tagging sequence Example images Figure 2.16: 1D tagging and 2D tagging with composite SPAMM pulses. Compos- ite pulses can also generate multi-dimensional tag lines. Top to bottom rows show 1D-x, 1D-y and 2D SPAMM sequences with example images, respectively. Note that tag lines are perpendicular to the directions of corresponding dephasing gra- dients (orange dashed rectangular). In 2D tagging, two 1D tagging sequences are applied sequentially. Note that 90 ◦ out of phase RFs (green) and crusher gradient with different size (blue) in the 2 nd direction are needed to avoid stimulated echo in 2D tagging. 2.3.2 Tagged CINE MRI MR tagging has been extensively used with CINE imaging [105]. Figure 2.17 illustrates a simple example of CINE imaging [124]. In acquisition, an ECG signal triggers a pulse sequence. The sequence acquires a specific k-space line or view. Identical pulse sequences are repeated until the next trigger signal is received. With the new trigger signal, a different k-space line is acquired. This process is 44 R–R interval Cardiac phases 1 2 3 4 5 6 7 8 9 10 k-space line 1 k-space line 2 k-space line 3 k-space line 3 line 1 line 2 ECG . . . . . . Cardiac phase 1 Cardiac phase 2 Cardiac phase10 Figure2.17: CINEMRIusesmultiplerepetitionstoreconstructonesingleperiodof motion. During each repetitions, the sequence only acquires a specific k-space line or multiple lines (view). The acquired views are re-arranged during reconstruction to obtain k-space at each cardiac phases. repeated until all k-space lines are acquired. Images reconstructed from the k- space data sets can be viewed as a CINE loop to reveal the dynamics during an averaged cardiac cycle (over 3 R-R intervals in the example). Tagged CINE MRI has been employed in the upper airway imaging. Brown et al. [66] used tagged CINE MRI to evaluate tongue and lateral upper airway move- ment with mandibular advancement, in order to predict treatment efficiency on obstructive sleep apnea patients. For speech production, tagged MRI was utilized 45 as snap shots at designated points in a tongue movement to visualize the defor- mation in early 1990s [125, 126, 127]. Later in the 2000s, tagged CINE MRI was used to analyze the motion of the internal tongue during speech [104, 128, 129]. Recently, it has been utilized to provide images for measurement of 4D tongue motionandtogenerateanatlasofthehumantongueduringarticulation[130,131]. This technique requires multiple repetitions. For example, 16 speech utterance repetitions were required for each slice in [128], 144 voluntary head rotation for studying traumatic brain injury in [109], and more than 135 repeated left to right eye movements for extraocular muscle motion in [107]. Such CINE methods rely on repetition with perfect synchronization, thus allowing tagged MRI to be used to analyze cardiac motion [38], [39], as heart beats in sinus rhythm are highly repeatable, independent of rate of contraction [40], and can be easily synchronized with ECG. Robust deployment of tagged CINE MRI to non-cardiac applications has been hindered by variability in motion, such as voluntary effort discrepancy during body movement [106] or normal and natural token and type variability during speech production [132]. 2.3.3 Tagged real-time MRI An alternative to tagged CINE strategies is to use RT-MRI approaches that do not require repetition or synchronization. Tagged RT-MRI has been proposed to reduce scan time for cardiac application [133, 134, 135, 136]. Tagged RT-MRI with Cartesian sampling was explored for cardiac applications [133, 134]. However, these methods only provide 1D deformation in real-time, as they implement fast imaging by either compromising resolution on phase encoding direction [133] or by only acquiring a small island of harmonic peak in k-space [134]. They need at least two heartbeats to resolve motion on both directions. Real-time SENC techniques 46 [135, 136], although able to provide quantitative strain for cardiac applications, nevertheless measure on a plane that is perpendicular to the imaging plane and are not compatible with speech applications. 47 Chapter 3 Visualizing internal tongue motion: intermittently tagged real-time MRI I n this chapter, we demonstrate a tagging method compatible with RT-MRI for the study of natural human speech production. We apply tagging as a brief interruption of continuous RT-MRI data acquisition. We explore the selec- tion of imaging parameters for such speech studies to optimize image quality and tag persistence. We evaluate this method using simulations and in-vivo studies of American English diphthong and consonant production. We show that the proposed method can capture tongue motion patterns and their relative timing through internal tongue deformation, and therefore provide a potential tool for studying muscle function in speech production and similar scientific and clinical applications. 3.1 Methods 3.1.1 Tagged real-time MRI implementation ExperimentswereperformedonaSignaExciteHD1.5Tscannerwithacustom eight-channel upper-airway coil [40]. The pulse sequence was implemented within 48 a real-time imaging platform (RTHawk Research v2.3.4 , HeartVista, Inc., Los Altos, CA, USA) [137]. Figure 3.1 illustrates the acquisition timing and the pulse sequence diagrams for tagging and imaging. As shown in Figure 3.1a, tagging is applied as a brief interruption to continuous real-time spiral acquisition. A button was added to the RTHawk graphical user interface to allow operator control of intermittent tagging. Manually pushing the button initiates the tagging module to be applied right after the current imaging TR and before the next imaging TR. Real-time spiral data acquisition experiences only a brief interruption of less than 6 ms (comparable to one imaging TR). The persistence of the tag grid depends on longitudinal relax- ation (T 1 ) of the tongue muscle and the effect of imaging RF excitation [117]. Figure 3.1(b) illustrates the tagging sequence, which is a standard 2D 1-3-3- 1 binomial SPAMM sequence [123, 115], with a 1 cm spacing in both in-plane directions. Two SPAMM pulses were sequentially applied along the x and y axes, followed by crushers to eliminate any remaining transverse magnetization [121]. The second composite SPAMM sequence had its phase shifted by 90 ◦ relative to the first one [115] and used a different crusher area to avoid stimulated echoes. The overall duration was 5.66 ms. Figure3.1(c)illustratestheimagingsequence, whichisastandardspiralspoiled gradient echo designed to make the maximum use of the gradients (40 mT/m amplitude and 150 mT/m/ms slew rate). The imaging parameters were: FOV 20 cm, slice thickness 7 mm, readout duration 2.49 ms, TE/TR 0.71 ms/5.58 ms, 13-interleaves bit reversed view-ordering. Coil-by-coilgriddingreconstructionwithview-sharingwasperformedon-the-fly during data acquisition. The Walsh method was used to estimate the sensitivity 49 t 0 (trigger applied) ...... ...... ... frame 1 frame 3 frame 2 time RF Z X Y time (msec) 90° RF Z X Y time (msec) Tagging TR Imaging TR 5.66ms 5.58ms ... Sequence 012345 012345 (a) (b) (c) Figure 3.1: Speech RT-MRI with intermittent tagging. (a) Overall acquisition timing. Continuous imaging is performed using interleaved spiral GRE imaging (c, blue block) with view-sharing reconstruction. 13-interleaves were utilized to fully sample k-space at each time frame using a bit-reversed interleaf order. Tag placement is performed using two 1-3-3-1 SPAMM pulses along x and y (b, yellow block). Note the second composite SPAMM pulse is shifted with a 90 ◦ relative phase and is with slightly larger crusher to avoid stimulated echo. map for coil combining [138]. We utilized a step size of 5 TRs for the sliding win- dow, resulting ina nominaltemporalresolution of36frames/sec. Theapproximate end-to-end reconstruction latency was 27 ms. This setup enables the operator to observe the tagging lines deformation in real-time, to monitor the subject comple- tion of the designed articulation task, and to determine if the timing of triggering conformed to design. Concomitant fields correction [139] and image unwarping that accounts for gradient nonuniformity [140] were applied with gridding recon- struction. 50 3.1.2 Selection of acquisition parameters Tag persistence can be quantitatively evaluated by analyzing the temporal evo- lution of contrast-to-noise ratio CNR tag as a function of time. In the following section, we assume that the steady state signal M ss is reached prior to tagging. For SPGR sequences, immediately after tagging sequence at time t 0 , the lon- gitudinal magnetization can be expressed as [117]: M z (t 0 ) =M ss,z Q(x,y), (3.1) where Q(x,y) represents the modulation function due to the SPAMM sequence. The longitudinal magnetization immediately before the first RF at time t 1 , con- sidering T 1 recovery, is: M z (t 1 ) =M ss,z Q(x,y)e − t 1 T 1 +M 0 1−e − t 1 T 1 =M T +M R , (3.2) The first term, denotedM T , contains the fading tag information; the second term, denotedM R , contains the recovery toward equilibrium magnetizationM 0 . We cal- culate the temporal evolution of tag contrast by considering n consecutive spiral GRE TRs, each with flip angle α. Each of such imaging RFs will scale the mag- netization with a factor of cosα. The M T component immediately before the n th RF excitation (at time t n ) can be expressed as: M (n) T =M ss,z Q(x,y)e − tn T 1 n−1 Y j=1 cosα =M ss,z Q(x,y)e − tn T 1 cosα n−1 , (3.3) 51 and the M R component can be recursively expressed as: M (n) R = h M (n−1) R cosα−M 0 i e − tn−t n−1 T 1 +M 0 . (3.4) Applications of RFs during imaging contributes to reducing the tag information, as it consumes part of the longitudinal magnetization. An optimal flip angle can be determined as described below. The contrast in image is the peak-to-valley difference in magnetization M (n) T that tipped to the transverse plane by the imaging RF. The CNR tag after the n th RF excitation is defined as the ratio between the contrast in image and standard deviation of the image noise: CNR tag = R M (n) T sinα σ , (3.5) where R(·) denotes the peak-to-valley difference. Simulated σ is calculated as the simulated steady state signal divided by 15, as suggested by previous experiments [40]. The tag persistence can be defined as the time span between the grids being placed and CNR tag dropping below a certain threshold. Markl et al. suggested a CNR threshold of 6 for cardiac applications [141]. Two healthy volunteers (27/M, 27/F) were scanned to verify tag persistence in thetongueandtoidentifytheoptimalimagingflipangle. Fifteenintegerflipangles ranging from 1 ◦ to 15 ◦ were utilized in the experiment. A wide tag spacing of 5 cm was used to mitigate partial volume effects in the post processing steps. The noise covariance matrix of the coils was measured with a separate scan with excitation RFs turned off. The measured noise covariance matrix was utilized to pre-whiten themulti-coildataandtocalculatethestandarddeviationofthenoisetonormalize 52 the result. For each flip angle, a separate scan was employed to measure the steady state signal to properly scale between simulation and measurements. The subjects were instructed to keep their mouth in a closed neutral position and remain still during the scan to minimize off-resonance and motion artifacts. The peak and valley values were calculated by taking an average over the manually selected regions of interest (ROIs). The peak ROI was drawn in two 4-by-6-pixel squares in the bright regions in the tongue; the valley ROI was selected over one 3-by-16-pixel stripe at the center of the dark tag lines. 3.1.3 Triggering mechanism In this study, we tested three different tag-triggering schemes to assess the best utilization of the imaging window after the intermittent tagging sequence. Each involved a specific approach to coordinating the tag triggering by the operator with the speech production by a subject (who read linguistic stimuli projected on a screen). Manual triggering In the manual triggering approach, the subjects were instructed to speak the linguistic stimuli (described below) 10 times with a full pause between each pro- duction (to ensure the intermediate return to a neutral vocal tract posture). The operator used the first 2-3 utterances to ascertain the token-to-token rhythm or pacing of the subject for application of the tagging module for the rest of the trials. The operator controlled both the button for the tagging module and the projector presenting the stimuli one utterance at a time. 53 Table 3.1: American English diphthong stimuli Stimuli Carrier phrase Target diphthong Starting posture description Ending posture description “I” “oy” “ow” None /aI/ /OI/ /aU/ Low back Mid Back Low Back High front High front Mid/high back “a buy puppy” “a boy puppy” “a bow puppy” a [·] puppy /aI/ /OI/ /aU/ Low back Mid Back Low Back High front High front Mid/high back Note: Starting/ending posture description refers to the approximate tongue position when tagging began (if in a carrier, during the “a”) and ended. Cued triggering In the cued triggering approach, the MRI operator and the subject were instructed, respectively, to push the triggering button and to read the stimulus immediately upon its visual appearance on the projector screen. Periodic triggering: In the periodic triggering approach, an automatic triggering was implemented in the sequence system. The tagging module was applied every 182 TRs with a period of approximately 1015 ms, which is equivalent to 14 fully sampled frames when no view sharing is applied. The subjects were instructed to say the stimuli during a 15 sec interval placing a pause between each individual speech item. 3.1.4 Speech experiments Four healthy volunteers (2M2F, 27-31yrs), all native American English speak- ers, were scanned. The experiment protocol was approved by our Institutional Review Board, and informed consent was obtained from all volunteers. Audio 54 recording and stimuli presentation were adapted from similar protocols success- fully used in previous studies [40]. Table 3.1 shows the American English diphthong vowel stimuli used in this experiment [142, 143, 144]. Diphthongs are vowels in which the lingual postures, andtheirconcomitantformantfrequencies, requirerelativelylargemovementsfrom one vowel target to another in the same syllable [142]. The diphthongs /aI/, /OI/ and /aU/ were chosen for this study because they involve substantial movement of the tongue when gliding from initial to final vowel quality, and the duration of this movement (approximately 180ms to 300ms [144]) can be thoroughly covered in the current imaging window. The stimuli were placed both in carrier phrases and presented in isolation, so as to provide variation for investigating the proposed tagging sequence. Diphthong stimuli in isolation were the words/pseudo-words: “I”, “oy” and “ow.” The stim- uli in carrier phrases placed the diphthongs after labial consonants in the words: “buy,” “boy,” and “bow.” (for “ow”, subjects were instructed so as to ensure that their pronunciation rhymed with “now.”) A [b], a consonant made with lip rather than lingual closure, was used preceding and following the diphthong to minimize any co-articulation with other nearby lingual sounds. The tagging module was triggered in close temporal proximity with the onset of the diphthong. Different motion patterns and their relative timing during the transition between the com- ponent postures of the diphthongs were then imaged. The carrier phrase stimuli (“a buy/boy/bow puppy”) are presented in this work. Table 3.2 shows consonant stimuli used in the experiment. Stimuli occurred in the pseudo-words: “ara,” “asha” and “acha,” so as to place /ô/, /S/ and /tS/ between two /@/s having a relative neutral vocal tract posture. All of these target consonantsareproducedusingatongueconstrictionintheanteriororalhardpalate 55 Table 3.2: American English consonant stimuli Stimuli Target consonant Articulation place and manner Constriction area “ara” /ô/ Retroflex approximant Lips, pharynx, postalveolar ridge “asha” /S/ Postalveolar fricative Postalveolar ridge “acha” /tS/ Postalveolar affricate Postalveolar ridge area immediately posterior to the alveolar ridge. /ô/ (for this speaker) places the tongue tip in a retroflex posture (though other American English speakers are known to make /ô/ with a bunched, tip-down posture), and /S/ and /tS/ raise the tongue tip and blade up toward the postalveolar area; /S/ retains a small airway opening allowing turbulent airflow, while /tS/ has a brief stop of airflow as the tongue fully contacts the palate followed by turbulent airflow as it draws away. 3.2 Results 3.2.1 Acquisition parameters Figure 3.2 shows CNR-based threshold time and signal intensity as functions of imaging flip angle. The longitudinal relaxation of the tongue muscle T 1 = 850 ms at 1.5 T was measured by inversion recovery fast spin echo (FSE-IR) with multiple inversion times. This value agreed with previous literature [1, 145]. Dashed lines in Figure 3.2(a) indicate the CNR optimal flip angle that delivers the longest threshold time. The CNR optimal flip angle increases from 3 ◦ to 6.5 ◦ with higher threshold values providing shorter tag persistence. The Ernst angle for imaging 56 CNR-based threshold time Steady state signal 05 10 15 0 200 400 600 800 1000 Threshold Time (msec) CNR = 4 5 6 7 Lower CNR threshold 05 10 15 Flip Angle ( ° ) 0 0.02 0.04 0.06 Signal (M 0 ) Figure 3.2: Simulation of tag persistence and steady-state signal as a function of imaging flip angle. Top: Threshold time is defined as the time span between the tag being placed and the tag CNR falling below the threshold value (shown for CNR cutoffs of 4, 5, 6, and 7). The dashed line marks the flip angles that will deliver the longest threshold time for each CNR threshold. The longest persistence can be reached at a flip angle of 3-6.5 ◦ . Performance suffers quickly if the flip angle is too low, but less so if the flip angle is too high. Bottom: Steady state signal as a function of flip angle for the imaging TR = 5.58 ms and tongue T 1 = 850 ms at 1.5 T. The Ernst angle in this case is 6.2 ◦ . The actual imaging flip angle was selected based on both tag persistence and steady-state tongue SNR. tongue is α E = 6.2 ◦ as showed in Figure 3.2(b). The simulation shows a trade- off between CNR-based tag persistence and image SNR when choosing optimal excitation flip angle. Figure 3.3 shows an in-vivo experiment on tag persistence in human tongue. Measured signal of tag lines (center) and peak-to-valley contrast were plotted as functions of time with corresponding simulated curve. The curves were normalized by the standard deviation of noise measured in a separate scan. Only a subset of flip angles (3 ◦ , 5 ◦ , 7 ◦ in 1 - 15 ◦ ) are shown in the figure for illustrative purpose. 57 Tag line signal recovery Contrast decay 0 200 400 600 800 1000 1200 Time (msec) 0 2 4 6 8 10 12 14 16 18 20 CNR 0 200 400 600 800 1000 1200 Time (msec) 0 2 4 6 8 10 12 14 16 18 20 SNR FA = 3 ° measurement FA = 5 ° measurement FA = 7 ° measurement 400 500 600 700 5 6 7 8 Figure 3.3: Tag persistence in human tongue at 1.5 T. Left: simulation (line) and measurement (symbol) of the tag line signal for the first 1.2 s after the tag modulewasapplied. Right: contrastdecayaftertagmodulebeingapplied. Tongue T 1 = 850 ms was measured using an inversion recovery fast spin echo (IR-FSE) sequence with multiple inversion times. The signal and contrast were normalized by the standard deviation of noise, measured by a separate scan with RF excitation turned off. The measured signal conformed to the simulation for all imaging flip angles. The tag lines of FA = 3 ◦ , 5 ◦ , 7 ◦ recovered to the steady signal with SNRs of 13, 17 and 20 with decreasing times, respectively. Note that FA = 7 ◦ had the highest imaging SNR; however, the faster decay resulted in a CNR drop to 5 in only 600 ms. In contrast, the CNR by FA = 3 ◦ and 5 ◦ reached the threshold level in more than 650 ms, with the latter having 30% higher image SNR in the tongue compared to the former. In our experience, imaging using a very small flip angle (α< 5 ◦ ) was sensitive to B 1 inhomogeneity in the tongue, as the signal dropped dramatically when unintentionally decreasing the flip angle. As an overall result of the above 58 111ms 893ms 698ms 502ms 307ms SNR=13 CNR=5.7 SNR=17 CNR=5.5 SNR=20 CNR=4 FA=3° FA=5° FA=7° Figure 3.4: Example images of tag fading with imaging flip angle of 3 ◦ , 5 ◦ and 7 ◦ . Wide tag spacing of 5 cm was used to mitigate partial volume effects. At around 700 ms (4 th column), FA = 3 ◦ , 5 ◦ have similar CNR, while the latter has 30% higher SNR. As an overall consideration, we used flip angle of 5 ◦ with an imaging window of around 650-800 ms, with the ending CNR of 5-6. considerations, we used a flip angle of 5 ◦ with an imaging window of around 650- 800 ms, with the ending CNR of 5-6. Figure 3.4 shows example images of tag fading. 3.2.2 Triggering mechanism Manual triggers were likely to miss the beginning of the diphthong even with theoperatorandthesubjectsynchronizedintothesamerhythmwithpractice. The reflex delay of the human operator and the normal speech pacing and production variability of the subjects aggravated the miss rate. Further, the operator’s timing accuracy largely depended on the audible speech that emerged from the scanner, which was compromised by acoustic scanner noise. 59 Both cued and periodic triggering performed well. During cued triggering, the normal reflex delay of the subject between seeing the stimuli on the projector and starting articulation was largely matched by the reflex delay of the MRI operator in executing the tagging button press, ensuring that the tag was reliably placed appropriately before the target tongue movement. Interestingly for the elicitation protocol of periodic tagging, the tagging module interrupted the acoustic sound of the readout gradient heard by the subjects and acted in effect as an auditory metronome for the subject, causing them to entrain to the tag triggering rhythm and thereby consequently aligning their productions with the tagging timing after the first 1-2 triggers. And, since there was no voluntary effort required by the operator on the triggering side, operator alignment errors were not an issue. 3.2.3 Visualization of tongue deformation Figure 3.5 uses American English Vowel Charts to provide a rough schema for understanding the tongue positioning. The blue curve in the chart marks the starting and ending points for the three diphthong vowels being studied. These vowels in English are known to produce sweeping lingual motions that move the tongue upward from a depressed and/or retracted posture to a raised and fronted or raised and retracted posture as follows: in /aI/ from a low-back posture to a high-front posture, in /OI/ from a mid-back posture (with lip rounding) to a mid-high front posture, in /aU/ from a low posture to a high-back (lip rounded) posture. In these vowels, as in vowels generally, the tongue is generally more or less arched; it is not grooved or concave. Figure 3.6 reveals internal tongue movement during three American English diphthong articulation examples. The videos can be found in Supporting Informa- tion Video S1 at Wiley Online. For orientation, note that /aI/ and /aU/ start with 60 I /a ɪ/ OI /ɔɪ/ OW /aʊ/ Figure 3.5: American English Vowel Charts illustrate a rough schema for under- standing tongue position observed in the representative frames in Figure 3.6. similar low and retracted tongue postures (note the pharyngeal narrowing); /aI/ and /OI/ end with similar postures of the tongue bunched up high in the palatal vault; and the starting posture of /OI/ is similar to the ending posture of /aU/ with the tongue high and retracted toward the velum (soft palate). Figure3.6containsrepresentativeframes, illustratingmultipledeformationpat- terns and capturing their relative timing. A shear between different parts of the 61 I /aɪ/ OI /ɔɪ/ 36 ms 176 ms 315 ms 455 ms 595 ms OW /aʊ/ 734 ms Figure3.6: TaggedRT-MRIreveals internal tonguedeformationsand theirrelative timing during American English diphthong articulation. Each color indicates the start of a different motion pattern: (left to right) tongue tip deformation (green), shear (cyan), tongue body compression (magenta), and tongue root compression (yellow). Importantly, the relative timing of motion patterns is seen; for example, deformation of the tongue tip (green) was followed by shear (cyan) and finally compression of the tongue root (yellow). tonguecanbeidentifiedassquaregridschangingintoparallelograms. Compression can be identified as square grids changing into bi-concave rectangles. Stretching and curving of the tongue can be identified by bended grid lines. Each of these types of deformations occurred during the course of diphthong articulation. Color arrows mark the start of one specific type of deformation in the representative frames. In the case of /aI/ (top row), parallelograms emerge at 315ms (cyan), indicat- ing shear between the tongue body and tongue root. Also at this time, bi-concave rectangles can be observed at the top of the tongue body (magenta). These com- pressions move the tongue forward and somewhat higher. Compression of the tongue root happens later (frame 595 ms), further increasing the height of the tongue into the palatal vault (yellow). 62 In/OI/(middlerow), thetonguetipstretchingforwardwasidentifiedbythever- tical tag lines in that area starting to curve (green). Then as the tongue moves for- wardandhigher, upper-lowershear(cyan), compressionintonguebody(magenta), and some tongue root fronting (yellow) is observed in the later frames. In/aU/(bottomrow), weagainseeearlycompressionandcurvingofthetongue tip (green). Shear (cyan) appears as the tongue retracting and bunching toward the pharyngeal wall. Compression in both tongue body (magenta) and tongue root (yellow) further move the tongue upward toward the velum. The representative frames were chosen specifically to show the timing rela- tions of these various tongue internal deformations, documented as the four colors distributing differently in time from left to right. For instance, in the top row deformation of tongue body (magenta) and tongue base (yellow) (which can be thought of as the tongue’s ‘undercarriage’) is seen during /aI/, with the former happening earlier ( 300 ms) than the latter ( 590 ms). Another example is tongue tip deformation, which happened early in all diphthongs tested, indicated by green arrows on the left. Figure 3.7 shows diphthongs in carrier phrases: (a) “a buy puppy,” (b) “a boy puppy,” and (c) “a bow puppy.” Supporting Information Video S2-4 (see Wiley Online) shows the three diphthong stimuli in carrier phrases with synchronized audio recording. Intensity-time (x-t) plots are shown in the top rows of (a)-(c) and the moment at which the tagging module was applied is indicated at the very top of the Figure 3.7 and serves as the temporal alignment point for the figures. Six representative frames are zoomed out in the bottom rows with green and magenta dashed squares marking the start and end of the diphthong articulations. (Note that the representative frames in (c) have a shorter time span compared to (a) and (b).) The tag persisted from the beginning of the mid-central /@/ that preceded 63 1 2 3 4 5 6 (a) lip tongue 1 2 3 4 5 6 28 ms 140 ms 253 ms 365 ms 478 ms 590 ms u p pp y a b u y lip tongue tagging applied stimuli x-t plot rep. frames b u p ppy o w a (c) 1 2 3 4 5 6 1 2 3 4 5 6 28 ms 112 ms 197 ms 281 ms 365 ms 450 ms (b) b u p ppy o y a 1 2 3 4 5 6 1 2 3 4 5 6 28 ms 140 ms 253 ms 365 ms 478 ms 590 ms Figure 3.7: Tagged RT-MRI reveals deformation relative to the relatively neutral posture of the schwa /@/ (“a”) of the carrier sentence. Stimuli occurred in car- rier phrases: (a) “a buy puppy,” (b) “a boy puppy” and (c) “a bow puppy.” The intensity-time (x-t) plots in top rows of (a)-(c) indicate tagging timing, and six rep- resentative frames are shown across time in each bottom row. Green and magenta dashed square mark the start and end gestures of the diphthong articulation. Note the deformation differences in internal tongue among the three diphthongs’ start- ing postures and across their ending postures. (Such as start of /aI/ vs. /aU/ as in a4 vs. c3, start of /OI/ vs. end of /aU/ as in b4 vs. c5.) 64 the target word in the carrier phrase and successfully visualized deformation of the tongue for the entire course of the target diphthong. The first frames in Figure 3.7(a)-(c) show the tag applied when the tongue started at a mid-central vowel /@/ (the initial “a” of the carrier phrase), so that all of the deformations in the later frames are relative to this relatively neutral vocalic schwa posture. Note that while (a4)/aI/ and (c3)/aU/ start with similar low and retracted tongue postures marked by pharyngeal narrowing, differences in the internal tongue can be immediately visualized in the distinct grid deforma- tions. This confirms subtle distinction between the starting position of /aI/ and /aU/, echoed in the American English Vowel Chart in Figure 3.5. Similarly, the deformational difference between the ending posture of /aU/ (b4) and the starting posture of /OI/ (c5) was clearly evident; more bi-concave rectangles exist in (c5) in addition to parallelograms in both (b4) and (c5), indicating horizontal squeeze, which further packs the tongue up toward palatal vault. This is consistent with the placement in the second and third vowel charts in Figure 3.5. With a relatively neutral schwa posture (frame 1’s) as a reference, the defor- mations also indicate regional motion within the tongue: in (b3-4) parallelograms in the middle of the tongue indicate shear serving to retract the tongue body back toward the pharyngeal wall; (a6, b6) indicate horizontal compression squeezing the tongue up toward the palate. Little or no deformation is observed during the maintenance of the most extreme postures such as (a6, b6, c3). Figure 3.8 shows different deformation patterns in three example consonant stimuli. In /ô/, curved tag lines in tongue tip (green) are evident, indicating the upward ‘bending’ deformation of the tongue front high into the palatal vault. Note that /ô/ has three constrictions: at the lips, in the postalveolar region, and in the pharynx; while /S/ and /tS/ only have one constriction, in the postalveolar region. 65 Thus in /ô/ vertical compression in the tongue body (yellow arrows in /ô/) is seen due to the tongue body and root being squeezed toward the pharyngeal wall. This vertical compression is not present in the other two consonant stimuli. In both /S/ and /tS/, the x-t waveform shows there is a highly similar airway shape (i.e., tongue surface contour), as we would expect for the fricative portion (green dash). However, internal differences are visible, presumably arising from the pull-away characteristics of the blade that remains pressed or stabilized upward more so for /tS/ (magenta) than for /S/ (green). Significantly, the tagged images show the tongue internal deformation differences even when tongue surface contours and vocal tract constriction locations are comparable. 3.3 Discussion We have demonstrated intermittent tagging during RT-MRI as a poten- tial means to visualize internal tongue motion during speech production. This approach eliminates the need for re-binning data using multiple repetitions and is suitable for investigations of natural speech production. We demonstrated a framework to select imaging parameters in consideration of image quality and tag persistence and achieved an imaging window of approximately 650-800 ms at 1.5 T, with imaging SNR ≥ 17 and tag CNR ≥ 5 in human tongue. This work leverages mature speech RT-MRI techniques [12], [45] to provide adequate spatiotemporal resolution for tagged imaging. The resulting method is able to cap- ture tongue motion patterns and their relative timing as exemplified by internal tongue deformation during American English diphthong vowels and consonants. This method can also provide images for further quantification of internal tongue motion [128, 131, 108, 146, 147]. 66 /ə/ 365ms 309ms 421ms 337ms 281ms 393ms 281ms 225ms 337ms /ɹ/ /ʃ/ /tʃ/ x-t plot /ɹ/ /ə/ /ə/ /ʃ/ /ə/ /ə/ /ə/ /tʃ/ Figure 3.8: Tagged RT-MRI shows different deformation patterns (relative to pre- ceding schwa postures) during the articulation of consonants /ô/, /S/, and /tS/. The intensity-time (x-t) plots in top rows indicate tagging timing, and three rep- resentative frames are shown across time in each bottom row. All stimuli involve constrictionwiththetonguetipand/orblade(i.e. thetonguefront)inthepostalve- olarregionofthevocaltract. Interestingly, thetaggedimagesshowtongueinternal deformation differences (magenta vs. green) even when tongue surface contours and vocal tract constriction locations are comparable. 3.3.1 Potential applications The proposed method may provide insight into several open questions in speech science and linguistics. For instance, acoustic studies have shown that the vocalic formants of the initial and terminal portions of a diphthong are not necessarily the same as those found for the simple vowels in monophthongs used to describe them [142]. Hsieh et al. [143] hypothesized that strong biomechanical coupling between starting and ending gestures truncates diphthong articulation, leading to 67 less extreme [a] vowels (as compared to the corresponding monophthong). This studyusedconstrictiondegreetoexaminediphthongarticulation,byassumingthat constrictions can be identified with higher signal intensity in a region of interest. Thetaggingmethodproposedherecanenabletestingofthisandsimilarhypotheses by directly examining the biomechanical subsystems in the tongue. The proposed method may also serve to provide insights into disease states that affect speech production. CINE-tagging has been used by Lee et al. [147] to assess tongue impairment in amyotrophic lateral sclerosis (ALS) patients and by Stone et al. [129]toinvestigatearticulationvariancebetweenpost-glossectomypatientsand controls. For these applications, the requisite repeating motion required in CINE- tagging could be burdensome for some patients, aside from the fact that highly consistent repeatability, which is challenging in impaired speech, is required for re-binning data. Such challenge is demonstrated in Supporting Information Video S5 (see Wiley Online). The proposed RT-MRI tagging method can substantially simplify the data acquisition and preclude errors from a re-binning process, by compromising resolution and/or SNR. Lastly, tongue muscle movement patterns in obstructive sleep apnea (OSA) patients have been characterized in clinical stud- ies for treatment evaluation [66]. The proposed method with automatic periodic tagging could potentially allow studies during natural sleep. 3.3.2 Triggering We investigated the performance of the proposed intermittent tagging with three varying triggering mechanisms. Cued and periodic tagging perform well for all four subject scans. Although there is variability in speech rate across subjects, the flexible nature of these intermittent-tagging protocols allows us to flexibly adjust the triggering timing. 68 3.3.3 Tagging consideration As a feasibility effort, this work employed a fairly simple tagging module. We used a 1-3-3-1 SPAMM tagging sequence, as established in the literature, and produced high quality visualization of tag grids in tongue. There exist many alter- natives to SPAMM. Several cardiovascular magnetic resonance (CMR) tagging approaches can potentially be adapted for speech applications [105]. Particularly appealing options include HARP [148] and DENSE [119], allowing faster and sim- pler post-processing and analysis. HARP has been adapted for speech production in the CINE framework [104, 131, 130, 108, 147]. More rapid data acquisition implementation by EPI was proposed for cardiac HARP [134], in which only the spectral peak of interest was acquired. DENSE provides higher sensitivity and spatial resolution. However, the technique is derived from Stimulated Echo Acqui- sition Mode (STEAM) sequence and suffers from low SNR. Phase contrast imag- ing has been shown for the application of tissue velocity mapping in myocardial motion [149] as well as in skeletal muscle contraction [150]. This technique encodes information about velocity into the phase of the detected signal. Note that all three of these alternatives are phase-sensitive methods; phase errors introduced by uncounted off-resonance need to be carefully considered when adapting to speech applications [149, 151, 152, 153]. The SPAMM parameters may also be optimized. We used grid spacing of 1 cm, but this spacing may need adjustment based on the size of the subject. For example, we expect a finer grid spacing will be required in small people, such as young children. The grid spacing may also need modification depending on the specific muscle groups or vocal tract subsystems of interest such that they are fine enough to distinguish the contractions and internal movements of the specific lingual muscle system(s) of interest such as for the tongue tip. 69 Improvement in tag persistence is also of interest. Variable flip angle (VFA) has been utilized in spiral myocardial tagging to improve contrast throughout the entire cardiac cycle. Ryf et al. [154] applied larger flip angle in the later stage of the imaging cycle to compensate the faded longitudinal magnetization. This topic will be further explored in Chapter 4. 3.3.4 Imaging consideration Motion artifacts exist in some of the current results. This is not surprising as Lingala et al. [40] pointed out that fully sampled single slice RT-MRI can- not resolve all tongue movements, especially during faster pace speaking or those involving intrinsically faster subsystem movement such as by the tongue tip. These artifactscanbemitigatedbyunder-samplingandconstrainedreconstructionmeth- ods, which have yet to be explored in combination with tagging. Imaging at 3 T is of interest because it could provide longer tag persistence and higher SNR. We conducted all of our experiments at 1.5 T field strength. Previous studies have compared imaging at 1.5 T and 3 T for cardiac applications for balanced steady state free precession (bSSFP) sequences [141]. With the same imaging parameters, the tag persists approximately 25-30% longer due to slowerT 1 relaxation and higher intrinsic imaging SNR in human tongue. This can be further improvedbyasmallerflipangle,consideringthelowerErnstangleneededforlonger T 1 . However, stronger off-resonance emerges at higher field strength, especially at air-tissue boundaries with an amount of approximately 9.4 ppm [155]. This could cause blurring of the grid near the tongue surface, or even total disappearance in subtle structure such as the tongue tip. To mitigate the off-resonance artifacts, dynamic off-resonance can also be incorporated into the reconstruction pipeline to reduce artifacts [49, 50]. Subjects with large proton density fat fraction at the 70 base of the tongue (inferior-posterior) will also suffer from signal dephasing due to off-resonance of 3.5 ppm between fat and water [156]. This signal loss can be reduced by shortening the readout duration of spiral acquisition while trading-off temporal resolution or by using another sampling pattern with short readout, such as radial sampling [26]. 3.4 Conclusions We have developed and demonstrated a method for intermittent tagging during real-time MRI of speech production to reveal internal deformations of the tongue. We incorporated 1-3-3-1 SPAMM tagging with rapid spiral GRE to reveal the internal tongue motion during articulation. We showed that this method can capture various motion patterns in the tongue and their relative timing using case examples of American English diphthongs and consonants. The proposed method can potentially provide tools to investigate muscle function or other applications of internal tissue movement in future scientific and clinical research. 71 Chapter 4 Capturing longer motion patterns in speech production: real-time MRI with REALTAG T agged RT MRI has been proposed to reduce scan time for cardiac appli- cation [133, 134, 135, 136] and, in Chapter 3, to eliminate the need for repetitions in imaging speech production [157]. These techniques are able to pro- vide adequate spatiotemporal resolution, and tag persistence has been the primary limitation. In Chapter 3, tag persistence was demonstrated to be around 650 ms at 1.5 T for the tongue [157]. This duration is sufficient for imaging the produc- tion of single syllables in American English. However, this is not sufficient for longer utterances in which the tongue motions of interest may be on the order of 1 second, such as vowel-to-vowel transition across words or geminate (i.e., phonolog- ically long) articulations. Tongue deformation patterns occurring in the formation and release of lingual constrictions over these longer speech intervals are not well understood due to limitations of other articulometry methods, which have focused on point-tracking or tongue-/lip-surface imaging, yet the biomechanical underpin- nings of internal tongue deformations remains an important aspect of character- izing speech production and speech motor control, particularly for the complex lingual hydrostat [158]. 72 Several methods can potentially extend the duration of tag persistence. For SPGR, fewer excitation pulses per image can improve contrast [159], but this will increasethereadouttimeandwillbelimitedbythesubstantialoff-resonanceinair- tissue boundaries [38]. A bSSFP sequence yields improvement for contrast-to-noise ratio (CNR) and tag persistence by improving tissue signal-to-noise ratio (SNR) [160]. However, large off-resonance can introduce banding artifacts. CSPAMM uses two consecutive scans with opposite tagging RF polarity and ramped imaging flip angles to cancel out the tag fading [117]. This technique requires two repe- titions and therefore requires gating. REALTAG, proposed by Derbyshire et al. [161], uses a total flip angle (TFA) of 180 ◦ for the tagging pulse and phase-sensitive reconstruction to correct the rectified inverted tags. This method prolongs tag per- sistence by roughly a factor of 2 without increasing scan time, and it is compatible with tagged RT-MRI approaches. In this chapter, we apply REALTAG to intermittently tagged RT-MRI and demonstrate its successful application to the study of human speech production. We use a spatial low-pass filter to extract and compensate for smooth image phase and isolate tag lines. We evaluate this method using both simulations and in vivo studies of American English vowel-to-vowel transitions. In experiments focused on the human tongue, the proposed method extended tag persistence by a factor of 1.9×. 4.1 Methods 4.1.1 Imaging methods A combinatorial 1-3-3-1 SPAMM sequence was employed with a TFA of 180 ◦ (proposed) and a TFA of 90 ◦ (original) implementation [157]. The tagging pulses 73 coil-combined images low-pass filter extract phase φ = ∠( • ) phase compensation ∙ exp(-iφ) Re{ • } max{ 0 , • } phase sensitive recon final images Flowchart Example images (A) (B) (C) (D) magnitude phase masked for visualization masked for visualization - - /2 0 /2 Figure 4.1: Phase sensitive reconstruction flowchart (left) and example images from intermediate steps (right). Low resolution and tag-line-free phase (B) was estimated for each time frame using a 1/5 synthesized k-space center (20-by-20) after gridding and coil combining (A), as the tag harmonic peaks are located at 20 points from the center of a 100-by-100 k-space with 2 mm image resolution and 1 cm tag spacing. Final images (D) were generated by taking the non-negative values from phase sensitive reconstruction (C) for better visualization. Note the bright spots existing in the intersection of tag lines in (D) due to double inversion. were intermittently triggered to place a 2D grid of tag lines with 1 cm spacing during the continuous SPGR imaging. The imaging parameters were: 13 spiral interleaves, field of view 20 cm, slice thickness 7 mm, readout duration 2.49 ms, echo time/TR 0.71 ms/5.58 ms, and in-plane resolution isotropic 2 mm. Gradient delay was accounted for during pre-scan. First-order shimming was interactively performed in a carefully selected region of interest, including both the oral tongue-air interface and the pharyngeal tongue root, while the subject was instructed to sustain a relaxed open mouth posture. In practice, we found this 74 shimming procedure to be crucial to reduce off-resonance at the tongue surface boundary and to effectively enhance the performance of phase sensitive recon- struction. Concomitant field correction [139] and image unwarping that takes into account gradient nonuniformity [140] were applied with gridding reconstruction. The gridding reconstruction was applied on the fully sampled 13 interleaves spiral data coil-by-coil with a sliding window of 1 TR, providing a frame rate of 178 frame/s. ESPIRiT [162] was used to estimate coil sensitivity during pre-scan for coil combining. Figure 4.1 illustrates the phase sensitive reconstruction pipeline. After gridding and coil combining, we selected a k-space center and implemented a low pass filter by zero padding. We denote the tag spacing as Δ tag = αδ, where δ is the image resolution, and α is a scaling factor. The tag pattern can be written as a finite cosine series having a fundamental frequency of 1/Δ tag [163]. In k-space, modulating by this frequency results in replicating the image k-space at certain harmonic frequencies [118]. These harmonic frequencies are located at k tag = N Δ tag = 2Nk max α , (4.1) with N =±1,±2,..., and 2k max is the k-space extent. The width of the low pass filter W was chosen in a way such that it only passes the k-space inside the half of the lowest harmonic frequency, that is W = 2k max α . (4.2) Weenforcednon-negativevalueinthefinalimagestoperceptuallyincreasecontrast between dark tag lines and tongue tissue for visualization purposes. 75 4.1.2 Experiments ExperimentswereperformedonaSignaExciteHD1.5Tscannerwithacustom eight-channel upper-airway coil [40]. The pulse sequence was implemented using RTHawk Research v2.3.4 (HeartVista, Inc., Los Altos, CA, USA) custom real-time imaging platform [137]. We used a previously described tagged RT-MRI technique with an imaging flip angle of 5 ◦ [157]. The experiment protocol was approved by our Institutional Review Board, and informed consent was obtained from all volunteers. 4.1.3 Tag persistence Tag persistence with TFA = 90 ◦ (original) and 180 ◦ (proposed) was simulated and compared with in-vivo experiments. Five healthy volunteers (2M/3F, age 28-36 years) were scanned to verify tag persistence in the tongue. In this sub- study, subjects were instructed to keep their mouth in a closed neutral position and remain still during the scan to minimize off-resonance and motion artifacts. A wide tag spacing of 5 cm was used to mitigate partial volume effects in the post processing steps. A separate scan was employed to measure the steady state signal to properly scale between simulation and measurements. The coil noise covariance matrix was measured by a separate scan with excitation RFs turned off, in order to pre-whiten the multi-coil data and to normalize the result [164]. CNR tag is defined as the ratio between tag line peak-to-valley difference in the image and standard deviation of the image noise. The peak and valley values were calculated by averaging over the manually selected regions of interest (ROIs). The peak ROI was drawn in two 4-by-6-pixel squares in the bright regions in the tongue; the valley ROI was selected over one 3-by-16-pixel stripe at the center of the dark tag lines. 76 Table 4.1: Vowel-to-vowel transition stimuli Stimuli Instructed pronunciation Target starting posture Target ending posture Measured transition duration ∗ “A pie again” /paI/ Low back High front 175±25 ms “A poppy again” /papI/ Low back High front 232±15 ms “A pop pip again” /pap pIp/ Low back High front 328±22 ms Underlines mark target words (column 1) or target vowel sounds (column 2). The transi- tiondurationismeasuredasmean±stdin10trialsfor2speakersduringspeechproduction for the time from the initiation of the first vowel’s constriction to the end of release of the second vowel’s constriction. Speech Production Experiments Two volunteers (28/M, 28/F), both native American English speakers, were scanned in the second sub-study. The text for the speech stimuli were projected onto a screen visible to the subject in the scanner through a mirror. Audio record- ing was synchronized with the data acquisition [165]. Upon visual appearance of the task stimuli on the projector screen, the MRI operator and scan subject were instructed, respectively, to push the triggering button and to read the stimulus [157]. Table 4.1 details the stimuli used in this experiment. Three stimuli targeting the same vowel-to-vowel transition were placed in the carrier phrase “a [·] again.” All three stimuli involve a transition from (approximately) the same starting vowel sound /a/ to (approximately) the same ending vowel sound /I/. The three stimuli are the monosyllable “pie” with diphthong /aI/, “poppy” with the two vowels /a/ and /i/ in successive syllables in a word, and “pop pip” with the two vowels /a/ and /I/ in adjacent words. These English vowels in sequence produce sweeping lingual motions that move the tongue from a low-back (pharyngeal constriction) 77 posturetoahigh-front(oralpalatalconstriction)posture. Theapproximatesweep- ing or transition time becomes increasingly longer in duration (all else equal) from the monosyllable “pie” to the bisyllable CVCV (CV: consonant-vowel) “poppy” to the cross-word CVCCVC sequence “pop pip.” Importantly the words’ medial consonant [p] is made with the lips rather than the tongue, so was used near the target vowel-to-vowel transition to minimize coarticulation with (i.e., interference of) other nearby lingual sounds. The tagging module was triggered in close tempo- ral proximity with the onset of the starting mid-central vowel /@/ (the initial word “a” of the carrier phrase), so that the lingual deformations in the later frames are relative to this fairly neutral vocalic schwa posture of the tongue. 4.2 Results 4.2.1 Off-resonance and the low pass filter width Figure 4.2 illustrates the phase of a typical un-tagged image after low pass filtering is applied, with various widths. The subject is holding an open-mouth posture with large off-resonance. White arrows identify spatial variation of the phase due to off-resonance. Off-resonance within the tongue is minimal. Larger off-resonance only exists near other articulator boundaries, such as the hard palate and pharyngeal wall. A filter width of 2k max /α provides an adequate estimate of the image phase for REALTAG phase sensitive reconstruction. Figure 4.3 illustrates phase compensation using the low pass filter with different widths W. A smaller W can cause over-smoothed phase estimation and shading artifacts, indicated by the yellow arrow. Larger W can introduce spurious phase information from the spectral replicas created by tagging and rectify the negative- valued tag lines, indicated by the white arrows. 78 - - /2 0 /2 W = 0.5k tag W = k tag W = 2k tag Image phase Error LR phase W = 1.5k tag Figure 4.2: Phase of an un-tagged images after low pass filtering with different widths. Dashed lines indicate the designed filter width W = k tag = 2k max /α (in k-space samples). White arrows identify sharp spatial variations in phase due to off-resonance. This variation is suppressed when using a low pass filter with small width (see black arrow in 0.5k tag ). A filter width ofW =k tag allows for a frequency smaller than one half of the first harmonic frequency, minimizing phase contami- nation from the replicas due to tagging, while provide an adequate estimation of the background phase. Comparison among the W = k tag , 1.5k tag , 2k tag columns indicates that larger filter width does not significantly improve the phase estima- tion in the tongue. Further, larger filter width introduces unwanted artifacts in the tag lines as shown in Figure 4.3. 4.2.2 Improved tag persistence Figure 4.4 compares simulated and measured tag CNR decay for TFA = 90 ◦ in the original implementation and TFA = 180 ◦ with REALTAG in proposed method. The left plot shows that the experimental measurements are consistent with sim- ulations. The starting CNR is doubled in the proposed method, as expected with homodyne detection compared to magnitude detection. The tag persistence is therefore prolonged with the same decay rate. Example images on the right show a 5 cm tag line decay during in-vivo scan. CNR drops below 6 at 650 ms (1 st row, 3 rd column) for TFA = 90 ◦ , while it reaches the same CNR level around 1250 ms for TFA = 180 ◦ with REALTAG (3 rd row, 3 rd column). Note that a minimum 79 Uncompensated image W = 0.2k tag W = k tag W = 1.5k tag W = 2k tag Figure 4.3: Phase compensation by the low pass filter with different widths W. A smaller W can cause over-smoothed phase estimation and shading artifacts, indicated by the yellow arrow. A larger W can introduce spurious phase informa- tion from the spectral replicas created by tagging and rectify the negative-valued tag lines. The white arrow indicates a bright spot in the tag lines caused by the unwanted phase compensation in the tag lines. CNR threshold of 6 has been used for both cardiac [141] and speech applications [157]. The windowed final image (3 rd row) shows perceptually darker and higher contrast compared to the phase sensitive reconstruction (2 nd row) by enforcing non-negative pixel values. Figure 4.5 compares tag persistence between the original method and the pro- posed method in 5 subjects (from our first sub-study). The 2 nd , 3 rd and 4 th rows compare the two methods with CNR thresholds of 7, 6 and 5, respectively. In all subjects and all cases, the proposed method significantly improved the persistence by a factor of >1.9×. For instance, the 3 rd row shows the tag persistence was prolonged from 621 ms to 1251 ms with a CNR threshold at 6. 4.2.3 Tongue deformation with improved contrast Figure 4.6 shows representative images by the two methods during the pro- duction of the target speech stimuli (second sub-study). (A), (B) and (C) show the subject speaking “a pie again,” “a poppy again” and “a pop pip again,” pro- gressing in time from the starting to the ending vowel constriction postures. For 80 CNR decay Example images TFA=90° TFA=180° phase sensitive recon TFA=180° windowed final image 37 ms 341 ms 1251 ms 947 ms 644 ms 500 1000 1500 Time (ms) 0 5 10 15 20 25 30 CNR TFA=90° Measurement TFA=90° Simulation TFA=180° Measurement TFA=180° Simulation Figure 4.4: Simulated and measured tag CNR decay for TFA= 90 ◦ in the original implementation and with TFA= 180 ◦ with REALTAG in the proposed method. (Left) Solid lines show simulations; symbols and error bars show mean and stan- dard deviation of the measurement, respectively. The in-vivo CNR measurement with 5 subjects is consistent with the simulation. (Right) Representative images from one subject with 5 cm tag spacing. The time stamp resides in the center of 13 TRs that fully sample the images. Gray dashed panel outlines indicate that CNR drops below 6 around 650 ms for TFA= 90 ◦ ; it reaches the same level around 1250 ms for TFA= 180 ◦ with REALTAG. Windowed final images (3 rd row) have percep- tually darker and higher contrast compared to the phase sensitive reconstruction (2 nd row) by enforcing non-negative pixel values. both methods in each stimulus, the arrows in the audio waveform indicate the relative timing of the representative images. In all stimuli, both the original and proposed method were able to visualize the deformation for the starting vowel /a/ (indicated by horizontal compressed bi-concave rectangles), as it was within the persistence time (396-486 ms). However, the results differed across stimuli for the ending vowel /I/ (evidenced by vertically compressed bi-concave rectangles). In the diphthong case (A), the vocalic tongue movement spanned a relatively short duration as the two vowel postures are in the same syllable; so, both methods can capture the ending deformation (orange). In (B) the two vowel sounds are in two 81 0 500 1000 1500 CNR = 5 500 1000 1500 Persistence (ms) CNR = 6 Original Proposed 500 1000 1500 CNR = 7 Original 37 ms Proposed 37 ms 745 ms 621 ms 520 ms 1374 ms 1251 ms 1127 ms Persistence Comparison Figure 4.5: Time threshold improvement by using REALTAG in 5 in-vivo scans. The 2 nd , 3 rd and 4 th rows compare the two methods with CNR thresholds of 7, 6 and 5, respectively. For all 5 subjects, the proposed method significantly improved the persistence by a factor of >1.9×. successive syllables with one intervening segment, with the result that the ending vowel /I/ occurred around the cut-off time for the conventional method. The new proposed method, denoted with blue dashed lines in (B), provides clear internal tongue deformation through the vowel transition endpoint, while the tag line in the original method becomes obscure by this time point. Finally, in (C) the two vowels were located in two words with two intervening segments between the vow- els, and consequently were further in time from each other. Crucially, while the tag lines in the original method nearly disappear by the time of the second vowel’s 82 408 ms 570 ms 396 ms 559 ms 396 ms 602ms 469 ms 672 ms 486 ms 784 ms 522 ms 827 ms Original Proposed Starting vowel Starting vowel Ending vowel Ending vowel (A) (B) (C) 0 200 400 600 800 1000 1200 ms 0 200 400 600 800 1000 1200 ms 0 200 400 600 800 1000 1200 ms 0 200 400 600 800 1000 1200 ms 0 200 400 600 800 1000 1200 ms 0 200 400 600 800 1000 1200 ms Figure 4.6: Representative images during the speech stimuli production. (A), (B) and (C) show the subject speaking “a pie again,” “a poppy again,” and “a pop pip again,” respectively, progressing from the starting to the ending target vowel vocal tract constrictions. Arrows in the audio waveform indicate the time points of the selected images. The proposed method captures the tongue deformation of the starting and the ending vowels for all three stimuli (right: red, blue and gold), while CNR by the original method drops below threshold in the latter two cases and becomes obscure (left: blue and gold). constriction, the proposed method is able to maintain the tag lines with CNR> 8 (gold) through the vocalic articulation of the second syllable. 83 4.3 Discussion 4.3.1 REALTAG for the tongue We have demonstrated intermittent tagging with REALTAG to extend the tag persistence in RT-MRI. This approach can provide 2× initial CNR and > 1.9× longer persistence without repetitions and is suitable for investigations of natural speech production. We illustrated a method for phase sensitive reconstruction that provides better image quality compared to the existing intermittent tagging method [157]. We demonstrate a usable imaging window of 1250 ms at 1.5 T, with imaging CNR≥ 6. The proposed method is able to capture internal tongue deformation during American English vowel-to-vowel transition in separate words. This method provides a powerful new tool for imaging muscle movement in natural speech production and other similar RT applications, particularly where CINE imaging is not applicable or suboptimal due to its repetition requirements and the natural variability of human action. 4.3.2 Imaging consideration Recently, RTspeechMRIhasbeenperformedatabroadrangeoffieldstrengths from 0.55 T [166] to 3 T [47, 45, 44, 167, 168] with adequate image quality. The proposed method could be particularly useful at low field strengths where both muscle T 1 is shorter and the baseline SNR is lower. The T 1 is approximately 30% shorter at 0.55 T compared to 1.5 T [169]. This will cause approximately 30% shorter persistence but can be compensated by the 2× improvement achieved using REALTAG. Derbyshire et al. used bSSFP sequence in their original CMR REALTAG to furtherextendthetagpersistence[161]. Further, bSSFPprovidessuperiorcontrast 84 between myocardium and blood, which offers additional improvement on image quality. However, in our implementation we did not use bSSFP. In our experience, banding artifacts inevitably appeared in tongue tip and tongue body boundary due to the 9.4 ppm (600 Hz at 1.5 T) off-resonance at the air-tissue boundary. Shorter TR and less efficient acquisition have to be used to mitigate signal nulling. This topic remains for future work considering trading among spatiotemporal resolution and banding artifact removal. 4.3.3 Phase estimation Another difference between our method and the original CMR REALTAG is phase estimation. CMR REALTAG uses a fixed linear fitting to the phase within an automatically selected static ROI [161]. We did not directly adopt this pipeline for two reasons. Firstly, linear phase is a reasonable model for cardiac imaging as the myocardial ROI is distant from the surface coils relative to the diameter of the coils [161]. Our custom eight-channel coil assembly wraps around the subject’s jaw anteriorly to both lateral sides and was designed to be in close proximity to the upper airway (see Lingala et al. [40]). Therefore, the linear phase assumption does not hold. Secondly, speech production involves rapid and irregular tongue movement relative to the phased array coils, introducing constantly varying image phase. Hence the assumption of a fixed image phase is not valid. A dynamic approach has been used to effectively depict and track articulators’ phase for off- resonance correction during speech production [50]. For all of these reasons, we use frame-by-frame estimates of smooth phase using the images themselves. Thedesignofthelowpassfilterdeterminestheaccuracyoftheestimatedphase. In our experience, a low-pass filter that is 10− 20% wider than the designed width can be used to estimate the phase without any perceivable artifacts in the tag 85 lines. This may outperform the current choice due to a more accurate homo- dyne detection [170] but inevitably introduces invisible phase errors from the tag information in the harmonic peaks. This trade-off between phase compensation and errors can be potentially resolved by using other advanced image phase esti- mation methods, such as ESPIRiT with virtual conjugate coils (VCC-ESPIRiT) [171]. VCC-ESPIRiTenforcessmoothimagephase(thereforeavoidingphaseerrors in tag lines), while implicitly imposing data consistency from the entire k-space rather than only the synthesized center. 4.3.4 Tag line visualization We enforced non-negative value in the final images to perceptually increase contrast between dark tag lines and tongue tissue. This process also eliminated the bright background noise from the phase sensitive reconstruction and therefore provided better visualization. It is worth noting that speech-scientist observers reported only one qualitative drawback with the REALTAG approach, which was the bright dots at the intersection of tag lines during earlier phases (< 400 ms), due to double-inversion. This would not be present in 1D tagging, and for 2D tagging could be easily read through with practice. 4.4 Conclusions We demonstrated improved real-time tagged MRI with substantially increased tag persistence using REALTAG. The tag persistence was roughly 1250 ms com- pared to 650 ms with the prior conventional approach. This enables capturing longer motion patterns in speech production, such as lingual vowel-to-vowel tran- sition, and provides a powerful new window to study tongue muscle function. 86 Chapter 5 Seeing functional endotypes of OSA: real-time multi-slice MRI during continuous positive airway pressure O SA is characterized by repetitive cessation of airflow due to physical nar- rowing or collapse of the airway as a result of anatomical and physiological abnormalities in pharyngeal structure [3]. This collapse is typically attributed to excessive soft tissue elements, such as the tongue, velum, uvula and epiglottis, and/or increased collapsibility of the pharyngeal airway [4]. Three-dimensional static MRI provides superb contrast and resolution to reveal the anatomical structures that potentially contribute to airway collapse [41, 28]. Respiratory-gated CINE techniques have been proposed to measure the airway change during tidal breathing, where multiple respiratory cycles are used to form one cycle of dynamic images [42]. Recently 3D RT-MRI during natural sleep [29] and 2D RT-MRI during wakefulness [30] have been demonstrated alongside synchronized recording of physiological signals similar to polysomnography (PSG). These techniques have demonstrated a unique ability to identify airway collapse sites during natural sleep. 87 PSG is the standard technique for the diagnosis of sleep apnea, involving mon- itoring and recording multiple physiological signals in parallel that together reflect sleep physiology [172]. Research PSG can utilize a sealed facemask connected to a positive/negative pressure source to enable rapid switching between pressure levels so as to emulate the collapse of the upper airway (UA) during sleep. However, as PSG lacks visualization of pharyngeal structures, it cannot provide any informa- tion regarding the position and level of airway narrowing or collapse. Prior studies [29, 30] have applied inspiratory occlusion, such as Mueller maneuver (MM), to observe airway collapsibility during simultaneous dynamic MRI and PSG. How- ever, MM is a voluntary effort with poor reproducibility [173]. Previous studies [174] have also shown that MM is inherently unable to identify all types of collapse [175]. Countinuous positive airway pressure (CPAP) acts as a pneumatic splint to prevent upper airway collapse and has been proven to be the most efficacious treatment for OSA to date [176]. Prior studies indicate that CPAP manipulation can be used to determine upper airway physiological traits, by alternating between therapeutic and sub-therapeutic levels [64, 65]. The direct effects of CPAP on soft tissues surrounding the upper airway have been extensively studied using static MRI [177]. However, the underlying mechanisms of airway tissue response to pressure change remains unclear. Due to acquisition speed and spatial coverage constraints, the relationship between soft tissue collapsibility and physiological traits of OSA are not completely understood. In this chapter, we apply and assess a simultaneous multi-slice (SMS) RT-MRI technique [30] to image and quantify upper airway changes during rapid changes in CPAP pressure level. We use this tool to determine if RT-MRI during CPAP 88 can be used to measure neuromuscular reflex and/or passive collapsibility of the upper airway in individuals with OSA. 5.1 Methods 5.1.1 Experiments Four adolescent subjects with OSA and obesity (3M/1F) and 3 healthy volun- teers (3M) were studied. The experiment protocol was approved by our Institu- tional Review Board. Written informed consent was obtained from all adult sub- jects and volunteers, and obtained from the subject’s parents if they were younger than 18 years of age. Subjects were scanned starting at 8pm, and were instructed to refrain from consuming caffeine for 24 hours prior to the study. Total scan time per subject was 2 to 4 hours. We performed the experiments on a 3T GE Signa HDxt MRI scanner (GE Healthcare, Waukesha, WI) with gradients capable of 40 mT/m amplitude and 150 T/m/s slew rate. A body coil was used for RF transmission, and a 6-channel carotid coil (NeoCoil, Pewaukee, WI) was used for signal reception. During the MRI scan, we monitored and collected several physiological sig- nals to determine sleep/wakefulness. All instrumentation was either noted as MRI compatible by the manufacturer or was tested and verified to contain no metal- lic components by our group. An optical fingertip plethysmograph (Biopac Inc., Goleta, CA) was used to monitor heart rate and oxygen saturation. A respiratory transducer (Biopac Inc., Goleta, CA) and the scanner’s built-in respiratory bel- lows (GE Healthcare, Waukesha, WI) were used to measure respiratory effort at the lower chest and abdomen. 89 A facemask (Hans Rudolph Inc., Kansas City, MO) covering both nose and mouth was used to measure airway pressure and for providing positive pressure for CPAP testing. Small-bore tubing from the mask port led to a MP-45 pressure transducer (Validyne Engineering Inc., Northridge, CA) for measurement of mask pressure. The inspiratory port of the mask was connected to a Philips Respironics System One CPAP machine (Respironics Inc., Murrysville, PA) through an exten- sion tube with a length of 5 meters. Both the CPAP machine and the monitoring deviceswerelocatedalongsidetheMRIconsoletoenabletheMRIscanneroperator tochangethemaskpressurelevelduringthescanandtomonitorsleep/wakefulness in real-time during the study [65]. 5.1.2 MRI protocol All subjects first underwent overnight polysomnography in a sleep laboratory, which determined the therapeutic CPAP level. During each scan, the CPAP level in the facemask was alternated between the therapeutic value and 4 cm H 2 O. Positive pressure of 4 cm H 2 O is required to overcome the resistance of the long extension tube connecting CPAP and facemask. A representative scan protocol is shown in Figure 5.1. Each scan began with pressure level at 4 cm H 2 O. A 10-min pressure ramp was generated to gradually raise CPAP level from 4 cm H 2 O to the pre-determined therapeutic level to avoid discomfort. Then the CPAP pressure level was maintained at therapeutic level to facilitate sleep. The resting airway area(A eupnea )wasrecordedasanaveragevalueacrossa20stimespanduringwhich pressure was maintained at the therapeutic level. When the CPAP was dropped, there was an immediate reduction of airway cross sectional area as the airway narrowed. The reduction of airway area (ΔA d ) can be measured by subtracting A eupnea from an average value of airway area during the sub-therapeutic period 90 after a 2-3 breaths transition time. This increased upper airway resistance led to an increase in respiratory drive. The effect of this increase on upper airway can be measured when rapidly restoring CPAP back to the therapeutic level, by subtracting the overshoot (ΔA r ) from the resting airway areaA eupnea . For a CPAP droptobeusedtomeasurethedesiredtraits,noarousalrelatedtoapnea/hypopnea could occur during the sub-therapeutic period [65]. This alternating procedure was repeated 2-3 times during each scan, with at least 2-min intervals, resulting in a total scan time of 20-30 minutes. We used a SMS golden angle radial fast gradient echo sequence to acquire real-time images [30]. This provided 1mm in-plane spatial resolution and 4 simul- taneous slices (2-retroglossal and 2-retropalatal), with 96 ms temporal resolution. Imaging parameters were: 5 ◦ flip angle, 200 samples per readout, FOV 20×20cm 2 , TE/TR 3.7/6.5 ms, slice thickness/gap 7/3 mm. Standard static volume localizer scans were performed to identify and prescribe the imaging slices. 5.1.3 Data analysis We used a semi-automated region-growing algorithm [102] to segment the air- way in each 2D slice. We manually placed 2-4 seeds into the airway in each slice for the first time frame. The algorithm then grew a region-of-interest that included the entire airway for all time frames. Cross-sectional areas were calculated based on the segmented airway. Figure 5.1 and Figure 5.2 show representative examples of a healthy volun- teer and a OSA subject, respectively. Upper airway loop gain (UALG) represents the stability of neuromuscular reflex systems to recover from sudden ventilation reduction. Note that UALG is distinct from upper airway gain (UAG) defined 91 in Ref [65]. The latter is a quantification of the airway reflex based on ventila- tion curves, while UALG is determined by direct measurement of cross sectional areas. We calculated UALG by taking the ratio of the overshoot ΔA r to the area drop ΔA d marked in Figure 5.1. This calculation is valid for healthy volun- teers and patients whose sub-therapeutic sections underwent no interruption by apnea/hypopnea events. However, we frequently observed cases in which the sub- therapeutic periods were perturbed by apnea/hypopnea events in OSA patients, as shown in Figure 5.2. In such cases, the measurement accuracy would be reduced by the severe fluctuation of airway if the same method was used. Therefore, we employed a direct measure of airway collapse and reopening, as follows. We identified each apnea/hypopnea event by facemask pressure and bellows signal, highlighted by arrows in Figure 5.2. We then averaged across all these time segments to estimate the area drop ΔA d . Similarly, we located the airway area reopening by examining the facemask pressure curve and detecting the 1-3 breath resurgence after each event. We determined airway reopening ΔA r by subtracting A eupnea from the average across all of the detected segments. UALG was calculated as the ratio of airway area reopening ΔA r to the area drop ΔA d . The fluctuation of airway area (FAA) represents passive collapsibility of the upper airway. We determined FAA by the standard deviation of airway area nor- malized by the mean value, in therapeutic and sub-therapeutic sections, respec- tively. We calculated the mean value and the standard deviation of UALG for each subject to evaluate the stability of neuromuscular reflex systems. The mean value and standard deviation of the FAA were also calculated for both OSA patients and control group. We performed Student’s t-test between the two groups to evaluate the statistical difference. 92 Figure 5.1: CPAP pressure level manipulation. A representative example of CPAP drop/recovery of a healthy volunteer is shown to illustrate physiological changes during the process. Bottom graph shows CPAP being alternated between the pre- determined therapeutic level of 8 cm H 2 O and the sub-therapeutic baseline level of 4 cm H 2 O. The effects of this manipulation on airway area, facemask pressure and breathing effort are shown in the top 3 graphs. The resting airway area (A eupnea ) is determined by averaging the airway area before the CPAP drop. When the CPAP is dropped, there is an immediate narrowing of the airway (ΔA d ), resulting in a ventilation reduction. This reduction in ventilation stimulates the respiratory drive to increase breathing effort. The effect of increased drive on upper airway recovery can be measured by the overshoot of the airway area (ΔA r ) when rapidly recovering CPAP to the therapeutic level after 1 minute. ΔA r is calculated by subtracting the mean airway area of the first 2-3 breaths after the CPAP recovery from A eupnea . To evaluate the intra-subject reproducibility, one OSA patient and one healthy volunteer were removed from the MRI after one scan, given a short break, and then re-positioned into the scanner for a second scan. Results from both scans were then compared. We repeated the measurements by alternating CPAP pressure level 3 times within each scan. We determined the physiological traits of 2 adjacent 93 Figure 5.2: Results from a representative OSA patient (Male, AHI 50.0, BMI 40.5) illustrate the measurement of airway area change when there are interruptions due to airway collapse. The collapse and recovery of the airway were directly measured when the sub-therapeutic interval is interrupted by airway collapse and/or arousal. Apnea/hypopneaeventswerehighlightedbythearrowsinfacemaskpressurecurve. The drop Ad was calculated by subtracting the mean value of area across all collapsing sections from the resting airway area A eupnea . The airway recovery in responsetothestimulatedrespiratorydrivewasdeterminedbymeasuringthemean value across the 1-3 breaths immediately following the apnea/hypopnea events. slices, in order to exclude large variation from different airway sites. Intra-class correlation (ICC) between the 2 scans were calculated. 94 Figure 5.3: RT-MRI during rapid CPAP change. Shown are (Bottom row, left to right) four different time points marked in the graph (upper left). Red contour shows segmented results in the bottom images. The CPAP was turned to 11 cm H 2 O at time point (a). The rows correspond to 3 slices, marked with similar colors in the localizer image (upper right). The airway shape change during tidal breathing at a sub-therapeutic pressure, shown in the bottom-most two rows, is primarily in the lateral (right-left) direction. This suggests more passive tissue structures exists in the lateral walls, which may be relevant when planning surgical intervention. 5.2 Results 5.2.1 Visualizing the physiological fluctuation Figure 5.1 contains a representative airway area curve from a healthy volunteer. We observed in healthy volunteers that the response of airway area to CPAP pressure change matched the ventilation curves from Ref [65]. 95 Figure 5.2 contains a representative result from an OSA subject. The sub- therapeutic section was frequently interrupted by apnea/hypopnea events, com- pared to the healthy volunteer. Airway recovery was observed at the end of each apnea/hypopnea event, typically across a 1-3 breath time span. The recovery following airway narrowing was noted to be with larger amplitude Ar in almost all cases, compared to the overshoot measured in the control group, indicating a more dramatic change in muscle tone in response to airway collapse. It was also observed that the tidal breath induced fluctuation of cross sectional area in the OSA patients is at least 2-3× larger than those in the healthy volunteers. Figure 5.3 shows four example frames dynamic MRI during a CPAP drop from the same data set shown in Figure 5.2. Three columns represent three slices, marked with the same color in the localizer image. Four rows marked with (a)- (d), represent four time points, highlighted in the area vs. time curve at bottom left. The images demonstrate that the SMS RT-MRI is able to provide adequate temporal resolution to resolve airway dynamics during dramatic cross-section area fluctuation. 5.2.2 Statistical findings Table 5.1 lists the UALG and FAA for all subjects. There was no statistically significant difference in UALG between the OSA patients and the control group. However, we observed that OSA subjects with higher AHI value had higher UALG. Table 5.1 also listed FAA in the therapeutic and sub-therapeutic intervals. The OSA group had more severe fluctuations of airway area compared to that of the control group. Table 5.2 lists representative results from 4 distinct slices from the OSA sub- ject in Figure 5.2 and illustrates the variation among different airway sites. The 96 Table 5.1: Upper airway loop gain and fluctuation of airway area comparison between OSA patients and the control group. Gender AHI (events/hr) UALG FAA sub- therapeutic therapeutic OSA 1 OSA 2 OSA 3 OSA 4 M M F M 17.3 50.0 81.8 10.3 0.16±0.12 3.01±1.61 4.71±4.96 0.60±0.42 44.8% 37.2% 48.6% 28.8% 15.7% 25.5% 22.1% 9.7% Control 1 Control 2 Control 3 M M M − − − 0.42±0.41 1.60±1.49 1.60±2.16 10.5% 13.6% 13.0% 4.0% 9.2% 5.0% OSA: Obstructive sleep apnea; AHI: Apnea/hyponea index; UALG: Upper airway loop gain; FAA: Fluctuation of airway area. Note: There was no clear difference in UALG between the 2 groups. However, OSA subjects with higher AHI tended to have larger UALG, which implies a less stable neu- romuscular control system in the upper airway. There was a significant difference (see Table 5.2) in FAA between the 2 groups, indicating that OSA patients in the cohort had more collapsible and less stable airways. first slice has the largest UALG and the most dramatic fluctuation during sub- therapeutic interval, indicating that it possesses the least stable neuromuscular response and the least stable airway structure, and therefore is likely to be the most collapsible site. This interpretation is reinforced by the blue curve in the Figure 5.2 that shows this slice significantly narrowed and fully collapsed near the 70-80 s interval. Table 5.3 and Table 5.4 compare the FAA and mean value of the airway area between the OSA subjects and the control group. Table 5.3 shows that the airway from the two groups underwent statistically different fluctuation characteristics, with p-values less than 0.05 for both sub-therapeutic and therapeutic sections. Table 5.4 shows that the two groups possess the same range of airway size during the sub-therapeutic section with no statistically significant difference (p = 0.672). 97 Table 5.2: UALG and FAA results for different slices of one representative OSA patient. Slice UALG FAA subtherapeutic therapeutic 1 4.21±1.16 40.6% 27.2% 2 3 4 2.59±0.93 2.57±0.91 1.80±0.21 33.1% 29.4% 32.4% 24.3% 28.8% 27.6% UALG: Upper airway loop gain; FAA: Fluctuation of airway area. Note: We list results from four axial slices from one representative OSA patient to illustrate the variation among airway sites. Slice #1 has the largest UALG and FAA during the sub-therapeutic section, indicatingit possessed the least stableneuromuscular reflection and the most passive airway tissue, and therefore was the most collapsible site. Table 5.3: FAA is significantly larger in OSA subjects and can be reduced under therapeutic airway pressure. Cohort size FAA subtherapeutic therapeutic OSA 4 42.6%±9.4% 18.3%±7.0% Control 3 13.6%±0.6% (p = 0.003) 6.2%±2.6% (p = 0.04) OSA: Obstructive sleep apnea; FAA: Fluctuation of airway area. Note: Shown are the FAA during the sub-therapeutic and therapeutic sections. There was a statistically significant difference between the OSA and control group (Student’s t-test,p< 0.05) for both sections. Increasing CPAP pressure, as done in the therapeutic section, reduced the magnitude of the difference. However, the right column shows that CPAP remarkably dilates the airway for OSA patients during the therapeutic section, due to their less stable airways. Table 5.5 shows representative results for intra-subject reproducibility. Although UALG had large standard deviation across different airway sites for each subject, the intra-subject test-retest result indicates good repeatability within adjacent slices for both the OSA patient (ICC = 0.714) and the healthy volunteer 98 Table 5.4: Airway area mean value during CPAP indicates significantly less stiff airway in the sampled OSA patients. Cohort size Airway area mean value subtherapeutic therapeutic OSA 4 83.3±53.7 147.1±72.6 Control 3 79.1±51.1% (p = 0.672) 110.8±65.1% (p = 0.007) OSA: Obstructive sleep apnea; FAA: Fluctuation of airway area. Note: A Student’s t-test was used to compare the mean airway area during the sub- therapeutic and the therapeutic sections. There was no significant difference between the 2 groups during the sub-therapeutic section. CPAP was able to remarkably dilate the airway for OSA patients, who possess more passive airways. This implies that airway stiffness, instead of the anatomic profile, has an important role in maintaining airway patency in the sampled OSA cohort. (ICC = 0.757). ICC for FAA were all higher than 0.76, indicating good reliability of the measurements. 5.3 Discussion We present a novel MRI-based experiment that measures UALG and FAA, which are valuable for the study of sleep-related breathing disorders. We utilized SMS RT-MRI, and CPAP with carefully designed pressure changes. This new test is valuable because conventional PSG with AHI measurement only provides estimation of the overall severity of OSA and cannot localize specific airway sites that are prone to collapse. In contrast, the proposed experimental design can directly measure location-specific active (UALG) and passive (FAA) physiological traits and visually resolve airway dynamics. 99 Table 5.5: Intra-subject reproducibility of RT-MRI during CPAP. UALG FAA subtherapeutic therapeutic OSA Scan 1 Scan 2 2.94±0.72 2.65±1.51 27.3%±3.1% 31.2%±4.1% 24.9%±1.6% 20.7%±4.5% ICC 0.714 0.823 0.761 Control Scan 1 Scan 2 0.19±0.06 0.21±0.14 7.6%±1.5% 7.9%±1.7% 3.1%±1.0% 3.3%±1.2% ICC 0.757 0.865 0.878 OSA: Obstructive sleep apnea; UALG: Upper airway loop gain; FAA: Fluctuation of airway area; ICC: Intra-class correlation. Note: One OSA patient and one control volunteer were scanned twice in the same session, with subject removal and replacement, to determine intra-subject test-retest reproducibility. Two adjacent slices were used. For each scan, the measurements were repeated by alternating CPAP pressure level 3 times. ICC for UALG and FAA measure- ments were calculated for both subjects. 5.3.1 OSA patients: less stable neuromuscular control sys- tems and higher collapsibility We observed that OSA subjects with higher AHI had higher UALG, suggest- ing that the OSA cohort have less stable neuromuscular control systems of their upper airways. The OSA group also exhibited larger fluctuations of airway area, compared to that of the control group. This suggests that OSA subjects in the cohort possess less stable airways with greater collapsibility. Occlusion studies can potentially measure biomarkers for passive and anatom- ical risk factor for OSA, such as closing pressure and compliance [30, 178]. In addition to these measurements, we demonstrate that the proposed experiment has the potential to estimate the active factors of upper airway in response to collapse. Furthermore, CPAP provides accurate pressure control, while occlusion 100 studies produce negative pressure only and suffer from more variability due to inconsistent voluntary respiratory effort. 5.3.2 Experimental considerations It is possible to scan patients during wakefulness with occasional occlusions [30], however, we have found CPAP to be more patient-friendly. Previously, sub- jects reported discomfort introduced by short time occlusions during wakefulness. Patients with OSA typically have previous CPAP experience, which facilitates comfort and the likelihood of sleep in the MRI scanner. We gradually increased pressure level before the scan procedure, in order to minimize the chance of inter- rupting sleep. In our experience, all subjects did fall asleep during MRI scanning while wearing the CPAP apparatus (4 of 4 patients and 3 of 3 volunteers in this study). It is important to measure active muscle reaction to airway collapse during natural sleep. During wakefulness, there is additional variability in UALG mea- surement. We speculate that this is due to different neuromuscular mechanisms during wakefulness, stiffer muscle tone, and airway motion due to swallowing. We observeddifferencesbetweenmeasuredUALGandFAAbetweensleepandwakeful- ness for all subjects. Specifically, we observed that during wakefulness, reopening is restrained, and the overshoot after CPAP recovery is reduced. Previous studies [8, 65, 179] that used PSG and CPAP to determine physio- logical traits had to exclude significant amounts of data where there were arousal interruptions. In those studies, the measurements of airway reaction were based on physiological modeling and the assumption that the ventilation drive compen- sation to CPAP drop is due to a re-opened airway. Our observation (for all OSA patients and healthy volunteers) was that the airway did not necessarily reopen 101 in response to the CPAP drop, unless the airway itself underwent significant nar- rowing or total collapse. This could mean that: (a) either the assumption in the previous studies is not correct; or (b) that the subjects were not in a stable and sufficiently deep stage of sleep. In both cases, the upper airway never became totally passive. This observation was made possible because the proposed MRI experiment includes direct measurement of cross-sectional area. Previous studies using static MRI have shown enlarged airway area with progressively increased pressure [177, 180]. However, with improved spatial coverage and enhanced tem- poral resolution, the fully resolved dynamics reveals that airway area depends on many factors in addition to the pressure level, including specific airway section and muscle tone status. 5.3.3 Toward seeing the endotypes The upper airway includes the pharynx, which is a structural and physiologi- cally complicated system serving multiple functions. Also, OSA is a heterogeneous syndrome, with several structural and physiological pathways [58, 59]. There- fore, we expect variation in both UALG and FAA across different airway sites. This study documented large variability in these quantities across patients and slice locations. This establishes the value and importance of using simultaneous multi-slice imaging for this application. We observed OSA 2,3 had 3-8× larger UALG than OSA 1,4, and OSA 1,2,3 had 1.5-3× larger FAA than OSA 4. We speculate that these large variations are due to weighting of active/physiological and passive/anatomical factors for these subjects, because they represent different endotypes and severity of OSA. We also observed large intra-subject variation. For example, for OSA 2, slice 1 has the largest value for both UALG and FAA potentially indicating that the region of slice 1 should be given higher priority for 102 treatment for this patient. These observations suggest the possibility of personal- ized treatment for OSA patients [59]. This preliminary study has several limitations. First of all, we had a small cohort (4 patients and 3 controls), and the findings need to be confirmed in a larger sample. Second, we used a relatively large slice thickness of 7 mm, which is insufficient to fully resolve the motion of certain interesting structures, such as uvula, during airway collapse. This in combination with the 3 mm slice gap makes it difficult to tackle trans-plane motion, which could introduce additional bias/variation. Third, our 2D area segmentation is based on a region-growing algorithm and was not optimized to overcome rapid movement of the subject. In rare cases we needed to manually segment the airway when adjacent frames did not possess adequate airway overlap. 3D segmentation [181] with improved spatial coverage and adequate resolutions is in demand and remains for future work. Fourth, this experiment would benefit from natural sleep in the scanner, however, this is not always practical. 5.4 Conclusions Inconclusion, wedemonstrateanovelexperimentthatsimultaneouslymeasures upper airway active and passive traits regarding OSA, including physiological and anatomical factors, potentially enabling detailed endotyping of OSA patients. By performing SMS RT-MRI during CPAP, we reveal that airway behavior in OSA patients possess large variation. Patients may deserve personalized examination before proceeding to specific treatment. We also demonstrate that the proposed approach can help locate the most collapsible airway sites with higher treatment priority,withspecificpossiblemotivation(anatomicalorphysiological). Withthese 103 demonstrated result, we also expect this experiment can be further used in other procedures, such as detailed CPAP titration or aiding in surgery planning. 104 Chapter 6 Concluding remarks M agnetic Resonance Imaging provides arguably the most powerful tool for evaluating upper airway anatomy and function. It enables versatile con- trast for soft tissues in a non-invasive way without any ionizing radiation. Ongoing researchhasadvancedtheresolutionandimagingspeedtotime-resolvethedynam- ics of the upper airway. RT-MRI of the upper airway is a promising research direction as it allows imag- ing the upper airway during natural human activities, such as speech production and respiration in sleep. Fast SPGR sequences, non-Cartesian sampling, parallel imaging with phased array coils, and constrained reconstruction enables superior spatiotemporal resolution during RT imaging. However, solely fast is not enough. RT-MRI techniques have been used exten- sively to image the dynamics of the upper airway shaping with a focus on tracking the air-tissue interface at articulators, vocal tract surfaces, and/or the pharyngeal wall. The function of the moving upper airway has not been extensively stud- ied, as RT-MRI lacks the ability to visualize internal muscle movement, and few investigations have measured dynamic physiological aspects of the upper airway in motion. Visualization and measurement of these activities is crucial to under- stand how function is controlled in health and how it is disrupted in disease. This dissertation has presented methods to image the function of the upper airway with two applications: speech production and sleep disorder. 105 For speech production I focused on arguably the most important articula- tor, the tongue, by visualizing its internal motion through tagged tissue deforma- tion. I demonstrated intermittent tagging during RT-MRI. This approach elim- inates the need for re-binning data using multiple repetitions and is suitable for investigations of natural speech production. I explored imaging parameters to maximize the image contrast between the tagged and non-tagged tissue, while leveraging mature speech RT-MRI techniques to provide adequate spatiotemporal to capture tongue motion patterns and their relative timing. I further proposed a phase-sensitive inversion technique, named REALTAG, to double the tag line contrast and therefore extend the tag persistence during RT-MRI. This approach can provide 2× CNR and > 1.9× tag persistence without repetitions and is suit- able for investigations of natural speech production. The proposed tagging meth- ods are exemplified during intra- and inter-word American English vowel-to-vowel transitions. This method can provide images for quantification of internal tongue motion. This method provides a powerful new tool for imaging muscle movement in natural speech production and other similar RT applications, particularly where CINE imaging is not applicable or suboptimal due to its repetition requirements and the natural variability of human action. For sleep disorder I focused on proposing methods to help endotyping of obstructive sleep apnea. I presented a novel MRI-based experiment that measures upper airway loop gain (UALG) and fluctuation of airway area (FAA), which are two crucial physiological traits that mediate the severity of OSA and are valuable for the study of sleep-related breathing disorders. I utilized SMS RT-MRI to increase spatial coverage compared to conventional RT-MRI to observe multiple airway slices in the same scan. We used CPAP with carefully designed pressure 106 changes to enable measuring target physiological traits. This new test is valuable because conventional PSG with AHI measurement only provides an estimation of the overall severity of OSA and cannot localize specific airway sites that are prone to collapse. The proposed experimental design can directly measure location- specific active (UALG) and passive (FAA) physiological traits and visually resolve airway dynamics simultaneously. Future directions Near term I have discussed future possibilities in each of the above chapters. Among those discussed, I believe there are several that will be the most valuable in the near term. Chapter 3 pointed out motion artifacts exist in some of the current results as fully sampled single slice RT-MRI cannot resolve all tongue movements. One pos- sible solution is to use constrained reconstruction with undersampling to mitigate the motion artifacts. My preliminary results show temporal finite difference can efficiently reduce undersampling and motion artifacts. Chapter 4 discovered a trade-off between phase compensation and errors, which is pointed out to be resolvable by using other advanced image phase estimation methods. One possible candidate is VCC-ESPIRiT [171], which enforces smooth image phase (therefore dodging phase errors in tag lines) while implicitly imposing data consistence from the whole k-space rather than only the synthesized center. Speech production MR tagging provides the opportunity to quantify inter- nal muscle motion. HARP [148] allows faster and simpler post processing for quantitative analysis. HARP has been adapted for speech production in the CINE framework [104, 131, 130, 108, 147]. The measured deformations have been shown 107 to quantify muscle mechanics to develop atlases of motion within the tongue [131]. Another option to quantify muscle movement is phase contrast imaging, which has been shown for the application of tissue velocity mapping in myocar- dial motion [149] as well as in skeletal muscle contraction [150]. This technique encodes information about velocity into the phase of the detected signal. Note that both methods are phase-sensitive methods; phase errors introduced by uncounted off-resonance need to be carefully addressed when adapting to quantitative mea- surement in the upper airway [149, 151, 152, 153]. Sleep disorder The MR tagging methods developed for speech production canbeusedtoevaluatethemotionpatternsofdilatormuscle, includingthetongue, during sleep. Brown et al. [66] has used MR tagging during mandible advance- ment to predict the treatment outcome. The same group showed in another study [182] that there exists three different motion patterns in the tongue muscle dur- ing forced mandible advancement. More interestingly, they showed that “en bloc” tongue deformation was associated with positive treatment outcomes among obese participants. This is echoed by the finding that obese individuals who do not get OSA are “protected” by an augmented reflex response of airway dilator muscles [63]. The proposed tagging methods can significantly simplify the data acquisition process and allow continuous scan, which potentially enables experiments during natural sleep. This will fill the missing puzzle (C) in Figure 1.3 in Chapter 1 and provide a comprehensive MRI-based imaging tool-set for endotyping of OSA. New imaging paradigms The recent excitement regarding low field RT speech MRI [166] has provided a look into the bright future of upper airway imag- ing. Lessoff-resonanceandaquieterscanningenvironmentareparticularlyappeal- ing to RT speech and sleep imaging. The proposed methods can be adapted to 108 scan at low field. At low field strengths, both muscle T 1 is shorter and the base- line SNR is lower. The T 1 is approximately 30% shorter at 0.55 T compared to 1.5 T [169]. This will impose challenges such as lower SNR and faster tag line decay. However, this could be mitigated by adjusting imaging strategy, such as using more efficient data acquisition. For instance, longer readout can be used. Another option is to use a different imaging sequence that maximizes image SNR, such as bSSFP sequences. Notably, both strategies can be used to extend the tag persistence and therefore compensate the proposed tagging method at lower field strength. bSSFP provides superior SNR and contrast between muscle and fluid, which offers additional improvement to image quality and tag persistence in the myocar- dial REALTAG [161]. My ongoing research has investigated upper-airway imaging using bSSFP. My preliminary result shows banding artifacts appeared in tongue tip and tongue body boundary due to the 9.4 ppm (600 Hz at 1.5 T) off-resonance at the air-tissue boundary. Careful shimming and shorter TR have to be used to mitigate signal nulling. However, shorter TR inevitably results in less efficient sampling. Othersolutionstomitigatethebandingartifactsincludefrequencymod- ulated (fmSSFP) [183] or wide-band bSSFP [184]. Recently, Roeloffs et al. [185] used fmSSFP with subspace reconstruction to demonstrate banding-free high SNR 3D stack-of-stars without intermediate preparation phases. This technique can be adapted for dynamic imaging. This direction remains promising for creating longer tag persistence at all field strength. Coda RT-MRI of the upper airway has been a pursuit of fast imaging. We can lever- age the maturing fast scan methods to provide adequate spatiotemporal resolution, 109 while introduce novel techniques and experiment designs to go beyond fast. We can set our sights beyond the anatomical structures onto the even more interest- ing yet intrinsically complex functions of the upper airway. With this work, and other ongoing research, we will start to unveil the intriguing human upper airway functions. 110 Bibliography [1] Andrew D. Scott, Marzena Wylezinska, Malcolm J. Birch, and Marc E. Miquel. Speech MRI: Morphology and function, sep 2014. 1, 2, 4, 7, 33, 56 [2] Christina Hagedorn, Tanner Sorensen, Adam Lammert, Asterios Toutios, Louis Goldstein, Dani Byrd, and Shrikanth Narayanan. Engineering Inno- vation in Speech Science: Data and Technologies. Perspectives of the ASHA Special Interest Groups, 4(2):411–420, apr 2019. 1 [3] Patrick J Strollo and Robert Rogers. Obstructive sleep apnea. Current Treatment Options in Neurology, 334(2):99–104, 1996. 1, 87 [4] Emilia Sforza, William Bacon, Thomas Weiss, Anne Thibault, Christophe Petiau, and Jean Krieger. Upper airway collapsibility and cephalometric variables in patients with obstructive sleep Apnea. American Journal of Respiratory and Critical Care Medicine, 161(2 I):347–352, 2000. 2, 87 [5] David P. White. Central Sleep Apnea. In Principles and Practice of Sleep Medicine, pages 969–982. 2005. 2 [6] Raanan Arens and Carole L Marcus. Pathophysiology of upper airway obstruction: a developmental perspective. Sleep, 27(5):997–1019, 2004. 2 [7] Harvard Medical School and McKinsey. The Price of Fatigue: The surprising economic costs of unmanaged sleep apnea. Technical report, 2010. 2 [8] Terry Young, Paul E. Peppard, and Daniel J. Gottlieb. Epidemiology of obstructive sleep apnea: A population health perspective, 2002. 2, 101 [9] S Andreas, R Schulz, G S Werner, and H Kreuzer. Prevalence of obstruc- tive sleep apnoea in patients with coronary artery disease. Coronary artery disease, 7(7):541–545, 1996. 2 111 [10] S Javaheri, T J Parker, J D Liming, W S Corbett, H Nishiyama, L Wexler, and G a Roselle. Sleep apnea in 81 ambulatory male patients with stable heart failure. Types and their prevalences, consequences, and presentations. Circulation, 97(21):2154–2159, 1998. 2 [11] C Guilleminault, S J Connolly, and R a Winkle. Cardiac arrhythmia and conduction disturbances during sleep in 400 patients with sleep apnea syn- drome., 1983. 2 [12] T Douglas Bradley, Alexander G Logan, R John Kimoff, Frédéric Sériès, Debra Morrison, Kathleen Ferguson, Israel Belenkie, Michael Pfeifer, John Fleetham, Patrick Hanly, Mark Smilovitch, George Tomlinson, and John S Floras. Continuous positive airway pressure for central sleep apnea and heart failure. The New England journal of medicine, 353(19):2025–2033, 2005. 2 [13] Mary S M Ip, Bing Lam, Matthew M T Ng, Wah Kit Lam, Kenneth W T Tsang, and Karen S L Lam. Obstructive sleep apnea is independently asso- ciated with insulin resistance. American Journal of Respiratory and Critical Care Medicine, 165(5):670–676, 2002. 2 [14] Naresh M. Punjabi, John D. Sorkin, Leslie I. Katzel, Andrew P. Goldberg, Alan R. Schwartz, and Philip L. Smith. Sleep-disordered breathing and insulin resistance in middle-aged and overweight men. American Journal of Respiratory and Critical Care Medicine, 165(5):677–682, 2002. 2 [15] Joseph S. Perkell, Marc H. Cohen, Mario A. Svirsky, Melanie L. Matthies, Iñaki Garabieta, and Michel T. T. Jackson. Electromagnetic midsagittal articulometer systems for transducing speech articulatory movements. Jour- nal of the Acoustical Society of America, 92(6):3078–3096, dec 2005. 2, 4 [16] Katherine K. Green, David T. Kent, Mark A. D’Agostino, Paul T. Hoff, Ho Sheng Lin, Ryan J. Soose, M. Boyd Gillespie, Kathleen L. Yaremchuk, Marina Carrasco-Llatas, B. Tucker Woodson, Ofer Jacobowitz, Erica R. Thaler, José E. Barrera, Robson Capasso, Stanley Yung Liu, Jennifer Hsia, Daljit Mann, Taha S. Meraj, Jonathan A. Waxman, and Eric J. Kezirian. Drug-Induced Sleep Endoscopy and Surgical Outcomes: A Multicenter Cohort Study. Laryngoscope, 129(3):761–770, mar 2019. 2, 4 [17] Adam M. Zysk, Freddy T. Nguyen, Amy L. Oldenburg, Daniel L. Marks, and Stephen A. Boppart. Optical coherence tomography: a review of clin- ical development from bench to bedside. Journal of Biomedical Optics, 12(5):051403, 2007. 2, 4 [18] Pierre Delattre and Donald C. Freeman. A dialect study of american R’S by x-ray motion picture. Linguistics, 6(44):29–68, 1968. 2, 4 112 [19] J R Galvin, S A Rooholamini, and W Stanford. Obstructive sleep apnea: diagnosis with ultrafast CT. Radiology, 171(3):775–8, jun 1989. 2, 4 [20] Pascal Perrier, Louis-Jean Boë, and Rudolph Sock. Vocal Tract Area Func- tion Estimation From Midsagittal Dimensions With CT Scans and a Vocal TractCast. Journal of Speech, Language, and Hearing Research, 35(1):53–67, feb 2014. 2, 4 [21] J. J. Fredberg, M. E. Wohl, G. M. Glass, and H. L. Dorkin. Airway area by acoustic reflections measured at the mouth. Journal of Applied Physiology, 48(5):749–758, may 1980. 2, 3, 4 [22] Maureen Stone, Thomas H. Shawker, Thomas L. Talbot, and Alan H. Rich. Cross-sectional tongue shape during the production of vowels. Journal of the Acoustical Society of America, 83(4):1586–1596, apr 2005. 2, 3, 4 [23] Erik Bresch, Yoon Chul Kim, Krishna Nayak, Dani Byrd, and Shrikanth Narayanan. Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging. IEEE Signal Processing Magazine, 25(3):123– 132, may 2008. 2, 6, 33 [24] Didier Demolin, Sergio Hassid, Thierry Metens, and Alain Soquet. Real-time MRI and articulatory coordination in speech. Comptes Rendus - Biologies, 325(4):547–556, 2002. 2 [25] Syed Nabeel Zafar, Navin R. Changoor, Kibileri Williams, Rafael D. Acosta, Wendy R. Greene, Terrence M. Fullum, Adil H. Haider, Edward E. Cornwell, and Daniel D. Tran. Race and socioeconomic disparities in national stoma reversal rates Oral presentation at the 25th Annual Scientific Assembly of the Society of Black Academic Surgeons, April 9-11, 2015, Chapel Hill, NC. American Journal of Surgery, 211(4):710–715, 2016. 2 [26] MorielS.NessAiver, MaureenStone, VijayParthasarathy, YuviKahana, and Alex Paritsky. Recording high quality speech during tagged cine-MRIstudies using a fiber optic microphone. Journal of Magnetic Resonance Imaging, 23(1):92–97, 2006. 2 [27] S. R. Ventura, D. R. Freitas, and João Manuel R.S. Tavares. Application of MRI and biomedical engineering in speech production study. Computer Methods in Biomechanics and Biomedical Engineering, 12(6):671–681, 2009. 2 113 [28] Richard J. Schwab, Michael Pasirstein, Robert Pierson, Adonna Mackley, Robert Hachadoorian, Raanan Arens, Greg Maislin, and Allan I. Pack. Iden- tification of upper airway anatomic risk factors for obstructive sleep apnea with volumetric magnetic resonance imaging. American Journal of Respira- tory and Critical Care Medicine, 168(5):522–530, 2003. 2, 3, 4, 9, 35, 87 [29] Yoon Chul Kim, R. Marc Lebel, Ziyue Wu, S. L. Davidson Ward, Michael C.K. Khoo, and Krishna S. Nayak. Real-time 3D magnetic reso- nance imaging of the pharyngeal airway in sleep apnea. Magnetic Resonance in Medicine, 71(4):1501–1510, 2014. 2, 4, 6, 34, 87, 88 [30] Ziyue Wu, Weiyi Chen, Michael C.K. Khoo, Sally L. Davidson Ward, and Krishna S. Nayak. Evaluation of upper airway collapsibility using real-time MRI. Journal of Magnetic Resonance Imaging, 44(1):158–167, dec 2016. 2, 5, 6, 9, 29, 32, 34, 87, 88, 91, 100, 101 [31] Adam Albright. Articulators in motion during speech. Available at http://phonetics.linguistics.ucla.edu/demos/croatian/index.html. 3 [32] Olivier M. Vanderveken, Joachim T. Maurer, Winfried Hohenhorst, Evert Hamans, Ho-Sheng Lin, Anneclaire V. Vroegop, Clemens Anders, Nico de Vries, and Paul H. Van de Heyning. Evaluation of Drug-Induced Sleep Endoscopy as a Patient Selection Tool for Implanted Upper Airway Stim- ulation for Obstructive Sleep Apnea. Journal of Clinical Sleep Medicine, 09(05):433–438, may 2013. 3 [33] Joseph C Jing, Lidek Chou, Erica Su, Brian J F Wong, and Zhongping Chen. Anatomically correct visualization of the human upper airway using a high- speed long range optical coherence tomography system with an integrated positioning sensor. Scientific Reports, 6:39443, dec 2016. 3 [34] Amal Isaiah, Reuben Mezrich, and Jeffrey Wolf. Ultrasonographic Detection of Airway Obstruction in a Model of Obstructive Sleep Apnea. Ultrasound International Open, 03(01):E34–E42, feb 2017. 3 [35] David Núñez-Fernández. Upper Airway Evaluation in Snoring and Obstruc- tive Sleep Apnea: Overview of OSA, Relevant Upper Airway Anatomy, Pathologic Conditions Associated With OSA. 3 [36] Ian Stavness, John E. Lloyd, Yohan Payan, and Sidney Fels. Coupled hard- soft tissue simulation with contact and constraints applied to jaw-tongue- hyoid dynamics. International Journal for Numerical Methods in Biomedical Engineering, 27(3):367–390, mar 2011. 3 114 [37] Yoon-Chul Kim. Fast upper airway magnetic resonance imaging for assess- ment of speech production and sleep apnea. Precision and Future Medicine, 2(4):131–148, dec 2018. 3, 6, 33, 34, 35, 36, 37 [38] Sajan Goud Lingala, Brad P. Sutton, Marc E. Miquel, and Krishna S. Nayak. Recommendations for real-time speech MRI, jan 2016. 4, 7, 32, 33, 34, 73 [39] Joseph Jing, Jun Zhang, Anthony Chin Loy, Brian J F Wong, and Zhongping Chen. High-speed upper-airway imaging using full-range optical coherence tomography. Journal of Biomedical Optics, 17(11):110507, nov 2012. 2 [40] Sajan Goud Lingala, Yinghua Zhu, Yoon Chul Kim, Asterios Toutios, Shrikanth Narayanan, and Krishna S. Nayak. A fast and flexible MRI sys- tem for the study of dynamic vocal tract shaping. Magnetic Resonance in Medicine, 77(1):112–125, jan 2017. 5, 6, 35, 48, 52, 55, 70, 76, 85 [41] R Arens, J M McDonough, a T Costarino, S Mahboubi, C E Tayag-Kier, G Maislin, R J Schwab, and a I Pack. Magnetic resonance imaging of the upper airway structure of children with obstructive sleep apnea syndrome. American Journal of Respiratory and Critical Care Medicine, 164(4):698– 703, 2001. 4, 87 [42] Mark E. Wagshul, Sanghun Sin, Michael L. Lipton, Keivan Shifteh, and Raanan Arens. Novel retrospective, respiratory-gating method enables 3D, high resolution, dynamic imaging of the upper airway during tidal breathing. Magnetic Resonance in Medicine, 70(6):1580–1590, 2013. 5, 34, 87 [43] Yinghua Zhu, Yoon Chul Kim, Michael I. Proctor, Shrikanth S. Narayanan, and Krishna S. Nayak. Dynamic 3-D visualization of vocal tract shaping during speech. IEEE Transactions on Medical Imaging, 32(5):838–848, may 2013. 6 [44] Michael Burdumy, Louisa Traser, Fabian Burk, Bernhard Richter, Matthias Echternach,JanG.Korvink,JürgenHennig,andMaximZaitsev. One-second MRI of a three-dimensional vocal tract to measure dynamic articulator mod- ifications. Journal of Magnetic Resonance Imaging, 46(1):94–101, jul 2017. 6, 84 [45] Maojing Fu, Marissa S. Barlaz, Joseph L. Holtrop, Jamie L. Perry, David P. Kuehn, Ryan K. Shosted, Zhi-Pei Pei Liang, and Bradley P. Sutton. High- frame-rate full-vocal-tract 3D dynamic speech imaging. Magnetic Resonance in Medicine, 77(4):1619–1629, apr 2017. 6, 84 115 [46] Yongwan Lim, Yinghua Zhu, Sajan Goud Lingala, Dani Byrd, Shrikanth Narayanan, and Krishna Shrinivas Nayak. 3D dynamic MRI of the vocal tract during natural speech. Magnetic Resonance in Medicine, 81(3):1511– 1520, nov 2019. 6, 32 [47] Aaron Niebergall, Shuo Zhang, Esther Kunay, Götz Keydana, Michael Job, MartinUecker, andJensFrahm. Real-timeMRIofspeakingataresolutionof 33 ms: Undersampled radial FLASH with nonlinear inverse reconstruction. Magnetic Resonance in Medicine, 69(2):477–485, feb 2013. 6, 84 [48] Martin Uecker, Shuo Zhang, Dirk Voit, Alexander Karaus, Klaus Dietmar Merboldt, and Jens Frahm. Real-time MRI at a resolution of 20 ms. NMR in Biomedicine, 23(8):986–994, 2010. 6 [49] Bradley P. Sutton, Charles A. Conway, Youkyung Bae, Ravi Seethamraju, and David P. Kuehn. Faster dynamic imaging of speech with field inhomo- geneity corrected spiral fast low angle shot (FLASH) at 3 T. Journal of Magnetic Resonance Imaging, 32(5):1228–1237, nov 2010. 6, 70 [50] Yongwan Lim, Sajan Goud Lingala, Shrikanth S. Narayanan, and Krishna S. Nayak. Dynamic off-resonance correction for spiral real-time MRI of speech. Magnetic Resonance in Medicine, 81(1):234–246, jul 2019. 6, 25, 70, 85 [51] William M. Kier and Kathleen K. Smith. Tongues, tentacles and trunks: the biomechanics of movement in muscularâĂŘhydrostats. Zoological Journal of the Linnean Society, 83(4):307–324, apr 1985. 7 [52] Karen M. Hiiemae and Jeffrey B. Palmer. Tongue movements in feeding and speech. Critical Reviews in Oral Biology and Medicine, 14(6):413–429, nov 2003. 7 [53] Jordan R. Green. Mouth Matters: Scientific and Clinical Applications of Speech Movement Analysis. Perspectives on Speech Science and Orofacial Disorders, 25(1):6, jul 2015. 7 [54] Jean-Michel Gerard, Reiner Wilhelms-Tricarico, Pascal Perrier, and Yohan Payan. A 3D dynamical biomechanical tongue model to study speech motor control. Recent Res Develop Biomech, 2006. 7 [55] Stéphanie Buchaillard, Pascal Perrier, and Yohan Payan. A biomechanical model of cardinal vowel production: Muscle activations and the impact of gravity on tongue positioning. Journal of the Acoustical Society of America, 126(4):2033, 2009. 7 116 [56] Asterios Toutios and Shrikanth S. Narayanan. Articulatory synthesis of french connected speech from EMA data. In Proceedings of the Annual Conference of the International Speech Communication Association, INTER- SPEECH, pages 2738–2742, 2013. 7 [57] Danny J. Eckert. Phenotypic approaches to obstructive sleep apnoea âĂŞ New pathways for targeted therapy. Sleep Medicine Reviews, 37:45–59, 2018. 8 [58] S M Caples, A S Gami, and V K Somers. Obstructive sleep apnea. Annals of Internal Medicine, 142(3):187–197, feb 2005. 7, 102 [59] Allan I. Pack. Application of personalized, predictive, preventative, and par- ticipatory (P4) medicine to obstructive sleep apnea a roadmap for improving care? Annals of the American Thoracic Society, 13(9):1456–1467, jul 2016. 7, 8, 35, 102, 103 [60] Danny J. Eckert, David P. White, Amy S. Jordan, Atul Malhotra, and Andrew Wellman. Defining phenotypic causes of obstructive sleep apnea: Identification of novel therapeutic targets. American Journal of Respiratory and Critical Care Medicine, 188(8):996–1004, 2013. 7 [61] Yamini Subramani, Mandeep Singh, Jean Wong, Clete A. Kushida, Atul Malhotra, and Frances Chung. Understanding Phenotypes of Obstructive Sleep Apnea. Anesthesia & Analgesia, 124(1):179–191, 2017. 7, 8 [62] P Campo, F Rodríguez, S Sánchez-García, P Barranco, S Quirce, C Pérez- Francés, E Gómez-Torrijos, R Cárdenas, J M Olaguibel, J Delgado, Severe Asthma Workgroup, and SEAIC Asthma Committee. Phenotypes and endo- types of uncontrolled severe asthma: new treatments. Journal of Investiga- tional Allergology & Clinical Immunology, 23(2):76–88;, 2013. 7 [63] Scott A. Sands, Danny J. Eckert, Amy S. Jordan, Bradley A. Edwards, Robert L. Owens, James P. Butler, Richard J. Schwab, Stephen H. Loring, Atul Malhotra, David P. White, and Andrew Wellman. Enhanced upper- airway muscle responsiveness is a distinct feature of overweight/obese indi- viduals without sleep apnea. American Journal of Respiratory and Critical Care Medicine, 190(8):930–937, oct 2014. 8, 108 [64] Andrew Wellman, Danny J. Eckert, Amy S. Jordan, Bradley A. Edwards, Chris L. Passaglia, Andrew C. Jackson, Shiva Gautam, Robert L. Owens, Atul Malhotra, and David P. White. A method for measuring and modeling the physiological traits causing obstructive sleep apnea. Journal of Applied Physiology, 110(6):1627–1637, 2011. 9, 88 117 [65] Kelly Wilton and Clodagh M. Ryan. Edwards BA, et al. Obstructive Sleep Apnea in older adults is a distinctly different physiological phenotype. Sleep (6). American Journal of Respiratory and Critical Care Medicine, 192(9):1128, 2015. 9, 88, 90, 91, 92, 95, 101 [66] Elizabeth C. Brown, Shaokoon Cheng, David K. McKenzie, Jane E. Butler, SimonC.Gandevia,andLynneE.Bilston. TongueandLateralUpperAirway Movement with Mandibular Advancement. Sleep, 2013. 9, 37, 45, 68, 108 [67] LarsG.Hanson. Isquantummechanicsnecessaryforunderstandingmagnetic resonance? Concepts in Magnetic Resonance Part A, 32A(5):329–340, sep 2008. 12 [68] Dwight G. Nishimura. Principles of magnetic resonance imaging. Stanford University, 1.2 edition, 2010. 12, 26 [69] Zhi-Pei Liang, Paul C. Lauterbur, and IEEE Engineering in Medicine and Biology Society. Principles of magnetic resonance imaging : a signal pro- cessing perspective. SPIE Optical Engineering Press, 2000. 12, 22, 23 [70] RobertW.Brown, YuChungN.Cheng, E.MarkHaacke, MichaelR.Thomp- son, and Ramesh Venkatesan. Magnetic Resonance Imaging: Physical Prin- ciples and Sequence Design: Second Edition. John Wiley & Sons Ltd, Chich- ester, UK, apr 2014. 12 [71] Matt A. Bernstein, Kevin F. King, and Xiaohong Joe Zhou. Handbook of MRI Pulse Sequences. Academic Press, 1 edition, 2004. 12, 37 [72] Travis B. Smith and Krishna S. Nayak. MRI artifacts and correction strate- gies. Imaging in Medicine, 2(4):445–457, aug 2010. 24, 25, 26 [73] Dana C. Peters, Pratik Rohatgi, René M. Botnar, Susan B. Yeon, Kraig V. Kissinger,andWarrenJ.Manning. Characterizingradialundersamplingarti- facts for cardiac applications. Magnetic Resonance in Medicine, 55(2):396– 403, feb 2006. 25 [74] G H Glover and J M Pauly. Projection reconstruction techniques for reduc- tion of motion effects in MRI. Magnetic Resonance in Medicine, 28(2):275– 89, dec 1992. 25 [75] J R Liao, J M Pauly, T J Brosnan, and N J Pelc. Reduction of motion artifacts in cine MRI using variable-density spiral trajectories. Magnetic Resonance in Medicine, 37(4):569–75, apr 1997. 25 118 [76] Jacob A Bender, Rizwan Ahmad, and Orlando P Simonetti. The Importance of k-Space Trajectory on Off-Resonance Artifact in Segmented Echo-Planar Imaging. Concepts in magnetic resonance. Part A, Bridging education and research, 42A(2), mar 2013. 25 [77] Travis B. Smith and Krishna S. Nayak. Automatic off-resonance correction in spiral imaging with piecewise linear autofocus. Magnetic Resonance in Medicine, 69(1):82–90, jan 2013. 25, 26, 29, 30 [78] Qing-San Xiang and R. Mark Henkelman. K-space description for MR imag- ing of dynamic objects. Magnetic Resonance in Medicine, 29(3):422–428, mar 1993. 26, 28 [79] Zhi Pei Liang, Hong Jiang, Christopher P. Hess, and Paul C. Lauterbur. Dynamic imaging by model estimation. International Journal of Imaging Systems and Technology, 8(6):551–557, jan 1997. 26 [80] KatherineL.Wright,JesseI.Hamilton,MarkA.Griswold,VikasGulani,and Nicole Seiberlich. Non-Cartesian parallel imaging reconstruction. Journal of Magnetic Resonance Imaging, 40(5):1022–1040, nov 2014. 28 [81] Yoon-Chul Chul Kim, Shrikanth S. Narayanan, and Krishna S. Nayak. Flex- ible retrospective selection of temporal resolution in real-time speech MRI using a golden-ratio spiral view order. Magnetic Resonance in Medicine, 65(5):1365–1371, may 2011. 29, 32 [82] Markus Barth, Felix Breuer, Peter J. Koopmans, David G. Norris, and Benedikt A. Poser. Simultaneous multislice (SMS) imaging techniques. Mag- netic Resonance in Medicine, 75(1):63–81, jan 2016. 29 [83] Felix A. Breuer, Martin Blaimer, Robin M. Heidemann, Matthias F. Mueller, Mark A. Griswold, and Peter M. Jakob. Controlled aliasing in parallel imag- ing results in higher acceleration (CAIPIRINHA) for multi-slice imaging. Magnetic Resonance in Medicine, 53(3):684–691, mar 2005. 29 [84] John I Jackson, Craig H Meyer, Dwight G Nishimura, and Albert Macovski. Selection of a convolution function for Fourier inversion using gridding [com- puterised tomography application]. IEEE Transactions on Medical Imaging, 10(3):473–478, 1991. 30 [85] J G Pipe and P Menon. Sampling density compensation in MRI: ratio- nale and an iterative numerical solution. Magnetic Resonance in Medicine, 41(1):179–186, 1999. 30 119 [86] J Fessler and B Sutton. Nonuniform fast Fourier transforms using min- max interpolation. Signal Processing, IEEE Transactions on, 51(2):560–574, 2003. 30 [87] V. Rasche, R. Proksa, R. Sinkus, P. Bornert, and H. Eggers. Resampling of data between arbitrary grids using convolution interpolation. IEEE Trans- actions on Medical Imaging, 18(5):385–392, may 1999. 30 [88] Jeffrey A. Fessler. Michigan Image Reconstruction Toolbox. https://github.com/JeffFessler/mirt. 30 [89] GraemeF.Mason, ToddHarshbarger, HobyP.Hetherington, YantianZhang, Gerald M. Pohost, and Donald B. Twieg. A method to measure arbitrary k-space trajectories for rapid MR imaging. Magnetic Resonance in Medicine, 38(3):492–496, sep 1997. 30 [90] Jeff H. Duyn, Yihong Yang, Joseph A. Frank, and Jan Willem van der Veen. SimpleCorrectionMethodfork-SpaceTrajectoryDeviationsinMRI. Journal of Magnetic Resonance, 132(1):150–153, may 1998. 30 [91] P. B. Roemer, W. A. Edelstein, C. E. Hayes, S. P. Souza, and O. M. Mueller. The NMR phased array. Magnetic Resonance in Medicine, 16(2):192–225, nov 1990. 31 [92] Michael A. Ohliger and Daniel K. Sodickson. An introduction to coil array design for parallel MRI. NMR in Biomedicine, 19(3):300–315, may 2006. 31 [93] Klaas P. Pruessmann, Markus Weiger, Markus B. Scheidegger, and Peter Boesiger. SENSE: Sensitivity encoding for fast MRI. Magnetic Resonance in Medicine, 42(5):952–962, nov 1999. 32 [94] Mark A. Griswold, Peter M. Jakob, Robin M. Heidemann, Mathias Nittka, Vladimir Jellus, Jianmin Wang, Berthold Kiefer, and Axel Haase. Gener- alized Autocalibrating Partially Parallel Acquisitions (GRAPPA). Magnetic Resonance in Medicine, 47(6):1202–1210, jun 2002. 32 [95] D.L. Donoho. Compressed sensing. IEEE Transactions on Information The- ory, 52(4):1289–1306, apr 2006. 32 [96] E J Candes and T Tao. Near-Optimal Signal Recovery From Random Pro- jections: Universal Encoding Strategies? Information Theory, IEEE Trans- actions on, 52(12):5406–5425, 2006. 32 [97] Michael Lustig, David Donoho, and John M. Pauly. Sparse MRI: The appli- cation of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine, 58(6):1182–1195, dec 2007. 32 120 [98] Michael Lustig, David L Donoho, J M Santos, and John M Pauly. Com- pressed Sensing MRI. Signal Processing Magazine, IEEE, 25(2):72–82, 2008. 32 [99] Weiyi Chen, Emily Gillett, Michael C.K. K Khoo, Sally L. Davidson Ward, and Krishna S Nayak. Real-time Multi-slice MRI during Continuous Posi- tive Airway Pressure Reveals Upper Airway Response to Pressure Change. Journal of Magnetic Resonance Imaging, page 698, 2017. 32, 34 [100] Vikram Ramanarayanan, Sam Tilsen, Michael Proctor, Johannes Töger, Louis Goldstein, Krishna S. Nayak, and Shrikanth Narayanan. Analysis of speech production real-time MRI, 2018. 33 [101] Khrishna S. Nayak and Robert J. Fleck. Seeing sleep: Dynamic imaging of upper airway collapse and collapsibility in children. IEEE Pulse, 5(5):40–44, 2014. 33 [102] Yoon-Chul Kim, Ximing Wang, Winston Tran, Michael C K Khoo, and Krishna S Nayak. Measurement of upper airway compliance using dynamic MRI. In ISMRM 20th Scientific Sessions, Melbourne, volume 20, page 3688, 2012. 34, 91 [103] Brian Hargreaves. Rapid gradient-echo imaging. Journal of Magnetic Reso- nance Imaging, 36(6):1300–1313, dec 2012. 36 [104] Maureen Stone, Edward P. Davis, Andrew S. Douglas, Moriel NessAiver, Rao Gullapalli, William S. Levine, and Andrew Lundberg. Modeling the motion of the internal tongue from tagged cine-MRI images. Journal of the Acoustical Society of America, 109(6):2974–2982, jun 2001. 38, 46, 69, 107 [105] El-Sayed H Ibrahim. Myocardial tagging by cardiovascular magnetic res- onance: evolution of techniques–pulse sequences, analysis algorithms, and applications. Journal of Cardiovascular Magnetic Resonance, 13(1):36, jul 2011. 37, 39, 44, 69 [106] Kevin M. Moerman, Andre M. J. Sprengers, Ciaran K. Simms, Rolf M. Lamerichs, Jaap Stoker, and Aart J. Nederveen. Validation of continuously tagged MRI for the measurement of dynamic 3D skeletal muscle tissue defor- mation. Medical Physics, 39(4):1793–1810, mar 2012. 37, 46 [107] Marco Piccirelli, Roger Luechinger, Veit Sturm, Peter Boesiger, Klara Lan- dau, and Oliver Bergamin. Local Deformation of Extraocular Muscles during Eye Movement. Investigative Opthalmology & Visual Science, 50(11):5189, nov 2009. 37, 46 121 [108] Jonghye Woo, Maureen Stone, Yuanming Suo, Emi Z. Murano, and Jerry L. Prince. Tissue-Point Motion Tracking in the Tongue From Cine MRI and Tagged MRI. Journal of Speech, Language, and Hearing Research, 57(2):S626–36, 2014. 37, 66, 69, 107 [109] ArashA.Sabet,EftychiosChristoforou,BenjaminZatlin,GuyM.Genin,and Philip V. Bayly. Deformation of the human brain induced by mild angular head acceleration. Journal of Biomechanics, 41(2):307–315, jan 2008. 37, 46 [110] Anne Bazille, Michael A. Guttman, Elliot R. McVeigh, and Elias A. Zer- houni. Impact of semiautomated versus manual image segmentation errors on myocardial strain calculation by magnetic: Resonance tagging. Investiga- tive Radiology, 29(4):427–433, apr 1994. 37 [111] H. Azhari, M. Buchalter, S. Sideman, E. Shapiro, and R. Beyar. A conical model to describe the nonuniformity of the left ventricular twisting motion. Annals of Biomedical Engineering, 20(2):149–165, 1992. 37 [112] Sharmeen Masood, Guang Zhong Yang, Dudley J. Pennell, and David N. Firmin. Investigating intrinsic myocardial mechanics: The role of MR tag- ging, velocity phase mapping, and diffusion imaging. Journal of Magnetic Resonance Imaging, 12(6):873–883, dec 2000. 37 [113] Leon Axel. Biomechanical Dynamics of the Heart with MRI. Annual Review of Biomedical Engineering, 4(1):321–347, aug 2002. 37, 38 [114] E A Zerhouni, D M Parish, W J Rogers, A Yang, and E P Shapiro. Human heart: tagging with MR imaging–a method for noninvasive assessment of myocardial motion. Radiology, 169(1):59–63, oct 1988. 39 [115] L Axel and L Dougherty. MR imaging of motion with spatial modulation of magnetization. Radiology, 171(3):841–845, jun 1989. 39, 49 [116] Timothy J. Mosher and Michael B. Smith. A DANTE tagging sequence for the evaluation of translational sample motion. Magnetic Resonance in Medicine, 15(2):334–339, aug 1990. 39 [117] S. E. Fischer, G. C. McKinnon, S. E. Maier, and P. Boesiger. Improved myocardial tagging contrast. Magnetic Resonance in Medicine, 30(2):191– 200, aug 1993. 39, 49, 51, 73 [118] Nael F. Osman, William S. Kerwin, Elliot R. McVeigh, and Jerry L. Prince. CardiacmotiontrackingusingCINEharmonicphase(HARP)magneticreso- nance imaging. Magnetic Resonance in Medicine, 42(6):1048–1060, dec 1999. 39, 75 122 [119] Anthony H. Aletras, Shujun Ding, Robert S. Balaban, and Han Wen. DENSE: Displacement Encoding with Stimulated Echoes in Cardiac Func- tional MRI. Journal of Magnetic Resonance, 137(1):247–252, 1999. 39, 69 [120] Nael F. Osman, Smita Sampath, Ergin Atalar, and Jerry L. Prince. Imaging longitudinal cardiac strain on short-axis images using strain-encoded MRI. Magnetic Resonance in Medicine, 46(2):324–334, aug 2001. 39 [121] A A Young, L Axel, L Dougherty, D K Bogen, and C S Parenteau. Validation of tagging with MR imaging to estimate material deformation. Radiology, 188(1):101–108, jul 1993. 39, 49 [122] Susan B Yeon, Nathaniel Reichek, Barbara A Tallant, João A.C Lima, Linda P Calhoun, Neil R Clark, Eric A Hoffman, Kalon K.L Ho, and Leon Axel. Validation of in vivo myocardial strain measurement by magnetic res- onance tagging with sonomicrometry. Journal of the American College of Cardiology, 38(2):555–561, aug 2001. 39 [123] L Axel and L Dougherty. Heart wall motion: improved method of spatial modulation of magnetization for MR imaging. Radiology, 172(2):349–350, aug 1989. 42, 49 [124] J. C. Waterton, J. P. R. Jenkins, X. P. Zhu, H. G. Love, I. Isherwood, and D. J. Rowlands. Magnetic resonance (MR) cine imaging of the human heart. The British Journal of Radiology, 58(692):711–716, aug 1985. 44 [125] M Kumada, M Niitsu, S Niimi, and H Hirose. A study on the inner struc- ture of the tongue in the production of the 5 Japanese vowels by tagging snapshot MRI. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, 26:1–11, 1992. 46 [126] M Niitsu, M Kumada, N G Campeau, S Niimi, S J Riederer, and Y Itai. Tongue displacement: visualization with rapid tagged magnetization- prepared MR imaging. Radiology, 191(2):578–580, 2014. 46 [127] Vitaly J. Napadow, Qun Chen, Van J. Wedeen, and Richard J. Gilbert. Intramural mechanics of the human tongue in association with physiological deformations. Journal of Biomechanics, 1999. 46 [128] Vijay Parthasarathy, Jerry L. Prince, Maureen Stone, Emi Z. Murano, and Moriel NessAiver. Measuring tongue motion from tagged cine-MRI using harmonic phase (HARP) processing. Journal of the Acoustical Society of America, 121(1):491–504, jan 2007. 46, 66 123 [129] Maureen Stone, Jonghye Woo, Jiachen Zhuo, Hegang Chen, and Jerry L. Prince. Patterns of variance in /s/ during normal and glossectomy speech. Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization, 2014. 46, 68 [130] Jonghye Woo, Junghoon Lee, Emi Z. Murano, Fangxu Xing, Meena Al- Talib, Maureen Stone, and Jerry L. Prince. A high-resolution atlas and statistical model of the vocal tract from structural MRI. Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization, 3(1):47–60, jan 2015. 46, 69, 107 [131] Jonghye Woo, Fangxu Xing, Maureen Stone, Jordan Green, Timothy G. Reese, Thomas J. Brady, Van J. Wedeen, Jerry L. Prince, and Georges El Fakhri. Speech Map: a statistical multimodal atlas of 4D tongue motion dur- ing speech from tagged and cine MR images. Computer Methods in Biome- chanics and Biomedical Engineering: Imaging and Visualization, pages 1–13, oct 2017. 46, 66, 69, 107, 108 [132] Johannes Töger, Tanner Sorensen, Krishna Somandepalli, Asterios Toutios, Sajan Goud Lingala, Shrikanth Narayanan, and Krishna Nayak. TestâĂŞretest repeatability of human speech biomarkers from static and real-time dynamic magnetic resonance imaging. Journal of the Acoustical Society of America, 141(5):3323–3336, may 2017. 46 [133] Elliot R. McVeigh and Fred Epstein. Myocardial tagging during real-time MRI. In Annual Reports of the Research Reactor Institute, Kyoto University, volume 3, pages 2284–2285. IEEE, 2001. 46, 72 [134] Smita Sampath, J. Andrew Derbyshire, Ergin Atalar, Nael F. Osman, and Jerry L. Prince. Real-time imaging of two-dimensional cardiac strain using a harmonic phase magnetic resonance imaging (HARP-MRI) pulse sequence. Magnetic Resonance in Medicine, 50(1):154–163, jul 2003. 46, 69, 72 [135] Li Pan, Matthias Stuber, Dara L. Kraitchman, Danielle L. Fritzges, Wes- ley D. Gilson, and Nael F. Osman. Real-time imaging of regional myocardial function using fast-SENC. Magnetic Resonance in Medicine, 55(2):386–395, 2006. 46, 47, 72 [136] El Sayed H Ibrahim, Matthias Stuber, Ahmed S. Fahmy, Khaled Z. Abd- Elmoniem, Tetsuo Sasano, M. Roselle Abraham, and Nael F. Osman. Real- time MR imaging of myocardial regional function using strain-encoding (SENC) with tissue through-plane motion tracking. Journal of Magnetic Resonance Imaging, 26(6):1461–1470, 2007. 46, 47, 72 124 [137] J.M. Santos, G.A. Wright, and J.M. Pauly. Flexible real-time magnetic resonance imaging framework. In The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, volume 3, pages 1048–1051. IEEE, 2005. 49, 76 [138] David O. Walsh, Arthur F. Gmitro, and Michael W. Marcellin. Adap- tive reconstruction of phased array MR imagery. Magnetic Resonance in Medicine, 43(5):682–690, 2000. 50 [139] Kevin F. King, Alexander Ganin, Xiaohong Joe Zhou, and Matt A. Bern- stein. Concomitant gradient field effects in spiral scans. Magnetic Resonance in Medicine, 41(1):103–112, 1999. 50, 75 [140] Michael Markl, R. Bammer, M. T. Alley, C. J. Elkins, M. T. Draney, A. Bar- nett, M.E. Moseley, G.H.Glover, andN. J.Pelc. Generalizedreconstruction of phase contrast MRI: Analysis and correction of the effect of gradient field distortions. Magnetic Resonance in Medicine, 50(4):791–801, 2003. 50, 75 [141] Michael Markl, S. Scherer, A. Frydrychowicz, D. Burger, A. Geibel, and J. Hennig. Balanced left ventricular myocardial SSFP-tagging at 1.5T and 3T. Magnetic Resonance in Medicine, 60(3):631–639, sep 2008. 52, 70, 80 [142] Ilse Lehiste and Gordon E. Peterson. Transitions, Glides, and Diphthongs. Journal of the Acoustical Society of America, 33(3):268–277, 2005. 55, 67 [143] Fang Ying Hsieh, Louis Goldstein, Dani Byrd, and Shrikanth Narayanan. Truncation of pharyngeal gesture in English diphthong [aI]. In Proceedings of the Annual Conference of the International Speech Communication Asso- ciation, INTERSPEECH, pages 968–972, 2013. 55, 67 [144] Sungbok Lee, Alexandros Potamianos, and Shrikanth Narayanan. Devel- opmental acoustic study of American English diphthongs. Journal of the Acoustical Society of America, 136(4):1880–1894, 2014. 55 [145] V. Uma Valeti, Wookjin Chun, Donald D. Potter, Philip A. Araoz, Kiaran P. McGee, James F. Glockner, and Timothy F. Christian. Myocardial tagging and strain analysis at 3 Tesla: Comparison with 1.5 Tesla imaging. Journal of Magnetic Resonance Imaging, 23(4):477–480, apr 2006. 56 [146] Fangxu Xing, Jonghye Woo, Arnold D. Gomez, Dzung L. Pham, Philip V. Bayly, Maureen Stone, and Jerry L. Prince. Phase Vector Incompressible Registration Algorithm for Motion Estimation from Tagged Magnetic Res- onance Images. IEEE Transactions on Medical Imaging, 36(10):2116–2128, oct 2017. 66 125 [147] Euna Lee, Fangxu Xing, Sung Ahn, Timothy G. Reese, Ruopeng Wang, Jordan R. Green, Nazem Atassi, Van J. Wedeen, Georges El Fakhri, and Jonghye Woo. Magnetic resonance imaging based anatomical assessment of tongue impairment due to amyotrophic lateral sclerosis: A preliminary study. Journal of the Acoustical Society of America, 143(4):EL248–EL254, apr 2018. 66, 68, 69, 107 [148] Nael F. Osman, Elliot R. McVeigh, and Jerry L. Prince. Imaging heart motion using harmonic phase MRI. IEEE Transactions on Medical Imaging, 19(3):186–202, mar 2000. 69, 107 [149] Krishna S. Nayak, Jon Fredrik Nielsen, Matt A. Bernstein, Michael Markl, Peter D. Gatehouse, Rene M. Botnar, David Saloner, Christine Lorenz, Han Wen, Bob S. Hu, Frederick H. Epstein, John N. Oshinski, and Subha V. Raman. Cardiovascular magnetic resonance phase contrast imaging, 2015. 69, 108 [150] Valentina Mazzoli, Lukas M. Gottwald, Eva S. Peper, Martijn Froeling, Bram F. Coolen, Nico Verdonschot, Andre M. Sprengers, Pim van Ooij, Gustav J. Strijkers, and Aart J. Nederveen. Accelerated 4D phase con- trast MRI in skeletal muscle contraction. Magnetic Resonance in Medicine, 80(5):1799–1811, nov 2018. 69, 108 [151] J. P A Kuijer, M. B M Hofman, J. J M Zwanenburg, J. Tim Marcus, Albert C. Van Rossum, and Rob M. Heethaar. DENSE and HARP: Two views on the same technique of phase-based strain imaging. Journal of Mag- netic Resonance Imaging, 2006. 69, 108 [152] Henrik Haraldsson, Andreas Sigfridsson, Hajime Sakuma, Jan Engvall, and Tino Ebbers. Influence of the FID and off-resonance effects in dense MRI. Magnetic Resonance in Medicine, 2011. 69, 108 [153] Salome Ryf, Jeffrey Tsao, Juerg Schwitter, Anja Stuessi, and Peter Boesiger. Peak-combination HARP: A method to correct for phase errors in HARP. Journal of Magnetic Resonance Imaging, 2004. 69, 108 [154] Salome Ryf, Kraig V Kissinger, Marcus A Spiegel, Peter Börnert, Warren J Manning, Peter Boesiger, and Matthias Stuber. Spiral MR Myocardial Tag- ging. Magnetic Resonance in Medicine, 51(2):237–242, 2004. 70 [155] John F. Schenck. The role of magnetic susceptibility in magnetic resonance imaging: MRI magnetic compatibility of the first and second kinds. Medical Physics, 23(6):815–850, jun 1996. 70 126 [156] Ianessa A. Humbert, Scott B. Reeder, Eva J. Porcaro, Stephanie A. Kays, Jean H. Brittain, and Joanne Robbins. Simultaneous estimation of tongue volume and fat fraction using IDEAL-FSE. Journal of Magnetic Resonance Imaging, 2008. 71 [157] Weiyi Chen, Dani Byrd, Shrikanth Narayanan, and Krishna S. Nayak. Inter- mittently tagged real-time MRI reveals internal tongue motion during speech production. Magnetic Resonance in Medicine, 82(2):600–613, aug 2019. 72, 73, 76, 77, 80, 84 [158] ChristinaHagedorn,MichaelProctor,andLouisGoldstein. Automaticanaly- sis of singleton and geminate consonant articulation using real-time magnetic resonance imaging. In Proceedings of the Annual Conference of the Interna- tional Speech Communication Association, INTERSPEECH, 2011. 72 [159] Chao Tang, Elliot R. Mcveigh, and Elias A. Zerhouni. MultiâĂŘShot EPI for Improvement of Myocardial Tag Contrast: Comparison with Segmented SPGR. Magnetic Resonance in Medicine, 33(3):443–447, mar 1995. 73 [160] Daniel A. Herzka, Michael A. Guttman, and Elliot R. McVeigh. Myocardial tagging with SSFP. Magnetic Resonance in Medicine, 49(2):329–340, feb 2003. 73 [161] J. Andrew Derbyshire, Smita Sampath, and Elliot R. McVeigh. Phase- sensitive cardiac tagging - REALTAG. Magnetic Resonance in Medicine, 58(1):206–210, jul 2007. 73, 84, 85, 109 [162] Martin Uecker, Peng Lai, Mark J. Murphy, Patrick Virtue, Michael Elad, John M. Pauly, Shreyas S. Vasanawala, and Michael Lustig. ESPIRiT - An eigenvalue approach to autocalibrating parallel MRI: Where SENSE meets GRAPPA. Magnetic Resonance in Medicine, 71(3):990–1001, mar 2014. 75 [163] Meir Shinnar and John S. Leigh. Inversion of the Bloch equation. The Journal of Chemical Physics, 98(8):6121, apr 1993. 75 [164] Peter Kellman and Elliot R. McVeigh. Image reconstruction in SNR units: A general method for SNR measurement. Magnetic Resonance in Medicine, 54(6):1439–1447, dec 2005. 76 [165] Erik Bresch, Jon Nielsen, Krishna Nayak, and Shrikanth Narayanan. Syn- chronized and noise-robust audio recordings during realtime magnetic res- onance imaging scans. Journal of the Acoustical Society of America, 2006. 77 127 [166] Ipshita Bhattacharya, Rajiv Ramasawmy, Matthew Restivo, and Adrienne Campbell-Washburn. Dynamic speech imaging at 0.55T using single shot spirals for 11ms temporal resolution. In ISMRM 27th Scientific Sessions, page 440, Montreal, 2019. 84, 108 [167] Matthieu Ruthven, Andreia C. Freitas, Redha Boubertakh, and Marc E. Miquel. Application of radial GRAPPA techniques to single- and multislice dynamic speech MRI using a 16-channel neurovascular coil. Magnetic Reso- nance in Medicine, 82(3):948–958, apr 2019. 84 [168] Sajan Goud Lingala, Yongwan Lim, Stanley Kruger, and Krishna Nayak. ImprovedspiraldynamicMRIofvocaltractshapingat3Teslausingdynamic off resonance artifact correction. In ISMRM 27th Scientific Sessions, page 441, Montreal, 2019. 84 [169] Adrienne Campbell-Washburn, Daniel Herzka, Peter Kellman, Alan Koret- sky, andRobertBalaban. Imagecontrastat0.55T. In ISMRM 27th Scientific Sessions, page 1214, Montreal, 2019. 84, 109 [170] Douglas C. Noll, Dwight G. Nishimura, and Albert Macovski. Homodyne Detection in Magnetic Resonance Imaging. IEEE Transactions on Medical Imaging, 10(2):154–163, jun 1991. 86 [171] Martin Uecker and Michael Lustig. Estimating absolute-phase maps using ESPIRiT and virtual conjugate coils. Magnetic Resonance in Medicine, 77(3):1201–1207, mar 2017. 86, 107 [172] Clete A Kushida, Michael R Littner, Timothy Morgenthaler, Cathy A Alessi, Dennis Bailey, Jack Coleman, Leah Friedman, Max Hirshkowitz, Sheldon Kapen, Milton Kramer, Teofilo Lee-Chiong, Daniel L Loube, Judith Owens, Jeffrey P Pancer, and Merrill Wise. Practice parameters for the indications for polysomnography and related procedures: an update for 2005. Sleep, 28(4):499–521, apr 2005. 88 [173] David J. Terris, Matthew M. Hanasono, and Yung C. Liu. Reliability of the Muller maneuver and its association with sleep-disordered breathing. The Laryngoscope, 110(11):1819–1823, nov 2000. 88 [174] Kenny P. Pang and David J. Terris. Multilevel pharyngeal surgery for obstructive sleep apnea. In Michael Friedman, editor, Sleep Apnea and Snor- ing : Surgical and Non-surgical Therapy, pages 1–11. Elsevier Inc., 2014. 88 [175] S Fujita. Surgical treatment of OSA: UPPP and lingualplasty (laser midline glossectomy). In C Guilleminault and M Partinen, editors, Obstructive Sleep Apnea Syndrome: Clinical Research and Treatment., pages 129–151. Raven Press, New York, 1990. 88 128 [176] Michael Semelka, Jonathan Wilson, and Ryan Floyd. Diagnosis and Treat- ment of Obstructive Sleep Apnea in Adults. American family physician, 94(5):355–60, sep 2016. 88 [177] Richard J Schwab. Upper airway imaging. Clinics in chest medicine, 19(1):33–54, 1998. 88, 102 [178] F. G. Issa and C. E. Sullivan. Upper airway closing pressures in snorers. Journal of Applied Physiology, 57(2):528–535, 2017. 100 [179] Anan Salloum, James A. Rowley, Jason H. Mateika, Susmita Chowdhuri, Qasim Omran, and M. Safwan Badr. Increased propensity for central apnea inpatientswithobstructivesleepapneaeffectofnasalcontinuouspositiveair- way pressure. American Journal of Respiratory and Critical Care Medicine, 181(2):189–193, 2010. 101 [180] Murtuza M. Ahmed and Richard J. Schwab. Upper airway imaging in obstructivesleepapnea. Current Opinion in Pulmonary Medicine,12(6):397– 401, 2006. 102 [181] Ahsan Javed, Yoon Chul Kim, Michael C.K. Khoo, Sally L.Davidson Ward, and Krishna S. Nayak. Dynamic 3-D MR visualization and detection of upper airway obstruction during sleep using region-growing segmentation, 2016. 103 [182] Lauriane Jugé, Fiona Knapman, Peter Burke, Brown Elizabeth, Jane Butler, Danny Eckert, Jo Ngiam, Kate Sutherland, Peter Cistulli, and Lynne Bil- ston. Tongue deformation during mandibular advancement, as determined using tagged-MRI, may help to predict mandibular advancement treatment outcome in Obstructive sleep apnoea. In ISMRM 26th Scientific Sessions, page 1268, Paris, 2018. 108 [183] D. L. Foxall. Frequency-modulated steady-state free precession imaging. Magnetic Resonance in Medicine, 48(3):502–508, sep 2002. 109 [184] Krishna S Nayak, Hsu-Lei Lee, Brian A Hargreaves, and Bob S Hu. Wide- band SSFP: alternating repetition time balanced steady state free precession with increased band spacing. Magnetic Resonance in Medicine, 58(5):931– 938, 2007. 109 [185] Volkert Roeloffs, Sebastian Rosenzweig, H. Christian M. Holme, Martin Uecker, and Jens Frahm. Frequency-modulated SSFP with radial sampling and subspace reconstruction: A time-efficient alternative to phase-cycled bSSFP. Magnetic Resonance in Medicine, 81(3):1566–1579, mar 2019. 109 129
Abstract (if available)
Abstract
Magnetic Resonance Imaging (MRI) is the most promising modality for evaluating upper airway dynamics, because it is non-invasive and involves no ionizing radiation. For the past decade, real-time MRI (RT-MRI) has been extensively used with significant improvement in spatiotemporal resolution to time-resolve dynamics of natural speech production and sleep disorder. ❧ Existing techniques track tissue surfaces, such as the vocal tract and airway. However, they lack the ability to measure upper airway functions, such as internal muscle movement and muscle tone variation across different airway sites. This dissertation introduces new techniques and novel experiment designs compatible with RT-MRI techniques in order to reveal internal muscle motion and physiological traits of the upper airway. ❧ I develop intermittent tagging during RT-MRI as a means to visualize internal tongue motion during speech production. This approach eliminates the need for re-binning data using multiple repetitions and is suitable for investigations of natural speech production. I demonstrate a framework to select imaging parameters in consideration of image quality and tag persistence and achieved an imaging window of approximately 650-800 ms at 1.5 T. I demonstrate the ability to capture tongue motion patterns and their relative timing as exemplified by internal tongue deformation during American English diphthong vowels and consonants. ❧ Next, I demonstrate intermittent tagging with REALTAG to extend tag persistence. This approach provides 2× improvement in contrast-to-noise ratio (CNR), > 1.9× longer tag persistence, and is suitable for investigations of natural speech production. I develop an improved method for phase sensitive reconstruction that provides superior image quality compared to prior approaches in the presence of time-varying background phase. I demonstrate an imaging window of 1250 ms at 1.5 T, which is adequate to capture internal tongue deformations during American English vowel-to-vowel transitions in separate words. This provides a powerful new tool for imaging muscle movement during natural speech production and other applications, where CINE imaging is not applicable or suboptimal due to the natural variability of human action. ❧ Finally, I present a novel RT-MRI based experiment that measures upper airway biomarkers relevant to the study of sleep-related breathing disorders. These are upper airway loop gain (UALG) and the fluctuation of airway area (FAA). I combine simultaneous multi-slice RT-MRI with continuous positive airway pressure (CPAP) and carefully designed pressure changes. I demonstrate that this new test can localize specific airway sites that are prone to collapse, while the conventional apnea-hypopnes index only provides estimation of the overall severity of obstructive sleep apnea. This new experiment can directly measure location-specific active (UALG) and passive (FAA) physiological traits, and visually resolve airway dynamics. ❧ In this dissertation, I introduce novel techniques and experiment designs while leverage maturing fast imaging methods to provide the needed spatiotemporal resolution for upper airway RT-MRI. With the proposed methods, we can set our sights further than the anatomical structures and onto the even more interesting yet intrinsically complex functions of the upper airway.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Seeing sleep: real-time MRI methods for the evaluation of sleep apnea
PDF
Fast upper airway MRI of speech
PDF
Visualizing and modeling vocal production dynamics
PDF
Fast flexible dynamic three-dimensional magnetic resonance imaging
PDF
Improved brain dynamic contrast enhanced MRI using model-based reconstruction
PDF
Correction, coregistration and connectivity analysis of multi-contrast brain MRI
PDF
Toward understanding speech planning by observing its execution—representations, modeling and analysis
PDF
Articulatory dynamics and stability in multi-gesture complexes
PDF
New methods for carotid MRI
PDF
Technology for improved 3D dynamic MRI
PDF
Shift-invariant autoregressive reconstruction for MRI
PDF
Measuring functional connectivity of the brain
PDF
Novel theoretical characterization and optimization of experimental efficiency for diffusion MRI
PDF
Emotional speech production: from data to computational models and applications
PDF
Model-based phenotyping of obstructive sleep apnea in overweight adolescents for personalized theranostics
PDF
Improving the sensitivity and spatial coverage of cardiac arterial spin labeling for assessment of coronary artery disease
PDF
Susceptibility-weighted MRI for the evaluation of brain oxygenation and brain iron in sickle cell disease
PDF
Improving sensitivity and spatial coverage of myocardial arterial spin labeling
PDF
Enhancing speech to speech translation through exploitation of bilingual resources and paralinguistic information
PDF
Behavior understanding from speech under constrained conditions: exploring sparse networks, transfer and unsupervised learning
Asset Metadata
Creator
Chen, Weiyi
(author)
Core Title
Functional real-time MRI of the upper airway
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
10/28/2019
Defense Date
08/08/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
dynamic MRI,MR tagging,MRI,muscle function,OAI-PMH Harvest,real-time MRI,sleep apnea,speech production,Tongue,upper airway
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Nayak, Krishna S. (
committee chair
), Byrd, Dani (
committee member
), Haldar, Justin P. (
committee member
), Narayanan, Shrikanth (
committee member
)
Creator Email
wayne.weiyi.chen@gmail.com,weiyic@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-227843
Unique identifier
UC11675384
Identifier
etd-ChenWeiyi-7888.pdf (filename),usctheses-c89-227843 (legacy record id)
Legacy Identifier
etd-ChenWeiyi-7888.pdf
Dmrecord
227843
Document Type
Dissertation
Rights
Chen, Weiyi
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
dynamic MRI
MR tagging
MRI
muscle function
real-time MRI
sleep apnea
speech production
upper airway