Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A biomimetic approach to non-linear signal processing in ultra low power analog circuits
(USC Thesis Other)
A biomimetic approach to non-linear signal processing in ultra low power analog circuits
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
University of Southern California A Biomimetic Approach to Non-Linear Signal Processing in Ultra Low Power Analog Circuits A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Electrical Engineering by Viviane S. Ghaderi December 2014 c Copyright by Viviane S. Ghaderi December 2014 To John Choma. iii Table of Contents 1 Introduction................................ 1 1.1 Overview................................ 1 1.2 Implantable Electronics: Then and Now .............. 5 1.3 Categories of Neural Prostheses . .................. 7 1.3.1 Sensory Prostheses ...................... 8 1.3.2 Motor Prostheses ....................... 10 1.3.3 Cognitive Prostheses ..................... 11 2 Signal Processing in the Brain .................... 13 2.1 Background on Neural Communication ............... 13 2.2 The Basis of Neural Coding ..................... 17 2.3 Techniques for Measuring Neural Signals .............. 20 3 Neural Hardware Systems ....................... 23 3.1 Challenges in Implantable Electronics................ 23 3.1.1 Biocompatibility and Scale . ................. 24 3.1.2 Power ............................. 25 3.1.3 Area .............................. 27 3.1.4 OtherConsiderations..................... 27 3.2 ComponentsofaNeuralElectronicSystem............. 28 3.2.1 Recording Electrode ..................... 29 3.2.2 Analog Front End: Neural Amplifier and Bandpass Filter . 31 iv 3.2.3 Analog-to-DigitalConverter(ADC)............. 35 3.2.4 Spike Sorting and Pre-Processing .............. 39 3.2.5 Telemetry ........................... 39 3.2.6 StimulationElectrodeandAssociatedCircuits ....... 40 3.3 A Programmable Analog Implementation.............. 43 4 A Nonlinear Neural System Model ................. 47 4.1 ChoiceoftheNeuralModel ..................... 47 4.2 TheLaguerreExpansionofVolterraKernels(LEV)Model .... 50 4.2.1 LaguerrePolynomials..................... 51 4.2.2 LaguerreExpansionofVolterraKernels........... 53 4.2.3 ExperimentalSetupforNeuralDataAcquisition...... 55 4.3 Model Optimization for Hardware Implementation ......... 56 4.3.1 SystemSpecificationsforHardware ............. 61 5 Novel Neural Signal Processing Hardware ............. 64 5.1 OverviewoftheSystemArchitecture ................ 64 5.2 LEVSystemSynthesis ........................ 66 5.3 Noise and Performance Limitations ................. 70 5.3.1 NoiseinElectronicCircuits ................. 70 5.3.2 Maximizing the SNR in Presence of Noise ......... 73 5.4 The Power Efficient Weak-Inversion Regime and Its Limitations . 76 5.4.1 DerivationofDrain-CurrentCharacteritics . ........ 76 5.4.2 Transconductance Efficiency ................. 79 v 5.4.3 Leakage ............................ 80 5.4.4 ParasiticNodeCapacitances................. 80 5.4.5 Mismatch ........................... 80 5.5 Analog LEV Model Implementation................. 82 5.5.1 FirstOrderLEVGeneration................. 83 5.5.2 SecondOrderLEVGeneration ............... 88 5.5.3 Weighting and Combining .................. 92 5.6 Digital Calibration and Programmability .............. 94 5.7 SimulationResults .......................... 101 5.8 DesignConsiderationsforOptimalSNR .............. 104 5.8.1 NoiseintheLowPassFilterSection ............ 105 5.8.2 NoiseintheAllPassFilterSection ............. 106 5.8.3 SNRofLEVComponents .................. 108 6 Test Setup and Results......................... 114 6.1 ASIC Implementation ........................ 114 6.2 TestSetupandMeasurementProcedure .............. 115 6.3 Results................................. 119 6.4 Conclusion............................... 125 Appendix ................................... 129 References................................... 140 vi List of Figures 1.1 First fully implantable pacemaker (1958) [1] ............ 5 1.2 Replacing body functions with implantable electronics: What is possible today and in the near future [2] .............. 6 1.3 Examples of sensory prostheses: (a) Cochlear implant (b) Retinal implant ................................ 9 1.4 Examplesofmotorprostheses: (a)Motorarmprosthesis, (b)Foot- dropregulator............................. 10 1.5 Example of cognitive prostheses: hippocampal prosthesis ..... 12 2.1 Neuron-to-neuron communication via synapses [3] ......... 14 2.2 Action potential progression ..................... 16 2.3 Different methods for neural recordings: (a) EEG, (b) ECoG, (c) Micro-electrodearray......................... 21 3.1 Commontradeoffsinhardwaredesign................ 24 3.2 Gliaactivationbeforeandafterelectrodeinsertion......... 25 3.3 Hardware architecture of a cognitive system: (a) Off-chip process- ing (b) On-chip digital processing (c) On-chip analog processing . 28 3.4 Electrode tissue interface: (a) Double layer, (b) Equivalent circuit model ................................. 30 3.5 Different types of micro-electrodes:(a) Utah microelectrode array [4](b)Micro-wireelectrodearray[5] ................ 31 vii 3.6 State-of-the-art front end: (a) Neural amplifier (LNA and BPF) (b) Incremental resistance due to pseudo-PMOS resistors (Ma-Md) 33 3.7 ADC input-output characteristic................... 36 3.8 Chargebalancingforstimulation .................. 42 3.9 Average power consumption per channel for a neural system . . . 44 4.1 Multi-input/multi-outputmodeltocapturenonlinearspike-to-spike transformations of multiple neurons................. 49 4.2 LEVmodeltocapturetransformationsofspiketrainstoexcitatory post-synapticpotentials ....................... 51 4.3 Three Laguerre Functions for different pvalues........... 53 4.4 Experimentalsetuptoobtainneuraldata.............. 56 4.5 Percent normalized mean square error vs. number of Laguerre Functions for different orders of nonlinearity ............ 57 4.6 Approximating neural data using the LEV model (a) All-or-none input (to CA3) (b) Measured output (at CA1) (c) Estimated output 58 4.7 Comparison of EPSP recordings and LEV approximation ..... 59 4.8 Firstorderkernel ........................... 60 4.9 Secondorderkernel.......................... 60 4.10 SISO model input spike train x(t) vs. output spike train y(t)... 62 4.11 Pre-threshold potential u and noise signal (σ=0.7) ........ 62 5.1 BlockdiagramoffrequencydomainLaguerreFunctions...... 65 5.2 Low-pass filter implementation . . .................. 68 5.3 All-pass filter implementation .................... 69 viii 5.4 Thermal noise representations in a resistor and a MOS transistor 71 5.5 Thermal, flicker, and combined noise spectrum of a MOS transistor 73 5.6 Noise models (a) A generic Gm-C lowpass filter (b) Thevenin equivalent ............................... 74 5.7 Transistor level implementation of the Laguerre Filters (a) Low- passfilter(b)All-passfilter. ..................... 83 5.8 AC response of the OTA including gm 1 and gm 2 .......... 84 5.9 AC output impedance of the OTA implementing gm 1 and gm 2 .. 85 5.10 Theeffectoftransistorsizeonthelowpassfilterresponse .... 85 5.11 ACresponseofthelowpassfilter .................. 86 5.12 ACresponseoftheallpassfilter ................... 87 5.13 Transistor sizing in the allpass filter................. 87 5.14 ACresponseofacascadeofthreeall-passfilters .......... 88 5.15 Gilbert cell transistor level implementation............. 89 5.16 Common-modefeedbackforGilbertcell .............. 90 5.17 Gilbert cell performance (a) Comparison to an ideal linear mul- tiplier (b) rms error in % between the fit and the circuit output voltage................................. 91 5.18 Open loop phase margin response of Gilbert cell with common- modefeedback............................. 92 5.19 Gilbertweightingblock........................ 93 5.20 GilbertweightingblockACresponse ................ 94 5.21 Analogtoplevelschematic ..................... 95 ix 5.22 ACbuffersandDCtestingswitches................. 95 5.23 Digital calibration system ...................... 96 5.24 Detailed representation of the digital system ............ 97 5.25 Toplevelschematicofthedigitalsubsystem ............ 97 5.26 7-bit Gate-Select decoder which selects the gate under calibration 99 5.27 D-flip flop used in the register that stores the correct ladder value foreachgatebiasvoltage....................... 99 5.28 Simulated output of the D-flip flop: (a) Output Q (b) Input D (c) Clock signal CLK (d) Active low signal Resetb........... 100 5.29 Transmission gates in the tree mux used to select the appropriate resistiveladdertap .......................... 100 5.30 Simulatedoutputoftheladderandmux .............. 101 5.31 Simulatedoutputofthe2ndorderLEVmodel........... 102 5.32 Simulated Laguerre functions (L0-L33) for a double pulse input . 103 5.33 PhysicallayoutoftheLEVChip .................. 104 5.34 ACmodelofthelowpassfilter ................... 106 5.35 AC model and noise model for the all pass filter section...... 107 5.36 Noise contributions per filter stage ................. 110 6.1 Diephotooftheprogrammableanalogsystem............ 114 6.2 Main PCB with mini-PCB and socket attached .......... 115 6.3 Functional diagram of the computer/micro-controller/PBC interface116 6.4 Test Setup for ASIC calibration and measurement......... 117 6.5 Calibration algorithm......................... 118 x 6.6 ZerothorderLaguerreResult(a)Measured(b)Simulated .... 120 6.7 FirstorderLaguerreResult(a)Measured(b)Simulated...... 121 6.8 SecondorderLaguerreResult(a)Measured(b)Simulated .... 122 6.9 ThirdorderLaguerreResult(a)Measured(b)Simulated ..... 123 6.10 Finaloutputresult(a)Measured(b)Simulated .......... 124 6.11 Measured circuit output and LFs along with hippocampal output data in response to a paired pulse input .............. 126 A.1 TopcopperlayerofPCB ...................... 138 A.2 BottomcopperlayerofPCB..................... 138 A.3 68-pin QFN Bonding Diagram ................... 139 xi List of Tables 3.1 State-of-the-art CMOS analog front end performance comparison 35 3.2 Performance of state-of-the-art ADCs for neural applications . . . 38 3.3 Performance comparison of state-of-the-art CMOS spike process- ingsystems .............................. 40 5.1 Power Consumption of the LEV System .............. 104 5.2 Noise Summary . ........................... 109 5.3 Noise Summary for improved SNR ................. 112 6.1 Performance Summary ........................ 125 A.1 PinAssignmentsfromchiptopackage ............... 139 xii Acknowledgments This work and thesis would not have been possible without the support and encouragement of several people throughout the often arduous path of my PhD. First and foremost, I would like to dedicate this work to Professor John Choma. Words cannot describe what he has done for me. I would not have pursued a PhD degree and certainly not in electrical engineering had I not met him during my undergraduate years. He inspired me because he believed in me while he also challenged me to keep learning and to be rigorous and thorough. Whenever I was at a low-point during my PhD he picked me up and told me that I could reach for the stars. Professor Choma will continue to live on in my memory and that of thousands of students he has taught throughout his lifetime. I will never forget him. This work would not have been completed without the continuous advice, support and encouragement of Professor Theodore W. Berger. His vision of the futureofbiomedicalengineeringalongwithhispassionanddedicationtohiswork piqued my interest and inspired me to pursue the work presented in this thesis. Throughout my PhD, Professor Berger has been my mentor, advisor, and friend and it was a pleasure to work with him. I would also like to thank Professor Aluizio Prata whose guidance and ad- vice has helped me on many occasions. He is a great teacher and mentor and I appreciate all the interesting discussions I had with him throughout the years. Additionally, I am thankful for the numerous fruitful discussions and collabora- tions with Professor Vasilis Marmarelis. Professor Norberto Grzywacz has been someone I have looked up to for years and I am grateful and honored to have him as a member of my thesis committee. xiii I would like to thank Cypress Semiconductor and especially Dr. Sam Geha who provided fabrication for my chip and Artur Balasinski and Dr. Bindu Mad- havan for their hours of assistance and guidance for the setup of the design kit. I am grateful for the help of the staff in Biomedical and Electrical Engineering whoalwaysgooutoftheirwaytoassistandhelpoutusstudents,especially,Diane Demetras, Consuelo Correa, Karen Johnson and Ken Johnston. A very important and enjoyable part of my PhD experience were the friends I made at USC. I greatly appreciate the laughs and joys of conference trips, lunches and Frisbee sessions and discussions with my labmates of the Berger lab. Particularly, I am thankful for the sisterhood, Rosa Chan, Sushmita Allam and Huijing Xu. I am also grateful for fruitful discussions with Farhan Baluch and Nadav Ivzan. I am extremely grateful to my friends, Birgit Oettl, Christina Jeong, Atefeh Mirbagheri, Colleen OBrien and Bambi von Logue Newth who shared my joy and hardship throughout the past years. I also would like to dedicate my thesis to my father, Dr. Mehdi Ghaderi, who inspired me with his passion for engineering and who always helped me to complete my last and hardest mile, and to my mother, Renate Ghaderi, who encouraged me to take on new challenges and taught me to persist. Only with their continuous support, motivation and love was I able to take on this challeng- ing endeavor. I am also grateful for the support and love of my sister, Vanessa Ghaderi, who I have been looking up to as an inspiration as long as I remember. Lastly, but most importantly, I would like to thank Dr. Shervin Moloudi for his unconditional patience, love and dedication to my success. His love of teaching and the endless hours of discussions with him have helped me to grow not only professionally, but also personally. This work is a testament of this, xiv because it would not have been possible without him and I am forever grateful to have him in my life. xv Abstract of the Dissertation A Biomimetic Approach to Non-Linear Signal Processing in Ultra Low Power Analog Circuits by Viviane S. Ghaderi Doctor of Philosophy in Electrical Engineering University of Southern California, December 2014 Professor John Choma, Jr., Co-chair Professor Theodore W. Berger, Co-chair The human brain has perfected the task of performing complex functions, such as learning, memory, and cognition, in energy and area efficient manners, and therefore, it is attractive to model and replicate this performance in biomimetic systems. Applications for which such models are essential include implantable electronics that replace certain brain functions damaged by diseases or injury, and neuromorphic architectures for novel high-speed, low power computing. To properly mimic a neural function, it is important to determine the optimal level of abstraction because there are trade-offs between capturing biological complex- ities and the scalability and efficiency of the model. Additionally, the hard- ware implementation of that model must consume little power to minimize heat dissipation in order to avoid tissue damage and to allow for scaling to a large number of components. The Laguerre Expansion of Volterra (LEV) model is a promising approach since it can capture the spatio-temporal nonlinearities of a neural system at several abstraction levels. Similar to the brain, it uses basic processing strategies and operations like amplification, filtering, delay, and re- xvi dundancy to produce complex functionalities. This doctoral research addresses the hardware challenges of a biomimetic system and proposes a novel method for its practical implementation. The LEV model is realized using analog sub- threshold CMOS signal processing units, which are more power and area efficient than the digital counterparts. Several challenges in subthreshold analog design are addressed in this work. First, application-specific analog circuits are neither easily programmable nor flexible. Second, in the subthreshold regime mismatch and process variations lead to large differences in identically-sized transistors. The brain performs complex functions despite facing similar limitations by using redundancy and learning rules that adjust cell properties. Guided by these prin- ciples the analog subthreshold circuits are designed to be digitally programmable for coefficient and time constant adjustments. Also, since Laguerre basis func- tions of different orders are not completely orthogonal, a redundancy is created that helps reproduce the composite waveform more accurately. Calibration and training techniques are proposed in this work to accomplish the required preci- sion. These techniques are verified through numerical simulations, and physical implementation and measurement of a system fabricated in a 0.13 μm CMOS process. This work demonstrates the utility of these optimized low-power sub- threshold analog circuits, aided by calibration, and uses them to implement the LEV model of a single-input, single-output spike system to replicate the signal transformations of a single neuron. A foundation is laid for an easily scalable system useful for multi-input, multi-output (MIMO) models, to be used in sys- tems such as a hippocampus prosthesis for memory restoration or a large scale neuromorphic computer. xvii CHAPTER 1 Introduction 1.1 Overview Scientists have often looked to biology for ideas on how to solve a multitude of problems, which the biological world managed to tackle through millions of years of evolution. Since the human brain has perfected the task of performing complex functions, such as learning, memory, and cognition in energy and area efficient manners, today’s engineering community aspires to model and replicate this per- formance in biomimetic systems. While the brain’s impressive capabilities serve as an inspiration, they also present a major challenge since the conventional com- puting methods have not yet allowed us to achieve such computational efficiency. Buildingartificialsystemsinabiomimeticorneuromorphicwaycouldallowus tocapturethebrain‘senergyefficiency,robustnesstonoiseanderrors,modularity, and flexibility. Applications for which such systems are essential are manifold; from implantable electronics, which replace certain brain functions in case of damage due to diseases or injury, to neuromorphic architectures for novel high- speed, low power computing. To construct a biomimetic system, we have to determine the essential charac- teristics that allow the brain to perform specific tasks. Since it is still unknown how the collective activity of populations of neurons and synapses results in higher-level brain functions, choosing the right mechanisms to include and find- 1 ing the proper level of abstraction is difficult. Hence, the biomimetic model has to be flexible and modular so that it can be adjusted to account for new findings, without requiring long redesign times. To properly mimic a neural function, it is important to determine the opti- mal level of abstraction because there are trade-offs between capturing biological complexities and the scalability and efficiency of the model. Additionally, the hardware implementation of that model must consume little power to minimize heat dissipation in order to avoid tissue damage and to allow for scaling to a large number of components. Furthermore, since it is still unknown how the collective activity of populations of neurons and synapses results in higher-level brain functions, choosing the right mechanisms to include is difficult. Hence, the biomimetic model has to be flexible and modular so that it can be adjusted to account for new findings, without requiring long redesign times. Oneessentialfeatureofbraincomputingthatisverydifferentfromtraditional computing is that larger networks are made up of small modules (neurons with synapses), whichareofsimilarcomposition, butarenotpreciselythesame. Large networks in the brain with parallel inputs and outputs are made up of many of these slightly different cells, with adjustable features that are activity-dependent (potentiation, depression, etc.) These mechanisms provide for correction in the presenceoferrorornoise, whichallowstheindividualcomponentstobeimprecise but adjustable. Therefore, the biomimetic model can use adaptive low-power imprecise modules instead of precise but power-hungry components and satisfy the criteria of compactness, modularity (for easy scaling and modification), low power, and adaptation (programmability). The Laguerre Expansion of Volterra (LEV) model is a promising approach to neural modeling, since it can capture the spatio-temporal nonlinearities of a 2 neural system at several abstraction levels. Similar to the brain, it uses basic processing strategies and operations like amplification, filtering, delay, and re- dundancy to produce complex functionalities. Furthermore, as demonstrated in this work, it can be implemented in hardware in a modular and programmable manner. Whilesoftwaremodelsareveryflexible,theyareneitherlow-powernorportable and therefore do not readily lend themselves to implantable applications. Hard- ware systems implemented on FPGA(field-programmable gated array)-like plat- forms face the same problems, because they still demand a large area and power. The advantages of an integrated solution are its compactness (maximum density per function as compared to FPGAs and software-based solutions) and power efficiency. The main disadvantage is that integrated circuits fulfill a specific task and once fabricated they cannot be modified. Furthermore, the design time can be long. These problems can be mitigated by including programmability, which comes at the price of additional area and power but provides more flexibility. This doctoral research addresses the hardware challenges of an integrated biomimetic system on-chip and proposes a novel method for its practical imple- mentation. The proposed analog biomimetic hardware comprises components that are optimized for low power, and a digital subsystem that provides cali- bration and programmability for the analog building blocks. The LEV model is realized using an analog subthreshold CMOS signal processing unit, which is demonstrated to be more power and area efficient than its digital counterpart, since subthreshold operation reduces the power consumption and maximizes the building block density on chip. Several challenges to subthreshold analog design are addressed in this work. First, application-specific analog circuits are neither easily programmable nor 3 flexible. Second, in the subthreshold regime mismatch and process variations lead to large differences in identically-sized transistor currents. The brain per- forms complex functions despite facing similar limitations by using redundancy and learning rules that adjust cell properties. Guided by these principles the analog subthreshold circuits are digitally programmable for coefficient and time constant adjustments. Also, since Laguerre basis functions of different orders are not completely orthogonal, a redundancy is created that helps reproduce the compositewaveformaccurately. Calibrationandtrainingtechniquesareproposed in this work to accomplish the required precision. These techniques are verified through Matlab and transistor-level Cadence Spectre simulations, and physical implementation and measurement of the system which was fabricated in a 0.13 μm CMOS process. The goal of this work is to demonstrate the utility of these optimized low-power subthreshold analog circuits, aided by calibration, and to use them to implement the LEV model of a single-input, single-output spike sys- tem to replicate the signal transformations of a single neuron. A foundation is laid for an easily scalable system useful for multi-input, multi-output (MIMO) models, to be used in systems such as a hippocampus prosthesis for memory restoration or a large-scale neuromorphic computer. In this work, the proposed system is presented as part of an implantable cognitive prosthesis for memory restoration. The reason why systems that mimic the brain’s computational capability pose so many challenges to the engineering community can partly be understood from looking at the historical development of implantable devices and the nearly exponentially growing demand for more processing power. 4 1.2 Implantable Electronics: Then and Now In 1958, the cardiac pacemaker was the first electronic device that could be fully implanted inside the body of a patient (Figure 1.1). This technological break- through triggered a surge in the development of implantable microelectronics. Such devices, also called active implants because they are powered electronically, are introduced into the body to remain there after a surgical procedure [6]. Most active implants use a form of electrical stimulation of nerves to treat diseases. This technique has been used as early as 43 A.D. to soothe headaches withthetouchoftheelectricallychargedtorpedofish[7,p.87]. Inthebeginningof the 20th century, external brain stimulation using electroshock therapy was used to treat severe depression and other mental diseases. Today, electrical pulses are applied much more discriminately. Implanted electrodes with external or internal driving components deliver mild electrical charges to specific parts of the nervous system [7, p.88]. Figure 1.1: First fully implantable pacemaker (1958) [1] 5 Figure 1.2 shows some examples of common implantable devices available to- day: Artificial cardiac pacemakers use electrical impulses to maintain a patient’s regular heartbeat. Foot drop regulators improve the walking ability of a patient suffering from foot drop (difficulty when stepping during walking) by activat- ing nerves inside the patient’s leg via electrical pulses. Insulin pumps regulate the glucose dosing of diabetics by subcutaneously injecting a controllable and precise amount of insulin into the pancreas. A gastric stimulator sends mild electrical pulses through electronic leads to stimulate the smooth muscles of the lower stomach to help control the chronic nausea and vomiting caused by gastro- paresis [8]. Cochlear implants use electrical stimulation to aid or reconstruct a patient’s hearing ability. Deep brain stimulation is a new type of treatment still under development for applications, such as chronic pain, Parkinson’s and many more. Figure 1.2: Replacing body functions with implantable electronics: What is pos- sible today and in the near future [2] 6 Today, electronic implants are used extensively not only to extend and save lives, but also to improve the quality of human life. The impact of these devices becomes apparent when looking at the market analyses and projections for this biomedical engineering field. According to BCC Research, a publisher of tech- nology market research reports, ”the global market for microelectronic medical implants, accessories and supplies was worth an estimated US $15.4 billion in 2010 and is projected to grow” by over 60% to US $24.8 billion in 2016 [6]. Cardiac pacemakers, defibrillators, and cochlear implants make up the largest sector of the biomedical device industry. However, they are now considered ma- ture technologies and are projected to insignificantly contribute to prospective market growth. Biomedical implants of the future will be the so-called smart neural implants, which are customizable to the patient and will be able to do much more than treating only medical conditions. Controlling electrical devices using our brain signals, ”downloading” or ”uploading” information to and from our brain, and partially replacing or augmenting cognitive functions are just a few examples of ”smart” implants that scientists believe to be within reach soon [9], [10]. 1.3 Categories of Neural Prostheses Neural prostheses, which are designed to replace or improve certain functions of the central nervous system, constitute an important class of brain implants. To understand why some neural prostheses are already implanted in over 100,000 patients while others are not even past theoretical speculations, it is useful to classify them into three different categories: sensory, motor, and cognitive. This will allow us to understand what the design of prostheses for each category entails and what challenges should be overcome before they can be implemented. 7 1.3.1 Sensory Prostheses Sensory prostheses collect information using electronic or other types of sensors whose outputs are directly delivered to the nervous system via electrical stimu- lation [11]. To construct sensory prostheses, it is important to understand the neural code that organs, such as the ear or the eyes, use to transmit information to the respective processing area of the brain (auditory cortex for inputs from the ear and visual cortex for inputs from the eyes). The next step is to identify a ”surgically accessible site” to which a spatio-temporal pattern of stimulation can be applied [11]. Cochlear prostheses (Figure 1.3 (a) [12]) use electrical stimulation to bypass defective hair cells that normally transform sound into neural activity. They are the most advanced type of neural prostheses to date and have been successfully implanted in over 100,000 patients. If done at an early age, cochlear implants often leads to almost full restoration of a patients hearing ability. Despite the success of the cochlear prosthesis, there are still challenges that need to be ad- dressed. The dynamic range of the cochlear implant is still orders of magnitudes less than the natural human hearing range (20 dB dynamic range of the implant as compared to human’s natural 100 dB dynamic range). Furthermore, the number of independent input channels for cochlear implants is low compared to the input channels for a healthy ear, which is thought to be the reason why patients with cochlear implant complain about its performance in noise environments. To scale up to a higher number of inputs and higher dynamic range, better methods have to be developed to provide power for the device. External parts visible from the outside are required to power existing cochlear prostheses. Retinal implants (Figure 1.3 (b) [13]) have been in a stage of development 8 Figure 1.3: Examples of sensory prostheses: (a) Cochlear implant (b) Retinal implant and clinical trials until early this year, in February 2013, the Argus II was the first retinal implant to be approved by the FDA for sale in the US. There are a number of technical problems to be solved before vision can be restored suf- ficiently for blind patients suffering from retinis pigmentosa and other retinal afflictions. As with cochlear prosthesis, wireless powering is important for these implants. Furthermore, to properly process visual information, the brain needs a large number of inputs (usually received from many photoreceptors in the retina), which results in more stringent design criteria for the electronic sensors and the power consumption of the associated circuits. Since sensory prostheses interface to the outside world through sensors, the input into the visual neural system which processes this information is known. It is possible to measure and decode the neural response to a given sensory input and hence to understand how the brain translates sensory information into neural code. 9 1.3.2 Motor Prostheses Motor prostheses, another type of neural prostheses, are intended to directly couple the brain to machines to restore lost motor function. The motor prosthesis can be used to control a computer keyboard, to animate a robotic limb, or to stimulate muscles in a certain body part, such as the spinal cord or the foot (Figure 1.4) [14] [15]. (a) (b) Figure 1.4: Examples of motor prostheses: (a) Motor arm prosthesis, (b) Foot- drop regulator Muscle stimulators, such as the Walkaid to correct foot drop, are motor pros- theses which are comparatively easy to implement. More complex motor pros- theses employ a large number of electrodes to record the electrical activity of neurons in the motor or pre-motor cortices to directly control a robotic limb or a computer cursor. Most motor prostheses are still limited in performance because of the invasiveness of the large number of electrodes required to gain adequate resolution to record from different motor regions in the brain. Recent efforts in brain-machine interfaces to control robotic arms have shown promising results by incorporating learning algorithms into the device, in addition to input-output response mapping [16]. As described in the previous section, sensory prostheses encode sensory infor- mation into spatio-temporal neural patterns. In contrast, motor prostheses, aim 10 to decode spatio-temporal patterns into trajectories or movements of objects or body parts. In both cases, one end of the prosthesis usually interfaces with the outside world. Motor prostheses transform spatio-temporal neural signals into a measurable output such as movement or trajectory of a robotic arm. 1.3.3 Cognitive Prostheses Cognitive prostheses aim to restore or augment brain functions that are limited due to damaged brain tissues, caused for example by a stroke, disease, or trau- matic brain injury. Research in this area is growing, but there are currently no commercially available cognitive prostheses. The reason for this lack of progress of cognitive prostheses as compared to other types of neural prostheses is that the knowledge pervading neural mechanisms is insufficient to explain how the collective activity of neurons results in higher level processes such as language, abstract reasoning, and learning and memory. These cognitive functions are con- sidered to be the most complex operations of the brain and difficult to define in terms of the underlying neural mechanisms. A contributing factor to this complexity in building cognitive prostheses, is that, as opposed to the other types of neural prostheses, they do not have any external inputs or outputs and it is difficult to correlate the neural modulation to variables in the environment. The transformation from input to output patterns performed by a healthy brain is highly non-linear and difficult to replicate. Cognitive prostheses have to decode the spatio-temporal patterns recorded at the input, process them, and then also properly encode the outputs into another set of spatio-temporal patterns. Additionally, without one side interfacing to the exterior of the body, it is difficult to measure the performance of the artificial nonlinear system. Special experimental tasks are required to validate the system. 11 One of the most advanced cognitive prostheses to date is the hippocampus pros- thesis developed by a team of researchers at USC and Wake Forrest University (Figure 1.5 [17]). The goal of this project is to design a mathematical model capturing the nonlinear transformations performed by the healthy hippocampus, which can be implemented efficiently on a silicon VLSI chip [18]. VLSI Biomimetic Model Multi-Site Electrode Array Hippocampus Input Output Site rray O Hippocampu mimetic Mode ut utpu Figure 1.5: Example of cognitive prostheses: hippocampal prosthesis 12 CHAPTER 2 Signal Processing in the Brain 2.1 Background on Neural Communication The brain has a highly parallel architecture performing complex tasks such as learning, memory and cognition. To successfully create devices which can com- municate with the brain or to mimic efficient brain-processing, it is essential to understand how the brain encodes such tasks. This chapter provides a back- ground on some of the elements and mechanisms in the brain which contribute to its signal processing. Two main classes of cells play a role in the signal transformations performed by the brain: neurons and glia. While glia are usually considered to be support cells only, neurons are known to be the fundamental processing units. The ap- proximately 10 11 neurons in the human brain communicate using electrical and chemical signaling [19]. The general structure of a neuron is shown in Figure 2.1. Neurons consists of a cell body (soma), a protruding tree-like structure of several projecting processes (dendrites), and a single, long, cable-like extension (axon). The connection site between two neurons is called synapse, a narrow gap of only 20nm, at which information is transferred from the transmitting neuron, on the pre-synaptic side, to the receiving neuron, on the post-synaptic side, via neurotransmitters and other chemicals [20, p.20]. There are over 10 14 synapses in the human brain and they are considered the most important locations of 13 information exchange [20] [19]. Figure 2.1: Neuron-to-neuron communication via synapses [3] Neurons communicate using a combination of electrical and chemical signal- ing, which together enable complex nonlinear processing and rapid, long-distance transmission of time-sensitive data. Without any input from the dendrites, neu- ronal cells are unexcited and ion channels establish a resting (equilibrium) poten- tial across the neural membrane (which is impermeable to ions and water). This potential arises due to the current flow of mainly potassium (K + ) and sodium (Na + ) ions across the membrane through their respective resting ion channels, which are open when the cell is at rest. The equilibrium potential of a cell de- pends on the concentration of ions inside the cell (intracellular) and outside the cell (extracellular) and is maintained by the resting ion channels and an active sodium/potassium pump which extracts sodium from the cell and adds potas- sium, against their gradients. Typically, this potential is around -65 mV, referred to the extracellular fluid. When current flows into the cell across the membrane, it is depolarized, meaning its potential becomes less negative or possibly positive. When there is an input to a synapse from the axon of an adjacent neuron, chemical interactions produce post-synaptic currents which flow through the den- dritesofthereceivingneuron. Ifthereareinputsfromitsotherdendriticbranches 14 aswell, thepost-synapticcurrentsareintegrated(nonlinearlysummed)atthecell body of the neuron and an electrical impulse called action potential, or spike, is produced if the potential of the neuron’s membrane (which acts like a capacitor) exceeds a threshold. Figure 2.2 illustrates the generation of an action potential duetothetwomaintypesofvoltage-gatedion-channelsinvolvedinthegeneration of an action potential. Figure 2.2 (a) and (b) show the channel opening and clos- ing probabilities and the resulting conductances for voltage-gated sodium and potassium channels, respectively, when the membrane potential is temporarily depolarized from the resting potential 1 . It can be seen in Figure 2.2 (a) that a temporary depolarization of the neural membranecausesarapidincreaseintheopeningprobabilityandtheconductance of the voltage-gated sodium channels. It can also be seen that these channels also quickly desensitize, which means that the channels close soon again soon after the depolarization and the conductance decreases back to base level, even though the membrane potential is still depolarized. As can be seen in Figure 2.2 (b), voltage-gated potassium channels respond much slower to a large membrane depolarization than their sodium counterparts. Voltage-gated K + channels open after a delay and, therefore, the potassium transconductance slowly increases until most channels are open. Voltage-gated potassium channels remain open until the cell membrane is depolarized back to rest. Using this information, the generation of an action-potential, as shown in Figure 2.2 (c) can be explained. When a pulse of electricity temporarily depolarizes the membrane of the cell, voltage-gatedNa + channelsareactivatedandsodiumionsstarttoflowinrapidly, due to the behavior shown in Figure 2.2 (a). A depolarization of 10 to 15 mV 1 The channel probabilities and conductances are obtained using patch clamp recordings during a depolarization of -80 mV to -40 mV for the sodium channels and a depolarization from -100 mV to 0 mV for the potassium channels. 15 Excitation threshold Rapid Na + channel opening Delayed K + channel opening Rapid Na + channel desensitization (closing), K + channels stay open Hyperpolarization: slow closing of K + channels K+ channels close, Na + at rest +40 mV 0 mV - 70 mV Time Amplitude 0 s 2s -80 mV -40 mV - 100 mV 0 mV (a) Sodium Channels (Na + ) (b) Potassium Channels (K + ) Resulting Conductances Channel Opening and Closing Probabilities (c) Progression of an Action Potential Figure 2.2: Action potential progression from the resting potential causes a large burst in the number of opening sodium channels, leading to further rapid depolarization. This process is accelerated due to the positive feedback mechanism of more influx of sodium causing more depolarization which in turn leads to more channels opening. The voltage at the onset of the positive feedback is called the threshold voltage of a neuron. The neuron is now in the excited state. A large number of gated sodium channels are open at once and depolarize the cell until the membrane voltage is close to the equilibrium potential of sodium (+55 mV) [20]. As the voltage remains high, sodium gated channels start to close, since they desensitize as shown in Figure 2.2 (a), while the slower potassium channels are still opening, as shown in Figure 2.2 (b), pulling the potential back down towards the negative resting potential. Since these channels open and close more slowly than their sodium counterparts, the membrane voltage undershoots for a period of time called the refractory period, until finally all voltage-gated channels are closedandthemembraneisbackattheequilibriumpotential[20]. Theamplitude of an action potential is approximately 100 mV (measured with respect to the resting potential) and the duration is around 1-2 ms, depending on the type of neuron and other biochemical factors. 16 Action potentials are often considered to be all-or-none events of equal am- plitude and duration. Therefore, very similar to digital signals, neural spikes are robust to noise, and ideal for information transmission over long distances. This property is critical for neural signals which have to travel long distances along ax- ons to reach adjacent synapses and neighboring neurons. Assuming information is encoded in the timing and not the amplitude of the spikes, two theories exist on how neurons encode information [21]. One theory assumes that the average spike rate of action potentials is important (rate coding), while the other theory assumes that the timing between spike events or the timing of their occurrences carries the neural message (temporal coding) [22] [23, p.35]. As the averaging time interval used to obtain the rate code gets smaller, these two theories be- come more similar. 2.2 The Basis of Neural Coding To mimic neural processing in computational and hardware models, it is critical tounderstandhowneuronscommunicateandprocessinformationandspecifically how they generate, transform and decode spike codes. Even though the exact underlying mechanisms are still unclear, it is known that neural coding is affected by several neural attributes including their cell morphology, their biophysical and biochemical interactions, and plasticity. As explained below, these aspects generate nonlinearities in the membrane potentials and currents which in turn generatecomplexspikingpatternsandallowforalargesignalprocessingcapacity. The shape, size and branching of the cells significantly affect their electro- physiological response. Over the past decades it has been found that especially the morphology of dendrites play an important role in the integration of cell inputs. Dendrites are intricate structures whose morphology and chemical chan- 17 nels affect neural spiking and bursting [24]. It has been found that dendrites perform parallel processing of inputs, including filtering, amplification, and non- linear summation, to create complex functions such as coincidence detection and selectivity [25]. Furthermore, thelocation ofwherethesignal is received is impor- tant, as well as the presence or lack of myelination (ensheathment which allows for better signal conduction) affect a cell’s signaling properties [23, p.203]. There are thousands of biophysical pathways and channels that generate and modify spike timing, in addition to the voltage-gated sodium and potassium channels mentioned in Section 2.1 involved in the generation of the action poten- tial. Several important ones are based on the action of potassium and calcium in the cell, such as that of the A-type potassium conductance, calcium activated potassium conductance and T-type calcium channels. A-type potassium conductances have two important roles. Firstly, their ac- tivation allows for a neural spiking rate proportional to the total current in the soma, which basically means they allow for frequency modulation of the spike trains proportional to the signal strength of the input [23, p.198]. Secondly, when the A-type potassium conductance is inactivate and then suddenly activated due to a positive membrane current, the onset of the action potential is delayed with respect to the input [23, p.199]. Calcium activated potassium conductances play an important role in neural bursting (intervals of high frequency spiking). These conductances are activated when large amounts of calcium are built-up during high frequency spiking and they prevent extended bursting [23]. T-type calcium conductances are inactivate after long periods of hyperpolarization. Activating these channels with the injection of a subsequent positive current leads to a slow transient depolarization called a calcium spike, which causes the neuron to fire a burst of action potentials on top of the slow calcium depolarization [23, p.200]. 18 These examples of biophysical channels illustrate some of the mechanisms neu- rons use to encode information in spike timing and there are many more known and unknown pathways which contribute. An important dynamic neural signal processing mechanism which is based on both, the cell’s underlying structure and its biophysical channels, is short-term plasticity (STP). STP denotes a change in efficacy of a synapse after repetitive activation [26]. With STP, the synaptic signal transmission is altered depending on the pre-synaptic activity over a time scale of milliseconds to less than several minutes [26]. This characteristic is considered a dynamic nonlinearity and en- compasses short-term facilitation (STF) and short-term depression (STD). STF happens when the presence of previous presynaptic events causes an increased postsynapticresponseduetopreviouspresynapticeventsandSTDhappenswhen previous presynaptic events cause a decreased response at the postsynapse. STF is caused by the accumulation of residual of calcium at the axon terminal after spike generation, which increases the release probability of neurotransmitters and STD occurs due to the depletion of neurotransmitters vesicle pool and [27]. Po- tentiation significantly expands the signal processing and encoding capabilities of a neuron, by using dynamic coding techniques instead of static ones. The mechanisms described in this section are just some examples of the many nonlinear processes that shape the neural temporal code and allow for vast sig- nal processing capabilities of individual neurons. However, cortical functions are generally encoded in the firing activity of not just a single neuron but of a popu- lation of neurons. Therefore, to understand how neurons process information, it is important to examine the relationships of firing patterns across neural popu- lations and to consider the total spatio-temporal neural response [28] instead of the temporal firing activity of a single neuron. 19 Despite our knowledge of many of the effects that contribute to temporal cod- ing of individual neurons, it is still unknown how the collective activity of popu- lations of neurons and synapses results in higher level brain functions. Therefore, the level of details that have to be included in a model, in hardware or software, which mimics the information processing performed by the brain, is difficult to assess. Furthermore, since it is impossible to include all these complex under- lying neural mechanisms, the right level of abstraction for a model mimicking a neural function is often difficult to determine. There are trade-offs between capturing biological complexities on one hand and the scalability and efficiency of the model on the other. To construct a biomimetic system, we have to try to determine and implement only the essential characteristics that allow the brain to perform specific tasks. 2.3 Techniques for Measuring Neural Signals To decode, augment and emulate the behavior of neurons, it is essential to inter- face with the brain and to measure their signaling activity. Several techniques are currently used to observe brain activity, but recording signals in vivo, mean- ing in behaving, freely moving subjects, is challenging and there are still many bottlenecks to overcome [29]. The method used to measure neural activity depends on the required spatial andtemporalresolutionoftheapplicationandtheacceptablelevelofinvasiveness. Fromleasttohighestlevelofinvasiveness,thethreemostcommonlyusedmethods for neural recording are shown in Figure 2.3: EEG (Electroencephalography), ECOG (Electrocorticography) and micro-electrode recordings. EEG is applied on the outside of the skull and provides ”macroscopic” infor- 20 mation based on the synchronous aligned activity of neurons in the ”neocortex that produce modulations in brain oscillations” [30]. ECOG recordings measure neural activity right below the skull from the surface of the cortex. This method allowsmeasurementoflocalfieldpotentials, whichcontainthesynchronizedpost- synaptic potential activity of neurons in the cortex. Micro-electrodes allow for the measurement of single unit spiking activity (activity from individual neurons) as they are directly inserted into the brain tissue. (a) (b) (c) Figure 2.3: Different methods for neural recordings: (a) EEG, (b) ECoG, (c) Micro-electrode array Depending on the method of recording neural activity, there are several op- tions for decoding these signals, including frequency analysis, rate coding and spike timing. Frequency techniques, such as Fourier analysis are the main tools used on EEG as well as ECOG data. Rate coding is another analysis method which assumes that the spiking activity over a certain time interval is important, as opposed to the precise time of a spike occurrence. This method is applied to ECOG and micro-electrode recordings. Spike time analysis can only be ap- plied to micro-electrode recordings which provide spiking data from a single cell. This information contains the precise timing of individual spikes (and the time between spikes or inter-spike interval). When high resolution information is re- quired, such as for neural implants and prostheses, micro-electrodes are used to 21 acquire neural data. Therefore, rate coding or spike timing analyses are generally applied for these applications. 22 CHAPTER 3 Neural Hardware Systems Greatprogresshasbeenmadesincethedawnofimplantableelectronics,whichare now used to augment or replace functions in many different body parts. Since the first neural implant, investigating the brain has become more quantitative and information-driven. Improved experimental techniques for measuring and imaging electro-chemical information as well as better computational tools to analyze and process neural data led to a deeper understanding of many complex neural mechanisms. However, as demonstrated in the next section, challenging problems still lie aheadfortheengineeringandsciencecommunitytosolve. Moreinputandoutput channels, higherpowerefficiency,lessinvasiveness, andcomplexprocessingarethe desired characteristics of future implants. Additionally, a better understanding of the mechanisms involved in information processing in the nervous system is essential. 3.1 Challenges in Implantable Electronics As neuro-prosthetic devices transition from experimental to clinical use, there is a need to realize electronic devices that can be implanted to amplify, process or transmit the data gathered by the recording sub-system [30]. Designing such electronic systems is a complex task and remains one of the principle challenges 23 of neural prostheses. The goal is often to gain maximum spatial electrode resolution and on-chip signal processing while minimizing surface area and power. Some of the principle design tradeoffs are shown in Figure 3.1. Power, area, cost, reliability, and speed areinterdependentfactorstobeconsideredforhardwaredesign,whichcanimpose limitations on the complexity and performance of the implant. Speed Cost Power Area Power Cost Power Cost Reliability Figure 3.1: Common tradeoffs in hardware design 3.1.1 Biocompatibility and Scale When a device such as an electrode is inserted into the body it damages the sur- rounding tissues. It is treated as a foreign object and an inflammatory response is initiatedbytheglialcells, whichensheaththeelectrode. Microgliaandastrocytes migrate towards the damaged area and encapsulate the electrode in an attempt to heal the damage. However, this sheath of glia increases tissue impedance and therefore diminishes the ability to record from and stimulate the surrounding area [31]. To achieve larger scale systems that capture complex neural activities of populations of neurons, it is important to maximize the density of electrodes in certain brain areas. However, the more electrodes, the larger the inflammatory response and the more likely it is to damage surrounding tissue. 24 Figure 3.2: Glia activation before and after electrode insertion Recentdiscoverieshaveledtopromisingresultsinthefieldofbiocompatibility of electrodes. Different coatings with anti-inflammatory materials and smaller and thinner nanoscale electrodes with lower stiffness have been demonstrated to be less invasive and to generate less of a glial response [31].While this area is not the main thrust of this work, the state-of-the-art electrode technology will be employed for the optimal system performance in future work. 3.1.2 Power Power consumption is critical in biomedical implant applications due to the risks of overheating and tissue burn. The implanted electronics must be designed to work in human tissue, a sensitive environment that absorbs heat and can get damaged. Hence, propercircuitdesigntechniquesmustensurethatthemaximum temperature increase of neural tissue due to the implant is smaller than 1 ◦ C.A commonly cited rule for calculating the maximum power density to avoid tissue burn is 0.8 mW/mm 2 [32] [33]. Aside from tissue burn, it has been shown that even small fluctuations in temperature of neural tissue surrounding the implant can change the behavior 25 of its surrounding cells. One example is the effect of temperature on N-Methyl D-Aspartate (NMDA) receptors, excitatory neurotransmitter receptors crucial for signal processing in the synapse. The temperature-dependent changes of the receptor can alter the amplitude and time course of NMDA receptor mediated postsynaptic currents, which in turn influences synaptic plasticity, a form of learning inside the synapse [34]. Another reason why power is an important issue for neural implants is battery lifetime. High power consumption of the implant demands more frequent battery replacements. In the case of a fully implanted system, this often translates to occasional surgeries to replace the battery. An alternative is to include larger batteries, which is not an ideal option, because of the restrictions on the area of the implant. More recent implants are often powered by rechargeable batteries, which offer a more favorable solution. Even in this case, low power hardware is important be- cause frequent charging of the implant becomes an inconvenience for the patient. When inductive power coupling and other wireless means are used to recharge batteries, the low efficiency of power transfer through the tissue can become a bottleneck and the charging system can potentially lead to tissue damage, if not carefully designed. Another problem that underscores the importance of low power hardware design for implants is the demand for higher electrode resolution. More channels tobeprocessedintheimplantcallsforamplificationandfilteringforeachchannel, and often digitization, processing and telemetry electronics. Hence, it becomes clearthatdesigningtheelectronicsatultralowpowerlevelsiscriticalinachieving higher functionality and improved resolution for a neural implant. Therefore, one ofthemainfocalpointsofthisworkistofindthemostpowerefficientbasicanalog 26 building blocks to construct parts of a neural system. 3.1.3 Area Another factor which restricts the performance of a neural implant is its area, including the electrodes, electronics, and battery. These components should not cause discomfort for the patient and should not interfere with normal activities. The total area depends on the carrier of the implant but generally speaking, for human subjects, a planar implant should preferably be less than 10 mm by 10 mm [30]. The size of the final prototype is often limited by the battery or the charging mechanism [35]. 3.1.4 Other Considerations There are many other factors that should be considered when designing a brain implant, and their significance depends on the specific application. Reliability over the intended life-span is important, because every time an implant has to be removed due to malfunction the patient has to undergo a surgery. The speed is crucial when real-time processing is required of a neural implant. As the system is scaled to higher resolution levels, it becomes difficult to process many channels at speeds fast enough to produce a real-time output. Often, the implant needs to include patient specific adjustments and therefore there is a need for programmabilityonchipinadditiontotelemetrytoreceiveandtransmitrequired control data. 27 3.2 Components of a Neural Electronic System Three possible hardware architectures for a closed loop VLSI system are shown in Figure 3.3. All three realizations have an analog front end (AFE), which is directly connected to the implanted recording electrodes at the tissue interface. The AFE amplifies and filters the recorded signals (either spikes or graded po- tentials, depending on the application) for further processing. ADC Off-Chip Processing Unit Stim Spike Sort LNAF N Telemetry N TX RX Stim Spike Sort LNAF N N Digital Signal Processing Unit Stim Analog Spike Sort LNAF Analog Signal Processing Unit Amplification + Filtering Neural Signal Processing Neural Stimulation (a) (b) (c) Input Input Input Output Output Output ADC Figure 3.3: Hardware architecture of a cognitive system: (a) Off-chip processing (b) On-chip digital processing (c) On-chip analog processing The main difference in these three alternative solutions is how neural process- ing, including spike sorting, is performed 1 . The back-end (neural stimulation) then encodes the calculated outputs into stimulation patterns, which are fed back into the tissue. The solution demonstrated in Figure 3.3 (a) performs only 1 Spike sorting is a method to classify which neuron dominantly contributed to a spike detected in the composite waveform recorded by an electrode. 28 minimal neural processing on-chip in the digital domain, and the data is sent off-chip via telemetry for the bulk of the processing. A fully-integrated digital solution is shown in Figure 3.3 (b) where signals are digitized after the AFE via an ADC. Figure 3.3 (c) presents the corresponding analog equivalent on-chip solution, which is the system proposed in this work. The individual components of the systems in Figure 3.3 are described in the following sections and state-of- the-art performance of their hardware implementation is provided. 3.2.1 Recording Electrode Micro-electrodes are used to record single unit activity of a neuron as opposed to their combined response, which can be measured using EEG or ECoG. These electrodes are placed in close proximity of the target neuron and usually have a geometric surface area in the range of 1,000 to 4,000 μm 2 and typically have diameters of 13 to 80 μm [36]. When a metal electrode comes in contact with tissue (i.e. electrolyte), chemi- cal reactions are triggered at the interface. These result in the exchange of charge between the electrode and the electrolyte until an equilibrium potential (also de- fined as the half-cell potential) is reached when the electrical and chemical forces at the interface are equal. The chemical reactions result in an electrical potential which then attracts charge to the interface. As a result, charge separation occurs and a double layer of charge is formed (Figure 3.4(a)). The equivalent electri- cal model of the interface, the electrode, and the tissue is shown in Figure 3.4 (b) [37]. In this model C s is the distributed capacitance of the electrode and R m rep- resents its series resistance. C e is the capacitance resulting from the double layer and R e is the leakage resistance due to charge carriers crossing that layer. R s 29 Double layer Tissue Electrode Amplier (a) (b) Metal Diffuse Layer Solvent Solvent Molecule Solvated Cation Figure 3.4: Electrode tissue interface: (a) Double layer, (b) Equivalent circuit model shows the series resistance of the tissue, and e n is the potential at the electrode tip when no current is flowing. The exact value of the equilibrium potential is a function of the material of the electrode and the composition of the tissue. This half-cell potential results in a DC voltage offset at the input of the amplifier connected to the recording electrode (Figure 3.4 (b)). The DC offset can be significant compared to the signal to be recorded. This situation sets restrictions on the amplifier design and as a result, the electrode material should be selected with that in mind. Toxicity, mechanical strength, and tissue irritation are among other issues that impact the choice of electrode [38, p.127]. A wide range of materials are used for recording electrodes, including stainless steel, tungsten, platinum, platinum-iridium alloys, iridium oxide, titanium ni- tride, and poly ethylene dioxythiophene. Two different types of micro-electrodes are shown in Figure 3.5. The impedance of recording electrodes (R m )istyp- ically characterized at 1 kHz, because the spike bandwidth is centered at this frequency, and ranges from approximately 50 KΩ to 1 MΩ, depending on the material used for the electrode. The amplitude of the neural signals measured by the micro-electrodes is on the order of tens to hundreds of μV with respect to 30 the equilibrium potential. The high frequency response of the electrode is limited by the large electrode impedance along with its distributed capacitance C s [39]. Along with the intended recording, the electrode will also pick up electrical noise as a result of the summation of neural activity in the vicinity of the recording. Neural recordings with a signal-to-noise ratio of 5:1 are regarded as successful measurements [39]. Some of this noise is outside the signal band and will be removed by subsequent filtering to improve the SNR. (a) (b) Figure 3.5: Different types of micro-electrodes:(a) Utah microelectrode array [4] (b) Micro-wire electrode array [5] 3.2.2 Analog Front End: Neural Amplifier and Bandpass Filter The neural amplifier is responsible for the amplification of neural signals to lev- els adequate for processing in subsequent stages of the system. There are two important categories of signals to record: Local field potentials (LFPs), which typically have an amplitude of 2-3 mV, usually in the frequency range of 1 Hz to 300 Hz, and extracellular action potentials (APs), with an amplitude of 50-500 μV, in the range of 300 Hz to 5 kHz. As described in the previous section, electro-chemical reactions at the elec- trode interface incur potentially large DC offsets that accompany the recorded neural signals. It is important to eliminate this offset before amplification, as 31 it can be orders of magnitude larger than the neural signals and saturate the amplifier. Therefore, a bandpass filter is needed to eliminate the static offset, as well as attenuating high frequency noise. In many state-of-the-art systems, the bandpass filter is integrated with the neural amplifier [40,41]. 3.2.2.1 Front End Noise and Power Since the first stage of amplification dominates the noise level of the system, the main challenge for the design of the neural amplifier is to achieve the desired amplification while introducing only very low levels of noise into the signal [42]. Thenoiseefficiencyfactor(NEF)isafigureofmeritthatrelatestheinputreferred noise of the system to the current consumption and bandwidth. It is defined by [43]: NEF = v ni,rms 2I tot π ×u T ×4kT ×BW (3.1) In this relationship, v ni,rms is the total equivalent input noise, I tot is the total current used in the amplifier, BW is the amplifier bandwidth, k is Boltzmanns constant, T is the temperature in Kelvin, and u T is the thermal voltage 2 [43]. Given the same bandwidth, a system with a lower NEF accomplishes a given input referred noise at a lower current consumption level. 3.2.2.2 Front End Area Since each recording electrode requires its own amplifier, more channels increase theareaproportionally[41]. Thebandpassfiltermustallowthepassageofsignals at frequencies as low as 1 Hz if LFP signals are recorded. A cut-off frequency below 1 Hz corresponds to a time constant in the order of seconds. If this time 2 The notation used in some literature is V T , which is often avoided to prevent confusion with the transistor threshold voltage. 32 constant is implemented as a combination of a linear resistor and a capacitor, the required RC product is far too large for on chip integration and therefore an ex- ternal resistor or capacitor will be necessary. Since each recording electrode needs an associated bandpass filter, this approach is not feasible for integrated large- scale systems. More compact solutions have been proposed and are described in the next section [40]. 3.2.2.3 State-of-the-Art Front End Amplifiers In [40] the bandpass filter is combined with the neural amplifier (Figure 3.6(a)). In this realization MOS (metal-oxide semiconductor)-bipolar pseudo-resistor el- ements, together with an operational transconductance amplifier (OTA) in feed- back, are used to achieve the required large time constant, which corresponds to a low cut-off frequency (25 mHz). V [V] Incremental Resistance [Ohm] v out v in v ref C 1 C 1 C 2 C 2 C L M a M b M c M d g m 10 4 10 6 10 8 10 10 10 12 10 14 (a) (b) Figure 3.6: State-of-the-art front end: (a) Neural amplifier (LNA and BPF) (b) Incremental resistance due to pseudo-PMOS resistors (Ma-Md) The method used to generate large resistance values for small differential in- put voltages relies on p-channel MOS transistor (PMOS) pseudo-resistors instead 33 of linear resistors. The principle of the operation of a pseudo-resistor is as follows: If the drain and gate terminals of a PMOS are tied together, it forms what is known as a diode connected device. For this device, when the gate-source voltage is slightly negative, the transistor is in the subthreshold region and the effective impedance of 1/gm seen across the device is very large. This is because at low conduction levels the transistor transconductance is also low. Once the nega- tive gate-source voltage increases and exceeds the transistor threshold voltage, the transconductance rapidly increases and the effective impedance across the element sharply declines. Alternatively, whenthegate-sourcevoltageispositive, theMOSFETisincut- off and what appears across the element is the diode between the drain and the substrate. This diode will experience varying degrees of conduction depending on the bias across it, but as long as the gate-source voltage is kept below the diode’s turn-on voltage, typically around a few hundred mV, the conduction is weak and the apparent resistance is high. Therefore, for a range of voltages around 0 V (typically by a few hundred mV in either direction), the diode-connected transistor acts as a large resistor. Figure 3.6(b) demonstrates this where the x-axis is the differential voltage and the y-axis is the corresponding incremental resistance. The scattered data points in the figure are results of imprecision in measurements due to extremely high resistances [40]. Two series transistors are used to extend the region of high resistance. This in turn reduces the distortion in the presence of larger signals. The noise efficiency factor achieved in this scheme is 4.0 and sizing techniques are used to optimize the noise level. PMOS transistors are used so that the bulk terminal can be connected to the source or drain of the transistors without requiring a special process. Furthermore, PMOS transistors exhibit less flicker noise, which becomes 34 Power/Channel [μW] Area/Channel [mm 2 ] NEF No. of Channels Technology [nm] Harrison 2003 [40] 0.45 0.16 4.0 1 1500 Al-Ashmouny 2012 [41] 3.3 0.05 2.9 16 250 Wattanapanitch 2011 [45] 20 0.03 4.5 32 180 Azin 2011 [46] 27 1.76 2.9 8 350 Muller 2012 [47] 4.79 0.0082 5.99 2 65 Mollazadeh 2009 [48] 100 0.75 3.2 1 500 Table 3.1: State-of-the-art CMOS analog front end performance comparison important for low frequency signals such as neural recordings [44]. This neural amplifier has been replicated often since its first publication, be- cause of its optimal noise performance for a given power limit, and the fact that it does not require large off-chip components. In [45] this neural amplifier is combined with a variable bandpass filter and a programmable gain amplifier to obtain an efficient front end which adjusts its power consumption based on the background noise level. The performance in terms of power, area and NEF are shown in Table 3.1. The average power consumption of the amplifier is on the order of 25 μW. 3.2.3 Analog-to-Digital Converter (ADC) After amplification and filtering, processing of neural data is often performed in the digital domain. Therefore, analog-to-digital converters (ADCs) are required to convert continuous time signals into discrete digital numbers. The input- output characteristic of a 3-bit ADC is shown in Figure 3.7 where the x-axis is the normalized input voltage and B out is the output bit code. Each continuous voltage has a corresponding digital representation. The error between the input voltage and its representation depends on the number of ADC bits. For a full voltage range V FS , and a precision of N bits, the voltage difference between two 35 subsequent quantization levels (also referred to as the LSB) is: Δ= V FS 2 N (3.2) The maximum quantization error (the maximum error between the continuous voltage value and the digital representation) is ±Δ/2. B out 111 110 101 100 011 010 001 000 V in / 8 V ref 12345678 Figure 3.7: ADC input-output characteristic The quantization error is a nonlinearity which is folded so many times onto itself that (under most typical circumstances) it looks like wide-band noise added to the signal. It can be shown [49] that the mean square value of this error is: 2 rms = Δ 2 12 (3.3) This is the quantization noise of an ADC. If the ADC noise is primarily dominated by this noise, the SNR at the ADC output can be determined in terms of the number of bits [49] as SNR=6dB ×N+1.76dB (3.4) 36 In practice, other factors (clock jitter, thermal noise, etc) deteriorate the SNR further and the above formula is modified by replacing N with the effective numberofbit(ENoB).ENoBisalwayssmallerthanN.In1999Waldenconducted a survey of all existing ADCs and showed that over a wide range of ADCs the power consumption increases by a factor of two for every added bit or for every doublingoftheADCbandwidth[50]. Therefore, Waldenintroducedthefollowing figure of merit (FOM): FOM = P 2 ENOB ×2BW (3.5) This FOM quickly became a measure to compare the power efficiency of var- ious ADCs with different resolutions and bandwidths. In this relationship, P is the power dissipation, ENob is the effective number of bits, and BW is the band- width of the signal. BW is ideally equal to the Nyquist rate which is half of the sampling frequency of the ADC. This FOM expresses the efficiency in Joules per conversion-step. A smaller value of the FOM corresponds to more efficient usage of power. The Walden FOM is suitable for ADCs where the SNR is dominated by the quantization noise. In such systems, an improvement of 6 dB in the SNR is accomplished when the number of bits is increased by one (Equation 3.4). Taking a flash ADC as a general representative, that requires a doubling in the number of comparators and therefore each additional bit doubles the power consumption. Alternatively, one could keep the number of comparators the same and switch themattwicethespeedtodoublethebandwidth. Sincetheswitchingactivityper unit of time has now increased by a factor of two, it is plain to see that to obtain twice the bandwidth the power consumption is doubled as well. This explains theproportions in Walden FOM. Whilemany secondary effectsincluding thermal noise and other circuit limitations are ignored by this simple justification, Walden 37 Power [μW] Sampling Rate [kHz] ENOB No. of Channels FOM [fJ/conv-step] Harrison 2009 [53] 4.80 15.7 10 100 299 Al-Ashmouny 2012 [41] 0.99 20.0 7 16 387 Wattanapanitch 2011 [45] 6.47 31.0 7.65 32 1039 Azin 2011 [46] 26.9 36.0 7.65 32 1039 Muller 2012 [47] 5.04 20.0 7.16 2 1762 Mollazadeh 2009 [48] 26.4 16.0 7 1 12891 Table 3.2: Performance of state-of-the-art ADCs for neural applications FOM has provided a relatively solid way to compare a wide range of ADCs over the past decade. Compared to other state of the art applications, the requirements on ADC performance for biomedical applications are not very strict. For neural signals, an SNR of approximately 40 dB [ref] is needed which calls for an ADC with at least 7 bits. Since the frequency band of interest for neural signals is below 10 KHz [42], conversion has to be performed at a frequency larger than 20 KHz to satisfy the Nyquist criterion [51]. 40 KHz (which is a conservative figure) is the rate often used in current neural processing systems [52]. Table 3.2 catalogs the performance of state of the art ADCs for biomedical applications. The best FOM achieved is 299 fJ/conv-step and the power con- sumption for the ADCs on average is on the order of tens of μW. If each channel is to be digitized, this power can make up a large percentage of the total system power, along with the neural front end. Employing ultra-low-power analog signal processing techniques which eliminates the ADC can potentially lead to great power savings. 38 3.2.4 Spike Sorting and Pre-Processing It is often difficult to measure extracellular action potentials from one neuron only. Micro-electrode arrays are too large to be placed close enough to a single neuron with precision. Since multiple neurons are in the vicinity of the elec- trode, they each generate individual action potentials. Therefore, the measured signal is the composite effect of neural activities of a handful of neurons near the electrode. However, as previously mentioned, in order to decode the signal it is important to know the single-unit activity of neurons in certain brain regions. A technique called spike-sorting is applied to the recorded signal, which separates spikes based on their temporal characteristics into single-unit activities of the neurons surrounding the electrode [36]. In neural prosthetics applications it is often important that spike sorting be performed in real-time. As with other parts of the neural hardware system, there is a trade-off between complexity, accuracy, and efficiency. In spike sorting, the more complex the algorithm the better the accuracy is. However, such algorithms are also power hungry, as more hardware components are generally required, hence the need for ultra low power hardware implementation. The performance of state-of-the-art spike pre-processing systems is shown in Table 3.3 which on average consume 50 μW per channel [54]. 3.2.5 Telemetry When amplifying neural signals, a bandwidth of up to 10 kHz is typically needed to observe the individual spikes [42]. Assuming an ADC sampling rate of 20 KS/s and a resolution of 8 bits, each ADC generates data at a rate of 160 Kb/s [40]. State-of-the-art power levels needed to transmit neural data are in the order of 3 nJ/bit [59]. Therefore, transmission of each channel will require 480 μW. For 39 Power/Channel [μW] Method [KHz] No. of Channels Technology [nm] Harrison 2009 [53] 0.99 Detection (Thresholding) 100 600 Rizk 2008 [55] 104 Detection (Thresholding) 96 N/A (FPGA) Olsson 2005 [56] 75 Detection (Thresholing) 32 500 Karkare 2011 [54] 2.03 Detection (Thresholing) 64 350 Chae 2008 [57] 100 Detection + Feature Extraction 1 350 Chen 2009 [58] 14.6 Detection + Feature Extraction 128 90 Table 3.3: Performance comparison of state-of-the-art CMOS spike processing systems large scale systems which record from 32 channels or more, the transmit data rate would commensurately increase to around 5 Mb/s, which will need power levels in the order of tens of mW. The inefficiency of such a transmitter would result in heat dissipation, which can cause tissue damage. Attempts have been made to compress the data, such as spike detection, which results in a loss of information, and compressed sensing, which is not as lossy but still does not reduce the data rate by more than an order of magnitude [60]. 3.2.6 Stimulation Electrode and Associated Circuits Nerve stimulation is often used to communicate information to the brain by ap- plying a voltage or a current to a stimulating electrode to trigger neural spiking. This process entails making the potential of the extracellular fluid more negative so that the cell is depolarized, which means it becomes more positive with respect to the outside potential. If the depolarization is large enough spiking is induced. The problem with voltage stimulation is that the exact interface impedance is not known a priori and the current may become dangerously high. Therefore it is preferable to drive the electrode using a well-defined current. Both methods require two electrodes to form a return path for the current. The amount of cur- rent that can safely be provided to the tissue depends on several factors including 40 electrode surface area and material. Aspreviouslydescribed,onceanelectrodeisinsertedintoneuraltissue,charge isdistributeduntilarestingpotentialisreached. Theinterfacewasrepresentedin the electrical model of Figure 3.4 as the parallel combination of a large resistor R e and C e . Charge transfer at the electrode-tissue interface occurs in two forms: 1) non-Faradaic reactions, where no electrons are transferred between the electrode and tissue, and a redistribution of charged chemical species in the tissue happens, 2) Faradaic reactions, in which electrons are transferred between the electrode and tissue, resulting in reduction or oxidation of chemical species along with harmful reactants, which can lead to tissue damage [61]. Safe stimulation requires low stimulation levels at which Re does not appear in the model and therefore the charge transfer is non-Faradaic, also referred to as capacitive. In this case, the stimulation results only in a charge separation at the interface. This stimulation is too small to inflict damage, but also too small to elicit spiking. When stimulation levels are high enough Faradaic reactions occur and chem- ical products are formed that may diffuse away from the electrode before they can be recovered. This will results in a charge imbalance. All chemical reac- tions at the electrode-tissue interface should be reversible, so that at the end no net charge is transferred from the electrode to the tissue. If the electrochemical products resulting from Faradaic reactions do not immediately drift away from the surface they can be recovered to their initial form by reversing the direction of the current applied. In this manner, one can apply a stimulus strong enough to elicit a response and at the same time remedy the transient damage by im- mediately applying a counter measure. Providing current to the electrode in two successive phases, one providing negative current and one providing positive cur- 41 rent, is called biphasic stimulation. It is often used to achieve charge-balance and safe stimulation despite Faradaic reactions at the interface [62, p.558]. Bipha- sic stimulation balances the charge by ensuring that the product of the absolute value of the stimulus amplitude and the duration of the pulse are matched during positive and negative current stimulation (see Figure 3.8). The absolute value of the amplitude of the first pulse is a, with duration W. The width and amplitude of the second pulse is scaled by scaling factor m. Current (μA) a<0 W mW a/m Time (ms) Figure 3.8: Charge balancing for stimulation The minimum amount of current, I th , required to elicit a neural response for a given pulse duration W can be described by the strength-duration curve: I th = I rh 1−exp(−W/τ m ) (3.6) where I rh is the rheobase current, i.e. the minimum current required for stimulation for an infinitely long pulse, and τ m is the membrane time constant. I rh and τ m are determined empirically, and depend upon factors such as the distance between the neuron population of interest and the electrode [61]. The typical range for the stimulation current is 10 μA to several mA [63–66], depending on the tissue, electrode surface area and its impedance. Since most of these stimulator implementations are Class A or Class AB amplifiers the power 42 efficiency is between 25-60%, based on theoretical maximum power efficiency of such amplifiers [67]. The total power consumption of the stimulator depends on the required current, the power efficiency, and the quiescent current of the driving circuits. Assuming a 3 V power supply and a 100 μA stimulation current, an approximate average power consumption for a state-of-the-art systems can be estimated to be 500 μW assuming the highest efficiency. The quiescent current is in the order of tens of μW. The subject of efficiency improvement is subject to ongoing research and is not within the scope of this work. 3.3 A Programmable Analog Implementation In the context of an integrated circuit biomimetic signal processing system mini- mizingpowerandareaarethemostimportantchallengestoovercome.Application- specific integrated solutions optimize area, and recent advances have made 3D stacking of chips, flip chip bonding to microelectrode arrays, and multi-chip dis- tributed processing possible. Power consumption, however, remains a perfor- mance limiting factor for a given silicon area. An analysis of the state-of-the-art performance for each component of the systems of Figure 3.3 was provided in Section 3.2.1-3.2.6. The relative average power consumption of each of these components is illustrated in Figure 3.9. It can be seen that the majority of power is consumed in the stimulation circuitry. As mentioned in Section 3.2.6, it is difficult to lower this power, because of the minimum current required to elicit a neural response through stimulation with electrodes. Based on the comparison chart of Figure 3.9 a fully analog architecture for a closed loop cognitive system, as shown in Figure 3.3 (c), has several advantages 43 Power Consumpon/Channel Front End ADC Smulaon Spike Sorng Telemetry Figure 3.9: Average power consumption per channel for a neural system over the other digital and mixed-signal solutions in power- and area-restricted hardwareapplications[68]. Firstofall,sinceallmajorprocessingisperformedon- chip, there is no need for telemetry circuits, which often take the main power and area in an implant (after the stimulator). As explained in Section 3.2.5 at a data rate of 160 Kb/s, the state-of-the-art implementations consume 0.5 mW/channel to transmit neural data [59]. Furthermore, analog processing obviates A/D con- version, leading to significant power savings. With careful design, computations such as addition and multiplication can be more efficiently implemented in the analog domain than digital when high sampling rates are required [68] [40]. Despite these advantages, a fully analog system similar to Figure 3.3 (c) has not yet been presented in the literature. Few digital and mixed signal systems implement the whole signal processing chain from recording to stimulation on- chip. Off-line processing is often preferred since it provides benefits such as reliable spike-sorting, freedom to choose different algorithms and does not pose strictlimitsonprocessingpower. However, thedemandforfunctionalityofneural implants, especially in cognitive prosthesis applications is on the rise and that is 44 where on chip signal processing becomes indispensable. Even though various solutions for different components of such neural hard- ware systems have been published only a few have accomplished on-chip pro- cessing of neural signals. Those solutions are generally application-specific and cannot be readily reused in a wide range of neural systems [69]. The main novelty of this work is the ultra low power implementation of a neural model in analog CMOS. In this work the Laguerre Expansion of Volterra (LEV) model is used, but as will be demonstrated, the circuit design techniques are general and modular enough that they would easily lend themselves to other modelsaswell. Themodelisdividedintobasicanalogbuildingblocksthatcanbe built in standard CMOS technology. Analog hardwareprovidesa small area, high speed, and very low power [68, p.5]. In many cases to perform moderately precise computations, an analog solution offers significant area savings. One example of this is addition. In analog domain, connecting the drains of two transistors will sum their currents. In digital, this computation requires a few 100 transistors to build a typical 7-bit full adder. High speeds can be achieved using analog circuits because the signal swing can be smaller than in digital circuits. When a circuit produces smaller voltage swings, it also requires smaller currents to charge and discharge the node capacitances. Hence higher speeds are achieved for given current budgets. As explained in the following chapters, in the subthreshold transistor region of operation currents on the order of a few pico Amperes can be used, which is comparable to their leakage current (current conducted by a transistor when it is turned off). Biased at this level, the transistor can still act as a transconductance cell and the summing operation described in the above example would require minute currents in the same pA range. However, a digital counterpart consumes 45 a few hundred times more current in leakage alone, even when it is not clocked. Additionally the digital circuit consumes dynamic power during clock transitions. We will show in what follows that despite the superficial shortcomings of an analog implementation, it is in fact a more power and area efficient approach when a tight precision in signal generation and manipulation is not a must. We will in fact demonstrate that in biological neural systems, the signal is often buried in a significant amount of noise, which is not dissimilar to how an analog subthreshold circuit handles the signal. One could in fact argue that neural signals are inherently analog and as such a careful analog implementation is a natural choice to electronically generate and process such signals. 46 CHAPTER 4 A Nonlinear Neural System Model 4.1 Choice of the Neural Model Quantitative mathematical models have proven to be an indispensable tool in pursuing the goal of understanding and replicating the brain’s impressive infor- mation processing capabilities [70]. To create such models it is important to find a balance between detail to be captured and level of abstraction. As described in Chapter 2, many intricate biophysical and structural elements underlie the neural processing mechanisms. Furthermore, large systems of neurons encode higher level brain functions with innumerable connections of different strengths to thousands of other neurons [71]. Creating biomimetic models which accurately account for each membrane potential and neural element in a large neural net- work is therefore not feasible given today’s computational processing capabilities and experimental tools for obtaining proper data for model validation [71] . The most basic neuron representation is a simple integrate-and-fire model, which can capture the encoding of the stimulus amplitude into a spike frequency. However, it cannot accurately describe the neuron’s nonlinear computations, which arise due to their morphology, ionic conductances and dendritic input processing [70]. State-of-the-artmodelswhichcancapturenonlinearcomputationsintheorder of most detailed to most abstract include: Complete and reduced compartmen- 47 tal models, which consist of anatomical reconstruction of neural morphology and captures its influence on neural dynamics (they often require up to thousands of compartments); Single-compartment models, which capture neural dynamics based on the Hodgkin Huxley model [72]; Cascade models which include filter- ing and nonlinearity to generate time-dependent firing responses); Black-box (or input-output) models are data-driven but neglect biophysical mechanisms and producespikeresponsesbasedonconditionalprobabilitiesforagivenstimuli[70]. Ultimately, the appropriate level of detail and abstraction depends on the partic- ular goal of the model and finding the balance between them is often key to the success of the model. Compartmental modeling methods, such as Hodgkin-Huxley, accurately in- terpret detailed physiological mechanisms and processes of a neuron. However, the structure and parameters of the model are specific to the type of neuron and often not applicable to others. Furthermore, the model is computationally com- plex with a large number of open parameters. Large-scale neuromorphic brain simulators as well as implantable prostheses both have tight restrictions on the power and area their fundamental processing unit can have. Hence, detailed compartmental models are not ideal for such applications. A combination multi-input/multi-output (MIMO) model which is data-driven and contains biomimetic elements is shown in Figure 4.1 [73] [74]. This model is relatively computationally efficient, and avoids modeling errors due to biased or incomplete knowledge of underlying mechanisms by capturing neural spike trans- formations instead of biological details. The MIMO model, which is described in more detail in the following sections, contains multiple neurons which receive spike input trains and transform them nonlinearly into multiple spike output trains. The biomimetic features are the nonlinear transformation of spike inputs 48 into a graded potential u (similar to a postsynaptic potential), a threshold, and intrinsic noise, which are all considered critical components in neural processing. The MIMO model hence captures the nonlinear spatio-temporal transformation of a population of neurons critical in a neuromimetic model. k1 Input spike trains U (EPSPs) x y k2 H K Gaussian Noise Threshold Feedback Kernel Feedforward Kernels input spike trains output spike train x 1 y uw a MISO 1 MISO 2 MISO 3 MISO Model MIMO Model x 2 x 3 x 4 x 1 x 2 x 3 x 4 y 1 y 2 y 3 SISO Model Figure 4.1: Multi-input/multi-output model to capture nonlinear spike-to-spike transformations of multiple neurons The main advantage of the MIMO system of Figure 4.1 is that it is very efficiently implemented in hardware. Similarly to the information processing which occurs in single neurons, it uses basic computations including addition, subtraction, multiplication, high and low pass filtering, thresholding, and noise [70] which are combined to achieve more complex functionality. In this work, the model is introduced in the context of a prosthesis to restore a damaged part of the hippocampus, which is a brain region involved in memory processing (Figure 1.5). The prosthesis is intended to restore a patient’s memory after damage to the 49 hippocampus. It consists of a multi-site electrode array to record neural sig- nals from the CA3 region of the hippocampus, a VLSI biomimetic model, which emulates memory processing, and a multi-site stimulation array to stimulate neu- rons in the CA1 output region. To describe the memory processing, a combined experimental-theoretical model has been developed in [22]. In this model, neu- rons are considered to be the fundamental processing units in the brain. These units communicate using spikes of equal potential and information is encoded in the timing between the spike events [22]. Temporal information of an incoming spike train is transformed by a neuron into a different outgoing spike pattern. Since cortical functions are encoded in the firing activity of not just a single neuron but of a population of neurons, the model captures the spatio-temporal neural response [28]. The VLSI chip in Figure 1.5 implements this model. 4.2 The Laguerre Expansion of Volterra Kernels (LEV) Model Theprocessingofcognitivefunctionsoccurslargelyinthefourlobesofthehuman cerebral cortex. The hippocampus, deeply embedded in the temporal lobe, plays an important role in tasks related to learning and memory [20]. In particular, it has been shown that the hippocampus encodes long-term memories, and that damage to this structure can prevent the formation of new ones [75], [20, p.32]. A theoretical MIMO model is being developed to capture the nonlinear dynamics underlyingthetransformationofhippocampalCA3(input)toCA1(output)spike trains [28] [75]. The MIMO model is made up of several single-input/single-output (SISO) spike transformation as shown in Figure 4.1. Here, the system K represents the 50 feedforward kernel which transforms the input spikes x into a continuous signal pre-threshold potential u. Then, pre-threshold Gaussian noise is added to u and the resulting signal is subjected to a spike generating threshold θ. To capture local feedback effects, the feedback kernel H is included, which produces a hidden feedback variable a. A spike is produced at the output y if the pre-threshold potential w (the sum of u, a and the noise ) exceeds the threshold θ. The feedforward kernel K constitutes the most important module in the sys- tem because it computes the nonlinear transformation an input spike train x to a continuous-time output u. A second order representation of the LEV model which implements K is shown in Figure 4.2. The Volterra kernel k 1 captures the linear part of the transformation and k 2 captures the second order nonlinearities. The LEV kernels contain Laguerre basis functions which are convolved with the input spike trains. Volterra Kernels k1 stimulation trains EPSPs x u k2 Figure 4.2: LEV model to capture transformations of spike trains to excitatory post-synaptic potentials 4.2.1 Laguerre Polynomials InordertounderstandtheLEVmodel,itisimportanttodiscussLaguerrepolyno- mial expansion which allows close approximation of linear exponentially decaying 51 signals. Laguerre polynomials are defined as: L n (x)= e x n! d n dx n (e −x x n ) (4.1) They are orthogonal functions in the interval from 0 to infinity with weight function e −x : ∞ 0 e −x L n (x)L m (x)dx = ⎧ ⎨ ⎩ 1if m = n 0if m = n (4.2) Equation 4.2 suggests that the set of functions l n (x)= e −x/2 L n (x)forman orthonormal basis. Such functions are linearly independent, perpendicular unit length vectors, spanning a vector space [76]. For time domain signals, x/2is replaced with p×t (p acting as a time scaling factor) and proper normalization factor is applied to yield the following orthonormal basis, which are hereafter referred to as Laguerre Functions (LF): ⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ l 0 (t)= √ 2pe −pt l 1 (t)= √ 2p(2pt−1)e −pt l 2 (t)= √ 2p(2p 2 t 2 −4pt+1)e −pt ... l n (t)= √ 2p (2p) n n! t n − n(2p) n−1 (n−1)! t n−1 + n(n−1)(2p) n−2 2!(n−2)! t n−2 ...+(−1) n )e −pt (4.3) The first three Laguerre basis functions for three different p values are shown in Figure 4.3. A linear combination of these functions can be used to approximate signals (suchastheimpulseresponseofalinearcausalsystem)thatasymptoticallydecay with time: y(t)= c 0 l 0 (t)+c 1 l 1 (t)+c 2 l 2 (t)+... (4.4) 52 Figure 4.3: Three Laguerre Functions for different p values If the time domain signals are sampled, the sequence can be represented as a matrix equation: Y = LF ×C, (4.5) whereY isavectorofthesamplesofthesignaltobeapproximated,LF isamatrix of the LF samples, and C is the vector containing the expansion coefficients c k . The Moore-Penrose Pseudoinverse method provides a least squares solution for c k : C = Y ×LF −1 . (4.6) Using least square error estimation provides increased accuracy in the presence of noise and reduces the requirements in terms of length of experimental data- records [77]. Additionally, the parameter p is used to adjust the decay rate that best matches the signal. 4.2.2 Laguerre Expansion of Volterra Kernels The abovementioned linear approximation will be insufficient in capturing the nonlinearities of complex neural systems, which occur in presence of paired spike pulses, triplets, etc. Therefore, the approximation should be extended to a La- guerre Expansion of Volterra Kernels scheme [22]. The time-domain estimation 53 of the response of a system to an arbitrary input x(t) is described by: y(t)= L−1 n=0 c n v n (t) + L−1 n 1 =0 n 1 −1 n 2 =0 c n 1 ,n 2 v n 1 (t)v n 2 (t) + L−1 n 1 =0 n 1 −1 n 2 =0 n 2 −1 n 3 =0 c n 1 ,n 2 ,n 3 v n 1 (t)v n 2 (t)v n 3 (t) +... (4.7) where v n (t)= ∞ 0 l n (τ)x(t−τ)dτ (4.8) is the convolution of the nth order LF with the input x(t). L is the number of Laguerre basis functions employed, c n are the weighting coefficients of the first order terms, c n 1 ,n 2 are those of the second order terms, and so on. Note that equation 4.7 simplifies to equation 4.4 if x(t) is an impulse function and higher order terms are removed. The LEV coefficients can be found by training the system with a sufficiently long input and output training dataset [22] and applying linear or other approximation techniques. The correspondence between the parameter p here and the parameter α pre- sented previously in the context of the hippocampus prosthesis [28], [75], can be found: p π · 1 p+s . (4.9) The equivalent discrete z-Transform equation for the zeroth order Laguerre is: 54 √ 1−α 2 · z −1 1−αz −1 (4.10) which can be written as √ 1−α 2 · 1 z −α . (4.11) Using the location of the pole, the relationship between α and p can be found: z −a=1+ s p (4.12) Since z ≡ e sT ,where T is the sampling period, we can write e sT −α=1+ s p . (4.13) Evaluating this equation at the location of the pole s = −p the relation between α and p is: e −pT = α. (4.14) and, hence, ln(α)= −pT. (4.15) 4.2.3 Experimental Setup for Neural Data Acquisition To validate the functionality of the model in Figure 4.2, in theory and for hard- ware implementation, experimental data is used. Two sets of data are required where each set consists of a long input spike train sequence and its corresponding post synaptic output potentials, corresponding to Figure 4.2. The first set is used to train the model and to obtain adjustable parameters of the model and the sec- ond set is used to test the performance of the model. For this, the input of the second set is applied to the model and its output potentials are then compared to the experimental output. 55 The experimental setup for obtaining these recordings from the hippocampus, is shown in Figure 4.4 [78]. The nonlinear interactions between the CA3 input region and the CA1 output region represent this encoding function, which is to be modeled by the LEV kernel model. pp fibers hilus mossy fibers recurrent collaterals hf DG CA1 CA3 Schaffer collaterals str oriens str pyramidale str radiatum str lacunosum-moleculare str moleculare str granulosum str radiatum str pyramidale str oriens str lucidum Schaffer collaterals Random Interval Stimulation Train Stimulation Electrode Recording Electrode Postsynaptic Potential Figure 4.4: Experimental setup to obtain neural data CA1pyramidalneuronsinthehippocampusweresynapticallystimulatedwith Poisson random interval trains (RITs) through the Schaffer collaterals to mimic thespikingbehaviorobservedin CA3hippocampalneurons. Themeanfrequency of stimulation was 2 Hz and the stimulation intensity was adjusted so that no action potentials were induced. The EPSPs were then recorded at the soma of the cell. Hence, the inputs to the model are spikes (action potentials) and the outputs are continuous analog voltages. 4.3 Model Optimization for Hardware Implementation To optimize this model for hardware implementation, it is important to minimize complexity while maintaining precision in estimation and prediction. Since the complexity of the system grows with the number of LFs employed and the degree 56 of nonlinearities included, there are practical limitations for implementing an nth order LEV system with a large number of Laguerre basis functions. Figure 4.5 shows the variation of normalized mean square error (NMSE) in percent versus the number of Laguerre basis functions employed to estimate the EPSP output data. The top blue curve is a linear combination of LFs (first order LEV). The green curve also includes the cross products (2nd order) [79]. The red curve which almost coincides with the second one includes third order products. This clearly shows that for this data set, while including the second order nonlinear terms has an advantage over first order linear estimation, the third order nonlinear terms provide negligible improvement. One explanation for this is that the average time interval of spikes used to stimulate the input CA3 region is too small to allow for triplets (three consecutive output spikes which demonstrate 3rd order nonlinearity) to occur at the output. Here, the average inter-spike-interval is 500 ms (for a 2 Hz random interval train), which generally is too short a time window to observe three spikes. 1 2 3 4 5 6 7 5 10 15 20 Number of Laguerre Basis Functions 1st Order 2nd Order 3rd Order 2nd Order Fast&Slow Figure 4.5: Percent normalized mean square error vs. number of Laguerre Func- tions for different orders of nonlinearity 57 Finally, the bottom curve (turquoise) shows the second order estimation when two sets of LEV models are used, one with a slowly and one with a rapidly de- caying time constants. Including two sets of LFs can significantly improve model performance. Thisresultisexpectedconsideringtheunderlyingbiologicalmecha- nisms. The transformation performed by glutamatergic synapses from action po- tential to post-synaptic potentials includes fast-acting ionotropic receptors, such as AMPA and NMDA, and slow-acting metabotropic receptors, such as mGluR. In this work a second order estimation is implemented with one time constant (p=50 s −1 ). Theextension to two time constants isstraightforward by replicating the LEV model with a different p parameter [80]. However, for this application, a second order LEV model with four basis functions was found optimal due to the drastic increase in complexity for only a small gain in performance. 0 .5 1 1.5 2 2.5 0 .5 1 1.5 2 2.5 0 2 4 6 8 0 .5 1 1.5 2 2.5 0 2 4 6 8 Time [s] (a) 0 1 2 (b) (c) Amplitude [mV] Amplitude [mV] Amplitude [mV] Figure 4.6: Approximating neural data using the LEV model (a) All-or-none input (to CA3) (b) Measured output (at CA1) (c) Estimated output Figure 4.6 shows a sample sequence of the input - a random spike train with Poisson distribution and a mean spiking frequency of 2 Hz - applied (Figure 4.6 (a)), the experimentally recorded output (Figure 4.6 (b)) and the approximation of the output (Figure 4.6 (c)) using the optimized second order LEV model. The 58 effect of the nonlinear interactions that occur at the synaptic level is captured by the model, as can be seen in the response to the paired pulse at around 1.75 s. The decay parameter p used here is 50, which indicates a time constant of 20 ms for the zeroth order LF. The normalized mean square error between the estimate and the output is approximately 9% over an interval of 200 s. 0 5 10 15 20 25 -2 0 2 4 6 8 Recorded EPSP Data [mV] 0 5 10 15 20 25 -2 0 2 4 6 8 Approximation Using the LEV Model [mV] 0 5 10 15 20 25 0 1 2 Time [s] Absolute Value of the Error [mV] Figure 4.7: Comparison of EPSP recordings and LEV approximation The top panel in Figure 4.7 shows a snap shot of recorded EPSP data during a longer 25 seconds interval. The middle panel shows the approximation to the recording using the LEV model. The bottom panel illustrates the absolute value of the difference between the recorded data and the approximation. The normalized mean square error between data and approximation is 9.23% for this data set. The first and second order LEV kernels are shown in Figure 4.8 and Figure 4.9 respectively. These kernel plots demonstrate the time dynamics of the linear 59 and nonlinear components of the model. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 0.5 1 1.5 2 Time [s] [mV] Figure 4.8: First order kernel 0 0.5 1 0 0.5 1 0 0.5 1 Tau1 [s] Tau[2] Figure 4.9: Second order kernel 60 4.3.1 System Specifications for Hardware When designing hardware for the SISO or MIMO system of Figure 4.1, a proper value for standard deviation and the threshold have to be chosen. These pa- rameters are dictated by the properties of the neural data and specifically, the spontaneous firing rate of a neuron when no input is applied. The system’s signal-to-noise ratio (SNR), is defined as: SNR = v 2 rms v 2 n (4.16) where v 2 rms is the rms signal power at the output and v 2 n is the output-referred noise power, is determined by using the model of Figure 4.1. In this model Gaussian white noise () is added to the pre-threshold potential u to account for intrinsic neuronal noise as well as the contribution of unobserved inputs [81]. Based on the SISO model of Figure 4.1, the input spike train x(t)andthe output spike train y(t) produced in response are shown in Figure 4.10. Here, the standard deviation (σ) of the noise added, ,is0.7andthethresholdis4.2. The noise level and threshold were chosen to achieve the correct spontaneous firing rate over a long time interval (over 200 s). A snapshot of the noise in relation to the pre-threshold potential u during a sample interval of 165 ms is shown in Figure 4.11. Based on this simulation, the minimum SNR is found as approximately 2. To stay sufficiently above this limit and to be able to observe waveforms during the testing, the hardware should be designed to satisfy an SNR of approximately 10. Since the input x(t) to the SISO unit is an all-or-none spike, the input to the hardware should be similar to an impulse of 1 ms width (the approximate duration of an action potential) and an amplitude equal to the maximum input swing of the first stage. 61 0 2 4 6 8 10 12 0 0.5 1 Input Spike Train x 0 2 4 6 8 10 12 0 0.5 1 Time [s] Output Spike Train y Figure 4.10: SISO model input spike train x(t) vs. output spike train y(t) 0 20 40 60 80 100 120 140 160 0 1 2 Time [ms] Amplitude [mV] Signal and Noise Amplitude Figure 4.11: Pre-threshold potential u and noise signal (σ=0.7) 62 Since the energy of neural signals is concentrated between a few Hz to 5-10 kHz [82], this sets the bandwidth requirement for the hardware system. 63 CHAPTER 5 Novel Neural Signal Processing Hardware This chapter describes the novel hardware implementation of a second order LEV model which can capture the nonlinear spike transformations performed by indi- vidual neurons. The novelty of this work is that the LEV model is realized using programmable analog circuits, which operate close to the limit of low power op- eration, while achieving a form factor small enough for potential implantation into human tissue. Furthermore, the circuit were designed in a modular manner to allow for future expansion of the model to multiple inputs and outputs and ultimately, to be part of a cognitive neural prosthesis. This chapter outlines the design steps of the hardware implementation to achieve optimal performance of the analog systems along with the digital calibration circuits for parameter ad- justments after fabrication and to compensate for pronounced process-dependent effects. 5.1 Overview of the System Architecture The Laplace transform of an LF of order n can be found from Equation 4.8 as: H(s)= H LP (s)H n AP (s) (5.1) where H LP (s)= 1 p+s (5.2) 64 H AP (s)= p−s p+s (5.3) These forms lend themselves readily to a continuous-time frequency domain implementation of the LFs v n (t) of Equation 4.8. Therefore, a frequency domain implementation of H(s) as shown in the top row of Figure 5.1, is suggested, in which the impulse response at each node is an LF and p determines the LF decay rate. The filters H LP and H AP implement Equation 5.2 and 5.3. Therefore, v 0 to v 3 in this diagram are the direct implementations of Equation 4.8. Figure 5.1 shows the full block diagram of a second order LEV model with four LFs. The secondordernonlinearLFtermsaregeneratedbycross-multiplyingthefirstorder LF terms. Along with the weighting block, which contains the model coefficients c n and c n 1 ,n 2 , this system completely implements the LEV model of Equation 4.7. y(t) . . . Weighting Block v 0 v 0 v 0 v 1 v 0 v 2 v 0 v 3 v 0 v 3 v 3 v 3 x(t) v 0 v 1 v 2 v 3 H LP (s) H AP (s) H AP (s) H AP (s) v 0 v 1 v 2 v 3 c m,n Figure 5.1: Block diagram of frequency domain Laguerre Functions 65 5.2 LEV System Synthesis As shown in Figure 5.1 a filter chain comprising one low-pass filter followed by three all-pass filters can generate the four LFs of the proposed second order LEV. While different methods can be used to implement the filter stages, such as passive RC, switched capacitor or opamp-C filters, in this work a Gm-C im- plementation is proposed. Passive RC filters set the theoretical limit for lowest power per pole per bandwidth for a given dynamic range [83]. However such filters are prohibitively bulky for implantable hardware with time constants of several milliseconds. For example, to achieve a time constant of 10 ms, with a very large resistance value of 1 GΩ, the on-chip capacitance would have to be 100 nF. In a typical process this capacitance occupies an area of approximately 150 mm 2 , which is not economically integrable in today’s technologies. Besides, passive filters normally do not provide signal gain and, therefore, the SNR in a chain of passive filters progressively degrades. Switched-capacitor filters are discrete-time filters that process continuous signals via sampling. To achieve filtering, resistors are replaced by switched- capacitor equivalent networks where R = 1 f sw C (5.4) The advantages of a switched-capacitor filter implementation is that by minimiz- ing capacitances and using low switching frequencies a very small form factor can be achieved. However, it has to be ensured that the sampling rate or switching frequency satisfies Nyquist’s criteria in order to avoid frequency folding (alias- ing). Since neural signal processing entails signal frequencies up to 10 kHz, the switching frequency has to be at least 20 kHz. For a single-pole lowpass filter, 66 such as H LP , the cut-off frequency of the switched-capacitor is given by: f −3dB = 1 2π f sw C s C l (5.5) where f −3dB is the cut-off frequency and f sw is the switching frequency. For example, for a cut-off frequency of about 16Hz (corresponding to a time constant of 10 ms), the ratio of load filter capacitor (C l ) to switching capacitor (C s )is: C s C l = f −3dB f sw 2π=0.05 (5.6) If the smallest capacitor that can be reliably implemented is 10 fF, the filter load capacitor will be around 200 fF, which still enables a very compact implementa- tion. Another advantage of switched capacitor implementations is that the filter cut-off frequency can be adjusted via the clock frequency. However, aswitchedcapacitorfilterrequirestwonon-overlappingclockswhich constantly consume dynamic and static power, even in the absence of input sig- nal. Furthermore, tuning individual filters requires additional digital hardware to implement non-overlapping clocks for each filter. More importantly, an anti- aliasing pre-filtering stage is needed. In the absence of such a filter any noise or interference outside the signal band will fold back and corrupt the signal. This causes a significant circuit overhead for the implementation of low-order transfer functions. Therefore, a switched capacitor implementation is normally reserved for complex filter characteristics in which the overhead is justified [83]. Active RC filters use opamps in feedback along with passive resistors and capacitors. Feedback provides linearity and precise locationing of complex poles which in turn requires high gain amplifiers with low output impedance, often creating significant overhead in complexity and power [84]. Furthermore, the reduction in size of the passive elements is still not enough to allow for imple- menting on-chip time constants on the order of milliseconds. Additionally, time 67 constants can not be easily adjusted via programmability, because they are set by the size of the passive components. Therefore, intheproposedsystem, thefiltersweredesignedasfullydifferential Gm-C stages. Gm-C filters implement the LF with fewest active and passive components. Furthermore, time constants on the order of tens of milliseconds can be implemented in a more area-efficient manner using Gm-C realizations compared to bulky passive implementations. The drawback of using a Gm-C solution is the inherent nonlinearity of the transconductance devices when the signal swing is large. In a low power application, however, the signal is inherently small and this will cause minimal error. A fully differential structure was chosen for several reasons. First of all, a differentialarrangementallowsahigherswingandSNRatthepriceofanincrease in power consumption. Secondly, it improves immunity to the common mode noiseandinterferenceintheenvironmentandthecircuitismorelinear, sinceeven order nonlinearities are eliminated. Furthermore, for a given transconductance, the capacitor size can be halved if connected differentially across a cell’s output. + – + – + – + – gm 1 gm 2 v b C v b + v out - v in + - Figure 5.2: Low-pass filter implementation H LP was synthesized as shown in Figure 5.2. The transfer function for H LP is given by: v out v in = gm 1 C 1 s+ gm 2 C 1 (5.7) where p = gm 2 /C 1 based on Eq. 5.2. Once gm 2 is determined, gm 1 sets the 68 passband gain of the filter. The effect of bottom-plate capacitance was balanced by cross-connecting two capacitors between the output terminals but in opposite polarities [83]. The allpass filters H AP were synthesized as shown in Figure 5.3. The transfer + – + – gm 5 + – + – + – + – gm 3 v b3 v b4 v b5 gm 4 + v out - v in + - Low-Pass Filter Figure 5.3: All-pass filter implementation function of H AP can be derived as: v out v in = gm 5 gm 4 gm 1 gm 3 gm 5 C − gm 2 C −s gm 2 C +s (5.8) To match this transfer function to Eq. 5.3, we set gm 2 /C 1 = p, gm 3 =2gm 5 and gm 1 = gm 2 . gm 4 adjusts the passband gain of the filter. In both filters, the transconductances of the Gm-cells can be adjusted via v b . The all-pass filter topology is modular and employs the low-pass filter of Figure 5.2. The second order LEV kernels are generated using analog multipliers to cross- multiply the first order LEV kernels obtained from the filters. The weighting blockisimplementedusingprogrammableanalogmultiplierstoallowadjustments of the model coefficients. 69 5.3 Noise and Performance Limitations The goal of this work is to implement the analog signal processing system of Figure 5.1 with the minimum power possible in 0.13 μm CMOS technology along with a reasonable size. Neural hardware has relatively relaxed requirements on speed and dynamic range and the fundamental limit to the analog processing circuitry is set by the required SNR. As explained in the following section, in the Laguerre filters, noise restricts the size of the filter capacitors, and hence, sets the minimum current to be used. 5.3.1 Noise in Electronic Circuits Integratedcircuitelements, suchasCMOStransistorsandresistors, produceelec- trical noise. The power of the noise limits the minimum signal level a circuit can effectively process. Improving noise performance for a given input signal ampli- tude affects the power, speed, and linearity performance of a circuit. Therefore, to accurately predict the performance of a circuit, it is critical to account for noise during the design stage. Since noise is a random process, its instantaneous value is not known but it is possible to predict the average noise power generated in a circuit. In the time domain: P av = lim T→inf 1 T T/2 −T/2 n 2 (t)dt (5.9) where n(t) is the instantaneous value of the noise [85, p.203]. It is common to perform noise analysis in the frequency domain using power spectral density for ease of calculation [85]. The noise power spectral density allows us to determine how much noise power the signal carries at each frequency. The power spectral densities of different circuit noise sources are predictable and 70 mostly uncorrelated. Hence, the power of uncorrelated noise sources can be added, and superposition applies. Once the power spectral density is found, the total average noise power can be obtained by integrating over the operating frequency range. The two main noise sources of interest in this work are thermal and flicker noise. 5.3.1.1 Thermal Noise Thermal noise arises from random motion of electrons in a conductor, including resistors and CMOS transistors, and is proportional to absolute temperature. R Noiseless Resistor v n 2 Noiseless Resistor R i n 2 i n 2 Noiseless Transistor (a) (b) (c) Figure 5.4: Thermal noise representations in a resistor and a MOS transistor Figure 5.4 (a) and (b) show how thermal noise in a resistor can be modeled and Figure 5.4 (c) shows the thermal noise model of a CMOS transistor. In a resistor, thermal noise is accounted for as a voltage in series with a noiseless resistor, or, equivalently, a current in parallel with a noiseless resistor. The noise voltage has a power spectral density (PSD) described by: S v (f)=4kTR, f ≥ 0 (5.10) measured in V 2 /Hz, where k=1.38×10 − 23 J/K is Boltzmann’s constant, and T is the operating temperature in Kelvin. This is a one-sided representation of the 71 PSD. Since the PSD is flat with frequency, the variance, on a per Hertz basis, is: σ 2 = v 2 n = f 0 +1Hz f 0 S v (f)df=4kTR V 2 /Hz (5.11) The equivalent resistor noise current in A 2 /Hz is given by: i 2 n =4kT/R (5.12) In a transistor, the thermal noise element is a current source placed between the drain and source terminals, and its noise current in units of A 2 /Hz is given by: i 2 n =4kTγgm (5.13) where gm is the transistor transconductance and γ is the noise coefficient, which depends on the transistor’s channel length and is process-dependent. Thermal noise is considered “white” noise with a flat power spectral density and almost infinitely large bandwidth compared to the operating bandwidth of most inte- grated circuits. In a linear system, the noise is shaped by the circuit’s power transfer function, which limits its bandwidth [85, p.206]. 5.3.1.2 Flicker Noise In addition to thermal noise, CMOS transistors exhibit flicker noise. This type of noise occurs due to dangling bonds at the interface between the gate oxide and the substrate, which randomly trap and de-trap charge carriers and hence cause voltage fluctuations. Unlike thermal noise, which has a flat frequency spectrum, flicker noise is inversely proportional to frequency, which is why it is often called 1/f noise. It can be modeled as a voltage source in series with the gate terminal of a transistor and an approximate power spectral density: S v (f)= K f C ox WL 1 f (5.14) 72 where K f is a process-dependent constant, W and L are the transistor width and length respectively, and C ox is the gate oxide capacitance per area [85]. Figure 5.5 illustrates why it is important to consider flicker noise, especially when designing biomimetic systems, which operate at relatively low frequencies. Flicker noise power dominates at low frequencies, but decays rapidly as frequency increases, whereas thermal noise remains constant. When optimizing noise per- formance it is therefore ideal to reduce flicker noise as much as possible, so that the circuit’s operating frequencies of interest are above the so-called “corner fre- quency”. Asshown inthecombinednoisespectruminFigure5.5c, atfrequencies larger than the corner frequency, thermal noise dominates. 10log v n 2 log(f) 4kTR 10log v n 2 log(f) 10log v n 2 log(f) Thermal 1/f Thermal 1/f corner (a) (b) (c) Figure 5.5: Thermal, flicker, and combined noise spectrum of a MOS transistor In this work, mostly PMOS transistors were used in the circuits implementing the LEV model, because it has been shown that PMOS transistors have a better mismatch performance [44]. All circuits were designed so that the frequency band of interest was above the corner frequency. Therefore, flicker noise is not significant compared to thermal noise. 5.3.2 Maximizing the SNR in Presence of Noise The total amount of noise that appears at the output of a gm-C filter is set by the amount of capacitance used. More capacitance corresponds to less noise. 73 To illustrate this trade-off a simple noise model for a first order low pass filter is shown in Figure 5.6. The filter consists of a Gm cell, an equivalent resistor (which is a combination of the output resistance of the transconductance cell and any other external resistance loading it) and a capacitor. The noise is modeled as a current with a power spectral density of 4kTΓGm where Γ is a constant that increases with the complexity of the transconductance cell and the filter as a whole. In its simplest form, the Gm cell can be a single transistor, in which case Γ = γ. C i n 2 = 4kTgm Gm-Cell R out C v n 2 = 4kTgmR out R out (a) (b) Figure 5.6: Noise models (a) A generic Gm-C lowpass filter (b) Thevenin equiv- alent To simplify calculations, the Thevenin equivalent of the current noise source (Figure 5.6 (b)) is used: v 2 n =4kTΓgmR 2 out (5.15) The transfer function that the noise experiences is given by: v nout v in = 1 1+sCR out (5.16) The output voltage noise power spectral density is then: v 2 nout = v 2 n ·| 1 1+jωCR out | 2 =4kTΓgmR 2 out · 1 4π 2 R 2 out C 2 f 2 +1 (5.17) 74 The total noise power can then be found by integrating the above expression: P nout =ΓgmR out · ∞ 0 4kTR out 4π 2 R 2 out (C) 2 f 2 +1 df =ΓgmR out · 2kT πC tan −1 (f)| f=∞ f=0 =ΓgmR out · kT C (5.18) From this result, three observations can be made about the limit of power consumption. Firstofall,thetotalnoiseisinverselyproportionaltoC. Therefore, the lower limit for C is determined by the SNR of the system (assuming the signal swing is maximized already). Secondly, Γ should be as low as possible, which means, the circuit implementing the transconductance in the Gm-C filter should have the least amount of complexity. Finally, the output noise is proportional to the passband gain gm · R out . However, the passband gain also sets the output signal amplitude. Since signal amplitude is proportional to the square of the passband gain, and noise is only proportional to the passband gain, the SNR will actually increase with higher passband gain. From these observations, the design strategy for the Laguerre filters stages is to pick the smallest value of the capacitor for an economical on chip imple- mentation of the model to satisfy the SNR which in turn determines gm (set by gm/C = p). Then the best topology should be picked which maximizes gm but minimizes complexity. The next sections in this chapter introduce why transistors operating in weak- inversion are optimal to implement this model and presents the transistor imple- mentations of the filters using the least complexity for the circuits. 75 5.4 The Power Efficient Weak-Inversion Regime and Its Limitations Operating transistors in the weak-inversion (subthreshold) regime is ideal for low bandwidth, slow processing, ultra-low power hardware. In this region, transistors are biased below their threshold voltage and, therefore, currents are inherently low. Since circuit dynamics are determined by the speed at which capacitors charge and discharge, low currents allow implementing large time constants often needed for prostheses which operate in real-time, while keeping capacitor area within reasonable limits. 5.4.1 Derivation of Drain-Current Characteritics For an n-type transistor, the source terminal is always biased at a lower potential than the drain. Therefore, the Fermi potential of the drain is always lower than that of the source by qv ds and the barrier from source to channel, φ s is lower than from drain to channel, φ d . As a result, more charge will be at the end of the channel closer to the source compared to the end at the drain. Charge-carriers will now flow from source to drain due to diffusion as a result of the charge concentration difference between source and drain. According to the Boltzmann distribution the carrier densities at the source and drain ends are [86]: N s = N 0 e −φs kT (5.19) and N d = N 0 e −φ d kT (5.20) The transistor is fabricated to have a built in potential φ 0 between source and channel and drain and channel. The barrier from source to channel is then found 76 by: φ s = φ 0 +q(v g −v s ) (5.21) and the barrier from drain to channel is found by: φ d = φ 0 +q(v g −v d ) (5.22) Combining the above equations, the carrier density can be written in terms of gate, drain and source voltages [86]: N s = N 0 e −φ 0+q(vg−vs) kT (5.23) and N d = N 0 e −φ 0+q(vg−v d ) kT (5.24) The charge density at a given distance z along the channel is: dN dz = N d −N s l = N 0 l e −φ 0 kT e −q(vg) kT (e q(v d ) kT −e q(vs) kT ) (5.25) From this, the current per unit channel can be found based on the relation between current density and diffusion velocity: J = qNv diff (5.26) and v diff = −D dN dz (5.27) The current per unit width is given by I w = qNv diff = −qD dN dz (5.28) where D = kTt f 2m is the diffusion constant of the particle. The diffusion current inside a transistor can then be found as: I = qD w l N 0 e −φ 0 kT e −q(vg) kT (e q(vs) kT −e q(v d ) kT ) (5.29) 77 This equation holds when the mobile charge per unit area in the channel is much smaller than the depletion charge in the substrate. It assumes that the substrate consists of intrinsic silicon. However, the substrate is usually lightly doped, p-typeforNMOStransistorsandn-typeforPMOStransistors. Therefore, the charges in the substrate reduce the effectiveness of the gate at controlling the barrier energy [86, p.38]. A factor κ has to be introduced to account for this effect. Then, the transistor current in weak-inversion is adjusted to: I = qD w l N 0 e −φ 0 kT e −qκ(vg) kT (e q(vs) kT −e q(v d ) kT )(5.30) This equation can be rewritten as: I = I 0 e −qκvg kT (e qvs kT −e qv d kT )= I 0 e −q(κvg−vs) kT (1−e qv ds kT )(5.31) An adjusted version of this model for the drain current in weak-inversion called EKV (Enz-Krummenmacher-Vittoz) has been proposed in [87]. Based on the EKV model, the drain current I D in weak inversion referenced to the transistor’s source potential for a given gate-source voltage V GS is described by: I D =2nμC OX U 2 T W L e V G −V T0 nU T e −V S U T −e −V D U T (5.32) μisthechannelcarriermobility, C OX isthegate-oxidecapacitanceperunitarea 1 , U T is the thermal voltage, V T0 the threshold voltage, and W and L the effective width and length of the transistor, respectively. n is the substrate factor, which represents a loss in coupling efficiency between the gate and the channel, and is typically around 1.4 [88, p.13]. When V DS is sufficiently large, the drain current does not change with the drain-source voltage, as is generally desired in the case of saturation region op- 1 In the EKV notation, C OX represents the capacitance per unit area and C OX =WLC OX is the total gate-oxide capacitance. In mainstream CMOS literature C OX represents the gate- oxide capacitance per unit area. 78 eration which is relevant for analog design, I D is described by I D =2nμC OX U 2 T W L e V G −V T0 nU T − V S U T = I 0 e V G nU T − V S U T (5.33) In weak inversion, the desired saturation region is achieved for V DS > 4U T ≈ 100 mV. 5.4.2 Transconductance Efficiency The transconductance of a transistor in subthreshold (where v s =0)canbecal- culated as: gm = ∂I D ∂V GS = I D nU T (5.34) An important measure to evaluate the transconductance efficiency of a transistor for a given drain current is gm I D , which is independent of transistor size in weak and strong inversion. In weak inversion, this factor is maximized: gm I D WI = 1 nU T . (5.35) Asacomparison,instronginversionwheretheeffectivevoltage(V eff = V GS −V T ), is generally at least 100 mV, the transconductance efficiency is: gm I D SI = 2 V eff (5.36) Therefore, usingtransistorsoperatingintheweak-inversionregimeisthemost power efficient. However, it is important to consider several non-idealities which canpotentiallylimitthesignalprocessingcapabilitiesofthecircuitsinsubthresh- old, such as leakage, parasitic capacitances, and mismatch, which are described in the following sections. 79 5.4.3 Leakage Transistor leakage current can limit the power performance of an analog circuit if the transistor bias current is not significantly larger. These include source/drain leakage, which occurs because the source and drain regions act as reversed biased junctions, thin-oxide gate tunneling, which occurs due to electrons tunneling through the gate into the channel and sub-threshold leakage which is the leakage current that flows from drain-to-source when the gate-source voltage is zero [89]. In the 0.13 μm technology used here, the total leakage is on the order of less than 1 pA for the typical transistor sizes employed (3 μm/500 nm). Therefore, to ensure proper circuit operation, all transistor bias currents are at least an order of magnitude larger than this leakage. 5.4.4 Parasitic Node Capacitances Parasitic node capacitances limit the bandwidth of a circuit while larger currents increase it. In this work, sizes were carefully chosen as large as possible to sat- isfy flicker noise and to allow operation up to 5 kHz required for neural signal processing, as discussed in Section 5.5. 5.4.5 Mismatch Mismatch in the static voltage-current characteristics of transistors is mainly due to variations in the transfer parameter (also called current factor) β = μC OX W L and the transistor threshold voltage V T0 . These variations arise due to time- independent variations in physical quantities of identically designed devices, such as edge effects, implantation and surface-state charge, oxide effects, and mo- bility effects and several methods have been proposed to model them [90] [91]. 80 There are two different types of mismatch that cause variations in the static voltage-current characteristics of identical transistor pairs, gradient and random mismatch. Gradient mismatch occurs due to spatial processing gradients. This type of mismatch can be avoided using proper layout techniques, such as common centroid arrangements. Random mismatch occurs when the current or voltage fluctuations are not spatially correlated. This type of mismatch can be mitigated by increasing the area of the transistor [91]. However, as explained previously, increasing area for better accuracy reduces the circuit’s bandwidth. It can be shown that the relative mismatch of the drain currents of two iden- tically sized transistors with the same gate voltage (such as in a current mirror), is: σ(ΔI D ) I D 2 = gm I D 2 σ 2 (ΔV T )+ σ(Δβ) β 2 (5.37) Similarly, the relative mismatch of the gate-source voltage of two identically sized transistors with the same drain current (such as in a differential pair in which an input DC voltage is applied to null the offset), is: σ 2 (ΔV GS )= σ 2 (ΔV T )+ I D gm 2 σ(Δβ) β 2 (5.38) For current mode operation, where the signals of interest are the input and out- put currents, the value and accuracy of the gate-source voltage that develops is not consequential. Since gm/I D is maximized in the weak-inversion region, the current mismatch in Equation 5.37 is more significant in weak-inversion than in strong-inversion [68]. For voltage mode operation in which circuits process volt- age signals, the drain current value and accuracy is not of direct interest. The voltage mismatch in Equation 5.38 is minimized in subthreshold, because gm/I D is maximized. This hints that in subthreshold circuit design special attention must be paid to adjustment of offset in nominally matched current sources. 81 There are several design options to improve mismatch performance. One method is to operate current mode blocks in strong inversion and voltage mode blocks in suthreshold. However, this potentially results in larger area or power consumption for a given circuit. Another strategy is to operate all transistors in subthreshold and to compensate for mismatch in the worse case. We have chosen the latter in this work. One possibility that the use of PMOS transistors in the LEV circuit provides is to adjust and remove mismatch via the bulk terminal. It acts similarly to the gate terminal and its potential can affect the transistor’s threshold voltage. The bulks of PMOS transistors can be adaptively biased to reduce voltage mismatch in transistors. In a standard process, PMOS transistors are placed in their own well. This means their bulk terminals can be connected to the source so that the bulk-source voltage is zero and mismatch is minimized. For the 0.13 μm CMOS process used in this work, the slope factor n in Equa- tion 5.32 is approximately 1.33. It can be shown, using 5.32 that for approxi- matelyevery80mVchangeingatevoltageatransistor’sdraincurrentchangesby a factor of 10. Therefore we adjust the gate voltage to compensate for the often significant current mismatch in weak-inversion region using a digital calibration subsystem. It can be calculated that to keep the current mismatch below 5%, the gate voltage must be programmable with a better than 1.76 mV. 5.5 Analog LEV Model Implementation Because the transconductance efficiency is maximized in weak-inversion, all cir- cuit blocks were designed using CMOS transistors operating in this region. The circuit components were implemented in 0.13 μm Cypress Semiconductor CMOS 82 technology and mostly use PMOS transistors because of mismatch and noise advantages. 5.5.1 First Order LEV Generation C v out M b vb V DD + - v in + - M 1a M 1b M 2a M 2b M 3c M 3d M 4a M 4b V DD Mb 5 v out + - Mb 3a V DD Mb 3b vb 5 vb 3a vb 3b M 3a M 3b M 5a M 6a M 6b M 5a vb 6a vb 6b Low-Pass Filter v in + - (b) (a) Figure 5.7: Transistor level implementation of the Laguerre Filters (a) Low-pass filter (b) All-pass filter. Figure 5.7 (a) and (b) show the transistor level implementations of the low- pass and all-pass filters, respectively. These topologies use the minimum number of components to implement the filters, which minimizes circuit noise. The pro- posed circuits also have the minimum number of parasitic poles since the number of internal nodes is minimized, which maximizes bandwidth performance. In the low-pass filter of Figure 5.7 (a), M 1a,b implement gm 1 and transistors M 2a,b implement gm 2 . The AC response of the OTA which includes gm 1 and gm 2 is shown in Figure 5.8. The 3 dB-bandwidth is approximately 6.6 kHz, which is above the required 5 kHz bandwidth to process neural signals. The transconductances gm 1 and gm 2 of the low-pass filter are set to 250 pS via vb and C is 5 pF to achieve the desired inverse time constant p=50s −1 . The output impedance R out of the OTA is shown in Figure 5.9. Since the differential output impedance is 83 Figure 5.8: AC response of the OTA including gm 1 and gm 2 approximately R out =1/gm 2 ,and gm 2 = gm 1 , it can be found from this graph that gm 1 = gm 2 =1/4 GΩ = 250 pS. This is expected as MOS devices carrying the same current should have the same gm. In this CMOS process, the area of a square 100 fF metal capacitor is approx- imately 120 μm 2 . The size of the bias transistor M b in the OTA was chosen as W 1 /L 1 =0.42 μm/4 μm to ensure a large output impedance of the current source. For the differential input pair and the diode connected pair, the size was chosen to be maximized, to achieve better flicker noise and matching performance while en- suring that the right-half plane zero, which is generated due to the gate-drain capacitance of the differential input pair is at a high enough frequency above the operating frequency of 5 kHz. Figure 5.10 shows how varying the sizes of the differential pairs in the lowpass filter change the location of the parasitic zero. Consequently, the differential pair sizes was chosen to be W/L=3 μm/0.5 μm. The AC magnitude and phase response of the lowpass filter with these sizes is 84 Figure 5.9: AC output impedance of the OTA implementing gm 1 and gm 2 Figure 5.10: The effect of transistor size on the low pass filter response 85 shown in Figure 5.11. The pole of the lowpass filter is at approximately 8 Hz, since f −3dB =50/(2π)=7.96 Hz. Figure 5.11: AC response of the lowpass filter In the all-pass filter of Figure 5.7 (b) M 3a−d implement gm 3 =2gm 4 .M 4a,b and M 5a,b correspond to gm 4 and gm 5 respectively. The transconductance val- ues can be adjusted via vb 3 and vb 5 . The bleeder NMOS transistor pair M 6a,b steers current away from the PMOS transistor pair M 4a,b to ensure the proper transconductancevalueand,therefore,thetransconductancegm 4 canbeadjusted via vb 6a,b . The NMOS pair M 6a,b also allows for removal of the output mismatch with the aid of the digital programmability discussed in section 5.6. The AC response of the allpass filter is shown in Figure 5.12. It can be seen that the pole and zero of the allpass filter overlap at around 8 Hz and the first parasitic pole is located above 5 kHz to ensure the required bandwidth of the system. As shown in Figure 5.13, choosing the correct sizes for the allpass filter is critical, because it has to be ensured that the parasitic pole at the output of the allpass filter is at a large enough frequency to satisfy the bandwidth requirement. To meet the bandwidth requirement in a cascade of the filters the parasitic 86 Figure 5.12: AC response of the allpass filter Figure 5.13: Transistor sizing in the allpass filter 87 poles throughout the filter chain should not become a limiting factor. The AC response throughout the cascade of three allpass filters is shown in Figure 5.14 for three different transistor sizes. This AP filter uses the LP filter of Figure 5.7 (a) as a sub-block. The low-pass filter and three all-pass filters are cascaded as Figure 5.14: AC response of a cascade of three all-pass filters shown in Figure 5.1 to achieve the first order part of the LEV model. 5.5.2 Second Order LEV Generation TogeneratethesecondorderLEVterms,Gilbertcellsareusedforcross-multiplication, as shown in Figure 5.15. One Laguerre polynomial filter output is applied dif- ferentially to the gates of M 1a,b (Δv 1 ) and the other filter output to the gates of M 2a−d (Δv 2 ) as shown in Figure 5.15. The differential output current ΔI out for this Gilbert cell operating in weak inversion can be derived as: I + out −I − out =ΔI out = I bias tanh( Δv 1 nu T )·tanh( Δv 2 nu T ) (5.39) 88 vb 1 Multiplier Core M 1a M 1b M 2a M 2b M 2c M 2d V DD M 3a M 3b Mb 1 Mb 2 v out + - v cmfb v 1+ v 1- v 2+ v 2- v 2+ v 1+ v 1- v 2- v 2+ Figure 5.15: Gilbert cell transistor level implementation where I bias is the tail current given by: I bias =2nμ p C OX u 2 T W L e v biasp −v T nu T . (5.40) For small variations in Δv 1 and Δv 2 the output current ΔI out can be approxi- mated as: ΔI out ≈ I bias Δv 1 nu T · Δv 2 nu T (5.41) which holds if v 1 nu T and v 2 nu T .Forthe0.13 μm CMOS process used in this work, nu T = 34 mV at room temperature (where n=1.33 for the given bias current). Therefore, the differential output current is the product of the two differential inputs for small variations. Injecting this current into the NMOS transistor pair generates the differential output voltage Δv out . The transistors in the Gilbert cell were sized to ensure the bandwidth require- mentandproperDCoperationofthecircuit. Formodularityandeasycalibration 89 M 1a M 1b M 2a M 2b M 2c M 2d V DD M 3a M 3b v cmfb v b v 1+ v 1- v 1+ Figure 5.16: Common-mode feedback for Gilbert cell (as explained in Section 5.6), the size of the bias transistor was kept the same as that of the filters (W b =0.42 μmand L b =4 μm). However, to ensure a sufficiently large linear range of operation of the Gilbert cell the total DC current is larger than that in the OTA (11×). Figure 5.17 (a) shows the performance of the Gilbert cell as a linear multiplier in comparison to an ideal linear multiplier. The rms error (shown in Figure 5.17 (b)) between the ideal and circuit output is below 10% for a differential input range of 100 mV. Since the filter outputs all have a similar DC output, the differential pairs transistor sizes of M 1a,b and M 2a−d had to be chosen carefully to ensure operation in the saturation region. The drain voltage of the bias transistor has to be pushed to a minimum of 100-200 mV. Therefore, the size of M 1a,b was chosen to have a smaller W/L ratio (W 1 =0.42 μmand L 1 =4 μm) than that of M 2a−d (W 2 =3 μmand L 2 =0.5 μm), so that the gate-source voltage of M 1a,b is larger, pushing 90 (a) (b) Figure 5.17: Gilbert cell performance (a) Comparison to an ideal linear multiplier (b) rms error in % between the fit and the circuit output voltage 91 the drain of the current source to lower voltages. The NMOS transistor sizes were chosen as W 3 =1 μmand L 3 =4 μm, since they act as a current bleeder, so it was designed similar to a current source. To stabilize the DC output common-mode voltage (as needed in every dif- ferential circuit), a common-mode feedback (CMFB) as shown in Figure 5.16 is used. This CMFB circuit senses the common-mode voltage at the output and uses feedback to adjust the bias of the bottom NMOS pair of the Gilbert cell. Since feedback is applied around the Gilbert cell, stability has to be ensured. Therefore, open-loop AC simulations were conducted of the Gilbert cell and CMFB, and the result is shown in Figure 5.18. The phase margin is approxi- mately 52 degrees. Figure 5.18: Open loop phase margin response of Gilbert cell with common-mode feedback 5.5.3 Weighting and Combining Like the filters, the weighting block is designed in a modular manner, as it utilizes the multiplier core in the dashed box in Figure 5.15. To combine the first and 92 second order LEV polynomials after multiplication with c n and c n 1 ,n 2 , the current outputs of multiplier cores are summed in transistors M 4a,b in Figure 5.19. For modularity, easy calibration, and reduced design and testing times the weighting block shown in Figure 5.19 was designed to utilize the multiplier core in the dashed box in Figure 5.15. Each LEV coefficient c k is applied as a differential DC voltage at v 1 and is multiplied by its corresponding LEV signal applied at v 2 using the multiplier core. The current outputs of the multiplier core blocks are then added together to produce the final output voltage, as the weighted sum of the Laguerre polynomials and the higher order terms. The output NMOS devices are biased at the gate by v cmfb , generated by a single common-mode feedback circuit for the weighting block. . . . Multiplier Core Multiplier Core Multiplier Core i 1 i 2 i n + - + - + - v 33 v 1 v 0 M 4a M 3b v out + - v cmfb v l0+ v l0- v l1+ v l1- v l33+ v l33- Figure 5.19: Gilbert weighting block The common-mode feedback is identical to the one of the Gilbert cell in Figure 5.16. As shown in the AC open loop simulation in Figure 5.20, the Gilbert weighting block and its CMFB is stable with a phase margin of 56 degree. The top level schematic of the analog system is shown in Figure 5.21. The filter outputs are buffered before cross-multiplication to ensure the proper DC 93 Figure 5.20: Gilbert weighting block AC response levels at the input to the Gilbert multipliers. The buffers consists of an OTA exactly like the one in Figure 5.2.The AC and DC voltage of each first and second order LF as well as the final chip output are available as output test pins. To ensure testability of these analog outputs they are buffered using the open- drainbuffershowninFigure5.22. Thispreventsanyloadingoftheanalogcircuits by the measurement equipment or PCB components. To measure the DC out- put bias of the analog blocks, each analog output has its own transmission gate (as shown in Figure 5.22), which can be enabled for DC testing via the digital calibration circuits (described Section 5.6). When the transmission gate of an output is enabled the DC level can be observed at the analog output, which is especially useful to compensate for DC offset of each analog block. 5.6 Digital Calibration and Programmability An overview of the digital subsystem used in this work to adjust the gate voltage of each analog current source independently and to calibrate random variations in the chip is shown in Figure 5.23. this system also provides the right voltage 94 Filters Buffers Cross-Multipliers Weighting Block Weighting CMFB Figure 5.21: Analog top level schematic AC Buffer Figure 5.22: AC buffers and DC testing switches 95 level to set the time constant p and the weighting coefficients c n,m of the LEV model. . . . v high v low v DD R R R Tree Muxes and 7-bit Registers GS S clk rstb Mb 1 Mb 2 Mb n . . . v b1 v b2 v bn Figure 5.23: Digital calibration system The bias voltages needed in this chip were designed to be at one of three nominal levels: 275 mV for NMOS transistors, 600 mV for common-mode feed- back references and weighting voltages, and 900 mV for PMOS transistors. For each voltage level, a resistive ladder is implemented that provides a range of about +/-100 mV around the nominal bias value. Each ladder has 127 resistors and generates 128 voltage levels between v high and v low . The digital calibration circuits are used to assign the proper biasing v b1−bn to each transistor. The detailed top level Cadence schematic of the digital circuits is shown in Figure5.24andtheschematicinCadenceisshowninFigure5.25. Thecalibration circuitry consists of a gate-select decoder, the three resistive ladders, a tree mux for each gate to be biased and its corresponding register, as well as on-chip digital buffers for each digital signal. To program a certain transistor gate to a desired voltage v bn , first the digital sub-system selects the register corresponding to this gate through a 7-bit Gate 96 L_high Select Clock Gate 2 7 7 7-bit Register Tree Mux 7 7 7-bit Register Tree Mux 7 7 7-bit Register Tree Mux . . . L_low Decoder Gate Select R1 Gate 1 Gate n Figure 5.24: Detailed representation of the digital system Decoder Ladder 1, tree mux, registers Ladder 2 , tree mux, registers Ladder 3, tree mux, registers Buffers DC Select Registers Figure 5.25: Top level schematic of the digital subsystem 97 Select (GS) decoder, which is shown in Figure 5.26. Associated to each programmable gate is a 7-bit D-register, which is built using the D-flip flop shown in Figure 5.27. Figure 5.28 shows the input/output signals of the D-flip flop in this register. As shown in Figure 5.28 Resetb is an active low signal which can be used to reset all registers (output Q goes low) asynchronously. When signal D, shown in blue, is high and the clock signal (clk), shown in magenta, goes high, the output Q is high. After calibration, clk stays low, and Q remains at the previously programmed value. Then a value between 0 to 127 is written into the register that matches the desired voltage level through another 7-bit Select (S) bus. This programming phase for each register is relatively quick, as the digital circuit can be clocked at tens of mega Hertz. After the programing phase is completed, the digital circuit is turned off and consumes almost zero power. Then the words saved in the registers, together with a tree mux associated to each gate, connect the gates to the right voltage level. A 7-bit analog tree mux, made up of transmission gates, as shown in Figure 5.29, selects the desired voltage from the ladder. This voltage is then directly applied to the gate of the transistor under calibration, or the v adj input of a transconductance cell to set the correct time constant p, or to the input of a weighting multiplier core to set c k . Figure 5.30 demonstrates the functionality of the digital calibration system. At each clock cycle, a bias voltage is written to a gate (v b1 to v b127 ), and stored in the respective register. In this Figure, the ladder voltage was set to a range from 0-1.8 V and at the beginning of each clock cycle, the gate bias voltage is increased by one step size (in this functionality test the resolution is approximately 14 mV). 98 Figure 5.26: 7-bit Gate-Select decoder which selects the gate under calibration Figure 5.27: D-flip flop used in the register that stores the correct ladder value for each gate bias voltage 99 (a) (b) (c) (d) Figure 5.28: Simulated output of the D-flip flop: (a) Output Q (b) Input D (c) Clock signal CLK (d) Active low signal Resetb Figure 5.29: Transmission gates in the tree mux used to select the appropriate resistive ladder tap 100 Figure 5.30: Simulated output of the ladder and mux 5.7 Simulation Results To evaluate the performance of the hardware LEV model, the LEV coefficients and time constants are determined through Matlab simulations. The system is trained using post-synaptic potential recorded at the soma of a hippocampal neuroninresponsetoa2HzPoissonrandom input spike train (Figure 5.31a). The output of the circuit implementation simulated in Cadence Analog Design Environment is shown in Figure 5.31 (b). For comparison, the output of the ideal LEV model simulated in Matlab is also shown in Figure 5.31 (c). The normalized mean square error between the ideal and the subthreshold hardware implementation is 8.15%. Each simulated LEV function (L0 to L33) and the final output (out) in re- sponse to a doublet input is shown in Figure 5.32, to demonstrate the second order nonlinear interaction captured by the LEV model. The power consumption of each component of the system is shown in Table 101 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 0 0.5 1 a. 2Hz RIT Input 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 0 2 4 6 8 PSP [mV] b. LEV Ideal Matlab Output 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 0 2 4 6 8 PSP [mV] c. LEV Circuit Output Figure 5.31: Simulated output of the 2nd order LEV model 5.1. The total power consumption of the system which includes one lowpass, three allpass filters, ten Gilbert multipliers (each with its own common-mode feedback), one weighting block (which includes fourteen sub-multipliers and one common-mode feedback) is 32.3 nW. The digital subsystem is only on during calibration and therefore, does not contribute to the overall power consumption of the system. The chip was laid out and fabricated in 5-metal Cypress Semiconductor 0.13 μm CMOS technology. The chip layout is shown in Figure 5.33. As shown, there are 68 input/output pins, with 20 digital lines, 6 ground/power lines, 6 ladder voltage lines and 28 analog test output lines. The total chip area is 1 mm×1 mm, where the analog part occupies an area of 0.3mm× 0.3 mm. The analog circuitry is in the top center of the chip, the digital PMOS bias calibration (which constitute the majority of gates to be adjusted) is on the left and the other two ladders are on the right side of the chip. The decoder is located in the middle at the bottom of the Figure. 102 Figure 5.32: Simulated Laguerre functions (L0-L33) for a double pulse input 103 Hardware Component Power Consumption [nW] OTA 0.062 Low-Pass Filter 0.062 All-Pass Filter 0.248 Gilbert Multiplier 0.682 Gilbert CMFB 0.620 Weighting Block 9.461 Weighting Block CMFB 8.768 Total LEV System 32.3 Table 5.1: Power Consumption of the LEV System Figure 5.33: Physical layout of the LEV Chip 5.8 Design Considerations for Optimal SNR The signal-to-noise ratio is the main limiting factor in how low the power con- sumption in a circuit can be. As described in Section 4.3.1, for the neural MIMO application, an SNR of at least 2 is required. To design circuits which satisfy this 104 requirement, the noise in each block in the system has to be estimated. 5.8.1 Noise in the Low Pass Filter Section The AC equivalent model for the low pass filter is shown in Figure 5.34 (a). Using half circuit analysis, we can simplify it to the low pass filter half circuit for easier analysis, as shown in Figure 5.34 (b). Neglecting channel length modulation, and assuming equal transconductances (gm) and noise power (i 2 n ) in M1 and M3, the half circuit noise model is derived as shown in Figure 5.34 (c). With the aid of the Thevenin equivalent of the noise model, the average output noise power can be calculated via the model in Figure 5.34 (d). The load resistance of the circuit is approximately 1/gm and the half circuit input noise current is given by: 2i 2 n =2×4kTγg m (5.42) To simplify calculations, the Thevenin equivalent of the noise model is used with an equivalent input noise voltage of: 2v 2 n =2×4kTγgm 1 gm 2 (5.43) Since this is the half circuit noise voltage, the equivalent noise voltage of the lowpass filter is: v 2 nLP =4×4kTγgm 1 gm 2 (5.44) Comparing this result to the noise derivation for a general gm-C filter (Equa- tion 5.15 and Equation 5.18) and noting that R out = 1 gm , the total integrated noise power of the differential low pass filter in Figure 5.7 (a) can be calculated as: P nLP =Γ· kT C (5.45) where Γ = 4γ. 105 M1a M1b M2a M2b 2C 2C v out + - M1 M2 2C +v in /2 -v in /2 v in /2 v out /2 1/gm C 2* (a) Low pass filter ac model (b) Low pass half circuit i n 2 v nout 2 1/2* (c) Half circuit noise model C 2* 1/gm v n 2 v nout 2 1/2* (d) Thevenin equivalent noise gm 1 gm 2 Figure 5.34: AC model of the low pass filter In the transistor models of the design kit used in this work, the noise parame- ter γ is 2/3. The calculated total output noise P nLP , assuming operation at room temperate (300 Kelvin) is therefore: P nLP =4· 2 3 · 1.38E −23 2×5E −12 ·300 V 2 =1.1 nV 2 (5.46) For comparison, the simulated noise of the low pass filter is 1.18 nV 2 , which is very close to the calculated value. 5.8.2 Noise in the All Pass Filter Section The derivation of noise in the all pass section is similar to that of the lowpass filter. The allpass filter uses the lowpass filter as a subcomponent. Therefore, the contribution of the lowpass section is the same as what was derived in the previous section and initially not part of the following calculation of the allpass noise. The AC half circuit model of the all pass filter (without the lowpass subcom- ponent) is shown in Figure 5.35 (a). The noise model is shown in Figure 5.35 106 M3a M4a Cpar M3c M5 M6a v in /2 v out /2 1/gm Cpar 6* (a) Partial all pass filter half circuit i n 2 v nout 2 1/2* (b) Half circuit noise model v in /2 v LP /2 gm 4 3 gm 5 Figure 5.35: AC model and noise model for the all pass filter section (b). As for the lowpass filter, the channel length modulation effects are neglected, transconductances for transistors M1-M6 are equal. Since twice the current flows through the NMOS transistors M 6a,b as compared to the PMOS transistors M 4a,b , the NMOS transistors’ noise power is also twice that of the PMOS transistors. Hence the total noise current in the part of the allpass filter shown in Figure 5.35 is 6i 2 n . The parasitic capacitance at the output node of the all-pass filter is denoted as C par . This capacitance is found from simulation and has a value of 9.7fF.It can be derived that the noise power of the double-ended all pass filter section is given by: P nAPpart =Γ AP · kT C par =3.4 μV 2 (5.47) where Γ AP =12γ. Comparing this noise power to the total noise of the low pass filter, the noise in the all-pass section presented in Figure 5.35 clearly dominates. Hence, the lowpass section of the all pass filter can be neglected for noise analysis purposes. For comparison and verification of these results, the simulated noise power of the complete all-pass filter is found to be 1.1 μV 2 . The noise in the all-pass filter is 107 around 1000 times larger than that of the low pass filter, which agrees with the fact that the bandwidth of the all-pass filter is also about 1000 times larger. Using the above results, an approximate value for the noise power at each point in the filter chain can be calculated, assuming the gain of each block is flat within the frequency band of interest. For example, the noise at the output of the third all pass filter in Figure 5.1 can be found with: P nAP3 =((P nLP A AP1 +P nAP1 )A AP2 +P nAP2 )A AP3 +P nAP3 (5.48) where P n denotes the output noise power and A denotes the power gain of each filter in the chain. Each all-pass filter was designed for a gain of 1, and from simulations, the gain was found to be 0.95. 5.8.3 SNR of LEV Components Knowledge of total noise power alone is not sufficient to quantify the noise per- formance of a circuit. The rms value of the signal has to be known in order to calculate the signal-to-noise ratio (SNR) of the system. The rms value of the out- put voltage of each Laguerre filter was found based on Matlab simulations of the ideal model and are given in Table 5.2 along with the simulated and calculated (based on Eq. 5.48) noise power. The simulated SNR at the output of each stage is calculated using SNR = v 2 rms v 2 n (5.49) It can be seen that the simulated noise is on the same order of magnitude as that of the calculated noise. The simulated SNR for output signal L0 is high and sufficient for the system’s requirements, but for outputs L1-L3 the signal is buriedinnoiseforthisdesign. Figure5.36aimstoprovideabetterunderstanding 108 Stage Calculated Noise Simulated Noise Simulated Signal Simulated SNR Power [nV 2 ] Power [nV 2 ] Power [nV 2 ] L0 1.10 1.18 232 197 L1 3400 1100 158 0.144 L2 6470 1640 119 0.072 L3 9230 2740 84.2 0.031 Table 5.2: Noise Summary about the distribution of the circuit noise among the different filter stages with outputs L0 to L3. These charts illustrate the noise contributions of the Laguerre polynomial processing stages from the filter, its respective OTA-buffer and the open-drainbufferwhichconnectstoanoff-chip1MΩresistor. IntheL0stage, the OTA buffer dominates the noise. This agrees with the fact that the bandwidth of this buffer is orders of magnitude larger than that of the low-pass filter, and it is also larger than that of the output open-drain buffer. In stages L1 to L3, the all-pass filter noise dominates because each allpass filter has a large bandwidth and many noise contributing components. These charts provide insight into what the most critical system components are in which noise can be reduced for better performance. There are several methods to increase the SNR of a system. If it is an op- tion, the input amplitude can be increased, which in turn increases the output amplitude without increasing the noise in the system. However, since this is a differential topology, increasing the input signal differential amplitude is only useful until one branch of the circuit carries the entire bias current and the other branch is turned off, i.e. the system becomes nonlinear. In strong-inversion, this maximum differential input range in a basic differen- 109 Noise of L0 Stage Low Pass Buer OD Buer Noise of L1 Stage All Pass 1 Buer2 OD Buer Noise of L2 Stage All Pass 2 Buer 3 OD Buer All Pass 1 Noise of L3 Stage All Pass 3 All Pass 2 All Pass 1 OD Buer Buer 4 Figure 5.36: Noise contributions per filter stage tial pair can be calculated as [85] Δv in = 2I bias μC OX W L (5.50) where I bias is the bias current of the differential pair, μ is the channel carrier mobility, C OX is the gate-oxide capacitance per unit area and W and L the transistor’s width and length. In weak-inversion the maximum differential input range can be calculated as: Δv in =ln I bias I 0 u T κ (5.51) where I 0 is the subthreshold process parameter given in Equation 5.31. Another option is to reduce the noise in the circuit. Since P noise ∝ 1/C, this can be achieved either by increasing the parasitic load capacitances at different nodes of a circuit (by increasing transistor sizes), or by adding physical capaci- tances to these nodes. Increasing the parasitic transistor capacitance by a factor 110 of N decreases the total noise. However, increasing node capacitances lowers the bandwidth of the circuit (since BW ∝ 1/C), so to maintain a constant band- width, the transconductances in a circuit and consequently, its current also have to be increased (BW ∝ gm/C and gm ∝ I). Because of the nature of the model implemented in this work, the SNR re- quirement is lenient compared to most other systems. As explained in Section 4.3.1 the required SNR for this implementation is SNR=1.7=2.3dB. As explained in 3.2.2.1, there is a fundamental trade-off between bandwidth and noise and power, which can be evaluated using the noise efficiency factor (NEF): NEF = v ni,rms 2I tot π ×u T ×4kT ×BW (5.52) where lower NEF indicates better noise, power, bandwidth performance. There- fore, to obtain equivalent NEF performance while decreasing the noise levels for better SNR, a decrease in noise by a factor of k requires an increase in power by afactorof k, while keeping the bandwidth constant. In this work, the achieved NEF of the low-pass and all-pass filters competitive with state-of-the-art designs which demonstrated NEF’s below 10 [48] [42]. The low-pass filter NEF calculated using Equation 5.52 is 2.7 and that of the all-pass filter is 7.5. Due to the programmability that is implemented it is possible to increase the current in each all pass stage of the chip even after fabrication. However, it is not possible to increase parasitic capacitances of different stages to keep the same bandwidth, since these nodes are not accessible. To decrease the output noise in the band of interest the current in the all pass section can be increased by a factor of N via the programmable current sources. This will increase the transconductance of its transistors and the bandwidth by a factor of N also. Hence, the in-band noise power at each frequency is lower, but the noise band- 111 Stage Simulated Noise Simulated Signal Simulated SNR Power [nV 2 ] Power [nV 2 ] L0 17.3 1130 65 L1 251 1931 7.7 L2 932 4082 4.4 L3 3005 12420 4.2 Table 5.3: Noise Summary for improved SNR width is larger, leaving us with an equal integrated noise of P noise ∝ kT Cpar .A capacitor can then be added externally at the output of the open-drain buffers to filter out high frequency noise outside the band of interest. Using this method, the in-band noise measured at the output is reduced to P noise ∝ kT C×N . Increasingthecurrentineachblockusingtheadjustablecurrentsourcesallows an increase in SNR. Because the filter transconductances determine the inverse time constant p, programming the chip for higher SNR leads to an increase in processing speed. Therefore, the adjusted LEV model will operate faster than real-time compared to the experimental data. The model of Figure 4.2 has been retrained with p = 200 s −1 , which allows an increase in current by a factor of 4. This increase in bias current by a factor of 4 allows for an increase in the input signal swing of Equation 5.51, creating a larger output signal amplitude. Another method to increase SNR of the all-pass filters is to increase its gain by adjusting the ratio of gm 5 gm 4 in Equation 5.8. Increasing the voltage gain in a certain block of a system amplifies the signal as well as the noise from previous stages through that stage. However, it can be shown that the noise generated in that particular block does not get amplified by the same amount as the signal and noise from previous stages (for example, for a single transistors, the noise voltage 112 power is proportional to 1/gm whereas the signal power is proportional to gm 2 . Consequently, the SNR increases with higher gain or gm). Theprogrammablestructureallowsustoadjustgm 4 byincreasingtheamount of current that is branched into the differential NMOS pair in Figure 5.3 (b). Changing the ratio of current in the NMOS branch to the M4 PMOS branch from 4:2 to 5:1 increases the all-pass filter gain by a factor of 2. The simulated noise power, signal power and calculated SNR at the output of the chip for each filter is shown in Table 5.3. 113 CHAPTER 6 Test Setup and Results 6.1 ASIC Implementation The chip has been laid out and fabricated in Cypress Semiconductor’s 0.13 μm NVM CMOS technology. The die photo of the chip is shown in Figure 6.1. It was designed to operate at 1.8 V. Analog System Decoder Digital Calibration Ladder 1 + Registers + Tree Muxes Digital Calibration Ladder 2&3 + Registers + Tree Muxes Figure 6.1: Die photo of the programmable analog system. The total chip area is less than 1 mm 2 , of which the analog circuitry occupies 0.09 mm 2 . To interface with the PCB test board, on-chip open-drain buffers were 114 added to each analog output for ac signal amplification and buffering. The chip was packaged in a 68-pin QFN package, and the bonding diagram for the chip is shown in Figure A.3. 6.2 Test Setup and Measurement Procedure Foreasiertestingofmultiplesamples, thepackagedchipwasplacedinaPlastron- ics 68QN40S18080 through-hole socket soldered to a mini-PCB. The mini-PCB mounts on the main test PCB via four 17-pin 50-mil male-to-female headers, as shown in Figure 6.2. The functional test PCB schematic is shown in Figure 6.3. Figure 6.2: Main PCB with mini-PCB and socket attached The board schematic consists of 4 parts: power management (regulators for ladder and supply voltages), digital programming interface (digital buffers), DC 115 Digital Buffers PC B1 Instrumentation Amplifier INA1 Instrumentation Amplifier INA2 R3 R4 R1 R2 GS<0:7> S<0:7> CLK RESETB IN_N IN_P Analog Buffer Micro- Controller Digital_Inputs DAC vddD ac_out_n ac_out_p DC_out_n DC_out_p vddA Regulators Set Out ladder/supply voltages LEV ASIC C R_v v_current ac_out ladder voltages Protection Circuitry Analog Buffer Analog Buffer Analog Buffer B2 B3 B4 D1 D2 R_d1 R_d2 vddA’ Figure 6.3: Functional diagram of the computer/micro-controller/PBC interface measurement subsystems (analog output buffers B3 and B4, instrumentation am- plifier INA1 and resistors R1 (10 MΩ) and R2 (100 kΩ) to measure the DC bias current of the analog system) and analog I/O (analog input buffers B1 and B2, output amplification resistors R3 and R4 (both 1 MΩ), output instrumenta- tion amplifier INA2, analog output switches). As shown in Figure 6.4 a micro- controller board (Arduino Due) provides the interface between the software on a PC and the ASIC test board. The micro-controller provides all digital signals, 7-bit Select S (red dashed box) and gate select GS blue dashed box) signals, clock and resetb, and the analog input spikes. The code for setting the digital bits is provided in Appendix 6.4. First, the digital subsystem is reset. Then the digital calibration circuits are clocked and in every clock cycle, GS increases by one (from 0 to 128) and the S is written to the registers. The register map which sets the different values for S is programmed and adjusted on the PC. Analog inputs are buffered and extra protection was added to prevent voltage surges above 2.2 V at the input to the chip. Regulators provide the supplies for the analog and digital circuits and the six ladder voltages (high and low voltages for each ladder voltage range). The chip has 32 analog ac outputs (16 differential 116 ASIC Testboard Microcontroller Board ASIC Testsocket Figure 6.4: Test Setup for ASIC calibration and measurement outputs). Two 16-position mechanical switches (green dashed box) are used to select the desired open-drain on-chip output, which is then amplified on the PCB via 1 MΩ resistors and an instrumentation amplifier with a gain of 100. The dc current consumed by the analog circuitry is found via the instrumen- tation amplifier setup on the top right in Figure 6.3 since it is difficult to directly measure currents in the pA range using standard testing equipment. The dc bias current is forced through a resistor (either 100 kΩ or 10 MΩ) and the resulting voltage drop is then amplified by 1000× using the instrumentation amplifier. For smaller currents, the larger resistor is used. For other measurements, the two in- puts to instrumentation amplifier 1 are shorted and directly connect to the vddA regulator output. The high and low voltages of the ladders can be adjusted via the on-board regulators to gain the desired resolution around each bias point. A voltage range of 200 mV can change the current by more than two decades (100 times), which is more than sufficient for calibration purposes. ThecalibrationalgorithmisshowninFigure6.5. First, anominalbiasvoltage isappliedtothefirstgateundercalibration. Thentheblock’scurrentismeasured 117 Start Apply nominal bias setting to a block Measure Current correct ? Adjust Setting NO Null DC offset Visual inspection of block output All blocks done? YES NO YES End Figure 6.5: Calibration algorithm 118 and adjusted until the correct bias point is reached. The next step is to observe theDCoutputandnullanyDCoffsets. Thentheacoutputoftheblockisvisually inspected for the proper time decay and amplitude ratio. This is repeated for all the blocks. During the design phase, the total on-chip capacitance was kept below 20pF and the filter currents were adjusted to accomplish the desired time constants. The design satisfied the minimum SNR of 2, which meets the specifications of the SISO model of Figure 4.1. To allow better observation of the signals, the current was increased by a factor of 4 during the measurement phase through the on-chip programmable means, as explained in the previous chapter. The effect is a higher measured SNR and a scaled p value from 50 to 200 s −1 (time constant scaled from 20 ms to 5 ms). 6.3 Results The measured Laguerre filter outputs as well as the combined final output u are shown in Figures 6.6-6.10 for p = 200 s −1 along with the ideal curves below each measurement. The performance of the chip is shown in Table 6.1. The total powerconsumptionofthechipaftercalibrationis120nW.Themeansquareerror between the measured and ideal outputs is computed after applying a moving average filter to reduce the effect of high frequency noise. Notably, the final output results has a small mean square error of 6.0% compared to the ideal signal. The second order measurement results in response to a paired pulse with an inter spike interval of 5 ms is shown in Figure 6.11. L0 to L33 denote the LFs in response to the paired pulse. The chip measurement output and the real data 119 (a) (b) Time [s] Amplitude [mV] Figure 6.6: Zeroth order Laguerre Result (a) Measured (b) Simulated 120 (a) (b) Time [s] Amplitude [mV] Figure 6.7: First order Laguerre Result (a) Measured (b) Simulated 121 (a) (b) Time [s] Amplitude [mV] Figure 6.8: Second order Laguerre Result (a) Measured (b) Simulated 122 (a) (b) Time [s] Amplitude [mV] Figure 6.9: Third order Laguerre Result (a) Measured (b) Simulated 123 (a) (b) Figure 6.10: Final output result (a) Measured (b) Simulated 124 Table 6.1: Performance Summary Stage v 0 v 1 v 2 v 3 u Power [nW] 0.55 2.1 4.5 5.8 120 SNR [dB] 18 10 7.8 6.4 11 Mean square error [%] 7.0 141514 6.0 Area [mm 2 ] 0.0075 0.015 0.023 0.030 0.09 output from the hippocampus measurement (red) are also shown. 6.4 Conclusion A second order LEV analog neural signal processing system with digital calibra- tion for coefficient programming and mismatch compensation is presented. The design is modular for ease of scaling to higher orders of nonlinearities and larger numbers of inputs in future implementations. The digital circuits are primarily designed for flexibility and testability, with the possibility of considerable reduc- tion in area. The total analog area is only 0.09 mm 2 . We have demonstrated that analog signal processing is a viable approach for fully implanted closed-loop sys- tems with low power, small area, and high fidelity reproduction of the biomimetic signals. From the measurement results of this chip implementation several conclusions can be made. Using an input-output modeling approach with kernels and basis functions to approximate neural signals is ideal for analog hardware implementa- tion for several reasons. First of all, neural signals are inherently noisy due to the intrinsic noise produced by the bio-physical and -chemical processes which occur 125 L0 L1 L2 L3 L00 L01 L02 L03 L11 L12 L13 L22 L23 L33 Circuit output 5ms 5mV Real data output Figure 6.11: Measured circuit output and LFs along with hippocampal output data in response to a paired pulse input 126 in a neurons. Therefore, a low power hardware implementation with relaxed SNR requirements can be implemented. Secondly, the basis functions often entail a certain amount of redundancy and there are often several possibilities for basis functions weights, especially in the case of a second order model. Therefore, im- plementing the basis functions along with flexible weighting provides more leeway in precision system identification and approximation. Another conclusion reached from the measurements and chip implementation isthatthetargetSNRdeterminesthesizeofthechipinlowpowerreal-timeneural application. Since subthreshold allows the most efficient system implementation, it is the ideal choice for the LEV implementation. However, calibration has to be included for mismatch compensation, SNR adjustments and to account for system changes from a biological perspective which require re-programming of model coefficients. Since the programming and calibration subsystem in this work was designed for ultimate flexibility, it occupies a large area compared to the analog circuits. In any specific application, this large subsystem can be customized, optimized and significantly reduced in size. The system implementation and the results of this work show that with care- ful design the analog subthreshold circuits can successfully implement a nonlinear neural processing system. Along with an analog front end, which amplifies and filters spikes recorded from neurons in the input region, analog spike sorter hard- ware and a stimulation circuit, a closed loop system to replace neural function can be built with minimum power consumption. This is important because it allows for smaller batteries to power the hardware and it reduces the number of invasive surgeries required to replace batteries of an implant. For systems with external powering mechanisms and rechargeability, it reduces the charging time and offers more comfort to the patient. 127 Several improvements are suggested for a future implementation of this proto- type. Firstofall, thecapacitorsizecanbeincreasedon-chipalongwiththepower to achieve a better SNR performance for systems requiring a longer time constant than 5 ms. Furthermore, the digital calibration circuits can be improved in terms of area and ease of testing. Future calibration should automatically sense and measure the performance of each block and then adjust for process variations via an automated closed-loop feedback system. As technologies shrink, even transis- tor circuits operating above the device threshold need calibration to compensate for process variations. Future system implementations will include a first order feedback term and a thresholding block in order to complete a single spike-input/spike-output system. It is also possible to expand the system to include multiple inputs and outputs to accommodate a MIMO system needed for a closed loop hippocampus prosthesis. Sincethecircuitsandsystemaremodular,thisisexpectedtobeastraightforward scaling effort. This work impacts the state of implantable electronics by proving that analog subthreshold circuits are a viable option for such systems. Over the past few decades, subthreshold analog design has mainly been considered an academic effort and not often used in practice. What this work hopes to have accomplished is to show the path for the transition from theory to the practice. 128 Appendix Sample Code for Digital Calibration Programming #include ”L0 L1 L2 L3 combined.txt” int GS array[124]; //array to initialize for GS int x=0; int delay var=500; //500 is for clk=1khz int w=1; int w1=1; void setup() { for(int i=22; i <43; i++) // This loop sets the pins to use as outputs pinMode(i , OUTPUT); pinMode(50, OUTPUT); for(int j=0; j <124; j++) // This loop initialized the GS array //with numbers 0 to 255 GS array[ j]=j ; digitalWrite(34, HIGH); //this initializes the resetb pin 129 for(int m=22; m<34; m++) digitalWrite(m, LOW); //initialize GS and S and CLK to LOW for(int n=35; n<40; n++) digitalWrite(n, LOW); //initialize GS, S and CLK to low analogWriteResolution(12); //set the analog output resolution to 12 bits digitalWrite(50, LOW); //testing LED } void loop() { analogWrite(DAC1, 0x90 ); //write the square waveform on DAC1 (amp=2.16V) analogWrite(DAC0, 0x60 ); delayMicroseconds(delay var); //start reset digitalWrite(34, LOW); delayMicroseconds(2∗delay var); digitalWrite(34, HIGH); //end of reset delayMicroseconds(2∗delay var); 130 for(int x=0; x<124; x++) // For each row, write bit 0 to bit 8 { digitalWrite(31, GS array[x] & 1); digitalWrite(29, GS array[x] & 2); digitalWrite(22, GS array[x] & 4); digitalWrite(24, GS array[x] & 8); digitalWrite(26, GS array[x] & 16); digitalWrite(27, GS array[x] & 32); digitalWrite(25, GS array[x] & 64); digitalWrite(23, GS array[x] & 128); digitalWrite(33, S array[x] & 1); digitalWrite(32, S array[x] & 2); digitalWrite(30, S array[x] & 4); digitalWrite(28, S array[x] & 8); digitalWrite(35, S array[x] & 16); digitalWrite(37, S array[x] & 32); digitalWrite(39, S array[x] & 64); digitalWrite(38, S array[x] & 128); delayMicroseconds(30∗delay var); //pause digitalWrite(36, HIGH); //clk goes up after settling of data, S, //and GS are written at the clk edge 131 delayMicroseconds(1∗delay var); digitalWrite(36, LOW); //clk pin delayMicroseconds(delay var ∗1); } digitalWrite(50, HIGH); //int dac0bias=0x90; //int dac1bias=0xC0; int rampdelay=1; int dacamp=14.4; int dacampmult=7.5; int hicycle=300; int locycle=200000; int dac0word=0; int dac1word=0; int k; int applypulse=1; int countnum=200; int closedelay=10000; analogWrite(DAC0, dac0bias); analogWrite(DAC1, dac1bias); while(w==1) //pulse generator 132 { dac0word=dac0bias; dac1word=dac1bias; analogWrite(DAC0, dac0word); analogWrite(DAC1, dac1word); for(k=0; k<dacampmult∗dacamp+1; k=k+dacampmult) { //delayMicroseconds(rampdelay); analogWrite(DAC0, dac0word); analogWrite(DAC1, dac1word); dac0word=dac0word+applypulse∗dacampmult; dac1word=dac1word−applypulse∗dacampmult; } delayMicroseconds(hicycle ); for(k=0; k<dacampmult∗dacamp+1; k=k+dacampmult) { dac0word=dac0word−applypulse∗dacampmult; dac1word=dac1word+applypulse∗dacampmult; //delayMicroseconds(rampdelay); analogWrite(DAC0, dac0word); analogWrite(DAC1, dac1word); } // Second pulse added, start here delayMicroseconds(closedelay ); 133 for(k=0; k<dacampmult∗dacamp+1; k=k+dacampmult) { analogWrite(DAC0, dac0word); analogWrite(DAC1, dac1word); dac0word=dac0word+applypulse∗dacampmult; dac1word=dac1word−applypulse∗dacampmult; } delayMicroseconds(hicycle ); for(k=0; k<dacampmult∗dacamp+1; k=k+dacampmult) { dac0word=dac0word−applypulse∗dacampmult; dac1word=dac1word+applypulse∗dacampmult; analogWrite(DAC0, dac0word); analogWrite(DAC1, dac1word); } //second pulse added, end here delayMicroseconds(locycle ); } } Data File to Store S values // This is the file with the scaled tau // The vddA for these values is 1.83 // v3 high= 1.295 (J22), v3 low=1.048 (J23) // v2 high=0.358 (J26), v2 low=0.181 (J25) // v1 high=0.625 (J28), v1 low=0.53 (J27) 134 int S array[124]={ 127, /∗ register zero is dummy ∗/ 50, /∗ L0 ∗/ 50, 50, 50, 50, /∗ L1 ∗/ 50, 50, 50, 40, /∗ L2 ∗/ 50, 40, 40, 55, /∗ L3 ∗/ 50, /∗ buffer L0 ∗/ 50, /∗ buffer L1 ∗/ 50, /∗ buffer L2 ∗/ 50, /∗ buffer L3 ∗/ 127, 127, 127, /∗ Gilbert L00: bias , cmfbias1 , cmfbbias2 ∗/ 127, 127, 127, /∗ Gilbert L01 ∗/ 127, 127, 127, /∗ Gilbert L02 ∗/ 127, 127, 127, /∗ Gilbert L03 ∗/ 127, 127, 127, /∗ Gilbert L11 ∗/ 127, 127, 127, /∗ Gilbert L12 ∗/ 127, 127, 127, /∗ Gilbert L13 ∗/ 127, 127, 127, /∗ Gilbert L22 ∗/ 127, 127, 127, /∗ Gilbert L23 ∗/ 127, 127, 127, /∗ Gilbert L33 ∗/ 50, 50, 50, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, /∗ Gilbert−Weight biases ∗/ 50, 50, 135 /∗ Gilbert−Weight CMFB biasing 1,2 ∗/ 50, 50, /∗ Gilbert L00 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L01 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L02 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L03 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L11 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L12 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L13 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L22 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L23 vref1 ,2 ∗/ 50, 50, /∗ Gilbert L33 vref1 ,2 ∗/ 64, 0, /∗ Gilbert−Weight vgain0 p,n ∗/ 54, 74, /∗ Gilbert−Weight vgain1 p,n ∗/ 0, 100, /∗ Gilbert−Weight vgain2 p,n ∗/ 64, 0, /∗ Gilbert−Weight vgain3 p,n ∗/ 67, 43, /∗ Gilbert−Weight vgain00 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain01 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain02 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain03 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain11 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain12 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain13 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain22 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain23 p,n ∗/ 45, 64, /∗ Gilbert−Weight vgain33 p,n ∗/ 80, 80, /∗ Gilbert−Wght CMFB vref1 ,2 ∗/ 136 89, 55, /∗ NMOS L1 ∗/ 90, 90, /∗ NMOS L2 ∗/ 54, 108, /∗ NMOS L3 ∗/ 0, /∗ Switches ∗/ /∗ 1st word 3 −> L0p and L0n ∗/ /∗ 12 −> L1p and L1 n∗/ /∗ 48 −> L2p and L2n ∗/ /∗ 192 −> L3p and L3n ∗/ 0, /∗ Second word: ∗/ /∗ 3 −> L00p and L00n ∗/ 0, /∗∗/ 0/∗ 48 −> Outp and Outn ∗/ }; int dac0bias=0x90; int dac1bias=0xC0; PCB Layout Chip Pin Assignments The pin assignments are shown in Table A.1. 137 Figure A.1: Top copper layer of PCB Figure A.2: Bottom copper layer of PCB 138 Figure A.3: 68-pin QFN Bonding Diagram 1 L12p 18 GS7 35 vddD 52 L33n 2 L00n 19 GS6 36 clk 53 L23n 3 L00p 20 GS5 37 resetb 54 L23p 4 L0n 21 GS4 38 gndD 55 L03n 5 vin 22 GS3 39 gndD 56 gndD 6 vip 23 GS2 40 L3bot 57 L03p 7 L0p 24 GS1 41 L3top 58 L22n 8 L1n 25 GS0 42 L2bot 59 L22p 9 L1p 26 S0 43 L2top 60 L02n 10 L2n 27 S1 44 DCn 61 L02p 11 L2p 28 S2 45 DCp 62 L13n 12 L3n 29 S3 46 L11n 63 vddA 13 L3p 30 S4 47 L11p 64 L13p 14 L1top 31 S5 48 outn 65 L01n 15 L1bot 32 S6 49 outp 66 L01p 16 gndD 33 S7 50 L33p 67 L12n 17 NC 34 NC 51 NC 68 NC Table A.1: Pin Assignments from chip to package 139 References [1] Implantable Pacemakers and Defibrillators. [Online]. Avail- able: http://www.elin.ttu.ee/mesel/Study/Courses/Biomedel/Content/ ElTherap/CardioSt/Implants/Index.htm [2] Wireless implantable medical devices. [Online]. Available: http://groups. csail.mit.edu/netmit/IMDShield/ [3] Neurons, synapses, and signaling: Neuron struc- ture. [Online]. Available: http://www.studyblue.com/notes/note/n/ chapter-48-neurons-synapses-and-signaling/deck/4565298 [4] (2003) Imaging of utah electrode array, implanted in cochlear nerve. [Online]. Available: http://www.sci.utah.edu/ ∼ gk/abstracts/bisti03/ [5] V. S. Polikov, P. Tresco, and W. Reichert, “Response of brain tissue to chronically implanted neural electrodes,” Journal of Neuroscience Methods, vol. 148, no. 1, pp. 1–18, 2005. [6] Active Implantable Medical Devices: Winning the Power Struggle. [Online]. Available: http://www.emdt.co.uk/article/ active-implantable-medical-devices-winning-power-struggle [7] J. Horstman, The Scientific American Brave New Brain: How Neuroscience, Brain-Machine Interfaces, Neuroimaging, Psychopharmacology, Epigenetics, the Intenet, and Our Own Minds are Stimulating and Enhancing the Future of Mental Power. John Wiley and Sons, 2010. [8] About gastric electrical stimulation. [Online]. Available: http://www. medtronic.com/patients/gastroparesis/device/index.htm [9] M. W. Angler. Will we ever... have cyborg brains? [Online]. Available: http: //www.bbc.com/future/story/20121218-will-we-ever-have-cyborg-brains/2 [10] K. Hardin. Kurzweil: Your brain will connect directly to the cloud within 30 years. [Online]. Available: http://www.techrepublic.com/blog/geekend/ kurzweil-your-brain-will-connect-directly-to-the-cloud-within-30-years/ 10518 [11] B. Wilson et al., “Design for an inexpensive but effective cochlear implant.” Otolaryngology-Head and Neck Surgery Journal, vol. 118, pp. 235–241, 1998. [12] (2013) Cochlear implant. [Online]. Available: http://www.chha-nl.nl.ca/ programs-and-services/resource-groups/cochlear-implant.aspx 140 [13] (2009, October) Bionic eye may help blind see: Retinal prosthesis shown to restore partial vision. [Online]. Available: http://www.sciencedaily.com/ releases/2009/10/091021012847.htm [14] (2013) Footdrop regulator. [Online]. Available: http://www.centropiaggio. unipi.it/course/fenomeni-bioelettrici.html [15] (2012) Mind-controlled permanently-attached prosthetic arm could rev- olutionize prosthetics. [Online]. Available: http://www.gizmag.com/ thought-controlled-prosthetic-arm/25216/ [16] V. Gilja, et al., “A high-performance neural prosthesis enabled by control algorithm design,” Nature Neuroscience, vol. 15, pp. 1752–1757, 2012. [17] (2013) Cortical prosthesis. [Online]. Available: http://met.usc.edu/ projects/biotic.php [18] T. Berger, D. Song, R. H. M. Chan, and V. Marmarelis, “The neurobiolog- ical basis of cognition: Identification by multi-input, multioutput nonlinear dynamic modeling,” Proceedings of the IEEE, vol. 98, no. 3, pp. 356–374, 2010. [19] F. A. Azevedo, L. R. Carvalho, L. T. Grinberg, J. M. Farfel, R. E. Ferretti, R. E. Leite, W. J. Filho, R. Lent, and S. Herculano-Houzel, “Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain,” Journal of Comparative Neurology, vol. 513, pp. 532–541, 2009. [20] E. Kandel et al., Principles of Neural Science. McGraw-Hill Professional Publishing, 2012. [21] J. Gautrais and S. Thorpe, “Rate coding versus temporal order coding: A theoretical approach,” Biosystems, vol. 48, pp. 57–65, 1998. [22] T. Berger, D. Song, R. H. M. Chan, and V. Marmarelis, “The neurobiolog- ical basis of cognition: Identification by multi-input, multioutput nonlinear dynamic modeling,” Proceedings of the IEEE, vol. 98, no. 3, pp. 356–374, 2010. [23] P. Dayan and L. Abbott, Theoretical Neuroscience: Computational and Mathematical MOdeling of Neural Systems, 1st ed. MIT Press, 2005. [24] J. L. Krichmar, S. J. Nasuto, R. Scorcioni, S. D. Washington, and G. A. Ascoli, “Effects of dendritic morphology on ca3 pyramidal cell electrophysi- ology: A simulation study,” Brain Research, vol. 941, pp. 11–28, 2002. 141 [25] M.LondonandM.Husser, “Dendriticcomputation,” Annual Review of Neu- roscience, vol. 28, pp. 503–532, 2005. [26] R. S. Zucker and W. G. Regehr, “Shor-term synaptic plasticity,” Annual Review of Physiology, vol. 64, pp. 355–405, 2002. [27] D. Song, V. Z. Marmarelis, and T. W. Berger, “Parametric and non- parametric nodeling of short-term synaptic plasticity. part i: Computational study,” Journal of Computational Neuroscience, vol. 26, pp. 1–19, 2009. [28] R. Hampson, D. Song, R. H. M. Chan, A. Sweatt, M. Riley, A. Goonawar- dena, V. Marmarelis, G. Gerhardt, T. Berger, and S. Deadwyler, “Closing the loop for memory prosthesis: Detecting the role of hippocampal neu- ral ensembles using nonlinear models,” Neural Systems and Rehabilitation Engineering, IEEE Transactions on, vol. 20, no. 4, pp. 510–525, 2012. [29] M. A. Lebedev and M. A. Nicolelis, “Brainmachine interfaces: past, present and future,” Trends in Neurosciences, vol. 29, pp. 1–19, 2006. [30] J. Sanchez, J. Principe, T. Nishida, R. Bashirullah, J. Harris, and J. Fortes, “Technology and signal processing for brain-machine interfaces,” Signal Pro- cessing Magazine, IEEE, vol. 25, no. 1, pp. 29–40, 2008. [31] W.M.Grill, S.E.Norman, andR.V.Bellamkonda, “Implantedneuralinter- faces: Biochallengesandengineeredsolutions,” Annual Review of Biomedical Engineering, vol. 11, pp. 1–24, 2009. [32] T. Seese, H. Harasaki, G. Saidel, and C. Davies, “Characterization of tis- sue morphology, angiogenesis, and temperature in the adaptive response of muscle tissue to chronic heating.” Laboratory Investigation, vol. 78, no. 12, pp. 1553–62, 1998. [33] S. Kim, P. Tathireddy, R. A. Normann, and F. Solzbacher, “Thermal impact of an active 3-d microelectrode array implanted in the brain,” IEEE Trans- actions on Neural Systems and Rehabilitation Engineering, vol. 15, no. 4, pp. 493–501, 2007. [34] O. Cais, M. Sedlacek, M. Horak, I. Dittert, and L. V. Jr., “Temperature dependence of nr1/nr2b nmda receptor channels.” Cellular Neuroscience, vol. 151, no. 22, pp. 428–438, 2008. [35] C. Sauer, M. Stanacevic, G. Cauwenberghs, and N. Thakor, “Power harvest- ing and telemetry in cmos for implanted devices,” in Biomedical Circuits and Systems, 2004 IEEE International Workshop on, 2004, pp. S1/8–S1–4. 142 [36] S. Gibson, J. W. Judy, and D. Markovic, “The first step in decoding the brain,” IEEE Signal Processing Magazine, vol. 143, pp. 124–143, 2012. [37] D. Robinson, “The electrical properties of metal microelectrodes,” Proceed- ings of the IEEE, vol. 56, no. 6, pp. 1065–1071, 1968. [38] F.L.daSilvaandE.Niedermeyer, Electroencephalography: Basic Principles, Clinical Applications, and Related Fields. Lippincott Williams and Wilkins, 2005. [39] S. F. Cogan, “Neural stimulation and recording electrodes,” Ann. Rev. Biomed. Eng., vol. 10, pp. 275–309, 2008. [40] R. R. Harrison and C. Charles, “A low-power low-noise cmos amplifier for neuralrecordingapplications,” IEEE Journal of Solid-State Circuits, vol.38, no. 6, pp. 958–965, 2003. [41] K. M. Al-Ashmouny, S.-I. Chang, and E. Yoon, “A 4 w/ch analog front-end module with moderate inversion and power-scalable sampling operation for 3-d neural microsystems,” IEEE Transactions on Biomedical Circuits and Systems, vol. 6, no. 5, pp. 403–413, 2012. [42] R. Harrison, P. Watkins, R. Kier, R. Lovejoy, D. Black, R. Normann, and F. Solzbacher, “A low-power integrated circuit for a wireless 100-electrode neural recording system,” in Solid-State Circuits Conference, 2006. ISSCC 2006. Digest of Technical Papers. IEEE International, 2006, pp. 2258–2267. [43] M. S. Steyaert, W. M. Sansen, and C. Zhongyuan, “A micropower low-noise monolithic instrumentation amplifier for medical purposes,” IEEE Journal of Solid-State Circuits, vol. 22, no. 6, pp. 1163–1168, 1987. [44] A. M. Sodagar and et. al, “An implantable 64-channel wireless microsys- tem for single-unit neural recording,” IEEE Journal of Solid-State Circuits, vol. 44, no. 9, pp. 2591–2604, 2009. [45] W. Wattanapanitch and R. Sarpeshkar, “A low-power 32-channel digitally programmable neural recording integrated circuit,” IEEE Transaction on Biomedical Circuits and Systems, vol. 5, no. 6, pp. 592–602, 2011. [46] M. Azin and et. al, “A battery-powered activity-dependent intracortical mi- crostimulation ic for brain-machine-brain-interface,” IEEE Journal of Solid- State Circuits, vol. 46, no. 4, pp. 731–745, 2011. 143 [47] R. Muller, S. Gambini, and J. M. Rabaey, “A 0.013 mm2, 5 w, dc-coupled neural signal acquisition ic with 0.5 v supply,” IEEE Journal of Solid-State Circuits, vol. 47, no. 1, pp. 232–243, 2012. [48] M. Mollazadeh and et. al, “Wireless micropower instrumentation for multi- modal acquisition of electrical and chemical neural activity,” IEEE Transac- tions on Biomedical Circuits and Systems, vol. 3, no. 6, pp. 388–397, 2009. [49] R. J. van de Plassche, Integrated analog-to-digital and digital-to-analog con- verters, 1st ed. The Kluwer International Series in Engineering and Com- puter Science, 1994. [50] R.Walden, “Analog-to-digitalconvertersurveyandanalysis,” Selected Areas in Communications, IEEE Journal on, vol. 17, no. 4, pp. 539–550, 1999. [51] A. Oppenheim, A. Willsky, and S. Hamid, Signals and Systems,2nded. Prentice Hall, 1996. [52] (2013) Map continuous signal recording guide. [Online]. Available: http://www.plexon.com/sites/default/files/ [53] R. R. Harrison and et. al, “Wireless neural recording with single low-power integrated circuit,” IEEE Transactions on Neural Systems and Rehavilita- tion Engineering, vol. 17, no. 4, pp. 322–329, 2009. [54] V. Karkare, S. Gibson, and D. Markovic, “A 130uw, 64-channel neural spike- sorting dsp chip,” IEEE Journal of Solid-State Circuits, vol. 46, no. 5, pp. 1214–1222, 2011. [55] M. Rizk, I. Obeid, S. H. Callender, and P. D. Wolf, “A single-chip signal processing and telemetry engine for an implantable 96-channel neural data acquisition system,” IEEE Journal of Neural Engineering, vol. 4, no. 3, pp. 309–321, 2007. [56] R. H. O. III and K. D. Wise, “A three-dimensional neural recording mi- crosystems with implantable data compression circuitry,” IEEE Journal of Solid-State Circuits, vol. 40, no. 12, pp. 2796–2804, 2005. [57] M. Chae, W. Liu, Z. Yang, T. Chen, J. Kim, M. Sivaprakasam, and M. Yuce, “A128-channel6mwwirelessneuralrecordingicwithon-the-flyspikesorting and uwb tansmitter,” in Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International, 2008, pp. 146–603. 144 [58] T.-C. Chen, W. Liu, and L.-G. Chen, “128-channel spike sorting processor with a parallel-folding structure in 90nm process,” in Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium on, 2009, pp. 1253–1256. [59] J. Bohorquez, J. Dawson, and A. Chandrakasan, “A 350 uw cmos msk trans- mitter and 400 uw ook super-regenerative receiver for medical implant com- munications,” in VLSI Circuits, 2008 IEEE Symposium on, 2008, pp. 32–33. [60] F. Chen, A. Chandrakasan, and V. Stojanovic, “Design and analysis of a hardware-efficient compressed sensing architecture for data compression in wireless sensors,” Solid-State Circuits, IEEE Journal of, vol. 47, no. 3, pp. 744–756, 2012. [61] D. Merrill, M. Bison, and J. Jefferys, “Electrical stimulation of excitable tissue: Design of efficacious and safe protocols,” Journal of Neuroscience Methods, vol. 141, no. 2, pp. 171–198, 2005. [62] R. Sarpeshkar, Ultra Low Power Bioelectronics. Cambridge University Press, 2010. [63] B. Thurgood, D. Warren, N. Ledbetter, G. Clark, and R. Harrison, “A wire- less integrated circuit for 100-channel charge-balanced neural stimulation,” Biomedical Circuits and Systems, IEEE Transactions on, vol. 3, no. 6, pp. 405–414, 2009. [64] M. Ghovanloo and K. Najafi, “A modular 32-site wireless neural stimulation microsystem,” Solid-State Circuits, IEEE Journal of, vol. 39, no. 12, pp. 2457–2466, 2004. [65] K. Wise, A. Sodagar, Y. Yao, M. Gulari, G. Perlin, and K. Najafi, “Mi- croelectrodes, microelectronics, and implantable neural microsystems,” Pro- ceedings of the IEEE, vol. 96, no. 7, pp. 1184–1202, 2008. [66] W. Liu, K. Vichienchom, M. Clements, S. DeMarco, C. Hughes, E. McGucken, M. Humayun, E. De Juan, J. Weiland, and R. Greenberg, “A neuro-stimulus chip with telemetry unit for retinal prosthetic device,” Solid-State Circuits, IEEE Journal of, vol. 35, no. 10, pp. 1487–1497, 2000. [67] S. C. Cripps, RF Power Amplifiers for Wireless Communications,2nded. Artech House, 2006. [68] P. Kinget and M. Steyaert, Analog VLSI Integration of Massive Parallel Signal Processing Systems. Kluwer Academic Publishers, 1996. 145 [69] S. Bamford, R. Hogri, A. Giovannucci, A. Taub, I. Herreros, P. Verschure, M.Mintz, andP.DelGiudice, “Avlsifield-programmablemixed-signalarray to perform neural signal processing and neural modeling in a prosthetic sys- tem,” Neural Systems and Rehabilitation Engineering, IEEE Transactions on, vol. 20, no. 4, pp. 455–467, 2012. [70] C. K. M. D. J. Andreas V. M. Herz, Tim Gollisch, “Modeling single-neuron dynamics and computations: A balance of detail and abstraction,” Science, vol. 314, pp. 80–85, 2006. [71] L. Abbott, “Decoding neuronal firding and modeling neural networks,” Quarterly Reviews of Biophysics, vol. 27, pp. 291–331, 1994. [72] A. Hodgkin and A. Huxley, “A quantitative description of membrane current and its application to conduction and excitation in nerve,” The Journal of Physiology, vol. 117, pp. 500–544, 1952. [73] T. Berger and D. Glanzman, Toward Replacement Parts for the Brain: Im- plantable Biomimetic Electronics as the Next Era in Neural Prosthetics. MIT Press, Cambridge MA, 2005. [74] T. Berger et al., “Brain-implantable biomimetic electronics as the next era in neural prosthetics,” Proceedings of the IEEE, vol. 89, no. 7, pp. 993–1012, 2001. [75] T. Berger, D. Song, R. Chan, D. Shin, V. Marmarelis, R. Hampson, A. Sweatt, C. Heck, C. Liu, J. Wills, J. LaCoss, J. Granacki, G. Ger- hardt, and S. Deadwyler, “Role of the hippocampus in memory formation : Restorative encoding memory integration neural device as a cognitive neural prosthesis,” Pulse, IEEE, vol. 3, no. 5, p. 17, Sept 2012. [76] V. Ghaderi, S. Allam, N. Ambert, J.-M. C. Bouteiller, J. Choma, and T. Berger, “Modeling neuron-glia interactions: From parametric model to neuromorphic hardware,” in Engineering in Medicine and Biology Soci- ety,EMBC, 2011 Annual International Conference of the IEEE, 2011, pp. 3581–3584. [77] V.MarmarelisandM.Orme, “Modelingofneuralsystemsbyuseofneuronal modes,” Biomedical Engineering, IEEE Transactions on, vol. 40, no. 11, pp. 1149–1158, 1993. [78] U. Lu, S. Roach, D. Song, and T. Berger, “Nonlinear dynamic modeling of neuron action potential threshold during synaptically driven broadband in- tracellularactivity,” Biomedical Engineering, IEEE Transactions on, vol.59, no. 3, pp. 706–716, 2012. 146 [79] V. Ghaderi, S. Roach, D. Song, V. Marmarelis, J. Choma, and T. Berger, “Analog low-power hardware implementation of a laguerre-volterra model of intracellular subthreshold neuronal activity,” in Engineering in Medicine and Biology Society (EMBC), 2012 Annual International Conference of the IEEE, 2012, pp. 767–770. [80] G. Mitsis, R. Zhang, B. Levine, and V. Marmarelis, “Modeling of nonlinear systems with fast and slow dynamics,” Ann. Biomed. Eng, vol. 30, pp. 555– 565, 2002. [81] D. Song, R. H. M. Chan, V. Marmarelis, R. Hampson, S. Deadwyler, and T. Berger, “Nonlinear dynamic modeling of spike train transformations for hippocampal-cortical prostheses,” Biomedical Engineering, IEEE Transac- tions on, vol. 54, no. 6, pp. 1053–1066, June 2007. [82] B. Gosselin, M. Sawan, and E. Kerherve, “Linear-phase delay filters for ultra-low-power signal processing in neural recording implants,” Biomedical Circuits and Systems, IEEE Transactions on, vol. 4, no. 3, pp. 171–180, June 2010. [83] Y. P. Tsividis, “Integrated continous-time filter design - an overview,” Solid- State Circuits, IEEE Journal of, vol. 29, no. 3, pp. 166–176, 1994. [84] R. L. Geiger and E. Sanchez-Sinencio, “Integrated continous-time filter de- sign - an overview,” Solid-State Circuits, IEEE Journal of, vol. 29, no. 3, pp. 166–176, 1994. [85] B. Razavi, Design of Analog CMOS Integrated Circuits. McGraw-Hill Com- panies, Inc., 2001. [86] C. Mead, Analog VLSI and Neural Systems. Addison Wesley Publishing Company, 1989. [87] C. C. Enz, F. Krummenmacher, and E. A. Vittoz, An Analytical MOS Tran- sistor Model Valid in All Regions of Operation and Dedicated to Low-Voltage and Low-Current Applications, ser. Analog Integrated Circuits and Signal Processing. Kluwer Academic Publishers, Boston, 1995, vol. 8. [88] D. Binkley, Tradeoffs and Optimization in Analog CMOS Design. Wiley, 2008. [89] K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, “Leakage current mechanisms and leakage reduction techniques in deep-submicrometer cmos circuits,” Proceedings of the IEEE, vol. 91, no. 2, pp. 305–327, Feb 2003. 147 [90] M. Pelgrom, A. C. J. Duinmaijer, and A. Welbers, “Matching properties of mos transistors,” Solid-State Circuits, IEEE Journal of, vol. 24, no. 5, 1989. [91] C. C. Enz and E. A. Vittoz, Charge-Based MOS Transisor Modeling: The EKV Model for Low-Power and RF IC Design. Wiley, 2006. 148
Abstract (if available)
Abstract
The human brain has perfected the task of performing complex functions, such as learning, memory, and cognition, in energy and area efficient manners. Therefore, it is attractive to model and replicate this performance in biomimetic systems. Applications for such models include implantable electronics to replace certain brain functions damaged by diseases or injury, and neuromorphic architectures for novel high-speed, low power computing. To properly mimic a neural function, it is important to determine the optimal level of model abstraction because there are trade-offs between capturing biological complexities and the scalability and efficiency. Additionally, the hardware implementation of that model must consume little power to minimize heat dissipation in order to avoid tissue damage and to allow for scaling to a large number of components. This doctoral research addresses the hardware challenges of a biomimetic system and proposes a novel method for its practical implementation. A nonlinear neural model is realized using analog subthreshold CMOS signal processing units, which are more power and area efficient than the digital counterparts. Several challenges in subthreshold analog design are addressed in this work. First, application-specific analog circuits are neither easily programmable nor flexible. Second, in the subthreshold regime mismatch and process variations lead to large differences in identically-sized transistors. The brain performs complex functions despite facing similar limitations by using redundancy and learning rules that adjust cell properties. Guided by these principles the analog subthreshold circuits are designed to be digitally programmable for coefficient and time constant adjustments. Calibration and training techniques are proposed in this work to accomplish the required precision. These techniques are verified through numerical simulations, and physical implementation and measurement of a system fabricated in a 0.13 μm CMOS process. This work demonstrates the utility of these optimized low-power subthreshold analog circuits, aided by calibration, and uses them to implement the LEV model of a single-input, single-output spike system to replicate the signal transformations of a single neuron. A foundation is laid for an easily scalable system useful for multi-input, multi-output (MIMO) models, to be used in systems such as a hippocampus prosthesis for memory restoration or a large scale neuromorphic computer.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Power-efficient biomimetic neural circuits
PDF
CMOS mixed-signal charge-metering stimulus amplifier for biomimetic microelectronic systems
PDF
Charge-mode analog IC design: a scalable, energy-efficient approach for designing analog circuits in ultra-deep sub-µm all-digital CMOS technologies
PDF
Improving the speed-power-accuracy trade-off in low-power analog circuits by reverse back-body biasing
PDF
Circuit design with nano electronic devices for biomimetic neuromorphic systems
PDF
Silicon photonics integrated circuits for analog and digital optical signal processing
PDF
Low-power, dual sampling-rate, shared-architecture ADC for implantable biomedical systems
PDF
Plasticity in CMOS neuromorphic circuits
PDF
Advanced cell design and reconfigurable circuits for single flux quantum technology
PDF
Calibration of digital-to-analog converters in highly-integrated RF transceivers using machine learning
PDF
Demand based techniques to improve the energy efficiency of the execution units and the register file in general purpose graphics processing units
Asset Metadata
Creator
Ghaderi, Viviane Soraya
(author)
Core Title
A biomimetic approach to non-linear signal processing in ultra low power analog circuits
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
09/25/2014
Defense Date
08/04/2014
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
analog integrated circuits,analog signal processing,biomimetic,brain implant,Gm-C filters,hippocampus prosthesis,implantable electronics,Laguerre Expansion of Volterra Kernel model,neural implant,neuromorphic hardware,nonlinear neural model,OAI-PMH Harvest,process and mismatch compensation,programmable analog circuits,subthreshold,ultra-low power hardware,weak-inversion
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Berger, Theodore W. (
committee chair
), Choma, John, Jr. (
committee chair
), Grzywacz, Norberto M. (
committee member
), Prata, Aluizio, Jr. (
committee member
)
Creator Email
vghaderi@usc.edu,vivianeghaderi@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-485223
Unique identifier
UC11286818
Identifier
etd-GhaderiViv-2991.pdf (filename),usctheses-c3-485223 (legacy record id)
Legacy Identifier
etd-GhaderiViv-2991.pdf
Dmrecord
485223
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Ghaderi, Viviane Soraya
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
analog integrated circuits
analog signal processing
biomimetic
brain implant
Gm-C filters
hippocampus prosthesis
implantable electronics
Laguerre Expansion of Volterra Kernel model
neural implant
neuromorphic hardware
nonlinear neural model
process and mismatch compensation
programmable analog circuits
subthreshold
ultra-low power hardware
weak-inversion