Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Low-dimensional material based devices for neuromorphic computing and other applications
(USC Thesis Other)
Low-dimensional material based devices for neuromorphic computing and other applications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Low-Dimensional Material Based Devices for Neuromorphic Computing and Other Applications Copyright 2020 by Xiaodong Yan A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Electrical Engineering) December 2020 Xiaodong Yan Acknowledgements I feel truly fortunate to have Prof. Han Wang as my Ph.D. advisor. I would like to thank Prof. Wang for providing me the precious opportunity of doing research in his lab with high flexibility and ample resources. I appreciate it for his guidance throughout my Ph.D. period, especially when I faced with challenges no matter in research or in my life. He would always find the time for me and provide me with plentiful advice no matter how busy he actually was. Being influenced by him, I was able to know how I should think and work as a Ph.D. student with efficiency, rigorousness, confidence and optimism. I would also like to thank Prof. \¥ei vVu, Prof. Jayakanth Ravichandran for serving as my thesis committee members. I want to thank Prof. Wu for his support all the time. I also want to thank Prof. Ravichandran for his help. I would like to thank the MHI Ph.D. Scholar title offered by the Ming Hsieh Institute in the Ming Hsieh Department of Electrical and Computing Engineering, USC. I would like to thank Prof. Ivan Sanchez Esqueda from Arizona State University, who is talented, patient, efficient and hard-working. We have worked together for multiple research topics for more than three years. I thank him for patiently guiding me. I also treasure the enthusiasm and encouragement I can feel from him every time we chatted. He is also good at surfing, triatholon, guitar playing etc., which is admirable. I feel proud to work with all lab members in Prof. Han Wang's group. I want to thank Dr. Huan Zhao. vVe spent lots of time together building the labs, exploring new experimental techniques, chatting about our future and also winning the championship together in the first 3 ii vs. 3 basketball tournament in EE department. I treasure the good memory of these moments. I want to thank Dr. He Tian and Dr. Jiangbin Wu for providing me with help in various research topics. I want to thank Jiahui Ma and Nan Wang for the successful collaborations. I also want to thank Hefei Liu, Zhonghao Du and Hung-Yu Chen for their support. I feel fortunate to collaborate with many excellent research groups outside the Wang's group. I want to thank Prof. Paul Asimow from Caltech for his help in supporting us synthesizing large area single crystalline black phosphorus and also for his knowledgable advice along the research path. I appreciate his strong support to me. I also want to thank Prof. Lihong V. Wang from Caltech for his help in supporting us detecting ultra-fast optical chaos phenomenon in his lab. I want to thank Prof. Mike Chen and Dr. Aoyang Zhang from USC to support me in designing and builiding PCB circuits. I want to thank Prof. Jing Guo and Tong Wu from University of Florida for providing us with strong theoretical analysis support. I want to thank Prof. Linran Fan from University of Arizona and Prof. Yuhao Zhang from Virgina Tech. for our fruitful collaborations and publications. Finally, I want to thank staff members Dr. Donghai Zhu, Kim Reid, Diana Vuong, Kalief Washington, Susan Zarate for their help and support. Last but not the least, I want to thank the most to my family. I am grateful for my parents for raising me up, educating me, and always providing me with strongest support. ·without their unconditional love, I would not be who I am now. I want to especially thank my wife Jingru Zhou, who loves me and sacrifices a lot for me. I thank her for coming into my life and making my life amazing. She understands me the most and helps me to see there is bright future waiting for me. Her love always encourages me to fight on! I also want to thank Jingru's parents. They always support me with a good sense of humor, which makes me realize the beautiful aspects of life that I may neglect. I feel grateful for my family! iii Table of Contents Acknowledgements List Of Tables List Of Figures Abstract Chapter 1: Introduction 1.1 Introduction to Neuromorphic Computing ........ . 1.2 Introduction to Biological Neuron and Synapse ..... . 1.3 Low Dimensional Materials for Neuromorphic Computing 1.3.1 Two-Dimensional Materials for Neuromorphic Computing 1.3.2 One-Dimensional Materials for Neuromorphic Computing 1.3.3 Zero-Dimensional Materials for Neuromorphic Computing 1.3.4 Mixed-Dimensional Heterostructures for Neuromorphic Computing 1.4 Artificial Synapses and Neurons: From Devices to Circuits .. 1.4.1 Desirable Features of Artificial Synapses and Neurons 1.4.2 Crossbar Array for Neural Networks .. 1.4.3 Circuit-level Artificial Neural Networks 1.5 Thesis Outline . . . . . . . . . . . . . . . . . . ii vi vii xiv 1 1 5 10 10 12 13 14 14 14 16 19 20 Chapter 2: Two Dimensional Material Based Device Transport Property 23 2.1 Black Phosphorus on h-BN FET Transport . . . . . . . . . 23 2.2 Temperature Dependent Black Phosphorus FET Transport 39 2.3 Landauer Transport of BP SB-MOSFETs . . . . . . . . . . 49 Chapter 3: Low Dimensional Material Based Synaptic Devices 52 3.1 Carbon Nanotube Synaptic Device and Network 52 3.2 Reconfigurable BP /SnSe Synaptic Device . . . . . . . . . . 70 Chapter 4: Two Dimensional Material Based Circuit Level Applications for Neuromorphic Computing 82 4.1 Reconfigurable Stochastic Neuron Based on Tin Oxide/MoS 2 Hetero-memristor . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.2 Boltzmann Machine Implementation Using Tin oxide/MoS2 Based Stochastic Neuron . . . . . . . . . . . . . . . . . . . . . . 89 4.3 Implementing "Cooling" Strategies In Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 iv 4.4 Probability Distribution Associated With Tin oxide/MoS 2 Hetero-memristor .... 98 Chapter 5: Other Two Dimensional Material Based Devices and Applications 102 5.1 f -Ga 2 O 3 /graphene Barristor . . . . . . . . . . . . . . . . . . . . 102 5.2 High Breakdown Electrical Field in f-Ga 2O 3 /graphene Barristor . 108 Chapter 6: Conclusion, Challenges and Future Work 112 6.1 Summary . . . . . . . . . . . 112 6.2 Challenges and Future Work 113 6.2.1 Material Growth . . . 113 6.2.2 Mixed Dimensional Devices and Circuits . 6.2.3 New Research Capability . . . . . . . . . References 115 115 118 V List Of Tables 1.1 Categories of various neuro-inspired hardware chips . 1.2 Viable technologies reported by industry . 1.3 Desirable metrics for synaptic devices . . 4 5 16 vi List Of Figures 1.1 The tranditional von-Neumann computing suffers several issues which make it unsuitable to deal with data-intensive applications. . . . . . . . . . . . . . . . . 2 1.2 Traditional von-Neumann computing architecture and the neuro-inspired computing architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 A simple model shows an individual neuron. The input axon, dendrite, synapse, output axon are shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Several simple activation functions are shown. (a) linear function (b) a deformed linear function ( c) step function ( d) sigmoidal function. The slope of the sigmoidal function can be steep or less steep. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 A schematic shows how neurotranmitter affects AMPA and NMDA receptors. The NMDA receptor gets activated due to neurotransmitter and will further lead to LTP. 8 1.6 Schematics showing four different cases associated with LTP and LTD. Two synapses A and B connect to one neuron. (a) A provides a weak input and causes no postsynaptic activation. No synaptic weight has been strengthened. (b) Both A and B provide weak inputs and cause postsynaptic activation. Associated LTP happens for both synapses. ( c) A provides a strong input and causes postsynaptic activation. A has LTP. B, however, may have no change or may have heterosynaptic LTD. ( d) A provides medium input and causes intermediate firing in postsynaptic neuron. A also has homosynaptic LTD and B has no change. . . . . . . . . . . . . 9 1. 7 (a) various types of two-dimensional materials (b) a typical one-dimensional material is shown: single walled carbon nanotube ( c) a typical type of zero-dimensional material is shown: a metal nanoparticle. . . . . . . . . . . . . . . . . . . . . . . . . 11 1.8 (a) Image shows 12 x 12 crossbar array in M. Prezioso et al.'s work. (b) Three types of characters with different writing versions that are used in training algorithms. ( c) The input image has nine pixels. ( d) Single layer perceptron diagram. These plots are reproduced with permission from [10], Springer Nature Ltd. . . . . . . . . 17 vii 1.9 (a) Illustration of Jing Pei et al.'s work [ 62]. The neuromorphic chip is integrated with an automatic bike and enables multiple functions. (b) Voice recognition function test result. (c) Tracking function test result. (d) Self-balance function test result. These plots are reproduced with permission from [62], Springer Nature Ltd. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.1 (a) Schematic and (b) optical of the BP SB-MOSFET and BP/ h-BN SB-MOSFET 24 2.2 AFM of the BP SB-MOSFET .... 25 2.3 AFM of the BP / h-BN SB-MOSFET 25 2.4 Transfer characteristics (Id - V 9 8 ) for increasing channel lengths (L) (left), and the transfer characteristics with gate voltage axis offset by voltage at the minimum current (Vmin ) (right) for (a) SiO2 / BP and (b) SiO2/h-BN/BP devices. (c) Extracted ON-state current (Io N ) as a function of channel length indicating transport improvement in devices with h-BN insulating layer. (d) Dual gate sweep transfer characteristics ( offset by Vmin ) demonstrating a reduction in gate hysteresis for SiO2/h-BN / BP devices (h = hysteresis width). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5 (a) Electronic band diagram of the SB MOSFET indicating the charge transport mechanisms. (b) Typical Gaussian distribution of acceptor and donor-like interface traps used in model calculations. ( c) Self-consistent calculation of the channel potent ial as a function of gate bias with and without interface traps (top) ; density of ionized traps based on self-consistent solution of channel potential (bottom). ( d) Calculation of drain current components (i.e., electron and hole currents) for both ballistic and scattering-limited transport in the channel. ( e) Ballistic and scattering-limited transfer characteristics. . . . . . . . . . . . . . . . . . . . . . . . 29 2.6 Fits of model calculations to the experimental Id - Vg s characteristics of (a) SiO2/BP and (b) SiO2/ h-BN/BP SB-MOSFETs with increasing channel length L. (c) Energy distribution of acceptor- and donor-like interface traps used to fit experimental data. ( d) Model calculation and experimental extractions of t he ON-current IoN as a function of L for both the SiO2/ BP and SiO2/ h-BN/BP SB MOSFETs showing good agreement. (e) Calculations of the energy-averaged mean free path as a function of hole sheet density based on the experimentally verified model for both type of devices. Results indicate transport improvement due to larger mean free path for charged impurity and phonon scattering. . . . . . . . . . . . . . . . . . . . 35 2. 7 Extractions of (hole) channel mobility as a function of (hole) carrier density for SiO2/ BP and SiO2/h-BN / BP devices based on fits to experimental results shown in Figure 2.6. Extractions are from devices with L= 250nm. . . . . . . . . . . . . . 37 2.8 (a) 3-D schematic of the black phosphorus device with channel lengths L 1 = 250 nm, L 2 = 500 nm, L 3 = 1000 nm, L 4 = 2000 nm, L 5 = 4000 nm. (b) AFM image of BP device. ( c) AFM height profile indicating a thickness of~ 13 nm for device with L 1 = 250 nm, and ~9 nm for all other devices. Channel width is approximately 200 nm for all devices ( extracted at midpoint). . . . . . . . . . . . . . . . . . . . . 40 viii 2.9 (a) Dual gate sweeps of transfer characteristics (Id - Vb 9 ) for the device with L / W = 1000/ 200 nm at various temperatures. (b) Negative gate sweep of Id - ½ 9 for the device with L = 1000 nm. (c) Temperature dependence of change in Vmin and density of charged traps NT. ( d) Negative gate sweep of Id - ½ 9 with gate voltage axis offset by V min· ( e) Temperature dependence of Ion at a fixed bias of ½ 9 - Vmin = -80 V for devices with increasing channel lengths. (f) Temperature dependence of Ioff at a fixed bias of ½ 9 - Vm;n= -8 V for devices with increasing channel lengths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.10 (a) Band diagram shows barrier for holes. (b) Fermi-Dirac distributions of holes injected from the contact at different temperatures. ( c) Arrhenius-type plot of I 0 1 f vs 1/ T at different Vbg - Vmin · ( d) Extracted barrier heights for devices with increasing channel lengths. Inset plot shows SB heights extracted for devices with various channel lengths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.11 (a) Fits of model calculations to the experimental Id - ½ 9 for the device with L = 2000 nm. (b) D ;t for donors and acceptors. ( c) Model calculation and experimental extractions of temperature dependence of Ion for devices with increasing channel lengt hs. ( d) Model calculation and experimental extractions of channel lengths dependence of Ion at increasing temperatures. ( e) Temperature dependence of extracted hole mobility in BP SB-MOSFETs. . . . . . . . . . . . . . . . . . . . . . 46 3.1 (a) Aligned CNT FET wafer fabricated by Carbonics. (b) Top-gated aligned CNT FET test structures (inset is the zoomed-in view of the channel region from a 10- finger device; each channel finger is 20 µm wide). ( c) Scanning electron microscope (SEM) image of the aligned CNT FET active region. ( d) SEM of the aligned CNT channel. ( e) Cross-sectional schematic of the aligned CNT FET. ( f) Top view of the aligned CNT FET including t-shaped top-gate and self-aligned source/drain regions. (g) Conceptual back-end-of-line (BEOL) integration of aligned CNT FETs for artificial neural network implementation in crossbar configuration. . . . . . . . 55 3.2 (a) Dual-sweep Id - V 9 8 characteristics of aligned CNT FETs for Vds = -1.0 and -0.05 V revealing large gate hysteresis. (b) Dependence of hysteresis window on t he voltage sweep range of dual-sweep Id - V 9 8 measurements. ( c) Id - Vds characteristics for increasing V 9 8 . ( d) Multiple cycles of dual-sweep Id - V 9 8 indicating repeatability of hysteresis effects. ( e) Distribution of hysteresis plotted as a function of the on/ off ratio. (f) Energy band diagram illustrating charge-trapping effects in aligned CNT FETs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.3 (a) Biasing configuration for pulsed measurements of synaptic properties of aligned CNT FETs. (b) Diagram of the pulsed measurements for long-term potentiation and long-term depression. ( c) Measured synaptic characteristics of an aligned CNT FET. ( d, e) Tuning the synaptic properties of aligned CNT FETs with adjustment of the potentiating/ depressing voltage pulse amplitudes. (f) Reduced pulse amplitude improves linearity and stability of the synaptic response with a slight reduction in dynamic range. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 ix 3.4 (a-d) Multiple cycles of synaptic properties characterized with repeated (1000) pulsed measurements. Each graph is for a different amplitude of the potentiating voltage pulse ranging from Vpot = 1.6 to 1.0 V. Top: I d vs pulse number for all 1000 cycles; bottom: extraction of Id at four different levels (i.e. , after four different number of pulses) vs cycle number. . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.5 (a) Retention test showing samples of the programmed Id as a function of time immediately following the pulsed programming. (b) Collection of all data from (a)-(d) and model calculations indicating the impact of Vpot on the abruptness and dynamic range of the aligned CNT FET conductance modulation. . . . . . . . . . 63 3.6 (a) Diagram illustrating the implementation of unsupervised learning for pattern recognition in a spiking neural network with aligned CNT synaptic devices. (b) Simulated time-dependent current in the postsynaptic (output) neurons. ( c) Characteristics of output neuron potentials as simulated by an integrate and fire function, indicating the firing of the postsynaptic neuron spike as well as lateral inhibition. ( d) Experimental dat a and model calculations of aligned CNT FET synaptic response used in the simulations of MNIST data set pattern recognition. . . . . . . . . . . . . . . . . . . 64 3.7 (a) Recognition rate as a function of training number for arrays with increasing number of output neurons. (b) Recognition rate after 60,000 training cycles as a function of the number of output neurons. ( c) Conductance map of 20 output neurons after training. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.8 Improvement in recognition rate as a function of training number in aligned CNT FET spiking neural networks with increasing amplitude of potentiating voltage pulses. Learning rate can be optimized to achieve improvements in recognition with reduced number of training cycles. . . . . . . . . . . . . . . . . . . . . . . . . 68 3.9 BP-SnSe junction synaptic device. (a) Schematic of the BP-SnSe heterojunction synaptic device. The presynaptic input is applied at t he silicon bottom gate terminal. The electrode contacting SnSe is grounded. The postsynaptic output is measured at the electrode contacting BP. Vbias is applied between BP and SnSe, and the voltage V 9 is applied between the input terminal and SnSe. (b) Schematic of a biological synapse that can co-release excitatory and inhibitory neurotransmitters. (c) STEM image, EDX line profile, and EDX mapping of the BP-SnSe junction device cross-section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.10 Tunable characteristics of the BP-SnSe heterojunction. (a) Schematic of the BP- SnSe junction. (b) Simulated band profiles at the junction between the BP and SnSe layers along the vertical direction indicated by the letter B for Vg = -20, 0, and + 20 V, respectively. (c) Simulated band profile at the junction between the BP under the BP-SnSe junction and that outside the junction along the lateral direction indicated by the letter A for V 9 = -20, 0, and + 20 V, respectively. ( d) Id - Vbi as characteristics of the device at different V 9 , showing the rectifying characteristics of the BP-SnSe heterojunction that is reconfigurable between p-p, i-p, and n-p junction types depending on t he bias condit ion. . . . . . . . . . . . . . . . . . . . . 7 4 X 3.11 Excitatory and inhibitory responses reconfigurable at both the presynaptic and postsynaptic terminals. (a) The magnitude of the current measured at the electrode contacting BP for different Vbias and V 9 , plotted in logarithmic scale. Positive pulses applied at the input terminal will lead to the injection of electrons into the phosphorus oxide layer. For regions with the magnitude of the current increasing with V 9 , the synaptic response will be excitatory. For regions with the magnitude of the current decreasing with V 9 , the synaptic response will be inhibitory. The Id - V 9 characteristics can be classified into three regimes based on the different junction types, i.e., p-p, i-p, and n-p. The current map can also be divided into four operation quadrants based on the horizontal ridge of zero Vbias and the diagonal ridge joining the points of minimum conductivity, both marked with the yellow dashed lines. (b) PSC in response to a 20 V input pulse at the input terminal for four different bias conditions corresponding to the points 1-4 in (a). . . . . . . . . . 76 3.12 Potentiation, depression, and STDP for both the excitatory and inhibitory synaptic response modes. The weight change of the BP-SnSe synapse under positive (10 ms, 20 V pulses spaced at 90 ms apart) and negative (10 ms, -20 V pulses spaced at 90 ms apart) input pulse trains for (a) the excitatory response at V 9 = 10 V and ½ias = 2 V and (b) the inhibitory response at V 9 = 10 V and ½ias = -4 V. STDP characteristics for ( c) the excitatory response mode and ( d) the inhibitory response mode at the corresponding bias conditions in (a) and (b), respectively. . . . . . . . 77 3.13 Tuning the synaptic responses. (a) Synaptic weight changes in response to a 20 V input pulse at different V 9 bias conditions for ½ias = +2 and -4 V. (b) Tunable strengths of the excitatory and inhibitory synaptic responses mimicked by the device with different bias conditions at the presynaptic and postsynaptic terminals. 80 4.1 Device structure and electrical characteristics. (a) Schematic of the gate tunable memristive device. (b) The HR-STEM image of the fabricated device cross section. The scale bar is 5 nm. ( c) EDX scan indicates the elemental composition. ( d) Raman spectra for the SnSe sample before and after oxidation. The missing signature modes after oxidation indicates the full oxidation and amorphization of the SnSe sample. (e) Unipolar electrical switching characteristics of the device at V 9 =0 V. The set and reset voltages in positive scan are 3.2 V and 2.8 V, and in negative scan are -3.4 V and -3 V. (f) Modulation of the set voltage by the gate bias. \1/hen ½, decreases from 30 V to -20 V, the set voltage increases. . . . . . . . 85 4.2 (a) The SET process under different VrE- The initial state is reset to high resistance state and a bias Vr E is applied to the device for 2 seconds. (b) The experimentally extracted probability distribution of the bias time until SET occurrence for VrE=3 V, 4 V, 5 V and 6 V, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.3 (a) Pss,t<2s as a function of the Vr E under different gate voltages, showing exponential class sigmoidal distribution function. Experimental results are shown as data symbols, and the analytical model fit is shown in lines. (b) Experimental results (dots) and model fit (line) showing the relation between Tef f and the gate bias V 9 . 89 4.4 Flow chart showing the steps in mapping a MAX-SAT problem to an equivalent form solvable using the Boltzmann machine. . . . . . . . . . . . . . . . . . . . . . . 90 xi 4.5 (a) The PCB evaluation board of BM integrated system including the packaged tin oxide/ MoS 2 memristive units and CMOS peripheral circuits. (b) Schematic of the BM circuit blocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.6 The experimentally obtained evolution of state vector and total energy when the BM was started from three different initial states, resulting in the same optimal solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4. 7 (a) Experimentally obtained energy evolution in the BM optimization process with V 9 = -20 V, 0 V, 20 V, respectively. (b) The success rate of the BM optimization process under different V 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.8 Implementing the simulated annealing in tin oxide/MoS 2 based BM. (a) Conceptual schematic illustrating the evolution of the solution and energy states during the optimization process employing the 4 different variation strategies. (b) Experimentally obtained energy evolution in the BM optimization process for the 4 different strategies. ( c) ( d) Experimentally extracted success rate of the BM in achieving the most optimal solution using 4 different strategies of Tef f variation during the optimization process: HT-to-LT, LT-to-HT, LT and HT. Different initial states are used in (c) and (d). TeJJ =50 for HT and TeJJ=5 for LT in (b) , (c) and (d). . . . . . . . . . . 94 4.9 Russel-Rao similarity matrix underlying the clauses employing different "cooling" strategy in a MAX-SAT problem. (a) Schematic showing the correlation among the five clauses in the MAX-SAT problem as a result of the variable and clause constraints. As an illustration, if the variables in orange circles are assumed to be true, then the variables in blue circles would be false. (b ), ( c), ( d), Russel-Rao similarity matrix between the five clauses when BM runs the optimization process under Tef f = 50, 20 and 5, respectively. . . . . . . . . . . . . . . . . . . . . . . . . 96 4.10 The evolution of the Russel-Rao similarity matrix in a BM optimization process when Teff is decreased linearly with each iteration step. . . . . . . . . . . . . . . . 97 5.1 (a) Schematic of a ,B-Ga2O 3 /graphene vertical device. (b) Optical image of a fabricated device. (c) AFM image scan in the area marked in (b). (d) AFM height profile along the red dashed line marked in ( c) . . . . . . . . . . . . . . . . . 103 5.2 (a) Energy band diagram of a ,B-Ga2O 3/graphene heterostructure. (b) Id - Vi 9 (symbol) and analytical calculation fitting (red line) . The inset shows the barrister type output characteristics of the device. ( c) Cross-sectional TEM image of the ,B-Ga 2O 3/graphene heterostructure on the SiO2/Si sub- strate. (d) Cross-sectional TEM image of highly crystalline strain-free ,B-Ga 2 O 3 . The inset shows the corresponding SAED pattern. (e) EDS mapping of the cross-section of ,B-Ga2O3 /graphene/SiO 2 . 105 5.3 (a) Off-state breakdown characterization. (b) Simulated electrical potential distribution. ( c) Simulated electric field along the vertical direction. ( d) Analytical calculation of the breakdown voltages perpendicular and parallel to the (100) plane assuming similar junction structures. . .............................. 109 5.4 Comparisons of breakdown electric fields measured on various semiconductor materials for power device applications. Red stars mark the results from t his work. . .... 111 xii 6.1 (a) SEM image showing BP film aftergrowth. The substrate area is also shown for comparison. (b) BP film under optical microscope. ( c) Raman mapping of A;/ B 29 peak intensity ratio of the BP film for the same area in (b). ( d) Raman spectra of BP film from a small area in (b). . . . . . . . . . . . . . . . . . . . . . 114 6.2 A schematic showing the CUP setup. semiconductor devices etc. . . . . . . . . The objectives can be neural systems, 116 xiii Abstract The computing demand for data-intensive applications is heavily exceeding the computing capablities ofvon-Neumann computing system, which suffers issues like "von-Neumann bottleneck", "Moore's Law" scaling limit, high energy cost and low computing efficiency. The neuromorphic computing takes inspiration from human brains to invent new types of hardwares that can have advantages like "massive parallelism", "low power consumption" and "stochasticity". The low-dimensional materials offer rich physical properties and are superior building blocks for developing advanced devices for neuromorphic computing. This thesis focuses on building novel electronic devices and circuits for neuromorphic computing applications by addressing these aspects: (1) studying fundamental properties of low-dimensional materials and devices; (2) making and studying single devices that can mimic single biological components, like synapse and neuron; (3) studying the networking of these compoents; ( 4) building novel circuits to implement the networking and also demonstraing computing algorithms. By engineering low-dimensional material properties and device fabrications, synaptic devices with high tunability are presented, which simplifies the implementation into next level application. Stochastic neurons are demonstrated which enables the temperature-dependent Boltzmann machine and simulated annealing. It paves way for the research of mimicking stochastic behaviors of decision making in human brain. In the end, the future potential and challenges of low-dimensional material based devices for neuromorphic computing are also discussed. xiv Chapter 1 Introduction 1.1 Introduction to Neuromorphic Computing Nowadays, the computing demand for data-intensive applications is heavily exceeding the computing capablities ofvon-Neumann computing system. While we wait as usual for just 30 seconds at a red traffic light, 1.5 Giga-bytes of image data has been generated worldwide containing uncountable informations, which includes 0.6 million of surveillance recording footages, 50 hours of online videos, and millions of new photos being uploaded on the Internet [l]. There is a study showing that for all information generated like this, more than 23% of the data is valuable and worth further analysis but less than 0.5% of the data has been really analyzed [1], which indicates that we need an efficient way to tag and extract useful information in this big-data era. The von-Neumann computing system suffers from several issues which make it unsuitable to deal with data-intensive applications (Figure 1.1). The first issue is the so called "von-Neumann bottleneck". The computing part and the memory part are separated, which drags down the computing performance when workload of data transimission and interaction increases. The second issue is the so called "Moore's Law" limitation. The "Moore's Law" has stopped working and shows the physical scaling limitation of complementary metal-oxide-semiconductor (CMOS) technology . The third issue is the high power and low efficiency problem. For example, in 1 von Neumann "Bottleneck" "Moore's Law" Scaling 1.4 M Watts (>500 times average household power consumption), 8192 processors, 10 9 synaptic events/sec (10 9 flops) complexity is not "free": size, weight, power Limitation in implementing "new functions" Figure 1.1: The tranditional von-Neumann computing suffers several issues which make it unsuitable to deal with data-intensive applications. order to train the computer to identify faces of cats from around 10 millions images tagged from online videos, Google's stacked autoencoder algorithm was applied on a 16,000-cored processor, which consumed more than 100 kW power and more than 3 days of the training time [2].The breakthrough of the hardware level acceleration is essential for the speed and energy efficiency requirement for future computing tasks. The fourth issue is related to von-Neumann architecture. A new architecture that can handle massive parallel data computations at the same time is highly needed. The fast-development and huge success of the graphics processing units ( GPU) has shown the bright future that many new functions are easiliy enabled by new architectures [3]. 2 CPU • w I Memory (a) van-Neumann computing architecture neuron synapse neuron synapse neuron synapse neuron synapse neuron synapse neuron synapse neuron synapse neuron synapse neuron synapse (b) Neuromorphic computing architecture Figure 1.2: Traditional von-Neumann computing architecture and the neuro-inspired computing architecture. The neuromorphic computing takes inspiration from human brains to invent new types of hardwares that can have advantages like huam brain including: (1) low power; (2) massive parallelism; (3) sparse distribution and massive capacity; ( 4) noise tolerance; (5) contains stochastic components for stochastic computing; (6) self-teaching and self-learning; (7) can make quick decisions despite incomplete information; etc. In neuro-inspired architecture, data-centric computation can be realized by leveraging the relation between well-distributed neurons and synapses, which deal with data-computing and data-storage respectively (Figure 1.2). The neuromorphic computing can be a great supplement to current von-Neumann computing system. Together, they can realize the ultimate artificial intelligence and enable the applications like image recognitions, speech recognitions, self-driving cars etc. Many efforts have been put into developing new hardware platform to support massive prallelism 3 Table 1.1: Categories of various neuro-inspired hardware chips Categories CMOS technologies CMOS ASIC Emerging synaptic technologies technologies Non-spiking GPUs TPU UCSB crossbar array [10] Computing FPGAs CNN accelerators Spiking Computing SpiNNaker HICANN IBM STDP neuron TrueNorth circuits [11] in the neuro-inspired computing. The GPUs and field programmable gate arrays (FPGAs) [4] are popular candidates now for applications like deep learning. The CMOS-based application specific integrated circuit accelerators [5] and custom-designed tensor processing unit (TPU) are developed and reported [6]. In these hardware platforms, the implementation of neuro-inspired algorithms focuses on increasing parallelism but not on neural-spiking capabilities. There are also other types of hardware platform which focus on mimicking entire neural-spiking functions for computing. These hardware platforms include SpiNNaker from University of Manchester [7], HICANN from University of Heidelberg [8], TrueNorth from IBM [9] etc. These chips contain millions of artificial neurons, synapses, transistors and have demonstrated good performance in real-time image recognition with low power consumption. Table 1.1 shows more clearly how various neuro-inspired hardware chips are categoried. And technologies in this field is still fast developing. Inside these aforementioned neuro-inspired chips, emerging resistive non-volatile memory devices are implemented. Arrays of these devices enable the data-storage and data-computation in parallel. The static random-access memory (SRAM) arrays in von-Neumann computing system suffers issues like large array size and low capacity, high leakage current and power consumption. By comparison with SRAM arrays, the resistive memory arrays can have higher capacity with smaller device size. The array has no standy leakage current and reduces energy consumption. The resistive memory device can have fast access speed and multiple memory states and occupies much smaller space than SRAM. Furthermore, the resistive memory array can do parallel computing 4 and weighted sum computing, which is difficult to be realized in SRAM arrays. The resistive memory device has already achieved great progress in both research and industry. Many materials and device technologies are studied including: (1) WO x- [12], (2) TiOx/HfOx [13] (3)TiO2 / TaOx- [14] based resistive random-access memory (RRAM); (4) Ge2Sb2Te5- [15] based phase-change memory (PCM); (5) Ag/a-Si- [16], (6) Ag/ Ag2S- [17], (7)Ag/ GeS2- [18] based conductive bridge random-access memory (CBRAM) etc. Table 1.2 shows viable technologies of resistive memory devices reported by industry. Table 1.2: Viable technologies reported by industry Industry Device technology Memory capacity Bandwidth Samsung [19] PCM 8Gb 40 MB/s SanDisk/ Toshiba RRAM 32Gb NA [20] Micron/ Sony [ 21] CBRAM 16Gb 200 MB/ s The exciting neuromorphic computing can be truly realized if the interdisciplinary collaborations between devices, archetecture, algorithm are fully developed. The breakthrough in this research field will benefit the implementation of future computing technique. 1.2 Introduction to Biological Neuron and Synapse A biological neuron consists of neuron cell body, dendrites and synapse. The dendrite connect one neuron with other neuron by connections called synapse. A signal transmission process is like this: a synapse receives chemical signal from pre-synaptic neurons and will release neurotransmitter to cause either depolarization or hyperpolarization in the post-synaptic neurons. After receiving enough summation signals within 15-25 milisecond, the neuron will get depolarized and fire an action chemical potential. If the input signals cause the neuron to hyperpolarized, then the neuron would less likely to fire an action potential. So a neuron can be treated as a activation element when input excitatory signal minus input inhibitory signal exceeds threshold level, the neuron is 5 Input axon ••• Output axon Figure 1.3: A simple model shows an individual neuron. The input axon, dendrite, synapse, output axon are shown. activated and send out an action signal to next stage. A single neuron in a neural network can be formalized [ 22] as shown in Figure 1.3. For a neuron i, it receives and summarizes input signals from input axons denoted as x 1 . The total summation S i is Si= L X jWij, j (1.1) where the Wij is the synaptic strength. The former subscript i means the neuron that is receiving signals and the latter subscript j denotes the input axons. Now the Si goes into the neuron cell body and may lead to a neuron firing activity, which is determined by the firing rateyi. The f is activation function as Yi = f(s i) and can have different types as shown in Figure 1.4. It determines whether the neuron will fire or not. To have a neural network to perform computation tasks, the synaptic weight has to be updated during learning. Donald Hebb proposed in 1949 [23] that a synapse updates its weight when both presynaptic and postsynaptic activities are happened. Analytically, the Hebb rule can be expressed as (1.2) 6 a b inputs; inputs; C d inputs; inputs; Figure 1.4: Several simple activation functions are shown. (a) linear function (b) a deformed linear function ( c) step function ( d) sigmoidal function. The slope of the sigmoidal function can be steep or less steep. Here a synapse Wij updates its weight OW;j when presynaptic activity and postsynaptic activity happen simulaneously or conjunctively within 100-500 miliseconds. The presynaptic firing is Xj and postsynaptic firing is Yi· The a is a constant to specify the ratio of weight update. The Hebb rule indicates that both presynaptic and postsynaptic activity must happen at the same time in order to update the synaptic weight. It also indicates that synaptic weight get much strengthened if the firing rate is stronger. Based on Hebb rule, the associative long-term potentiation (LTP) and long-term depression (LTD) are found in brain.The LTP and LTD may not be most basic events during learning in brain but they are needed in some of synaptic systems during learning in brain. In LTP, the synaptic weight performs a sustained increase. LTP can happen rapidly in less than 1 minutes 7 Synapse AMPA NMDA Figure 1.5: A schematic shows how neurotranmitter affects AMPA and NMDA receptors. The NMDA receptor gets activated due to neurotransmitter and will further lead to LTP. and lasts for longer than hours. The LTP is associative, synapse-specific and requires temporal contiguity. In a postsynaptic cell in hippocampus, the neurotransmittor L-glutamate released from presynapse acts upon the NMDA (N-methyl-D-aspartate) and AMPA (alpha-amino-3-hydroxy- 5-methyl-isoxasole-4-propionic acid) receptors on the postsynapse [24]. They determines opening and closing of ion channels. Normally, AMPA is open and NMDA is closed. When LTP happens, the NMDA also opens and causes sustained potential increase in postsynapse. Figure 1.5 plots a synapse and illustrates the process when NMDA and AMPA function. LTD can also be associative and includes two types: heterosynaptic LTD and homosynaptic LTD. In multiple synapses that are connected to body cell, the hetero-synapse may experience LTD while the post-synaptic neuron is activated by other synapse. The homosynaptic LTD occurs when presynapse is activated but postsynaptic neuron is not. Figure 1.6 illustrates several cases associated with LTP and LTD [ 22]. 8 a A associative LTP weak postsynaptic activity strong strength firing C d A A strong strength firing intermediate strength firing Figure 1.6: Schematics showing four different cases associated with LTP and LTD. Two synapses A and B connect to one neuron. (a) A provides a weak input and causes no postsynaptic activation. No synaptic weight has been strengthened. (b) Both A and B provide weak inputs and cause postsynaptic activation. Associated LTP happens for both synapses. ( c) A provides a strong input and causes postsynaptic activation. A has LTP. B, however, may have no change or may have heterosynaptic LTD. ( d) A provides medium input and causes intermediate firing in postsynaptic neuron. A also has homosynaptic LTD and B has no change. Due to temporal property of LTP and LTD, the spike timing-dependent plasticity (STDP) is observed. A LTP occurs when the presynaptic signal precedes the postsynaptic activation by miliseconds. A LTD occurs when the postsynaptic signal precedes the presynaptic activation by miliseconds. STDP is an asymmetric Hebbian learning rule and has been proved to be useful to help synapses learn sequence, and predict future based on knowledge of past events. Different types of STDP has been studied, including antisymmetric Hebbian STDP, antisymmetric anti Hebbian STDP, symmetric Hebbian STDP and symmetric anti-Hebbian STDP [25]. Neuroscience and brain functions are not limited by the above description. The understanding of brain functions and neural systems is still very limited. By studying the functions and 9 representations of neural network in the brain, researchers get inspired to develop novel materials and hardware for neurmorphic computing. 1.3 Low Dimensional Materials for Neuromorphic Computing Low dimensional materials offer rich physical properties and are superior building blocks for developing advanced devices for neuromorphic computing (Figure 1.7). The rich physical properties of low dimensional materials are discussed in details in these reference journal publications: [26], [27]. Due to different geometry and unique physical properties, low dimensional material based devices show many superior performance and various novel functions by comparison with silicon technology in neuromorphic computing. In the following parts, many of the published research works using low-dimensional materials for neuromorphic computing are discussed. 1.3.1 Two-Dimensional Materials for Neuromorphic Computing Research of two-dimensional (2D) materials in past decades have enabled a great amount of achivements in fundamental material and physics science, semiconductor device engieering, possible life-changing novel applications, etc. In neuromorphic computing aspect, it is clear that 2D materials offer rich properties and become a platform to realize neuromorphic computing applications. The mechansims of these device operations are various. One prominent advantage of 2D materials is they contain atomically smooth surface and good scaling and integration capability with tranditional wafer scale technology. Several types of 2D material based synaptic devices or memrestive devices are discussed. Vertical structure of 2D material based memristors achieved great success. There is a family of 2D material called transition metal dichalcogenides (TMDCs) [26], [28]. TMDCs have form of MX 2 (M = Mo, W; X = S, Se). Vertical Au/MoS 2 / Au memristors show high on-off ratio (> 10 4 ), low on-state resistance(< lOrl) and fast switching speed(~ 50GHz) [29] . Graphene/MoS 2/graphene 10 a Two dimensional materials Atomic Crystals: Graphene, h-BN TMOs (Transition Metal Oxides): Mo0 3 , LiCo0 2 TMDCs (Transition Metal Dichalcogenides): Black Phosphorus MoS 2,WS 2,NbSe 2 (MX 2 ) b C One dimensional materia. ls Zero dimensional materials Single wall carbon nanotube Metal nanoparticles Figure l. 7: (a) various types of two-dimensional materials (b) a typical one-dimensional material is shown: single walled carbon nanotube ( c) a typical type of zero-dimensional material is shown: a metal nanoparticle. memristors show high operating temerature at 340°C [30]. Cu/MoS 2 / Au memristors show low turn-on voltages at around 0.2V. They also show STDP performance [31] . Memristors based on oxidized h-BN layer (BNOx) shows ultra low power consumption ( ~fJ) and low switching voltages [32]. Vertical h-BN based memristors also show memristive switching behaviors [33]. BP/SnSe device show tunable synaptic modes and STDP performance with various biases [34]. Lateral structure of 2D material based synaptic devices also achieved great success. Graphene based memristive devices show tunable Hebbian learning capability [35]. Low power consumption ( <500 fJ) is demonstrated in graphene based lateral memristive devices due to Li+ ion diffusion 11 [36]. Lateral WSe2 devices show low power consumption around 30 fJ [37]. People add dielectric layer with oxide material, polymer electrolytes etc. to enable novel functions to mimic various Hebbian rules [38] . Grain boundaries in CVD grown h-BN and MoS 2 are proved to be beneficial for memristive characteristics [33], [39]. The device performances rely on grain boundary geometry thus suffers non-uniform issue when being scaled up. By defect-engineering using e-beam bombard, MoS 2 memristive devices show performance improvement. Beyond the above discussed mechansims like oxygen vacancy type conductance, defect engineering, ion diffusion engineering and grain boundary engineering, some 2D materials offer new physical mechanisms like phase transitions. 1 T-TaS 2 changes phase back and forth between incommensurate charge density wave (CDW) and commensurate charge density wave at 350K [40]. Between 100K to 200K, 1 T-TaS 2 can also change to a Mott state of commensurate CDW state. The ion intercalation can also change MoS 2 between 2H and 1 T' phases,which enables memristive behaviros [ 41]. MoTe2 can change phases between 2H and distorted 2Hd and orthorhombic Td phases and the fabricated devices also prove good memristive performance [42]. "\Vith further development of 2D material growth techniques, the 2D material based devices will become suitable for commercialized applications. 1.3.2 One-Dimensional Materials for Neuromorphic Computing One-dimensional (lD) materials are geometrically like biological axons and offers capability in realizing the hyper-connected biological neural network. Beyond this geometric advantage, lD materials also offer rich physical properties which enable them to realize devices with novel functions. The carbon nanotube (CNT) is the most well studied lD material. With development of CNT growth techniques, the chirality and number of walls of the CNT tubes can be realized. CNT with mixed alignments or uniform alignments can also be realized. Because of its tubular geometry, there exists high electric field between CNT and dielectric layer that is below CNT, which help 12 CNT traps charges in its deep energy level. CNT also has both p-type conductance and n-type conductance depending on doping, which facilitate the device and circuit engineering. Aligned CNT has been fabricated into synaptic transistors with T-shape gates and they demonstrate tunable and robust synaptic behaviors, which is enabled by charge trapping mechansim between the CNT and the gate dielectric layer [43]. Mixed CNT has also been demonstrated with synaptic characteristics [44]. CNT is also integrated in 3D chip [ 45]. N anowires of many other materials are also lD and have memristive behaviors. These materials include ZnO, TiO2 , CuOx, Co3O4, Ga2O3, Ag, Cu nanowires etc [ 28]. Organic nanowire using with the structure of a polyethylene oxide (PEO) sheath wrapped around a poly(3-hexylthiophene- 2,5-diyl) (P3HT) core shows 1.23fJ power consumption and LTD, LTP performances [46]. Polymer organic electrochemical transistors also demonstrate neuromorphic computing functions. Optical lD system is also invented, which includes light-emitting diodes, superconducting-nanowire single photon detectors and lD waveguides [ 28]. The entire system contains neurons, synapses and shows neuromorphic computing capability with low power consumption ~20 aJ and fast switching speed (20 MHz). The spatial distribution of lD materials can be engineered to mimic neuromorphic neural network. There are not much research works in this area yet. 1.3.3 Zero-Dimensional Materials for Neuromorphic Computing Zero-dimensional (OD) materials include organic molecules, semiconducting quantum dots, metal nanoparticles, quantum confined electron gas etc. The optical properties of OD materials can be engineered to neuromorphic computing applications. InAs/ InGaAs quantum dots (QD) based lasers have been demonstrated with reconfigurable synaptic responses [47]. GaAs/ AlGaAs based QDs are demonstrated with electrical and optical memristive behaviors [48]. The CsPbBr3 perovskite QDs with coupled organic transistors show synaptic behaviors, which get input signals from both electrical path and optical path [49]. Black phosphorus QDs that 13 are surrounded by poly(methyl methacrylate) bilayers show memristive behaviors due to charge trapping mechanism [ 50]. Gold nanoparticles on pentacene film have also been demonstrated wit h synaptic plasticity [51]. Silver nanoparticles in Si Ox film also show similar performance [ 52]. Small chemical molecules that are close to OD are used for fabricating synaptic devices [ 53]. OD QDs are also used in Josephson junctions to build quantum neural networks [54]. 1.3.4 Mixed-Dimensional Heterostructures for Neuromorphic Computing Different low-dimensional materials can be integrated together by van der \¥aals force. Devices with mixed dimensional p-njunctions (OD with 2D, lD with 2D, lD with 3D) have been demonstrated to generate Gaussian curves, which can be used in generating synaptic spikings and also providing both positive and negative feedback in neural network. The low-dimensional materials can also be integrated in 3D compact future chips, which can greatly enhance computing efficiency and improve device density. In conclusion, low-dimensional materials possess desirable properties for neuromorphic comput ing. Integrating them together and utilizing the most suitable material and device structure for the specific applicaiton can enable complicated neuromorphic functions and achieve desirable performance. Device and circuit technology based on these materials show great potential for emulating biological neural network and realizing real artificial intelligence. 1.4 Artificial Synapses and Neurons: From Devices to Circuits 1.4.1 Desirable Features of Artificial Synapses and Neurons To meet the application requirement of neuromorphic computing, the synaptic devices need to have desirable feaures, like device sizes, robustness, reliability et c. The following will briefly discuss the desirable features for synaptic devices [55]. In order to mimic neural network, the size of a single device should be compatible with planar 14 wafer technology. The synaptic devices that can scaled down to sub-lOnm regime have been demonstrated. Devices have been integrated into crossbar array structure due to the optimized dimensions of single devices. A synaptic device should have multiple synaptic states to mimic the real biological synaptic plasticity. Devices with good retention and endurance can improve learning efficiency and long term memory capability. A synaptic device should have a reasonable dynamic range. The dynamic range is the on/ off ratio between the device's low resistance state and high resistance state. A dynamic range that is > 100 is preferable for neuromorphic computing algorithms to run efficiently. A synaptic device can potenti ate or depress by updating its conductance to multiple level. The resulting curve in many published works has certain curvature. Ideally, the curve is preferable to be linear to facilitate the mapping between synaptic weights to computing algorithms. This issue is called the linearity in synaptic weight update. A linear and equal-distanced distribution of multiple states will reduce computing error. Since the amount of synaptic devices needed in a neural network is huge, the device uniformity and variability determine computing accuracy in neural network scale. They also determine if an algorithm can be realized or not. Thought nosie-tolerance is a good property of neural network, the device uniformity and variability should be engineered to the best scenario. Last but not the least , the power consumptionper synaptic event is also important. Multiple research works have shown devices with pJ or fJ level of power consumption per synaptic event. In a biological synapse, a spike voltage is ~ 10 mV and related current is ~ lnA and time duratio is ~ 1 milisecond, which lead to energy consumpiton per event to ~ 10 fJ. A synaptic device should have similar or lower power consumption performance. The Table 1.3 summarizes desirable metrics for synapt ic devices [55] . The integration of mulitple synaptic devices in a parallel fashion leads to a crossbar array. The information in the crossbar array is updated in analog way and outside of the crossbar array 15 Table 1.3: Desirable metrics for synaptic devices Desirable metrics Performance target value Device size <10 nm Multilevel states >100 Retention > 10 years Endurance >10 9 updates Energy consumption <10 fJ per synaptic event Dynamic range >100 is in digital way. Optimization in array architecture or new methods to access each or rows or columns of these devices and avoid sneak path problem is desirable for realizing neural network. The interconnect resistance should be minimized. Also, a boost of instant power is needed if we need to access many of these devices at the same time, which brings challenges to capabilities of peripheral supporting circuits. As the complexity of practical computing problem increases, the crossbar array size may need to dramatically increase too. Common circuit design issues like device variablility, parasitic effects etc. may show up and become detrimental to circuit performance. People have proposed 3D future chip to integrate different technologies into one compact chip for the purpose of greating increasing computing efficiency with high chip capacity. Overall, the optimization in each steps and optimizaiton of overall systems in realizing neural network system are desirable. 1.4.2 Crossbar Array for Neural Networks Some published research works that study crossbar array for neuromorphic computing are discussed below to show achievements and challenges in this research area. Sukru et al. in year 2013 have demonstrated 10 x 10 crossbar array with phase change memory and 1 TlR structure [56] . Each single device has two terminals accessed by column bitline or row wordline. A recurrent Hopfield network is imeplemented to demonstrate Hebbian learning. Ten neurons are implemented using software to connect the 100 synaptic devices. Neurons are computing elements and synaptic 16 a C V\ Qi X · c.. Qi 01 ro E M X M V1 V4 V7 V2 Vs Vs V3 V5 Vg bias ~ b d Pattern"z" ~ ~ ~ Pattern"v" ~ ~ ~ Pattern "n" EE§ EE§ EE§ . . . BJ~ sum activation 5] • • • ~v Figure 1.8: (a) Image shows 12 x 12 crossbar array in M. Prezioso et al.'s work. (b) Three types of characters with different writing versions that are used in training algorithms. ( c) The input image has nine pixels. ( d) Single layer perceptron diagram. These plots are reproduced with permission from [10], Springer Nature Ltd. devices are memory elements. During training session, information associated with specific neurons are applied on the neurons and related synaptic devices, which reduces the resistance of these synaptic devices and lead to firing of those specific neurons. Then in recall session, a 90% similar information is applied on all neurons. Some of the specific neurons fire as usual, and some of specific neurons don't fire due to wrong information. But after several iterations, these specific neurons will fire again, which means the neural network still remember the information it learned during the training session. This learning demonstration is a form of Hebbian learning rule. It is due to the memory effects of synaptic devices during the training sessioin. In this work, the effects of device-to-device variation on the learning performance is also studied. M. Prezioso et al. in year 2015 have demonstrated training and operation of an crossbar 17 array network based on metal-oxide memristors [10]. The single device structure is based on Al 2 0 3/Ti0 2_x · A 12 x 12 crossbar array was built as shown in Figure 1.8 (a). Each device size is 200 x 200 nm 2 . Device variation and dynamic behaviors are studied using this crossbar array. Single layer perceptron was demonstrated through supervised learning on a 10 x 6 crossbar array, which mimics a neural network with 10 inputs and 3 outputs. In the training data set, there are three types of characters "z", "v", and "n", and each character has 30 different versions that are based on 3 x 3 pixels (Figure 1.8 (b), ( c)). The flow chart in Figure 1.8c,d shows the training algorithms. After training for around 20 epochs, the misclassification rate greatly decreased. To broaden more applications, multilayer percetron is demonstrated by a series concatenation of more single layer perceptron. In this work, two 20 x 20 crossbar arrays are used to mimic the multilayer perceptron by connecting input layer to hidden layer and hidden layer to output layer. Experimental results of this work can be found in this reference paper [10]. Sparse coding is a method inspired by human visual neural network, which decode visualized images into sparse information. Patrick M. Sheridan et al. in year 2017 demonstrated sparse coding information and form a dictionary of basic information using 32 x 32 WOx memristor array [57]. It is used to decode complex data information, reduce the complexity and in the end accelerate the data processing efficiency. There are challenges remaining with crossbar array technique before it can really be broadly used in neuromorphic computing system. First is the power consumption issue with crossbar array. It will affect the design of periphery CMOS circuits.Second, it is difficult to trustably validate the neural networks with crossbar arrays due to features that are hard to control, like: device variations, individual device malfunctioning, distrubance arised from devices' intrinsic stochasticity etc. Though simulation can help better understand these issues, more solid experiments are required to keep optimizing these issues. Nonetheless, many successful experimental demonstrations in published literature have shown the feasibility of crossbar array for neuromorphic computing based on neural networks. 18 1.4.3 Circuit-level Artificial Neural Networks Complex circuits can be implemented to realize multifunctional artificial neurons. In neural network, neurons can be generally classified into two models. One is leaky integrate fire (LIF) model. The other is Hodgkin Huxley (HH) model. The LIF model mimics biological neuron characteristics, like spike generation, integration and threshold activation etc. Z. Wang et al in year 2018 demonstrated LIF neuron circuits based on phase change memristors [58]. In this circuit, the neuron consists of a parallel connection of a memristor and capacitor. \Vhen the voltage drop across the capacitor exceeds the switching voltage of the memristor, the neuron fires. In HH model, the artificial neuron tries to mimic sophisticated biological ion dynamics in neuron cells. M. D. Pickett et al in year 2013 demonstrated HH neuron circuits based on Mott memristors [59] . H. M. Huang et al. in year 2019 introduced a quasi-HH circuit to have both advantages from LIF model and HH model [ 60]. It combines LIF model with a comparator and a timer. vVhenever the memristive device exceeds the threshold, it will triger the comparator to generate a falling edge and then the timer will record this firing event. System-level applications with demonstrating computing algorithms can also be implemented using circuit level neural network like spiking neural network (SNN). J. Feldmann et al. in year 2019 demonstrated SNN using optical methods to show the self-learning capabilities [ 61]. Information for computing is first decoded into spikes with different spatial-temporal properties. The neurons in SNN will fire when threshold condition is reached and then recover to rest conditions, which mimic how human brain membrane works. Jing Pei et al. in year 2019 demonstrated SNN / ANN hybrid neural network. The comparions between SNN and ANN is described in reference [62]. This hybrid chip is used to build an automatic bike with multiple functions like real-time object detection, route tracking, automatic self-balance etc (Figure 1.9 (a)). The experimental result is shown in Figure 1.9 (b), (c), (d). 19 a b 1 , , , , Speed_up 11 1 1111 I I I II I II II II II II I 1111111 II I I Ill I I II Left ' I I 11111 I II Ill II I II I I I Ill I II I '' 1111 Straight Time step '" ' C 0 :E 8. l 0 ~ Tracking -0.25 ' Obstacle : S-curve bypass i tracking ' ' 10 20 30 40 50 Time (s) d 30,-------------,5 - 4', 20 20 - Error 40 60 80 Time (s) Figure 1.9: (a) Illustration of Jing Pei et al.'s work [ 62]. The neuromorphic chip is integrated with an automatic bike and enables multiple functions. (b) Voice recognition function test result. ( c) Tracking function test result. ( d) Self-balance function test result. These plots are reproduced with permission from [ 62], Springer Nature Ltd. 1.5 Thesis Outline This thesis summarizes the main part of the research work during my PhD time at USC, which discusses properties of device transport, low-dimensional based synaptic devices, circuit level neuromorphic computing implementations and some other low-dimensional material and device based applications. In order to transfer the inspiration we have learned from human brain into circuit level applications with real artificial intelligence, my research has been focused on: (1) studying fundamental properties of low-dimensional materials and devices; (2) making and studying single devices that can mimic single biological components, like synapse and neuron; (3) studying the networking of these compoents; ( 4) building novel circuits to implement the networking and also demonstraing computing algorithms. The thesis address these aspects as follows: 20 In chapter 2, basic properties of electronic device-field effect transistor (FET)-are discussed. Black phosphorus based FET, black phosphorus/h-BN based FET are fabricated and compared. Transport study of these devices help understand the relation between material engineering and device performance optimization. Temperature dependent transport study of black phosphorus schottky barrier FET with various channel lengths reveals the physical factors that may dominant devices' characteristics. Theoretical study of Landauer transport model clearly varifies the experimental observations. In chapter 3, low dimensional material based synaptic devices are discussed. The carbon nanotube (CNT) is one-dimensional material with superior conducance properties. Aligned CNT FETs are demonstrated to behave like non-volatile flash-memory and have multiple memory states. They can mimic biological synaptic responses with tunability. The crossbar array of CNT FETs can perform image recognition under unsupervised learning algorithm. Next, black phosphorus/SnSe based reconfigurable synaptic device is demonstrated with tunable synaptic responses. The device can mimic excitatory mode or inhibitory mode of a biological synapse. In chapter 4, stochastic neuron based on tin oxide/MoS 2 hetero-memristor is discussed. The stochastic nature of tin oxide/MoS 2 is studied which can generate tunable sigmoidal curves. This property is applied in making stochastic neurons and temperature-based Boltzmann machine. Circuit level implementation is shown. Simulated annealing effect of the Boltzmann machine is carefully studied. Parameters that may greatly affect the optimization process of the Boltzmann machine are discussed. This work focuses on utilizing stochastic properties of a memristive device, which is a novel aspect that has rarely been studied in previous published work. In chapter 5, other electronic applications based on low-dimensional material are discussed. Exfolicated ,B-Ga 2 O 3 and graphene are comibined to make a barristor type electronic device. Due to ultra wide bandgap of ,B-Ga 2O 3 and its unisotropic property along different crystal direction, ultra-high breakdown field up to 5.2 MV /cm is experimentally observed in this device system. Theoretic study of breakdown field is also provided. 21 In chapter 6, conclusions, challenges and future work are presented. The developed device and circuit techniques in this thesis can serve as building blocks for future computing chip. Future research directions of neuromorphic computing based on low dimensional materials are discussed. 22 Chapter 2 Two Dimensional Material Based Device Transport Property 2.1 Black Phosphorus on h-BN FET Transport The electronic and transport properties of layered nano-materials are currently under extensive investigation for electronic [63], optoelectronic [64], thermoelectric [65], and other potential device applications [26], [66]. Black phosphorus (BP) is a layered material that can be fabricated as an ultra-thin (i.e., few layer) conducting channel and has gained significant interest for the next generation nanoelectronic devices because of its high mobility and tunable bandgap [ 67], [68]. BP nanoelectronic devices are typically constructed as Schottky-barrier (SB)-MOSFETs with metallic source/ drain contacts and an insulated gate electrode [ 69]. In general, the transport properties of layered channel materials are affected by the underlying substrate (and the quality of their interfaces) , as these introduce disorder and scattering due to charged impurities and other mechanisms, reducing the intrinsic performance of the channel [70]. Therefore, recent studies have demonstrated a significant improvement in the transport properties of BP channels when using hexagonal boron nitride (h-BN) insulation or encapsulation [67]. Having an atomically smooth surface with nearly negligible dangling bonds and charge traps [71], h-BN can be used to insulate the BP channel from the roughness and impurities at the SiO2 surface, thus achieving an improved 23 (a) (b) drain i Figure 2.1: (a) Schematic and (b) optical of the BP SB-MOSFET and BP/ h-BN SB-MOSFET transport characteristic. This improvement is typically characterized using extractions of mobility based on the empirical relationship between conductivity and carrier density. However, this does not provide insight into the transport mechanisms, nor it allows accounting for differences in intrinsic (e.g. , bandgap and effective mass) and extrinsic (e.g. , trap density/ distribution and Schottky barrier heights) properties in the analysis. In this section, we present a modeling approach for Schottky-barrier MOSFETs with low-dimensional channel materials based on the Landauer theory. To analyze transport improvement in h-BN-insulated BP channels we fabricate and measure BP Schottky-barrier-MOSFETs with and without the hBN insulating layer. Our analysis demonstrates ~ 80% improvement in low-field effective channel mobility and an (energy averaged) scattering mean free path that is >5 times larger for BP devices with an underlying h-BN layer compared to devices with BP directly on SiO 2 . Figure 2.1 shows the schematic images of the devices used in this paper. These consist of a thin-film BP channel on SiO 2 / Si substrates with and without an insulating layer of h-BN. In Figure 2.1 , 2.2 and 2.3, we, respectively, show optical and AFM images of the devices. Devices were fabricated by mechanical exfoliation of BP thin films on PDMS followed by dry transfer onto 300-nm SiO 2 / Si substrates. For devices with BP on h-BN, the h-BN was first exfoliated and 24 ,......., E C ._. ..... .c 0) Q) .c ,......, E C ........ 40 35 30 25 20 0.0 40 _. 30 ..c 0) Q) ..c 20 ....l,, -.....J ::J 3 0.5 1.0 1.5 2.0 distance [nm] Figure 2.2: AFM of the BP SB-MOSFET --~--~~-~-- 0 0.5 1 .0 1 .5 distance [µm] 30 ,......, E C ........ 20 _. ..c .!2> Q) 10 ..c 0 1.0 2.0 3.0 distance [µm] Figure 2.3: AFM of the BP / h-BN SB-MOSFET 2.5 25 transferred onto 300-nm SiO2 followed by BP exfoliation and dry transfer to form the SiO2/ h BN / BP heterostructure. BP samples with similar thickness ( ~ 18 nm) were carefully selected for bot h types of devices by visualizing the optical contrast of BP flakes on PDMS. Moreover, BP flakes with rectangular shapes over a length > ~20 µm were selected to allow fabricating FETs with various channel lengths on the same sample. Exfoliation and transfer were performed in an Argon-filled glovebox (mbraun Inc.) with oxygen and water concentrations well below 0.1 part-per-million to ensure high-quality samples. These were subsequently coated with poly (methyl methacrylate) resist and patterned for metallization using a Raith 20-kV electron beam lithography system. Cr/ Au ( 5 /30 nm) contacts were formed by thermal evaporation using a Kurt J Lesker Nano 36 system and lift-off process. Following electrical characterization, AFM was used to measure thickness of the BP and h-BN layers (AFM model is Bruker Dimension Icon). AFM surface profiles are shown in Figure 2.2 and 2.3. \¥e note that the nonuniformity of the BP flake in Figure 2.3 is an edge effect that is expected for mechanically exfoliated samples. The overall surface of the BP channel (away from the edge) is smoother. Figure 2.4 (a) plots the transfer characteristics (Id - V 9 8 ) of BP on SiO2 SB-MOSFETs (labeled as SiO2/BP) with increasing channel lengths (L). These are room temperature low-field measurements with Vd s = 10 mV. SB-MOSFETs are constructed on the same BP sample, but are still vulnerable to device-to-device variation as evidenced by the different turn-ON voltages. This can be attributed to variation in the impact of trapped charge in the SiO2 and of adsorbed contaminants on the surface of the BP [ 69]. The electrostatic effect of these charged impurities is a positive voltage shift on the Id - V 9 8 characteristics, typically described asp-type doping of t he channel [72], and easily identified by the voltage at which the drain current reaches a minimum value ( denoted as Vmin )- vVe note that the large positive values of Vmin on SiO2 / BP devices indicate a significant impact of charged impurities located near the channel. To obtain a better comparison of the transfer characteristics as a function of L, we can offset the voltage axis as shown in Figure 2.4 (a) by plotting Id as a function of V 98 - Vmin · 26 (a) E :i <l: 2, ~ (b) 100 V., = 10 mV 300 K 10- 1 10- 2 L [nm]= 250, 500, 1000 10 _, 2000, 4000 -20 0 20 40 60 Vgs [VJ V., =10mV 300K 80 - 60 -40 -20 0 20 40 60 Vgs [VJ E :i <l: 2, ~ 100 10- 1 10- 2 10- 3 -60 V., =10 mV 300 K -40 -20 0 Vgs - V min [VJ V., =10mV 300K -80 -60 -40 -20 0 20 Vgs - V min [VJ (c) (d) 3 � 2 0 0 0 ,oo � SiO 2/hBN/BP � o SiO 2 /BP � 0 � � 0 0 0 1000 2000 3000 4000 500( L[nm] SiO,lhBN/BP ,,,.._"~- h-2V SiO:,/BP' ~: ,>, I ' ' ' ' h-11V \ \ -----..H, L = 1000 nm ', .'<- -so -60 -40 -20 0 Vgs - V min [VJ Figure 2.4: Transfer characteristics (Id - V 9 8 ) for increasing channel lengths (L) (left), and the transfer characteristics with gate voltage axis offset by voltage at the minimum current (Vmin ) (right) for (a) SiO2/BP and (b) SiO2/h-BN/BP devices. (c) Extracted ON-state current (Io N) as a function of channel length indicating transport improvement in devices with h-BN insulating layer. (d) Dual gate sweep transfer characteristics (offset by Vmin ) demonstrating a reduction in gate hysteresis for SiO2/h-BN/BP devices (h = hysteresis width). In Figure 2.4 (b ), we plot Id - V 98 curves for devices with the h-BN layer underneath the BP channel (labeled as SiO2/h-BN /BP), as well as the characteristics offset by Vmin . SiO2/h- BN /BP devices have smaller Vmin and less device-to-device variation, indicating a reduced effect of fixed charged impurities. A smaller ON /OFF ratio is attributed to the slightly thicker BP channel in the SiO2 /h-BN /BP devices. This is consistent with previous studies that show a strong dependence of ON /OFF ratio on BP thickness [ 68]. We can extract drain current at an equivalent ON-state biasing condition (IoN) for increasing L. In Figure 2.4 (c) , we plot IoN as a function of L for SiO2/h-BN/BP and SiO2/BP devices. Io N is extracted at V 98 - Vmin = -60 V for SiO2/BP devices and -90 V for SiO2/h-BN /BP devices to account for the difference in effective oxide thickness (EOT). Here, both types of devices reveal an ~1/ L (i.e., ohmic) dependence of Io N indicating a scattering-limited transport regime. The results in Figure 2.4 (c) suggest an improvement in transport efficiency for devices with h-BN, as indicated by a larger IoN- The 27 improvement in transport is in agreement with the observation of reduced variation and lower Vmin in devices with h-BN insulating the channel from the effects of charged impurities at the interface with SiO 2 . Thus, transport improvement can be attributed in part to a reduction in scattering due to the h-BN layer screening the Coulomb potential of charged impurities in SiO 2 [70]. Additionally, a reduction in scattering from surface phonons is expected as the h-BN layer significantly lowers surface roughness at the channel interface as verified by the AFM surface profile characterization [see Figure 2.3] [73]. Further evidence of the h-BN layer screening the effects of charged impurities is provided in Figure 2.4 ( d). Here, we show transfer characteristics obtained using dual voltage sweeps for devices with L = 1000 nm. A significant reduction in gate hysteresis is measured for devices with h-BN. Gate hysteresis is attributed to a dynamic screening of the gate electric field due to charge trapping near the interface of the channel and the gate dielectric [7 4]. Here, the h-BN layer separates the BP channel from traps in the SiO 2 , effectively diminishing their dynamic charge contribution, resulting in reduced hysteresis. \Ve note that while the impact of slow traps in SiO 2 that contribute to hysteresis can be eliminated with the h-BN layer, interface traps that can much faster respond to changes in gate bias are still present. These are presumably located in a thin (~1-2 nm) native phosphorus oxide (POx) layer directly adjacent to the BP channel [34]. These interface traps will not significantly contribute to gate hysteresis, as their occupancy responds immediately to changes in bias (i.e., over the timescales of interest) [75]. However, they have an electrostatic effect on the subthreshold swing (SS) and contribute to scattering when in their charged state [76]. As the native POx layer results from ambient exposure of BP samples, it exists on both the SiO 2 /BP and the SiO 2 /h-BN /BP devices. Thus, their electrostatic and scattering contributions must be considered in our analysis of transport in both types of BP SB-MOSFETs. We note that a slightly larger SS in the SiO 2/h-BN/BP data is due to a lower ON/OFF ratio resulting from a thicker BP channel, a larger effective oxide thickness (because of additional h-BN layer) , and due to slightly larger density of traps with mid-gap energies in the native POx layer. 28 (a) electrons (b) 0.4 (c) ~ ~ : : UJ O 1 -------------- o __ _____ ---------------- qVc -0 1 EFd -02 Tc To ......... scattering -0.3 -0.4 ~~--~ 0 5 10 15 holes (d) (e) 10' Vds =-50 mV 10' <l>sa.n = 0.35 eV <~sa.p = 0.15 eV ---- m•=0.15mo E 10° /-- ::I. scattering ' i ...:? 10-• 10-4L_----4""'----~-~-~~-=-----"'I~ -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 - 20 -1 5 -1 0 - 5 qVc [eV) 0.2 0 Vg [VJ V ds =-50mV W=1000nm L = 250 nm lo, = 100 nm (Si0 2 ) 5 10 15 20 Figure 2.5: (a) Electronic band diagram of the SB MOSFET indicating the charge transport mechanisms. (b) Typical Gaussian distribution of acceptor and donor-like interface traps used in model calculations. ( c) Self-consistent calculation of the channel potential as a function of gate bias with and without interface traps (top) ; density of ionized traps based on self-consistent solution of channel potential (bottom). ( d) Calculation of drain current components (i.e., electron and hole currents) for both ballistic and scattering-limited transport in the channel. (e) Ballistic and scattering-limited transfer characteristics. The analysis presented above provides a good qualitative description of the differences in the electrical characteristics of SiO 2 / BP and SiO 2 / h-BN / BP devices. A modeling approach that allows better understanding and a quantitative analysis of the transport properties in BP SB MOSFETs is presented next. Applying the model to examine our experimental results enables a direct comparison of transport performance in both types of devices as a function of carrier density, while considering differences in channel thickness, energy bandgap, electron and hole SB heights at the source/drain contacts, density and energy distribution of interface traps, and scattering mechanisms in the channel. Next the modeling of transport study is discussed. In SB-MOSFETs, charge transport consists of thermionic and tunneling currents across the Schottky barriers, limited by scattering across 29 the channel. Ambipolar transfer characteristics result from the combination of electron and hole current branches, flowing, respectively, in the conduction and valence bands. The relative strength of electron and hole branches is determined by the alignment of the Fermi levels in the source/ drain contacts and the electronic bands in the channel. Figure 2.5 (a) illustrates the charge transport mechanisms in the SB-MOSFET using an electronic band diagram. The contacts are considered to be large reservoirs of electrons maintaining near-equilibrium conditions described using Fermi functions with Fermi levels Ep 8 and EFd = Ep 8 - qVds, respectively, at the source and drain. The role of the gate voltage is to shift the energy level of the bands in the channel region, effectively modulating the tunneling barriers at the source/drain Schottky contacts. Equivalently, we can define a channel potential (qV 0 ) to describe the shift in Ep 8 with respect to fixed bands in the channel of the device. For the biasing condition illustrated in Figure 2.5 (a), the source/drain Fermi levels align with the top of the valence band (Ev) allowing a large hole current. This consists of tunneling and thermionic components, subjected to scattering in the channel. Transmission probabilities across the source, channel, and drain regions are, respectively, denoted as Ts, Tc, The transport mechanisms described above are modeled using the Landauer formalism [ 77], [ 78] where (hole) current is expressed as q2 !Ev I = - T(E)M(E)[J(E,Ep 8 ) - f(E ,EFd)]dE. h -oo (2.1) Here, f is the Fermi function, M(E) = (gv / rrh) [2mi;,(Ev - E) 1 1 2 ] is the number of modes inside the valence band of the 2-D channel (i.e., for E < Ev), where 9v is the valley degeneracy and mh is the hole effective mass in the valence band. The transmission coefficient T(E) is obtained based on the series combination of scatterers and is given by [78] (2.2) 30 For energies between Ev and the peak of the barrier for holes, Ts and TD are calculated using the WKB approximation for tunneling probability across a triangular shaped barrier as [ 69] TwKB = exp{ - 2 : l xo ✓2m/i [E - Ev(x )] dx }· (2.3) The \i\TKB approximation is generally accepted for tunneling across Schottky barriers contacts to thin body and 2-D channels [ 79]. Scattering in the channel is modeled using an energy-dependent backscattering mean free path >-.(E) from which we obtain transmission across the channel as >-.(E) Tc = >-.(E) (2.4) Here, we use a power law >-.(E) = >-.0 [(Ev - E) / (kBTL)r that is valid for common scattering mechanisms and allows simple analytical modeling while providing general insight about transport and device operation [80]. Other techniques for modeling the intrinsic transport properties of nanostructures based on first-principles calculation of electron- phonon interactions can be found in [ 81]. The Fermi level at the source/drain is, respectively, given by E p 8 = qVc + qVds/2 and EFd = qV c qVds/2. The relationship between Ve and ½,s is determined by capacitive coupling of the gate to the channel [82] and is calculated as (2.5) where Cq is the quantum capacitance of the channel given by q 2 J+oo 2 ( E - Ve) Cq(Vc) = 4 kBTL -oo D(E)sech 2 kBTL dE (2.6) 31 and D( E) is the density of states in the channel ( containing both conduction and valence bands) [82] . In (2.5) , the flat-band voltage VFB accounts for the work-function difference between the gate and the channel ( MS) and contains the charge contribution from ionized (i.e., charged) interface traps. Interface traps can be acceptor-like ( charged when occupied by electrons) or donor-like (charged when empty) and trap occupancy is calculated using Fermi functions [83] . The resulting expression for VFB is given by q { J+oo ; •+oo } VFB (V e ) = <I>Ms- Cox -oo D ;t,a(E)f(E,qVe)dE- -oo D ;t,d(E)[l-f(E,qVe)]dE (2.7) where D ;t,a (E) and D ;t,d(E) are the acceptor and donor-like interface trap densities, respectively. Equation (2.5) is a transcendental equation since both Cq and VFB are functions of the channel potential, and must be solved numerically to obtain a self-consistent solution of Ve. Figure 2.5 (b )-( e) illustrates an example of calculations based on the proposed modeling approach for a general SB-MOSFET with a 2-D nanomaterial channel. In Figure 2.5 (b) , we plot the D ;t,a(E) and D ;t,d(E) used in the calculations, where energy is shown in the vertical axis to align with energy band diagram in Figure 2.5 (a). Here, we use au-shaped distribution modeled by the combination of two Gaussian components, one for acceptor-like traps centered near Ee, and one for donor-like traps centered near Ev. This is a well-known distribution of dangling bonds in conventional Si technologies [84], that also captures the characteristics of traps (interface and/or near-interfacial border traps) in III-V semiconductors [85], carbon nanotubes [7 4], MoS 2 [86], and recently BP transistors [87] on SiO 2 and/or high-k dielectrics [88] . However, we note that the modeling approach does not require a specific trap distribution. Also, the transport analysis based on this modeling technique is very robust to uncertainties in the distribution as long as the total density of ionized traps (N;t,a and N;t,d) contributing to charged-impurity scattering [see Figure 2.5 ( c)] is physically reasonable. Here, we use trap distributions that result in N;t ~ 10 12 cm- 2 , which is a commonly reported and physically reasonable trap density. In Figure 2.5 ( c), we show 32 the self-consistent solution of Ve as a function V 9 obtained using equation (2.6) as well as the density of ionized traps corresponding to the two integral terms in equation (2.7). Here, we use SiO 2 as the gate dielectric with a thickness of t 0 x = 100nm. As the magnitude of V 9 increases, interface traps are charged resulting in a stretch-out of the Ve - Vg characteristics ( also shown for references is the case of Dit = 0). The impact of interface traps on the Ve - V 9 characteristics is visually obvious in the OFF-state region where Cq « C 0 x and Ve ~ V 9 . At larger gate voltages Cq increases rapidly, pinning Ve. Figure 2.5 (d) plots Id per unit width as a function of qV c showing the electron (circles) and hole (diamonds) components, as well as the total current (lines), for both ballistic and scattering limited transport in the channel. These calculations are for Vds = -50 mV, for a device having W = 1000 nm and L = 250 nm. A larger hole current is achieved since the Schottky barrier heights (<I>sB,n and <I>sB,p ) are set such that the Fermi levels at the contacts align closer to the Ev. Here we use <I>sB,n = 0.35 eV and <I>sB,p = 0.15 eV, such that E 9 = <I>sB,n + <I>sB,p = 0.5 eV. The electron and hole effective masses are set tom = 0.15m 0 . For the case of scattering-limited transport, transmission across the channel is given by (2.4) using a power-law >-.(E) corresponding to charged-impurity scattering. Here, >-. 0 ~ N 0 /Nit, i.e., mean free path is inversely proportional to the density of charged (i.e., ionized) impurities (No is a constant), and,= 3/2 [80]. Calculations using power-law models can lead to discrepancies at large impurity concentrations [80], due to assumptions about isolated scattering events. In this work, a power-law model is valid as we consider relatively low densities of ionized traps. In Figure 2.5 (e), we plot the transfer characteristics (Id - V 98 ) for both the ballistic and scattering cases. A reduction in Id is obtained at large negative V 9 when scattering is included in the calculations. Compared to the OFF-state response, a larger effect of scattering on Id in the ON-state (e.g., for V 9 <~ -5V) is calculated. This is due to having a scattering-limited transmission (i.e., T ~ Tc) in the ON-state, while in the OFF-state transmission is dominated by tunneling across the source/ drain Schottky barriers. Additionally, Nit increases with - V 9 ( as 33 more traps become ionized) resulting in a reduction of the charged-impurity scattering mean free path and a corresponding lowering of Tc . We note that Tc allows us to treat diffusive transport (i.e., when L » A and T < 1) and ballistic transport (i.e., when L « A and T ~ 1) [80]. \Ve now apply the modeling approach to analyze the electrical characteristics and transport properties of SiO 2/BP and SiO2/h-BN /BP devices. Fits of model calculations to the transfer characteristics of SiO2/ BP devices are shown in Figure 2.6 (a). vVe model electron and hole conduction at energy levels near Ee and Ev using parabolic bands. The electron and hole effective masses are set, respectively, to m e = 0.15mo and mh = 0.l4mo, and Schottky barrier heights for electrons and holes are, respectively, set to <I>sB,n = 0.27 eV and <I>sB,p = 0.08 eV, based on BP thickness-dependent values reported in [69] and [89]. The model calculations in Figure 2.6 (a) are obtained by adjusting Dit,a(E), Dit,d(E), and .\(E) (as explained below) to simultaneously fit experimental data from all devices with different L. Here, rather than changing model parameters to account for device-to-device variation and obtain a better match to data from individual devices, we use a unique set of parameters and a unique distribution of interface traps. This allows us to use calculations based on the experimental fit to analyze the transport properties as a function of L. We note that BP transport is highly anisotropic, and we use the light electron and hole effective masses since these are expected to dominate injection across the SB in wide BP samples as the ones used in this paper. Injected carriers are expected to subsequently scatter as they transport the channel and respond with an average effective mass [90]. Moreover, the exact value of mh does not have a significant effect on the tunneling probability due to a small SB height for holes, which is consistent with our calculations. Therefore, the analysis presented below, comparing transport in devices from two separate BP flakes, is robust to discrepancies resulting from the orientation of the BP layer. Nonetheless, the modeling approach allows for a careful treatment of the BP orientation as required for a general transport analysis, especially in devices with narrow channels where transport can be highly directional. 34 (a) E ::1. < 2. ...:? (c) SiOjBP (b) 100 symbols = data 10- 1 10-2 10-3 -60 -50 -40 -30 -20 Vgs [V] Ee 1 2 3 4 5 0 ; 1 [10 13 cm·2 eV1 ] (d) E ::1. < 2. C _ o lines = model V ds =1 0mV 100 300 K E I 2. _ ,, 10- 1 -10 0 10 -80 (e) � SiOjhBNIBP o Si02'8P lines= model 0 0 2000 4000 6000 L[nm] SiOjhBNIBP symbols = data lines= model Vds =10 mV 300 K -60 -40 -20 0 20 V 9 s [V] 25 SiO:,lhBN/BP 20 E .s 15 ~ G} 10 ~ 5 Large improvement I in mean-free-path SiO 2/BP 10 9 10 10 10 11 10 12 Ps [cm ·2] Figure 2.6: Fits of model calculations to the experimental Id - V 9 8 characteristics of (a) SiO 2 / BP and (b) SiO 2 / h-BN/BP SB-MOSFETs with increasing channel length L. (c) Energy distribution of acceptor- and donor-like interface traps used to fit experimental data. ( d) Model calculation and experimental extractions of the ON-current Io N as a function of L for both the SiO 2 / BP and SiO 2 / h-BN/ BP SB MOSFETs showing good agreement. (e) Calculations of the energy-averaged mean free path as a function of hole sheet density based on the experimentally verified model for both type of devices. Results indicate transport improvement due to larger mean free path for charged impurity and phonon scattering. In Figure 2.6 (b ), we show fits of model calculations to the transfer characteristics of SiO2 / h- BN / BP devices with increasing L. In this case, we use the same values for effective masses, but set the Schottky barriers to <I>sB,n = 0.21 eV and <I>sB,p = 0.11 eV due to a slightly thicker BP channel [69] resulting in smaller ON / OFF ratios [68] and a slightly larger experimentally observed contact resistance (i.e., 1.89 kn - µmin SiO 2 / h-BN/BP compared to 1.49 kn - µmin SiO 2 / BP devices). Following the same procedure as above, model calculations are obtained by adjusting Dit,a(E), Dit,d(E) , and >-.(E) to fit the experiments. Fig. 2.6(c) plots Dit,a(E) and Dit,d(E) used in model calculations for both the SiO 2 / BP and SiO 2 / h-BN / BP devices. Interface traps are modeled using Gaussian distributions that peak near Ee and Ev for acceptor and donor-like traps, respectively. 35 The similarity in the shape of the distributions used for the fits is expected as these are associated with traps in the native POx layer that is present in both cases. In our calculations, two separate carrier scattering mechanisms are considered, charged-impurity and phonon scattering. We use 1/ >.(E) = 1/ Aci(E) + 1/ Aph(E) with power-law models for the individual mechanisms where r = 3/2 for charged-impurity scattering and r = 1/2 for phonon scattering [80], [91]. For charged impurity scattering >. 0 depends on the density of ionized interface traps (Nit) and is modeled as Oci = Noci/Nit, while phonon scattering is modeled using a constant Oph· Thus, Noci and Oph are the fitting parameters associated with scattering. Figure 2.6 ( d) shows calculations of IoN as a function of Lin good agreement with experiments. Larger Aoci and Aoph required to fit Si0 2/h-BN /BP data indicates transport improvement (i.e., longer mean free path). To obtain a quantitative determination of improvement in the transport properties we extract the energy-averaged mean free path as ((>.(E) )) = J >.(E)M(E)(fs - fd)dE J M(E)(fs - fd)dE ' (2.8) where fs = f (E, EFs) and fd = f (E, EFd)- ( (>.(E) )) represent a weighted average of >.(E) over the energy range where most of the current flows for a given bias (as determined by the density of modes and the difference in the Fermi functions, i.e., the Fermi window). Additionally, calculating carrier (hole) sheet density as p 8 = J!;, D(E)[lf(E, qVc)]dE allows a direct comparison of ((>.(E) )) between both types of devices. In other words, by comparing ((>.(E))) as a function of p 8 we eliminate discrepancies due to differences in device characteristics (e.g., geometry, gate dielectric thickness, BP channel thickness, trap density /distribution and), allowing a fundamental comparison of the transport properties. In Figure 2.6 (e), we plot ((>.(E) )) as a function of Ps for both Si0 2/BP and Si0 2 /h-BN/BP devices. In both cases, ((>.(E))) peaks at p 8 ~ 10 9 cm 2 (i.e., in the OFF-state) where the density of ionized traps is small. Increasing Ps above~ 10 9 cm 2 correlates with an increase in the density of ionized donor-like traps resulting in a reduction of 36 3500 T = 300 K 3000 ,......, 2500 en > 2000 -- N E (.) 1500 .......... :f 1000 SiOi BP 500 0 1011 Ps [cm- 2] Figure 2. 7: Extractions of (hole) channel mobility as a function of (hole) carrier density for SiO 2/BP and SiO 2/h-BN/BP devices based on fits to experimental results shown in Figure 2.6. Extractions are from devices with L=250nm. ( (>..(E)) ). Similarly, p 8 <~ 10 9 cm 2 correlates to an increase in the density of ionized acceptor-like traps, also lowering ( (>..(E))) due to enhanced charged-impurity scattering. A significant transport improvement in SiO 2/h-BN /BP devices corresponds to a larger mean free path as indicated in Figure 2.6 (b) (i.e., > 5x in the ON-state). This is due to a reduction in both charged-impurity and phonon scattering mechanisms, as determined by fits to experimental Id - V 98 data as a function of L. We also extract the ON-state channel mobility using µP = <J /qp 8 where ( I ) ( L ) 2qL !Ev <J = V, W = hV W T(E)M(E)(fs - fd)dE. ds ds -oo (2.9) Figure 2. 7 plots µP as a function of Ps. At Ps 10 12 cm 2 we, respectively, obtain µP ~ 37 350 and 630 cm 2 / Vs for SiO 2 / BP and SiO 2 / h-BN/BP devices with L = 250 nm. The values obtained for SiO 2 / BP devices is comparable to values reported in recent works [ 68], [ 92], and we observe ~ 80% enhancement in mobility when insulating with h-BN. The improvement increases at lower carrier densities where charged-impurity scattering dominates. Mean carrier velocity can be estimated using (v) = I / WqPs (e.g., 1 x 10 5 cm/s for SiO 2 / BP versus 2.5 x 10 5 cm/s for SiO 2 / h-BN/ BP at p 8 = 10 12 cm 2 ). vVe note that previous works reported an enhancement in mobility with increased BP thickness [63], [93]. However, this enhancement appears negligible in the range of BP thicknesses of our samples compared to the improvements we observed with h-BN insulation. Thus, we attribute the superior transport properties of SiO 2 / h-BN / BP devices to a reduction in scattering. Using a comprehensive and accurate modeling approach based on the Landauer formalism we analyze the transport properties of SB-MOSFETs with BP channels on SiO 2 with and without an insulating h-BN layer. The modeling approach uses a self-consistent solution of the channel potential based on the quantum capacitance of the channel, and incorporates the electrostatic and scattering impact of interface traps having nonuniform energy distributions. It also includes a component to account for roughness at the BP channel interface and its contribution to surface phonon scattering. A comparison of model calculations and experimental data allows analyzing the transport properties of SiO 2 / h-BN/BP and SiO 2 / BP SB-MOSFETs as a function of L. The analysis enables direct comparison of ( (>..(E) )) between both types of devices. We present a direct comparison of the energy-averaged mean free path as a function of carrier density for both types of devices based on the experimentally validated model. The comparison reveals a mean free path that is > 5 x larger for devices with h-BN insulation. This is attributed to screening of the scattering potential from charged impurities in the SiO2 , as well as a reduction in phonon scattering due to improved surface roughness at t he BP channel interface. Extractions of channel mobility reveal a significant improvement in SiO 2 / h-BN / BP devices compared to SiO 2 / BP and to 38 previously reported MoS 2 devices ( another 2-D channel material of interest for beyond-Si devices) [94]. 2.2 Temperature Dependent Black Phosphorus FET Transport Since the rediscovery of black phosphorus (BP) as a promising two-dimensional ( 2-D) layered material for electronic and optoelectronic applications [26], [63], [68], several studies have explored the transport properties of BP [89], [92], [94] and the electrical characteristics of nanoscale devices with BP channels [66], [67], [93], [95]. The operation of Schottky barrier metal-oxide semiconductor field-effect-transistors (SB-MOSFETS) is of significant technological importance for electronic applications with 2-D channel materials and metal source/drain contacts [63], [69], [96]. In 2013, Das et al. utilized conventional thermionic emission theory to extract the SB height at the interface between thin MoS 2 flakes and metal contacts and demonstrate the impact of metal work function on the transport properties of SB MOSFETs with 2-D channels [97]. Later, Penumatcha et al. presented an analytical model based on Landauer's formalism to analyze the transport properties of BP SB-MOSFETs [69]. This model neglected scattering in the channel but was successfully applied to fit experimental data in the device off-state region of operation and extract SB heights as a function of BP channel thickness. Recently, Esqueda et al. presented an improved analytical SB-MOSFET model that incorporates channel scattering as well as the effects of interface traps and quantum capacitance through a self-consistent calculation of the channel potential [98]. This model was used to analyze transport properties in ultrathin BP devices with and without hexagonal boron nitride (hBN) insulation and demonstrate improvements in mobility due to screening of charged-impurity-and roughness-induced phonon scattering. In this section, a new analysis of transport in BP is presented, by measuring the temperature and channel-length (L )-dependent electrical characteristics of SB MOSFETs with ultrathin BP channels. Based on this characterization, we demonstrate the importance of scattering effects on 39 b 16 -L ,..., 12 - L ' 13 nm - L 2 E L 3 s 8 4 - - L s ..c CTI 9nm ·a; 4 :r: 0 0.2 0.4 0.6 0.8 1.0 1.2 Position (µm) Figure 2.8: (a) 3-D schematic of the black phosphorus device with channel lengths L 1 = 250 nm, L 2 = 500 nm, L 3 = 1000 nm, L 4 = 2000 nm, L 5 = 4000 nm. (b) AFM image of BP device. (c) AFM height profile indicating a thickness of ~13 nm for device with L 1 = 250 nm, and ~9 nm for all other devices. Channel width is approximately 200 nm for all devices ( extracted at midpoint) . analyzing transport across SB at the interface between metal contacts and 2-D channel materials (previously neglected). We also derive a more valid expression for thermionic emission current in SB-MOSFETs using the Landauer formalism, which should replace the conventional expression used by Das et al. [97] when extracting SB height in nanoscale devices with 2-D channels. We apply this new expression to extract the SB height at the metal/channel interface of BP devices. We also demonstrate the impact of channel scattering on the extraction of SB height by applying this technique on devices with increasing channel lengths. Then, using the improved model of Esqueda et al. [ 98], we analyze BP SB-MOSFET on-state current as a function of temperature. 40 vVe use model fits to the temperature-dependent data to extract field-effect mobility at a fixed carrier density, and demonstrate the contribution of charged impurity and phonon scattering on the transport properties of BP SB-MOSFETs. The modeling approach accounts for device parameters and material properties (e.g., dimensionality, density-of-states, quantum capacitance, etc.) as well as extrinsic effects (e.g. , interface states and charge trapping). For this work, we fabricated BP SB-MOSFETs through the mechanical exfoliation and dry transfer of BP onto a 300 nm SiO 2/Si substrate. A narrow BP flake with rectangular shape and with uniform thickness was carefully selected through visual inspection of the optical contrast on BP flakes on the substrate. Exfoliation and transfer were performed in an ultrahigh purity 99.999% argon-filled glovebox (MBraun Inc. ) with low oxygen and water concentration (i.e., <0.1 ppm) , to ensure the quality of the BP samples. The samples were then coated with PMMA (poly(methyl methacrylate)) resist and contact regions with increasing separating distance were patterned using a Raith 20 kV electron beam lithography (EBL) system. Cr/ Au (5/30 nm) contacts were formed through thermal evaporation using a Kurt J. Lesker Nano 36 system and lift-off process. Figure 2.8 (a) is a schematic of the sample used in this study. The sample was loaded into a vacuum probe station ( < 5 x 10- 6 Torr) and annealed at 350 K for 1 h. Using the Si substrate as a common back-gate, we measured the electrical characteristics of SB MOSFETs with increasing channel lengths labeled in Figure 2.8 (a) as L 1 through L 5 . The electrical characterization was performed at various temperatures from room temperature to 77 K. After the electrical characterization, atomic force microscopy (AFM) was used to measure the thickness, surface roughness, and geometry of the BP channel in the sample devices. Figure 2.8 (b) is the AFM image of the sample, and Figure 2.8 ( c) is a plot of the height profile of the BP flake extracted from AFM measurements across the center of the channel region for the various SB-MOSFETs. The AFM measurements reveal the nanoscale uniformity of the BP flake thickness and width. The channel width (W) for all SB MOSFETs is ~200 nm (extracted at the midpoint of the BP flake thickness). The thickness is approximately 13 nm for SB-MOSFET with L 1 = 41 a 101 b 101 C 30 10° V os = 10mV 100 1../W = 1000 nm /200 nm 2.0 25 t.,.=285nm C> E 10· 1 - 77K E 10·1 - 77K ..,ti:,, = 3.9 (Si0 2 ) z 20 1.5 -t ::J. 10·2 - 100K ::J. 10·2 - 100K ~ V111 = 10 mV "3: ~ ~ - 150K 10·3 - 150 K 10·3 15 2. - 200K 2. - 200K -~ 1.0 ~ ~ 10·• - 250K ~ 10"' - 250K > 10 10·5 300 K 10·5 300 K <l 0.5 () 3 10·6 10-6 ..ii L.Nv = 1000 nm /200 nm 0 0 10·' 10·' 50 100 150 200 250 300 -40 -20 0 20 40 60 80 -40 -20 0 20 40 60 80 V t,g M Vt,g M Temperature [K] d e 10 1 f 101 ,...-;- I ~ = 1 .(V., -V"'" = -S0V) ,0·1 100 Vo, = 10 mV E 10·1 E ~ E 10·2 ::J. ::J. ::J. 10·2 ~ :i'. ~ - 77K 2. 10·3 - 100K 2. 100 :,; 10·3 ~ 10·• - 150K s > 10·5 - 200K -& ~ > 10·• 250 K -e- L = 1000nm -T!-_ 1 ]; 10·6 300 K + L=2000nm ..... :::-- -6- L = 4000nm -T-1• ' 10·5 10•7 10 · 1 -80 -60 -40 -20 0 20 40 100 200 300 100 200 300 Vt,g -Vmin M Temperature [Kl Temperature [K] Figure 2.9: (a) Dual gate sweeps of transfer characteristics (Id - Vb 9 ) for the device with L / W = 1000/ 200 nm at various temperatures. (b) Negative gate sweep of Id - ½, 9 for the device with L = 1000 nm. (c) Temperature dependence of change in Vmin and density of charged traps NT. (d) Negative gate sweep of Id - Vbg with gate voltage axis offset by Vmin· (e) Temperature dependence of Ion at a fixed bias of Vbg - V min= -80 V for devices with increasing channel lengths. (f) Temperature dependence of Ioff at a fixed bias of Vbg - Vmin= -8 V for devices with increasing channel lengths. 250 nm, and approximately 9 nm for all other devices. Figure 2.9 (a) plots measurements of drain current (Id) per unit width (W) as a function back-gate voltage (Vb 9 ) from the BP SB-MOSFETs with L = 1000 nm, for all temperatures (77, 100, 150, 200, 250, and 300 K). These are dual gate sweeps that reveal gate hysteresis in the Id - ½, 9 characteristics. With increasing temperature, the effect is enhanced, indicating a wider contribution of traps to the charge trapping effects that cause gate hysteresis. With increasing temperature we also observe a reduction of the on/ off ratio, due to a wider range of energies contributing to carrier conduction in the channel. This results in a larger overlap of the hole and electron branches, effectively raising Id at the minimum value which corresponds to the gate voltage at which the branches intersect. For easier visualization, in Figure 2.9 (b) we plot the Id - Vbg characteristics from only the negative gate sweep direction (positive to negative Vb 9 ). The measurements in Figure 2.9 (b) reveal that the gate voltage at the minimum Id (i.e., Vmin ) shifts to larger positive values with increasing temperature, indicating a net negative change 42 on the charge contribution from traps. This can be attributed to an increased occupancy of acceptor-like traps, as more electrons can be captured at higher temperatures under the initial condition of the sweep with a large positive Vbg bias. Figure 2.9 (c) plots the change in Vmin (and the corresponding change in density of charged acceptor-like traps) as a function of the temperature. In Figure 2.9 (d) we plot Id as a function of Vbg - Vmin, to effectively offset the electrostatic effect (voltage shift) due to traps contributing to gate hysteresis. The resulting Id vs Vbg - Vmin plot indicates that current flow increases with temperature in the off-state, and decreases with temperature in the on-state. In Figure 2.9 ( e) we plot the on-state drain current (J 0 n) at a fixed bias of Vbg - Vmin = -80 V, which corresponds to an approximate hole density of Ps ~ - (C 0x/q)(Vbg - Vmin ) = 5 x 10 12 cm- 2 , as a function of temperature for SB-MOSFETs with increasing channel lengths. The error bars indicate ±20% error tolerance in the extraction of Ion resulting from uncertainty in W ( determined from AFM measurements), and uncertainty in Vmin extracted from the I - V characteristics. We note that the largest uncertainty in Vmin was determined to be approximately ± 2.5 Vat 77 K (i.e., due to noise at low Id) , and has only a small effect on extractions of Ion. The 20% tolerance safely accounts for miscalculation in W from AFM measurements. The results in Figure 2.9 ( e) show Ion decreasing with L , and a transition from an initially temperature-independent current to a r-, dependence with I around 0.8 and 1.4. These observations are consistent with scattering in the BP channel because Ion is proportional to >-/ L , where >- is the carrier backscattering mean free path which captures the scattering mechanisms in the channel (and their temperature dependence) [80], [ 99]. We will show that the transition in the temperature-dependence of Ion results from a transition of the dominant scattering mechanisms from charged-impurity to phonon scattering. In Figure 2.9 (f) we plot the off-state drain current (1 0 11) as a function of temperature for BP SB-MOSFETs with increasing channel length at a fixed bias of Vbg - Vmin = -8V. In the off-state, current flow is limited by the Schottky barrier at the interface between the source/drain 43 metal contacts and the BP channel, and increases with temperature due to thermionic emission of carriers over the barrier. With increasing L , the role of scattering becomes more significant and can become the limiting factor toward lower Vbg - Vmin bias (less negative bias) , reversing the temperature-dependence as of Ioff· As a result, the rate at which Ioff increases with temperature is significantly reduced with Las shown in Figure 2.9 (f). In the following analysis we demonstrate how this affects the commonly used techniques for extracting SB heights. Next we did SB analysis. As indicated by Das et al. [ 97], the metal/ semiconductor contact interface is a major performance-limiting factor for devices constructed with low-dimensional materials. Thus, understanding its impact on carrier transport is important for analyzing SB MOSFETs with 2-D channels. SB heights can be extracted based on the temperature-dependent study of thermionic emission and thermally assisted tunneling. Earlier work used an equation for thermionic emission current in bulk devices expressed as [97] (2.10) It is useful to have a closed form expression for current due to thermionic emission over the barrier at the interface between metals and 2-D channels (see Figure 2.10 (a) for thermionic emission of holes). A derivation based on Landauer's formalism is presented in the Section 2.3, resulting in the (hole) current expressed as (2.11) where K = (q 2 /1i )(gv/1i)J2m/j1r. As illustrated by the band diagram in Figure 2.10 (a), when the device is biased in the off-state, the barrier height ( <I> B) is determined by the energy difference between the top of the valence band in the channel, and the Fermi-level at the contact. At flat-band <I>B corresponds to the SB height for holes (<I>sB,p )- As plotted in Figure 2.10 (b) , holes injected 44 a "flat-band" "off-state" b 0.2 ~ ---~ Ee > 0.1 ~ J 0 I LU -0.1 300, 200, ~-----. , Ev thermionic emission -0.2~-----' 0 0.5 f(E) C 10- 7 ~o -~---~- o--~-- - - 20 - V ~~ 10- 8 <{ 10-9 2: 10·10 >= _ o 10 -11 10·12 o _ o 0 . - .e -o-_ 0 -0 • --o _ - ov "o- - --~ }'- - - - -0 ---- 0 Vbg- Vm;n =OV - L = 1000 nm 10·13 ~~---~---~---~ 4 6 8 10 1000/T [K· 1 ] d > ~ "' & e > ~ Q. ai ,,, & 0.06 0.04 0.02 0 -0.02 -20 0.05 0.04 0.03 0.02 0.01 500 -15 -10 -5 Error bar=+/- 10 meV 1000 L[nm) 2000 4000 Figure 2.10: (a) Band diagram shows barrier for holes. (b) Fermi- Dirac distributions of holes injected from the contact at different temperatures. (c) Arrhenius-type plot of Ioff vs 1/ T at different ½, 9 - Vmin · ( d) Extracted barrier heights for devices with increasing channel lengths. Inset plot shows SB heights extracted for devices with various channel lengths. from the contact follow the Fermi-Dirac distribution. The probability of holes occupying energy levels available for conduction in the valence band (i.e., above the barrier) increases exponentially with temperature [ 78]. Thus, thermionic current increases exponentially with temperature. The barrier height can be extracted experimentally from the slope of log [Ioff / (Vds WK~) ] plotted vs 1/ T (Figure 2.10 (c)). In Figure 2.10 (d) we plot the extracted barrier height as a function of ½, 9 - Vmin for SB-MOSFETs with L = 500, 1000, 2000, and 4000 nm (symbols). Above flat-band, the barrier at the interface no longer changes, but the extracted <I> B continues to drop, although at a reduced rate, due to the increasing contribution of thermally assisted tunneling. The flat-band condition is determined from the linear dependence deviation of <I> B as a function of ½, 9 - V min · Figure 2.10 (e) plots the extracted SB height (<I>sB) as a function of L. The extracted value drops down from ~30 meV for L = 500 nm, to ~15 meV for L = 4000 nm, due to the more significant contribution of scattering on the temperature-dependence of I 0 1 f. If we extrapolate the extracted SB height back to small L , where scattering is negligible, we obtain a <I>sB ~35 meV. Such small 45 C E :i ~ 2: ~ _ o a L/W = 2000 nm /200 nm b 10° E 10·1 symbols = data :i 10·2 lines = model 4'. 2. 10·3 - 100K ~ 10·4 - 150K ...J? - 200K 10·5 250 K 300 K 10·6 ·40 ·20 0 20 V bg M 101 symbols = data lines = model 100 100 200 300 Temperature [K] 40 d 5 E 4 :i ~ 3 2: ~ 2 "c: 1 _ o 0 8 ; Q) 6 ')' E (.) 4 <") 0 ..... "' 2 Q 0 60 80 symbols= data lines = model 2 3 4 L [µm] Ev · 0 .2 -0.1 e 7n ;i: 10 3 "' E ~ ~ :0 0 E 5 102 0 E [eV] Charged impurities 0.1 ' ' ' Ee 0.2 ' ', T -2 ' ' ' ' ' 100 200 300 Temperature [K] Figure 2.11: (a) Fits of model calculations to the experimental Id - Vi 9 for the device with L = 2000 nm. (b) D it for donors and acceptors. ( c) Model calculation and experimental extractions of temperature dependence of Ion for devices with increasing channel lengths. ( d) Model calculation and experimental extractions of channel lengths dependence of Ion at increasing temperatures. (e) Temperature dependence of extracted hole mobility in BP SB- MOSFETs. energy barrier for holes is consistent with the dominant p-type conduction that is achieved on the BP SB-MOSFETs used in this work. The error bars in Figure 2.10 (e) indicate a ± 10 meV uncertainty in the extraction of <I>sB that may result due to discrepancies in the fits of Ioff vs T resulting from miscalculations in extractions of V min. Next we did analysis on scattering effects. We measured the temperature- and channel-length dependence of Ion in SB-MOSFETs, indicating a transition of the dominant scattering mechanism as a function of temperature. vVe now analyze the transport properties of thin and narrow BP channels by fitting these experimental results with our previously demonstrated model for SB MOSFETs. Shown in Figure 2.11 (a) are model fits to the experimental Id vs Vbg characteristics for the device with L = 2000 nm and for increasing temperatures. For the model calculations we have used the SB height for holes extracted experimentally (i.e., <I>sB,p = 0.035 eV), and <I>sB,n 46 = 0.265 eV for electron, resulting in a bandgap of Ee = <PSB, n + <PSB,p = 0.3 eV. The fit to the experimental results required adjusting the density and energy distribution of interface traps (acceptor- and donor-like traps, Dit ,a and Dit,d) , used in a self-consistent calculation of channel potential as given by Ve = (V 9 - VFB)[C 0x/(C 0 x + Cq)] [74], [82]. Cq(Ve) and VFB(Ve) are the quantum capacitance and flat-band voltage, and are a function of Ve. VFB accounts for the work-function difference between the gate and channel, as well as the charge contribution from ionized (i.e., charged) interface traps. We use simple Gaussian distributions for both Dit,a(E) and Dit,d ( E), plotted in Figure 2.11 (b), that result in reasonable agreement with the experimental I - V characteristics, and help capture the observed trends as a function of L and temperature. In these calculations, the channel potential represents the amount of band bending in the channel with respect to the Fermi levels in the source/drain contacts. Thus, the modeling approach self consistently accounts for the bias-dependent impact of trap occupancy on band bending as a function of V 9 . The mean free path >..(E), is modeled by incorporating charged-impurity (>..ci) and phonon (>..vh) scattering components as 1/ >..(E) = 1/>..ci(E)+lf>..vh(E) [80]. Each component is described by a power-law function of energy where >..(E) = >..0 [(EvE)j(kBT)]"", where r = 3/2 for charged impurity scattering and r = 1/2 for phonon scattering [80], [91], [100]. For charged-impurity scattering >..0 depends on the density of charged traps (Nit) and is modeled as >..oci = NocdNit. The larger temperature dependence of Ion for T >~ 150 K is captured by the phonon scattering component using Oph = Novh/Tn [80]. Noci, Noph, and n are the model fitting parameters used to capture the temperature and channel length dependence of Ion plotted respectively in Figures 2.11 (c) and 2.11 (d). For these model fits we have used Noci = 8 x 10 13 cm 2 ,Noph = 3.8 x 107,and n = 2.4. The model calculations result in good fits with experimental data for Ion as a function of L and temperature. A better understanding of the scattering mechanisms and their temperature dependence can be obtained by calculating hole mobility as a function of temperature at a fixed carrier concentration (p 8 ). These calculations are shown in Figure 2.11 (e) for Ps = 10 12 cm 2 , 47 using the same model parameters used to fit Ion· The results in Figure 4( e) indicate a mobility limited by charged impurity scattering with negligible temperature dependence below ~ T = 100 K. For temperatures between 100 K and room temperature, calculations indicate a transition to a phonon-scattering limited mobility with a large temperature dependence approximating a power law T - 'Y dependence. This large temperature dependence is consistent with previous studies [67], and has been attributed to a large contribution of acoustic phonons. In conclusion, we present new analysis of transport in ultrathin BP SB-MOSFETs with various channel lengths using temperature-dependent measurements from 300 to 77 K. From a device physics point-of-view, these results confirm our general understanding of SB-MOSFETs with 2-D channels and provide new observations that indicate the impact of scattering on transport in BP devices as a function of channel length and the effects on SB height analysis/extraction techniques. Our measurements show that Id increases with temperature in the off-state but decreases with temperature in the on-state. This is explained by the charge conduction limiting mechanism transitioning from thermionic emission of charges over the SB at the interface of contacts/channel in the off-state to channel carrier scattering in the on-state. Based on Landauer's formalism, we present a generalized technique for analyzing thermionic emission current in SB-MOSFETs with 2-D materials. "\¥e derive a closed-form expression for BP SB-MOSFETs and apply it to extract the SB height at the source/drain contacts using standard methods based on the slope of the off-state current vs temperature. Our results indicate the impact of scattering, observed as a gradual decrease of extracted SB height with increasing channel length. This is attributed to a more significant role of channel scattering, compared to thermionic emission at the SB, in devices with longer channels. The temperature dependence of hole mobility is extracted from model fits to the on-state current. Our results indicate a transition of the dominant scattering mechanism from charged impurity to phonon scattering, as observed from a transition in the temperature dependence of mobility. The characterization and modeling techniques presented in this work were used to 48 investigate the operation and transport properties of SB- MOSFETs with 2-D channel materials and allow consideration of intrinsic material and device properties ( e.g., density of states, channel material bandgap, contact work functions, etc.) as well as extrinsic effects (e.g., interface trap density and distribution). These considerations eliminate discrepancies that may arise due to the electrostatic effect of charged traps and enable true calculations of mobility as a function of temperature at fixed carrier densities. The results show that mobility can exceed 1000 cm 2 /Vs and remain fairly independent of temperature below 100 K. \Vith temperature increasing above 100 K, hole mobility is reduced following a power-law dependence to about 300 cm 2 /Vs at room temperature. 2.3 Landauer Transport of BP SB-MOSFETs In this section, a generalized technique is presented for analyzing thermionic emission current in SB-MOSFETs with 2-D materials, which is applied in the work presented in Section 2.2. The Landauer equation for hole conduction in the valence band of BP SB-MOSFETs given by 1 E v Id = irac2qh -oo T(E)M(E)[is - id] dE. (2.12) where q is the electronic charge, h is Planck's constant, T(E) and M(E) are the transmission coefficient and density of modes, respectively, and is and id are the Fermi functions at the source and drain, respectively. For small Vds we can approximate and under the Boltzmann approximation 8io/8E;::::; (l / kBT) exp[(E - EFo)/(kBT)]. (2.13) (2.14) 49 In the off-state, T(E) = 0 for E > Ev and l for E < Ev (ignoringscattering). (2.15) Because the limits of integration are for E < Ev, we can simply set T(E) = 1. For 2-D BP, the density of modes is given by M(E) = W(gv / 1rn)J2mh(Ev - E), (2.16) where 9v is the valley degeneracy and mh is the hole effective mass in the valence band. Overall we have I = V W-___.:!:..._V_ "'""_hh J(E - E)e- ,;Br dE 2q2 g 12in! 1Ev E po - E ds h 1r1i kBT -oo v . (2.17) By expressing the SB height as <I>s B = Epo - Ev (2.18) and changing variable x = J(Ev -E) (2.19) we obtain I = V W-___.:!:..._V_ "'"_"hh - kBT 2 2 -x /kB Td 2q 2 g 12rn" <I> s B l oo 2 ds h 1r1i kBT e O x e x . (2.20) Finally, using (2.21) 50 we obtain (2.22) where (2.23) 51 Chapter 3 Low Dimensional Material Based Synaptic Devices 3.1 Carbon Nanotube Synaptic Device and Network To approach the fundamental limits for process scaling of complementary metal-oxide-semiconductor (CMOS) technology, significant material and device research are undergoing aimed at a more efficient and better performing replacement for MOS field-effect-transistors (MOSFETs) [101], [102]. Low-dimensional (e.g., 2-D and 1-D) materials such as graphene and carbon nanotubes (CNTs) are promising candidates with excellent scalability and desirable electronic transport properties under low-voltage operation [103], [104]. Moreover, recent developments in the functionality of CNT devices [105], as well as their compatibility with three-dimensional (3-D) integration [ 45], may enable the implementation of non-von-Neumann architectures that eliminate the separation of memory and logic, thus reducing power consumption and heat generation resulting from expensive data transferring (i.e., the von-Neumann "bottle- neck") [106]. The time- and power-efficient computing benefits of these architectures are especially beneficial for low-power mobile electronic systems. Moreover, the increasing deployment of mobile, data-gathering devices for the Internet of Things (IoT) presents a significant need for efficient and high-throughput data preprocessing at the edge of the network (i.e. , edge computing) [107]. Neuromorphic architectures, inspired by the human brain, emulate the structure and functionality 52 of biological neural systems and can enable the realization of highly efficient computing systems [108]. By utilizing the synaptic properties of resistive switching (i.e., memristive) devices, artificial neural networks can be fabricated in a crossbar configuration offering the desired density, parallelism, and 3-D integration compatibility desired for the efficient hardware implementation of machine learning algorithms and neuro-inspired computing architectures [109], [110]. This approach has been used to demonstrate tasks such as recognition, classification, learning, and decision making [10], [107], [111]. The most widely studied memristive device for the hardware implementation of artificial neural networks in a crossbar configuration is the filamentary type (e.g., RRAM) [16], [112], [113]. This technology offers great features including a simple two terminal structure, low-power operation, and good endurance and retention [114]. However, it suffers from significant device-to-device and cycle-to-cycle variability [115], as well as from abruptness ofresistance modulation [116], due to the inherent filamentary operation [117]. This abruptness is undesirable for neuromorphic systems and can be eliminated from the characterization of RRAM synaptic behavior through the application of a forming step (i.e., initial generation of the conductive filament). Nonetheless, the dynamic range (i.e. , resistance modulation range) after the formation of the conductive filament is limited, as further changes in resistance result only from modulation of the filament cross-sectional area [118]. Moreover, the requirement of a forming step and the significant variation of the forming/set voltage [119] introduce additional system-level complexity (and associated cost), unwanted for the efficient hardware implementation of artificial neural networks. Because of these limitations, synaptic devices with alternative resistive switching mechanisms are desirable. Recently, charge-trapping synaptic transistors have been proposed as an alternative for the hardware implementation of artificial neural networks. Devices with Si [120] as well as random network CNT channels [105] have been demonstrated with promising preliminary results. It is well established that CNT based devices have exceptional scaling properties that extend beyond the Si roadmap and are considered a primary candidate for next-generation computing systems that vertically integrate 53 logic and memory [121]. Importantly, achieving superior device uniformity and stability requires controlled placement of CNTs (i.e. , alignment) as well as controlled semiconducting purity [122]. In this section, we present a wafer-scale aligned CNT synaptic transistor technology for large scale neuromorphic systems. An advantage of CNTs for the development of charge-trapping synaptic transistors is their large sensitivity to charged defect scattering. Because of their small physical dimensions, CNT conductivity can change significantly as a result of changes in the charge state of nearby defects [123]. As we will show, the sensitivity of individual CNTs translates into measurable changes of CNT FET conductance, especially for FETs with aligned CNTs where transport and scattering effects are isolated to 1-D [ 99], resulting in a robust synaptic behavior with large dynamic range. Here a thorough analysis of the robust synaptic behavior in aligned CNT transistors based on DC and pulsed electrical characterization is presented, which discusses the implementation of aligned CNT-based artificial neural networks and presents system level simulations of unsupervised learning for pattern recognition applications. Additionally, the synaptic tuning capability of an aligned CNT FET and its application to adaptive learning schemes for artificial neural networks and/or to implement homeostatic regulation of neuron firing rates is discussed. Single-walled carbon nanotubes (SWCNTs) and SWCNT FETs have exceptional 1-D electronic transport properties, making them an excellent candidate for various applications including high speed logic devices [124], radio frequency (RF) transistors [125], and nonvolatile memory [126]. However, for most of these applications, the organized assembly (i.e., alignment) of S\¥CNTs with controlled semiconducting purity is critical for optimizing device performance and for developing practical, reliable, and scalable technologies [45] . In this work, a recently improved evaporation driven process, named floating evaporative self- assembly (FESA) [127], has been utilized by Carbonics Inc. to fabricate highly aligned SWCNT devices at the wafer level (Figure 3.1 (a)). Figure 3.1 (b) is the optical image of a multifingered top-gated aligned CNT FET. The scanning electron micro- scope (SEM) images of a gate region and of the aligned SWCNTs are respectively 54 a b C d Aligned CNTs Figure 3.1: (a) Aligned CNT FET wafer fabricated by Carbonics. (b) Top-gated aligned CNT FET test structures (inset is the zoomed-in view of the channel region from a 10-finger device; each channel finger is 20 µm wide). ( c) Scanning electron microscope ( SEM) image of the aligned CNT FET active region. ( d) SEM of the aligned CNT channel. ( e) Cross-sectional schematic of the aligned CNT FET. (f) Top view of the aligned CNT FET including t-shaped top-gate and self-aligned source/drain regions. (g) Conceptual back-end-of-line (BEOL) integration of aligned CNT FETs for artificial neural network implementation in crossbar configuration. shown in Figure 3.1 (c) and (d). A schematic of the self-aligned T-gate transistor structure is illustrated in Figure l e, and the SEM image of the finalized device including self-aligned source/drain regions is shown in Figure 3.1 (f). The T-gate structure is characteristic of RF application that these devices were initially designed for. It enhances gate control, helps scaling down the channel length, and reduces parasitic capacitance [128]. Thus, it can enhance the dynamic behavior of gate-bias-dependent charge-trapping mechanisms and improve the performance of aligned CNT synaptic transistors, enabling faster operation. This wafer-level process is fully compatible with CMOS, owing to the low-temperature fabrication of aligned CNT devices. Thus, it is feasible to achieve 3-D integration of aligned CNT devices and CMOS circuits to enable non-von-Neumann architectures, such as neuromorphic topologies, that conquer the communication bottleneck between memory and logic. Figure 3.1 (g) illustrates the conceptual back-end-of-line (BEOL) 3-D integration of aligned CNT and CMOS for neuromorphic computing systems. In this architecture, aligned CNT FETs are connected in a crossbar configuration 55 a b C 10 ... 10--e V., = -0.05 V ,o .. ,0-1 ~ 10- 7 ~ 10--e ...::? 10 .. ,o-• 10-10 -V., =-1 .0V _ -o ,o -• Sweep range: 10-10 - 2.0 V lo-2.0 V -1 .5Vlo - 1.5V 10- 11 - V., = -0.05 V 10 -,1 - 1.0Vto - 1.0V -2 -1 0 2 -2 - 1 0 2 -0.8 -0.6 -0.4 -0.2 0 d V 9 s [V] e f Vds [V] 10-5 1 .s~--------~ 10 ... 10-' <( 10-• _ -o ,o-e 1.6 ~ 14 .!!l 1.2 • : • • ' •• • : '• • •• (J) 1 a--, • . ,-:. .. :·.,·.=-·~ ...... . ~ ,, _ __ ; ... ~ '.·~· .... ~ .... , Cl> 0.8 - ~ , ., • •• •• • 1n o.6 !·i·:: · l >,, ,. • • Gate HfO 2 CNT E,(x) = E 10 - qE 0,x .C 0.4 •• 0.2 Data from 691 devices -2 - 1 0 10 2 10 3 10' 10 5 V 9 s [V] on/off ratio Figure 3.2: (a) Dual-sweep Id - V 98 characteristics of aligned CNT FETs for Vds = -1.0 and -0.05 V revealing large gate hysteresis. (b) Dependence of hysteresis window on the voltage sweep range of dual-sweep Id - V 98 measurements. ( c) Id - Vds characteristics for increasing ½w ( d) Multiple cycles of dual-sweep Id - V 98 indicating repeatability of hysteresis effects. ( e) Distribution of hysteresis plotted as a function of the on/off ratio. (f) Energy band diagram illustrating charge trapping effects in aligned CNT FETs. and operate as the synaptic elements of an artificial neural network, and neural circuits are implemented with CMOS. vVe first present the electrical characteristics of the aligned CNT FETs, followed by the demonstration and analysis of their synaptic properties. Later, we describe the aligned CNT-based neuromorphic crossbar configuration as well as the implementation and performance of artificial spiking neural networks for pattern recognition based on unsupervised learning. Figure 3.2 (a) plots the drain current (Id) as a function of the gate-to-source voltage (V 98 ) for a drain bias (Vd) of -1.0 V and -50 mV from a CNT FET with a channel width to length ratio of W /L;::::: 60 µm/1 µm. These data are from a six-finger top-gated CNT FET with 20 µm channel width per finger ( only measured three of the six channels divided between two drain electrodes for a total of 60 µm). P -type operation in the CNT FET is indicated by the exponentially 56 increasing (negative) current with increasing - V 9 8 , resulting from hole conduction in the valence band of the CNTs. The dual sweeps in Figure 3.2 (a) also indicate large counterclockwise gate hysteresis attributed to a dynamic screening of the electric field due to charge injection/emission (i.e. , trapping/ detrapping) near and/ or at the interface between the CNTs and the 4.6 nm thick HfO 2 gate dielectric. Figure 3.2 (b) plots the dual-sweep transfer characteristics with Vd = -0.05 V, measured with increasing gate sweep range from ± 0.5 to ± 2.0 V. Increasing the gate sweep range allows accessing a wider range of energetically distributed traps and enhances the field driven tunneling mechanisms that allow charge trapping/ detrapping. As discussed below, this voltage control of trap occupancy allows gradually modulating charge-induced electrostatic and scatt ering effects, resulting in a robust synaptic device operat ion. For completion, in Figure 2c we plot the family of Id-Vds curves obtained with increasing V 9 8 from 0 V to -1.0 Vin st eps of -0.5 V. We plot multiple cycles of dual-sweep Id-Vgs measurements from three different devices in Figure 3.2 ( d) , to demonstrate the repeatability of the charge-trapping effects and their impact on the hysteresis and electrical characteristics of the CNT FETs. Having a sufficiently large on/ off ratio is important for achieving synaptic operation with a large dynamic range (i.e., a large range of conductance modulation). vVe experimentally verify that the aligned top-gate CNT FETs can simultaneously provide a sufficiently large on/ off ratio ( e.g., > 10) and hysteresis window ( e.g., > 0.4 V) by extracting these parameters from a large set of 691 measured devices. In Figure 3.2 (e) we plot the distribution, and it concentrates around an on/ off ratio of ~40 and a hysteresis window of ~0.9 V, with a long tail spreading t o larger on/ off ratios. The tightness of the hysteresis window distribution is a good indicator of uniformity in the charge-trapping dynamics. In short channel CNT FETs, electronic transport is quasi-ballistic (near-ballistic) [129]. This also helps achieve a robust synaptic behavior and large dynamic range, as it enhances the sensitivity of CNT FET conductance to charged defects near the channel. When carriers can travel without being scattered by other channel impurities, Coulomb scattering induced by changes in the charge state 57 of nearby defects can have a large impact on conductance [123]. In Figure 3.2 (f), the energy band diagram across the gate/HfO 2/CNT regions of the device illustrates the trapping mechanisms responsible for hysteresis and for the synaptic behavior of the aligned CNT transistors. vVith a negative bias applied at the gate, the energy level of near interfacial traps in the HfO 2 dielectric layer will be shifted upward ( due to band bending), and a fraction of them that were initially located below the Fermi level in the CNT channel ( E F) will now be located above Ep. These traps will gradually change their occupancy, since at this biasing condition there is a large hole population in the CNT channel that can occupy the energy levels Er (i.e. , hole trapping), resulting in a net positive change in the charge contribution due to traps. Similarly, when the gate bias is positive, the bands bend in the opposite direction, resulting in a net negative change in the trap charge contribution. We note that the trap energy distribution (relative to Ep) depends on bias (i.e., due to band bending) as well as position. Traps located further away from the interface see a larger shift in their energy level as a function of bias, but are also less likely to have a chance in their occupancy, as the tunneling probability of carriers from the CNT channel decreases exponentially with distance from the interface [ 74]. Thus, only a fraction of near-interfacial traps having energy levels centered around Ep, will dynamically change their charged state as a function of bias, and affect the electrostatic and transport properties of the device. In CNT FETs, traps along the surface of the dielectric but not directly in contact with the CNT (i.e., surface traps) may also contribute to charging effects [130]. The trap charge state transitions are not instantaneous and can have long-term effects that result in gate hysteresis and memory effects that are responsible for the synaptic behavior of the aligned CNT FETs. The synaptic properties of the aligned CNT FETs are experimentally analyzed using pulsed electrical measurements. As illustrated in Figure 3.3 (a), the source terminal of the aligned CNT FETs is connected to a ground reference, while a series of gate-to-source (V 98 ) and drain-to-source (Vds) voltage pulses are applied to the device under test during the experiment. To characterize synaptic potentiation, a short positive V 98 pulse with amplitude Vpot and width tw is applied as 58 C d e 10-•~---~-----' 0 10 20 30 40 10-•~---------' 0 20 40 60 80 pulse number pulse number t,,• 10µs , V.,m•-1 V 10-6 10-6~---~---~ f 0 10 20 30 40 pulse number I,,• 10µs , V .,m •-S0mV - single measuremenl ~ mean 10--8~---~---~ 0 10 20 30 40 pulse number Figure 3.3: (a) Biasing configuration for pulsed measurements of synaptic properties of aligned CNT FETs. (b) Diagram of the pulsed measurements for long-term potentiation and long-term depression. ( c) Measured synaptic characteristics of an aligned CNT FET. ( d, e) Tuning the synaptic properties of aligned CNT FETs with adjustment of the potentiating/depressing voltage pulse amplitudes. (f) Reduced pulse amplitude improves linearity and stability of the synaptic response with a slight reduction in dynamic range. indicated in Figure 3.3 (b). Following the application of the V 98 = Vpot pulse, a small bias of Vdsm is applied between drain and source to measure Id (at V 9 8 = 0 V), and the process is repeated for a specified number of potentiating pulses. Similarly, synaptic depression is characterized by applying a short negative gate-to-source voltage pulse with amplitude Vdep and width tw , followed by a small Vdsm bias to measure Id. In Figure 3.3 ( c) we plot the synaptic characteristics of an aligned CNT FET measured with 20 potentiating and 20 depressing voltage pulses having amplitudes Vpot = 2 V and Vdep = -1.4 V, respectively, and tw = 10 µs. For the measurements of Id, Vdsm = -1.0 V was applied for ~0.1 s. The same device is measured 10 times (gray solid lines), and the mean is extracted ( solid blue line with circles). The results in Figure 3.3 ( c) reveal good repeatability of the synaptic characteristics, a large dynamic range evident by > 1 order of magnitude modulation of Id, and good analog programmability (i.e., fine synaptic resolution). We note that each 59 potentiating and depressing pulse is of the same amplitude and width, since previous works have used pulse trains with incremental amplitudes and/or widths to improve the synaptic response [131]. However, it is not clear how these incremental pre/ post-synaptic pulse features can be practically implemented in neuromorphic systems. Compared to these previous works, which are mostly based on filamentary resistive-switching devices (e.g., RRAM), the aligned CNT devices have improved synaptic properties due to the inherent charge-trapping mechanisms responsible for conductance modulation. Filamentary devices generally exhibit an abrupt transition in conductance through a "forming step", during which the creation of the conductive filament is initiated. Following the creation of the conductive filamentary path, only small changes in conductance attributed to the widening of the filament are typically achieved, resulting in a limited dynamic range. In recent work [132], it was determined that conductance modulation of ~ 100% in random network CNT FETs enables better performing neuromorphic system operation, compared to conventional memristors, which typically achieve <30%. Also, that CNT devices with higher semiconducting purity and isolated nanotubes may provide improvements in synaptic performance. Here, we demonstrate that in aligned CNT FETs, where transport is isolated to individual nanotubes with high semiconducting purity, we can achieve > 1 order of magnitude conductance modulation, providing significant improvement over random network CNT FETs. Charge trapping in aligned CNT FETs not only eliminates the need for a forming step but also enables gradual changes in the conductance, resulting in a robust and stable synaptic response. However, in some cases we can still observe a sharp transition after the first depressing pulse (e.g., Figure 3.3 (c)). In order to eliminate this abruptness, we explore tuning of the synaptic characteristics through adjustment of the pulse amplitudes. In Figure 3d, we show independent tuning of synaptic depression based on measurements with a fixed Vpot = 2 V and Vdep = - 2.0, - 1.3, and - 1.0 V (same device). We note that adjusting only Vdep results in asymmetric synaptic characteristics. It is not yet clear how this asymmetry may affect the implementation of specific neuromorphic systems or machine learning algorithms. Nevertheless, it is possible to avoid the 60 asymmetry by simultaneously adjusting Vpot and Vdep, with a slight trade-off in dynamic range as shown in Figure 3.3 (e). Figure 3.3 (f) shows the synaptic characteristics from 10 repeated measurements (gray lines) of the same device using Vpot = 1.4 V and Vdep = -1.4 V as well as the mean (solid blue line with circles). The results in Figure 3.3 (f) reveal better linearity (less abruptness) in conductance modulation while maintaining a large ( ~ 1 order of magnitude) dynamic range. To further explore the endurance, robustness, and tuning of the synaptic properties of aligned CNT FETs, we tested a large number of potentiation/depression cycles in a single device. In Figure 3.4 (a) (top) we plot Id measurements from 1000 consecutive synaptic characterization cycles using Vpot =l.6 V, Vdep = -1.6V, tw = 2µs , and Vdsm = -50mV. From each cycle we extract Id after 0, 2, 4, and 16 potentiating pulses and plot them as a function of the cycle number (bottom). The results in Figure 3.4 (a) illustrate the endurance and robustness of the charge trapping-based synaptic behavior of aligned CNT FETs. For the same device, we repeat the measurement of 1000 consecutive cycles with Vpot = 1.4, 1.2, and 1.0 V, using Vdep = -1.6 V for all cases, as plotted in Figure 3.4 (b), (c), (d) (top). Similarly, we plot the extractions of Id after 0, 2, 4, and 16 potentiating pulses for all cases, respectively plotted as a function of the cycle number in Figure 3.4 (b) , (c), (d) (bottom). These results verify the stability and robustness of the synaptic performance of aligned CNT FETs, as well as the precise tuning capability based on adjusting the potentiating voltage pulse amplitude. "\Ve also verify the long-term retention of synaptic weights in aligned CNT FETs through the time-dependent sampling of Id following the programming of various states (i.e., after various numbers of potentiating/depressing pulses). Figure 3.5 (a) plots Id vs time over approximately 4 decades of time (up to 1 ks), indicating only a small loss of the extreme states that correspond to the largest/smallest programmed channel conductance. However, in many neuromorphic computing applications and machine learning algorithm implementations, synaptic weight updating may occur at much faster rates compared to the time scale over which we measure this slight 61 a "O "O l.v = 2 µs, Vc1sm = - 50 mV 10 -s Vpo1 =1 .6V V =-1.6V 0 8 16 24 32 pulse number --., ~ - pulse number: 0 = 0,0;2,� = 4,� = 16 b "O "O 10-8 0 l.v = 2 µs, Vc1sm = - 50 mV Vpo1= 1.4V Voop=-1.6V 8 16 24 32 pulse number pulse number: 10-8 ~- 0 ___ O_ =_ 0 ~ ,O _= _2 _. 0 _= _4 _,�_ =_ 16 ~ 500 cycle number 1000 0 500 1000 cycle number C l.v = 2 µs, Vc1sm = - 50 mV d l.v = 2 µs, V c1sm = - 50 mV 10-6 Vpo1= 1.2V Voop=-1.6V 10-6 Vpoe= 1.0V Voop = - 1.6V ~ 10-1 ~ ,0-1 "O "O "O 1 o -a ,__ _________ _, 0 8 16 24 32 pulse number ,o-al-------=wi: 0 8 16 24 pulse number 1 o -8 pulse num ber: 0 = 0,� =2, � = 4,� = 16 32 ,o-a.._ _________ _. ,o-s._ _________ __...., 0 500 1000 0 500 1000 cycle number cycle number Figure 3.4: (a-d) Multiple cycles of synaptic properties characterized with repeated (1000) pulsed measurements. Each graph is for a different amplitude of the potentiating voltage pulse ranging from V pot = 1.6 t o 1.0 V. Top: I d vs pulse number for all 1000 cycles; bottom: extraction of Id at four different levels (i.e., after four different number of pulses) vs cycle number. 62 a 10-5 10-6 ~ _:p 10- 7 10- 8 10-1 measurements: Vds = -50 mV Vgs = 0 V decreasing number of potentiating pulses prior lo retention lest 100 101 102 103 time [s] b 5 -- data 10 15 20 25 30 Pulse number Figure 3.5: (a) Retention test showing samples of the programmed Id as a function of time immediately following the pulsed programming. (b) Collection of all data from (a)-( d) and model calculations indicating the impact of Vpot on the abruptness and dynamic range of the aligned CNT FET conductance modulation. degradation in retention. Moreover, even with this reduction to the window of allowed programmed states, the dynamic range is still sufficiently large ( 3x) to enable adequate synaptic weight analog programmability with high resolution. Nonetheless, we expect that this issue may be easily resolved with engineering of the high-k dielectric trapping layer and/or introduction of alternative layers with better trapping characteristics. Finally, in Figure 3.5 (b) we show a combined plot with all 1000 cycles from each case of Vpot (i.e. , 1.6, 1.4, 1.2, 1.0 V) , to better illustrate the repeatability of the measurements and to clearly indicate the impact Vpot on the synaptic response. As shown, a higher Vpot results in a larger dynamic range, but also increases the abruptness of the pulse-induced conductance modulation (i.e. , conductance is changed more with each pulse). In Figure 3.5 (b) the color-coded solid lines are experimental data and the solid black lines with circles are calculations based on a recursive model for the aligned CNT FET synaptic characteristics. In the following section we provide more details on the model and describe the impact of conductance modulation abruptness and dynamic range on the unsupervised learning pattern recognition function of spiking neural networks. Synaptic devices such as the charge-trapping aligned CNT FETs are of great interest for the hardware implementation of large-scale neural networks for neuromorphic computing systems. A popular demonstration of the type of functions that can be efficiently implemented on neuromorphic 63 a ~raining lo put: X1 (t) image: J L _r-----i_ .n___ _r-----i_ I 1 1 I 1 2 I 1 3 b C d 1.0 350 300 xth 0.8 250 0 en 200 2. 2. C 150 CJ ><' 100 50 00 25 50 75 100 00 25 50 75100 t [ms] t [ms] 1.2 1.0 0.8 0.6 0.4 0 .2 0 output neurons l utput: / 1 . (I) I,o • data - model 50 100 150 200 pulse number Figure 3.6: (a) Diagram illustrating the implementation of unsupervised learning for pattern recognition in a spiking neural network with aligned CNT synaptic devices. (b) Simulated time dependent current in the postsynaptic (output) neurons. ( c) Characteristics of output neuron potentials as simulated by an integrate and fire function, indicating the firing of the postsynaptic neuron spike as well as lateral inhibition. ( d) Experimental data and model calculations of aligned CNT FET synaptic response used in the simulations of MNIST data set pattern recognition. systems is that of pattern recognition based on unsupervised learning in artificial spiking neural networks. Here, we present simulations of pattern recognition using the MNIST handwritten digit data set based on a simplified spike-timing-dependent plasticity scheme modeled on large arrays of aligned CNT synaptic transistors [133]. We utilize an experimentally verified model of the synaptic characteristics of aligned CNT FETs and investigate the impact of the conductance modulation abruptness and dynamic range on recognition rate and on the learning dynamics of the network. The implementation is illustrated in Figure 3.6 (a): The data set consists of 60000 training images and 10000 test images. The training images are presented to the network as input voltage 64 pulses which are applied at the rows of the implemented crossbar array architecture. Here, the input vector represents the intensity of all 28 x 28 pixels from the training image, translated into voltage pulses having a width directly proportional to the intensity of the corresponding pixel. At each cross point of the array, the gate of an aligned CNT synaptic transistor is connected to the input row (presynaptic neuron), and the source is connected to the output column (postsynaptic neuron). The drain is biased to a small negative voltage with respect to the source, to enable current flow in the channel of the aligned CNT FETs. The sum of the currents flowing through all of the synaptic devices connected to each column is summed at the postsynaptic neurons. Mathematically, the current in column j can be expressed using Kirchhoff's current law as Jj(t) = ~ ixi(t)Gij , where xi (t) and Gij are t he input voltage pulse and the conductance of t he aligned CNT FET from row i. Figure 3.6 (b) illustrates the output currents from an array with 10 output neurons during the 100 ms that a training image is presented to the network. In the spiking neural network implementation, Gij is updated based on a simplified spike-timing-dependent plasticity (STDP) scheme [134]. In this STDP scheme, a leaky integrate and fire operation is executed at each output to obtain the neuron potential expressed as dX 1 (t) / dt - X 1 (t) / T = I 1 (t) / T. When any of the output neuron potentials exceeds a specified threshold (Xth) , a postsynaptic spike is triggered, firing the application of voltage pulse at the corresponding column and resetting all Xj ( t) to zero. Figure 3.6 ( c) plots Xj ( t) corresponding to Ij (t) in Figure 3.6 (b ), illustrating the triggering of the postsynaptic voltage pulse from the output neuron that first reaches Xth ( neuron 2 in this case). Also indicated in Figure 3.6 ( c) is the implementation of lateral inhibition that consist s of holding Xj (t) at zero for neurons other than the one that has last fired for a short period of time (10 ms in this case), to prevent a different neuron from firing in response to the same stimulus (i.e. , a winner-takes-all approach). The firing of the postsynaptic spike delivers a V 9 8 voltage pulse across the aligned CNT FETs connected to the postsynaptic neuron that has fired, resulting in a charge- trapping-induced update of their channel conduct ance. This change in the conductance ( llG) is positive or negative 65 depending on the relative timing of the pre- and postsynaptic spikes. In this implementation, synaptic potentiation (i.e., positive ~G) occurs for all aligned CNT FETs that have an input pulse width (tin) that exceeds the triggering of the postsynaptic spike (tout), and depression occurs for devices with tin < tout· In other words, devices that have an input voltage during the arrival of the output spike will have a small increase in their conductance, and devices without an input voltage during the arrival of the output spike will have a small decrease in their conductance. vVe note that other implementations aimed at realizing a biologically plausible STDP scheme attempt to achieve a ~G that is proportional to ~t = tintout· Instead, we adopt a simplified scheme for practical hardware implementation where ~G is dependent only on the sign of ~t and can be realized with CMOS IC processes, using only square pre- and postsynaptic voltage pulses. Figure 5d shows experimental data and model calculations for ~G resulting from consecutive potentiating and depressing V 98 voltage pulses applied to an aligned CNT synaptic transistor. Calculations are from a recursive model required for spiking neural network simulations where updates in conductance are obtained as potentiation: ~G = pt(Rpt)Ptec (3.1) depression : ~G = dp(Rdp)dpec (3.2) where Rpt = (Gmax - G)/(Gmax - Gmin) (3.3) (3.4) In eqs 1 and 2, pt, pte, and Rpt are obtained from fitting the synaptic potentiation characteristics, and dp, dpe, and Rdp are obtained from fitting the synaptic depression characteristics. Rpt and Rdp are associated with the rates at which the conductance is increased/decreased, given the current state of the device as determined by the difference between the conductance and the 66 a b 0.9 C G min G max 0.9 Q) Q) Bmmme - ro m 0.8 .... .... C C nnoen 0 :;::; g 0.7 ·c ·c 0) 0 O> after 60k oang u 0 Q) u 0.6 training .... Q) .... iterations mneom 0.3 102 103 104 105 0.50 50 100 training number output neurons Figure 3. 7: (a) Recognition rate as a function of training number for arrays with increasing number of output neurons. (b) Recognition rate after 60,000 training cycles as a function of the number of output neurons. ( c) Conductance map of 20 output neurons after training. max/min values. This modeling approach can be used to explore the effects of the synaptic device characteristics on the neuromorphic system-level performance (e.g., in this case recognition rate). The recursive model is similar to those used in previous modeling work, but is formulated for easier interpretation and to better bound conductance to the specified Gmin and Gmax, resulting in improved simulation stability. In Figure 3.7 (a), we present the results of the pattern recognition simulations using the experimentally verified model of the aligned CNT synaptic transistors. The results show the recognition rate as a function of the training number for arrays with 10, 20, 40, and 80 output neurons. Each case is simulated five times, and we plot the mean value including error bars for one standard deviation. For each simulation we present a fraction of the training set images and then perform a recognition test using all 10000 test images. During the test we keep track of spiking activity and determine the recognition rate a posteriori by assigning each neuron to the digit for which it spiked the most and calculating the ratio of occurrences that the assigned neuron spiked compared to the total number of spikes for a given digit. The results presented in Figure 3.7 (a) are the average of all digits. Clearly, the recognition rate improves with training and also improves with increasing number of output neurons, as these provide specialization to different styles of handwriting for the same digits, resulting in improved accuracy of the algorithm. 67 • V pot = 1 . 0 V V dep 1.6 V 0.3 v pol = 1.6 v vd p =-1.6 v ..-... • v pol = 2.0 V v dep = - 1.6 V Q) ..,_, rn '- 0.2 C 0 ..,_, C C) 0.1 0 (.) Q) tw = 2 µs, V dsm = - 50 mV '- ._, Output neurons: 20 <] 0 10 2 10 3 10 4 10 5 training number Figure 3.8: Improvement in recognition rate as a function of training number in aligned CNT FET spiking neural networks with increasing amplitude of potentiating voltage pulses. Learning rate can be optimized to achieve improvements in recognition with reduced number of training cycles. Figure 3. 7 (b) plots the recognition rate as a function of the number of output neurons after 60000 training steps. In Figure 3.7 ( c) , we plot conductance maps for the case of 20 output neurons, which correspond to the conductance of all of the aligned CNT FETs connected to each column in the network (again after all 60000 training steps). In Figure 3.5 (b) we present experimental data and model calculations demonstrating the impact of the amplitude of the potentiating voltage pulses (Vpot) on the synaptic characteristics of aligned CNT FETs. We showed that increasing Vpot resulted in a larger dynamic range, but also increased the abruptness of the conductance modulation. In Figure 3.8 we now show simulation results from the unsupervised learning pattern recognition using model fits to experimental data 68 from aligned CNT synaptic transistors with increasing Vpot· We calculate the improvement (~) in recognition rate as a function of training number for the case of Vpot = 1.0 and 1.6 V and also extrapolate the model to the case of Vpot = 2.0 V. The results in Figure 3.8 show that the increased dynamic range and abruptness in modulation that results from increasing Vpot enhances the initial learning rate of the network (i.e., larger slope during the initial training steps). This enhancement can provide a better recognition rate with a smaller number of training steps (e.g., in the case of V = 1.6 V). However, the pot detrimental effects of an excessively abrupt conductance modulation resulting from further increasing Vpot can quickly saturate the improvement in recognition rate (i.e., levels off at a smaller training number), limiting the accuracy achieved in the simulation for Vpot = 2.0 V. We note that the tuning of pre- and postsynaptic pulses can be applied globally or selectively on the network to enhance/decrease the learning rate of specific neurons and/or input patterns. Thus, the tuning of the synaptic characteristics of aligned CNT FETs presents opportunities for developing neuromorphic systems and unsupervised learning algorithms with adaptive and/or selective learning properties enabled by control of the pre- and postsynaptic pulses. In conclusion, a synaptic transistor technology for the implementation of large-scale neuromorphic systems, based on the wafer-scale CMOS-compatible processing of CNT FETs with highly aligned nanotubes with high semiconducting purity and density is presented. The charge-trapping mechanisms responsible for the synaptic properties of aligned CNT FETs is studied and a detailed characterization based on DC and pulsed measurements is provided. A large dynamic range (i.e., >lOx) with gradual long-term analog programmability of conductance using potentiating and depressing voltage pulses is characterized. The robustness of the device operation and stability of the synaptic behavior are demonstrated with multiple cycles of consecutive potentiating/ depressing voltage pulses and extraction of programmed conductance states. Tuning of the synaptic characteristics of aligned CNT FETs is shown and trade-offs in the abruptness and stability of conductance modulation and the dynamic range are established. On the basis of the demonstrated robustness 69 of the aligned CNT synaptic transistor, the hardware implementation of an unsupervised learning for pattern recognition in spiking neural networks is simulated. The simulation is validated with experimental data from measurement-aligned CNT FET synaptic transistors and used to analyze the recognition rate of handwritten digits from the MNIST database. On the basis of the experimentally demonstrated tuning of the aligned CNT FET synaptic response, we show the impact of conductance modulation dynamic range and abruptness on the learning rate. \¥e show that tuning of the CNT synaptic characteristics enables optimizing the learning rate and achieves higher recognition rate with a lower training number. We also discuss the tuning of aligned CNT synaptic behavior for developing neuromorphic algorithms with adaptive and/or selective learning characteristics. 3.2 Reconfigurable BP /SnSe Synaptic Device In this section, a heterosynaptic device is discussed, which is distinctively different from the previously demonstrated heterosynaptic devices [35], [ 135] in terms of both their operational characteristics and biological equivalence. This heterosynaptic device typically relies on a third active terminal to modulate the synaptic responses and resembles a biological synapse influenced by an external neuromodulator. In neuroscience, an excitatory postsynaptic potential is a temporary depolarization of the postsynaptic membrane that makes the postsynaptic neuron more likely to fire an action potential [ 136]. In contrast, the inhibitory postsynaptic potentials counteract the excitatory actions and makes the postsynaptic neuron less likely to fire [ 137]. The type of synaptic effects, whether it is excitatory or inhibitory, is determined by the type of ion channels in the postsynaptic neuron activated by the specific neurotransmitter [ 138] released from the presynaptic neuron. A recent study showed the co-release of glutamate and 1-aminobutyric acid (GABA), excitatory, and inhibitory fast neurotransmitters from a single axon terminal in neurons of the ventral 70 tegmental area that project to the lateral habenula [139], allowing both excitatory and inhibitory postsynaptic potentials to be produced at the same synapse depending on the state of the presynaptic and postsynaptic neurons. Other studies have also shown that during mammalian brain development synaptic activities in certain neurons with GABA neurotransmitters can switch from being excitatory to inhibitory [140]. In artificial synapses that mimic the operation of their biological counterparts, it is often desirable to reconfigure the same synapse between excitatory and inhibitory operations. The ability to reconfigure the synaptic effects in a single synaptic unit can offer desirable flexibility and versatility for artificial neural network and neuromorphic system design. However, such reconfigurability of synaptic effects has been difficult to realize in a single solid-state device. Traditional methods to build an artificial synapse typically rely on a circuit based approach [141] that requires 10-20 transistors to realize one synapse. The conventional memristor-type [16] and transistor-type [105] artificial synapses can realize synaptic functions in a single semiconductor device but lacks the ability to dynamically reconfigure between excitatory and inhibitory responses without the addition of a modulating terminal. In this work, we propose to mimic the biosynapse that can co-release both excitatory and inhibitory neurotransmitters using a tunable heterojunction formed between black phosphorus (BP) and tin selenide (SnSe) and realize such reconfigurability between the excitatory and inhibitory synaptic effects. The synaptic behavior can be dynamically tuned by the electrost atic bias at both the input and output terminals of the device. Figure 3.9 (a) shows the schematic of the proposed synaptic device structure and the bias conditions, and Figure 3.9 (b) shows the schematic of a biosynapse that can co-release both excitatory and inhibitory neurotransmitters. The device consists of the heterojunction between BP and SnSe. The BP thin films were first exfoliated from its bulk crystals onto the Si0 2 (90 nm) / Si substrate in a glovebox environment (oxygen <0.1 ppm, H 20 < 0.1 ppm). A single-crystalline SnSe flake was then exfoliated and transferred onto the BP flake to form BP- SnSe heterojunctions in the glovebox. Subsequently, the sample was coated with a poly(methyl methacrylate) (PMMA) 71 a C Pre-synaptic Input b Excitatory / Neurotransmitter Post-synaptic output - i Q Inhibitory Neurotransmitter Pre-synaptic Input Nonnaized lnlmsity (au.) Sn Se P O Si Figure 3.9: BP-SnSe junction synaptic device. (a) Schematic of the BP-SnSe heterojunction synaptic device. The presynaptic input is applied at the silicon bottom gate terminal. The electrode contacting SnSe is grounded. The postsynaptic output is measured at the electrode contacting BP. Vbias is applied between BP and SnSe, and the voltage V 9 is applied between the input terminal and SnSe. (b) Schematic of a biological synapse that can co-release excitatory and inhibitory neurotransmitters. ( c) STEM image, EDX line profile, and EDX mapping of the BP-SnSe junction device cross-section. resist layer and patterned for metallization using a Raith 20 kV electron-beam lithography system. The 5 nm Cr adhesion layer followed by a 30 nm Au layer were then evaporated by thermal evaporation using a Kurt J Lesker Nano 36 system to form the contacts to the BP and SnSe flakes through a liftoff process. A ~ 2 nm thin layer of POx is formed between BP and Si02 due to the oxidation of phosphorus as confirmed by the EDX mapping of the device cross-section in Figure 3.9 (c) . Both BP and SnSe are layered semiconductors of orthorhombic crystal structure with bandgaps of 0.3 and 0.8 eV, respectively. The presynaptic signal is applied at the bottom silicon gate of the device, i.e., the presynaptic input terminal. The postsynaptic current (PSC) 72 flows through the vertical junction between BP and SnSe. Depending on the bias condition on the silicon gate (V 9 ) and the bias voltage (Viias) between the electrodes contacting BP and SnSe, the postsynaptic response of the device can be tuned over a wide range of characteristics. Between the BP layer and the SiO2 gate dielectric, a thin layer of phosphorus oxide (POx), the native oxide of BP, is intentionally formed through exposure to oxygen. This POx layer functions as the charge trapping layer to enable synaptic behavior in the device [34]. The cross-section of the device is characterized by transmission electron microscopy (TEM) and scanning TEM (STEM). The high-resolution TEM image in Figure 3.9 (c) clearly shows the highly crystalline layered lattice of BP, which has a thickness of ~6 nm. It is also clear that the junction between the SnSe layer and the BP layer is of high crystalline quality. We note that there is an amorphous layer between SiO2 and BP. In order to identify this amorphous layer and the material composition of each layer in the device, energy-dispersive X-ray spectroscopy (EDX) mapping was performed on the same region shown in the TEM image. As shown in Figure 3.9 (c), the crystalline SnSe layer shows the strong presence of both Sn and Se. The BP region shows the highly crystalline layered structure of the phosphorus (P) element. The presence of the oxygen (0) and silicon (Si) elements was observed in the SiO2 region, which is absent of the phosphorus element. In the interface region between BP and SiO2, both P and O are observed, confirming that the interfacial layer is a phosphorus oxide with a thickness of ~2 nm. Such a POx layer can also be confirmed on the basis of our previous BP TEM results [34]. The EDX line profile of the device cross-section clearly shows the presence of the POx and SiO2 layers. The POx layer is a key functional layer in this synaptic device, which traps and detraps electrons in response to the pulses at the input terminal to enable the postsynaptic current change at the output of the device. The thickness of the SnSe layer is ~ 100 nm. The BP-SnSe junction is fundamental to enabling the synaptic characteristics of the device and its reconfigurability between the excitatory and inhibitory responses. Device simulations are first developed to elucidate the operation mechanism that results in the tunable electronic 73 a C 0.4~--~--------~ ,,,-- - --------- 0.2 ______________ / E e - - ·-· ---·-·-·-· i > ,. ~ o~---~'_..__,~ -•c::: , ~ L - '.:: - $-:l - ::: - ::: -:::: -- ::: - ::: -:::: -- ::: - ~ _____ ,,, / E - V =-20 V - I V 9 UJ / ----- V 9 =0 V _.,- - · - - V =20 V ·-· -·-·-·-·-· -· ·· -··· E : f) -0.40 -0.2 0.5 1 1.5 2 X (µm) b 20~--~--~~--~ - V 9 =-20V, p-p 0 15 >, 5 01 d 10 ~ 0 3 "0 -10 ----- V 9 =0 V, i-p - · - · V 9 =20 V, n-p ------ EF Ee \ \\ 1 1 I I I " 'i i \ ~ I I\ \ \ SnSe BP \ \ \ \ 0.5 0 -0.5 i-p -4 -2 E (eV) -- V=20V g -- V=0V g -- V =-20V g 0 2 4 V bias (V) Figure 3.10: Tunable characteristics of the BP-SnSe heterojunction. (a) Schematic of the BP SnSe junction. (b) Simulated band profiles at the junction between the BP and SnSe layers along the vert ical direction indicated by the letter B for Vg = -20, 0, and + 20 V, respectively. (c) Simulated band profile at the junction between the BP under the BP-SnSe junction and that outside the junction along the lateral direction indicated by the letter A for V 9 = -20, 0, and + 20 V, respectively. ( d) Id - Vbias characteristics of the device at different V 9 , showing the rectifying characteristics of the BP-SnSe heterojunction that is reconfigurable between p-p, i-p, and n-p junction types depending on the bias condition. characteristics of the BP-SnSe junction as shown in Figure 3.10 (a) and hence its synaptic behavior. The simulated band profiles, as shown in Figure 3.10 (b), ( c) , indicate that this junction-based device can be configured into different operation regions by the bias V 9 applied at the input terminal, i.e., the gate. Both a vertical BP-SnSe heterojunction, whose band profiles are shown in Figure 3.10 (b), and a lateral homo junction in BP, whose band profiles are shown in Figure 3.10 (c), are formed in the current flow path. While the top SnSe layer is p-type [142] and insensitive to the applied bias V 9 due to charge screening, the bottom BP layer can be effectively modulated. The relatively moderate 0.3 eV bandgap of BP facilitates the modulation of the BP layer by Vg from p-type to intrinsic and then t o n-type in t he nonst acking region, as shown in Figure 3.10 ( c) . 74 A lateral homojunction forms at the boundary of the stacking (BP-SnSe overlap) and nonstacking (only BP) regions because the gate modulation of the bands is more effective in the nonstacking region of the BP layer compared to the BP region under the SnSe layer. At the presence of a negative gate voltage of Vg = -20 V, the BP layer is modulated to p-type, and a p-p vertical heterojunction is formed between BP and SnSe as shown by the simulated band profile in Figure 2b. "\¥ith a positive gate voltage of V 9 = 20 V, the BP layer can be modulated to n-type and an n-p vertical heterojunction with a rectifying I-V characteristics is expected. At zero gate bias, the BP layer in the nonstacking region is close to intrinsic and an i-p junction is formed in the current flow path. The intrinsic BP nonstacking region has a low charge density, which results in a large resistance and a low current when the gate is biased near zero in the transfer I-V characteristics. The tunable junction characteristics between the BP and SnSe layers as predicted by the simulation is confirmed by the electrical measurement of the fabricated device. Figure 3.10 ( d) shows the measured reconfigurable rectifying characteristics of the fabricated device. The three regimes of different junction types are clearly observed, i.e., an n-p rectifying junction behavior at V 9 = 20 V, an i-p rectifying junction behavior at V 9 = 0 V, and a p-p nonrectifying junction behavior at V 9 = -20 V. Figure 3.11 (a) shows the current in this ambipolar junction synaptic device as a function of both V 9 and Vbias . The current is plotted as the color map in logarithmic scale using its absolute value. Based on the doping types of the junction between BP and SnSe, the plot can be divided into three regions, i.e., the n-p, i-p, and p-p junction regions as predicted by the theoretical simulations. Furthermore, based on the horizontal ridge of the minimum conductivity resulting from the zero bias condition of Viias and the diagonal ridge joining the points of minimum conductivity as the junction changes from n-p to i-p (both in yellow dashed lines), the current map can be divided into four operation quadrants. When a positive input pulse is applied at the presynaptic input terminal, electrons will be injected and trapped inside the POx layer. This will shift the Fermi level in BP away from the conduction band and closer toward the valence 75 a 5-.------------. /(A) 4 p-p i-p 10• 4 3 0 2 low response ~ ~ 0 ),. -z; -1 -2 -3 -4 excitatory inhibitory -5+--,~~~--'1"-~~-~~~ -20 -15 -1 0 -5 0 5 10 15 20 V 9 (V) 10-s - V- =3V 20V 0.3 f} - V..,=2V Input pulse r7 < -S V_J L_ - 0.2 .:'v 2 r--~ -~~ -! :.=.Q~~~ "" -"'-l - 1 20V Input pulse r7 lOV_J L_ l Oms 0 0 2 4 0 < = 2 -· I - · 0.1 __ _____________ _ low response 4 6 8 10 Time(sl - V..,= -5 V 20V lnpurpu~ ·S V lOms excitatory < excitatory 0.0<--~~----'-' 0 2 4 6 8 10 Time(s) 1.0 0 - V..,=-2 V 20V Input pulse r7 0.9 a, 0.8 lOV_J L_ lDms -· 0,7 4 6 8 10 0 2 Time Cs) Figure 3.11: Excitatory and inhibitory responses reconfigurable at both the presynaptic and postsynaptic terminals. (a) The magnitude of the current measured at the electrode contacting BP for different Vbias and ½, , plotted in logarithmic scale. Positive pulses applied at the input terminal will lead to the injection of electrons into the phosphorus oxide layer. For regions with the magnitude of the current increasing with V 9 , the synaptic response will be excitatory. For regions with the magnitude of the current decreasing with V 9 , the synaptic response will be inhibitory. The Id - V 9 characteristics can be classified into three regimes based on the different junction types, i.e., p-p, i-p, and n-p. The current map can also be divided into four operation quadrants based on the horizontal ridge of zero Vbias and the diagonal ridge joining the points of minimum conductivity, both marked with the yellow dashed lines. (b) PSC in response to a 20 V input pulse at the input terminal for four different bias conditions corresponding to the points 1-4 in (a). band. For bias conditions in the two left quadrants of the current map in Figure 3a, this Fermi level shift will lead to an increase in the PSC and result in an excitatory postsynaptic potential. For bias conditions in the two right quadrants of the current map, this Fermi level shift will lead to a decrease in the PSC and, hence, an inhibitory postsynaptic potential. Figure 3.11 (b) shows the postsynaptic current measured at the output terminal of the device for four selected bias points 1-4 as indicated on the current map. The PSC is excitatory for bias points 2 (V 9 = lOV, Vbias=2V)and 3 (V 9 = -5V, Viias= -5V) since when the positive input pulse is applied at the presynaptic terminal the shift in the Fermi level in BP as a result of electron injection into the POx layer leads to an increase in the PSC in both of these two operation quadrants. The response is inhibitory at bias point 4 (V 9 = 10 V, Vbias= -2 V) since the PSC decreases in this operation quadrant when electrons are injected into the POx layer in response to the positive input pulse. Finally, the response is low at bias point 1 (V 9 = -5 V, Viias= 3 V) due to the relatively 76 a C ~ Q) Cl C ('CJ .r::: 0 .c Cl " Qi ~ 0 a ('CJ C >, (/) ~ ~ Q) Cl C ('CJ .r::: 0 .c Cl " Qi 40 30 20 a +20 V pulse a -20 V pulse ,,.~~oa v,. .. =2v QQ ...... Q ...... QQ Q 0 0 10 a 20V Input pulse r7 OQ Q 1ov--; 0 m~ Input pulse LJ a a 1ov__i L 0 a lOms ·20V a 0 10 20 30 40 Pulse Number 15~-----~------~ t>O 10 a l<O 5 · ·• · Fitted, t>O A.= 16.4 •.. t = 20 1 ms • • • + • .. .,_,..,_,. Fitted, t<O : 0 --i::· --- --------~----------------- 1> .... 11. : ·-· ~ -. ' 'Iii .,., I A=14.5 ·-.. 0 -5 a C'OC - 10 t 28 5 >- = - · ms V =2 V (/) -15 +--~-~~---1-~~-b~•••-.--1 -60 -40 -20 0 6/(ms) 20 40 60 b 10 ~ 0 Q) 0 Cl C ('CJ -10 .r::: 0 .c -20 Cl " Qi -30 ~ ,g -40 C. ('CJ ~ -50 (/) d ~ 30 ~ Q) Cl C 15 ('CJ .r::: 0 .c 0 Cl " Qi ~ ¥ -15 C. ('CJ C >, (/) -30 a +20 V pulse 20V a Input pulse r7 a -20 V pulse r# gp,,. 10v__i L Q lOms gf>QQ Q Q Q Q QQ a a lOV lOms "'c. a lnputpu~ aa a ·20V aaaa: a -..._ 0 10 20 30 40 Pulse Number A= -48.5 it V =-4 V I= -15.3 ms / .,,,, bias A........ I .a:, .. _--- -- - - - - --- --!---- ---- ___ .. _ -- -- Llt>O I ........ _ ,..-- a 61<0 __ . · · · · · · Fitted, l<0 ······ Fitted t>O -40 -20 0 /(ms) ,/ A. = -43.8 t.= 16.6 ms 20 40 Figure 3.12: Potentiation, depression, and STDP for both the excitatory and inhibitory synaptic response modes. The weight change of the BP-SnSe synapse under positive (10 ms, 20 V pulses spaced at 90 ms apart) and negative (10 ms, -20 V pulses spaced at 90 ms apart) input pulse trains for (a) the excitatory response at ½, = 10 V and Vbias = 2 V and (b) the inhibitory response at V 9 = 10 V and Viias = -4 V. STDP characteristics for (c) the excitatory response mode and (d) the inhibitory response mode at the corresponding bias conditions in (a) and (b), respectively. constant current as a function of V 9 and Vbias in the vicinity of this bias condition. As a result, this synaptic device can be reconfigured between generating an excitatory PSC and an inhibitory PSC by either changing the Vbias at the postsynaptic output from positive to negative, or by changing the baseline bias V 9 at the presynaptic input from negative to positive, or vice versa. The device, hence, provides versatile reconfigurability with control knobs at either its input or output terminals. It also differentiates itself from the conventional heterosynaptic device [35] since the latter requires a third modulating terminal to adjust the synaptic responses, which resembles a biological synapse influenced by an external neuromodulator [35]. Typically, the positive gate baseline can maintain the synaptic state for a longer time (Figure 3.11 (b), case 2) , while negative gate baseline reduces that time (Figure 3.11 (b), case 3). 77 Figure 3.12 shows the key characteristics of biological synapses-the potentiation, depression, and spike-timing-dependent plasticity (STDP) mimicked in this junction based artificial synaptic device for both the excitatory and inhibitory operation modes. Here, we show the characteristics of the device at both positive and negative Viias across the BP-SnSe junction, which give excitatory and inhibitory responses, respectively, with the input pulse biased at a baseline of 10 V. As shown in Figure 3.12 (a) and Figure 3.12 (b), 20 positive pulses (V = 20 V, W = 10 ms) are consecutively applied at the presynaptic input of the device followed by 20 negative pulses (V = -20 V, W = 10 ms). For a positive V = 2 V (Figure 3.12 (a)), the device is bias in the excitatory mode. The PSC increases rapidly in response to the first few positive input pulses before saturating at about 37% weight change, demonstrating potentiation behavior. The subsequent negative input spikes cause the PSC to decrease, resulting in depression behavior. For a negative Vbias = -4 V (Figure 3.12 (b) ), the postsynaptic response of the device becomes inhibitory. The PSC decreases rapidly in response to the first few positive input spikes before saturating at about -46% weight change, resulting in depression behavior. The subsequent negative input spikes cause the PSC to increase, giving rise to the synaptic potentiation. STDP is a key empirically observed characteristic of biological synapses believed to be fundamental for many functions of the brain from learning to memory [143]. The STDP of the BP-SnSe junction synaptic device operating in the excitatory and inhibitory modes corresponding to the same bias conditions as that in Figure 3.12 (a) , (b) are shown in Figure 3.12 (c), (d), respectively. As shown in Figure 3.12 (c), for the synaptic device operating in the excitatory mode, when the presynaptic input pulse arrives before the postsynaptic action, it leads to the enhancment of the synaptic connectivity with positive weight change, i.e., potentiation. Longer positive time interval results in less potentiation. In contrast, it results in the suppression of the synaptic connectivity, i.e., depression, when the postsynaptic signal arrives before the presynaptic signal. Longer negative time interval reduces the negative weight change in the synapse. The STDP behavior is reversed for the same junction synapt ic device operating in the inhibitory mode. As shown in Figure 78 3.12 ( d), for the synaptic device operating in the inhibitory mode, when the presynaptic input pulse arrives before the postsynaptic action, it results in the weakening of the synaptic connection ( depression) with negative weight change. A longer positive time interval results in less depression. The converse will lead to the strengthening of the synaptic connection (potentiation) when the presynaptic spike arrives after the postsynaptic spike. The measured behavior of the BP-SnSe junction synaptic devices agrees well with the STDP in both excitatory and inhibitory biological synapses [143]. The correlation time constants of the STDP behavior in both operation modes can be extracted by fitting the data with an exponential function. On the basis of the empirial relation of the spike-timing dependent plasticity, the dependence of the synaptic weight change on the time interval between the presynaptic and postsynaptic pulses can be described as { A+exp (~) G(~t) = ~ - A_exp ( ~) (3.5) The coefficients A+ and A are positive for excitatory synaptic operation and negative for the inhibitory synaptic operation of the device. The ranges of preto-postsynaptic interspike intervals over which the strengthening and weakening of synaptic connections occur are given by T+ and T. As shown in Figure 3.12 (c) , (d), both T+ and Tare around 20 ms, which are similar to the time scale of the STDP in typical biological systems. The versatile tunability of the device synaptic characteristics with control at both the presynaptic terminal and the postsynaptic terminal are shown in Figure 3.13. For applications such as pattern recognition by offiine training via software, more continuous tuning of the response is desirable. The versatility of the device to tune the response at both input and output terminals provides additional freedom to set the synaptic weight to a precise value. Figure 3.13 (a) summarizes the synaptic weight change of the device in response to a 20 V input pulse with different biases at the presynaptic and postsynaptic terminals. The weight change can be continuously tuned by either V 9 and Vbias applied at the presynaptic and postsynaptic terminals, respectively, and each of the 79 a 30 ....... "#- 20 .._,, Input pulse~ V G) 0) 10 C <U ..c 0 0 - excitatory ,, , - • ·., Gatebasellne l0ms - . = .= .; . : . ;;:.:. ~:Mt 1 . •• : . ~: -- ~ ~3 :-,~ : . • • ~: . 0 • ~ ~ . : : . ~ ~::: ~ V •• • • •• ., ~ ~-10 G) ~ -20 ' , inhibitory -• -V =2V . bias . ' , -• -V =-4V , , bias . . . , , -30 -15 -10 -5 0 5 10 15 Gate Baseline (V) b v, .. 2v-frF.~~~~ -15 V -10 V -5 V 0 V 5V 10 V 15 V Gate Baseline 2.8% 3.1" 5.0" I (_ \ Vb••••4VT; 1=-C-~ ill -lQ_ ~ . -23.0" -2.8% - Strong excitatory synapse Weak excitatory synapse .• Inactive synapse Strong inhibitory synapse Weak inhibitory synapse Figure 3.13: Tuning the synaptic responses. (a) Synaptic weight changes in response to a 20 V input pulse at different V 9 bias conditions for Viias = + 2 and -4 V. (b) Tunable strengths of the excitatory and inhibitory synaptic responses mimicked by the device with different bias conditions at the presynaptic and postsynaptic terminals. terminals has the capability to reconfigure the synaptic device between excitatory and inhibitory responses. The detailed PSC change at the different bias voltages are shown in Figure 3.13 (b). For positive bias (Vbias = 2 V) at the output terminal and input baseline voltage equal to or lower than 0 V, the weight change is around or below 1 %, which resembles two isolated neurons without significant synaptic connection. vVhen the input baseline increases to 5, 10, and 15 V, the same device can be reconfigured to be a weakly excitatory, strongly excitatory, and weakly inhibitory synapse, respectively. In this way, both the inhibitory and excitatory synapses can be realized in the same device by simply changing the input baseline voltage. Moreover, by changing 80 the bias voltage to negative (Vbias = -4 V), the weight profile under the different input baseline can be reconfigured. When the input V 9 baseline increase beyond -5 V, the characteristics of the device can be reconfigured from being a weakly excitatory synapse at V 9 = -15, -10, and -5 V to a strongly excitatory synapse at O V. The response becomes strongly inhibitory at the vicinity of V 9 = 10 V and weakly inhibitory at V 9 = 15 V. Hence, this junction based synaptic device allows bilingual (both excitatory and inhibitory) responses with tunable strengths in the same artificial synapse that can be reconfigured by either the input or the output terminals of the device without the need for a third modulating terminal. In summary, a junction based artificial synaptic device concept is proposed and experimentally demonstrated utilizing the BP-SnSe heterostructure for mimicking a biological synapse at which a single presynaptic neuron can release both excitatory and inhibitory neurotransmitters. The junction between the moderate bandgap material BP and SnSe gives rise to tunable rectifying electrical characteristics. Furthermore, the charge transfer between the native oxide of BP and the BP channel is utilized to achieve the synaptic behavior. The resulting device offers the useful synaptic characteristics that is reconfigurable between the excitatory and inhibitory responses, resembling biological synaptic activities of a single axon-dendritic synaptic junction that can co release glutamate (excitatory) and GABA (inhibitory) neurotransmitters. With highly tunable and reconfigurable synaptic characteristics enabled at the single device level, this reconfigurable artificial synaptic device may simplify the design and enable useful functionalities in emerging neuromorphic computing systems. 81 Chapter 4 Two Dimensional Material Based Circuit Level Applications for Neuromorphic Computing 4.1 Reconfigurable Stochastic Neuron Based on Tin Oxide/MoS 2 Hetero-memristor Stochastic devices with exponential-class distribution are essential for implementing Boltzmann machine (BM) and the associated simulated annealing that can represent and solve combinatorial optimization problems. To allow more efficient convergence and a better chance of reaching more optimal solutions, it is desirable that the simulated annealing process in BM can follow quantitatively designed "cooling" strategies, which requires the statistical parameters of the stochastic device output to be dynamically tunable. However, there has been very limited research on stochastic semiconductor devices with controllable statistical distribution, and the experimental study on their advantages in computing application is currently missing. Here, we demonstrate a gate tunable tin oxide (SnOx)/ molybdenum disulfide (MoS 2 ) heterogeneous memristive device that can realize tunable stochastic dynamics in its output sampling characteristics. The inherent stochastic characteristics in these devices arise from the intrinsic randomness and energy distribution in the ionic motions. The device can sample exponential-class sigmoidal distributions analogous to 82 the Fermi-Dirac distribution of physical systems with quantitatively defined tunable "temperature" effect. The computing application of these tunable stochastic memristive devices is demonstrated in the implementation of a Boltzmann machine, which can enable simulated annealing with designated "cooling" strategies. Quantitative insights into the effect of different "cooling" strategies on improving the BM optimization process efficiency is also provided. Electron devices with stochastic features are essential for the hardware realization of key emerging non-von-Neumann computing concepts such as the Boltzmann machine, which are recurrent neural networks with stochastic features analogous to the thermodynamics of real-world physical systems. BM can be used to solve a broad range of combinatorial optimization problems with applications in classification, pattern recognition, feature learning and other emerging computing systems [144], [145]. Deriving its name from the Boltzmann distribution of statistical mechanics, BM possesses an artificial notion of "temperature", and the controlled evolution of this "temperature" parameter during the optimization process [146], i.e. the "cooling" strategy, can profoundly impact the convergence efficiency of the BM and its chance of reaching a better cost energy minimization (or maximization depending on problem definition). To realize the hardware implementation of the BM that can also allow the "temperature" control and hence the precise execution of desired "cooling" strategy, it is essential to have electronic devices that can generate exponential-class stochastic sampling with dynamically tunable distribution parameters. The property of memristor in its deterministic form has been commonly used in applications such as multiply-and-accumulate matrix calculation [10] and resistor-logic demultiplexers [147]. Its stochastic property is often intentionally suppressed [148] in such applications for the purpose of achieving accurate and reproducible computational results [149]. On the other hand, rich stochastic property of memristors, which relies on ensembles of random movements of atoms and ions, offers new opportunities in energy-efficient computing applications [150]. With the stochastic property, one can generate random number [151] to encrypt information, implement physical unclonable functions [152], and realize artificial neurons with integrate-and-fire activations [58]. 83 Furthermore, emerging computing schemes can use stochastic memristive device as a building block to mimic biological neural network [153], whose functions - such as decision-making - can leverage the stochastic dynamics of neurons and synapses. However, a common challenge with previous stochastic memristors is the lack of means to precisely control and modulate the probability distribution that is associated with its randomness. Realizing such devices has been difficult because many device-generated random features in stochastic memristors or oscillators lack stable probability distribution, which limits the chance of controlling it experimentally [154]. Additionally, with only two terminals in a common memristor, where the probability distribution can only be influenced through the two-terminal bias, the probability distribution of the device output cannot be tuned flexibly and precisely. In this work we overcome such challenge with a new three-terminal stochastic memristive device based on tin oxide/ MoS 2 heterostructure, which demonstrates tunable statistical distributions enabled by the gate modulation. The inherent exponential-class stochastic characteristics of the device arising from the intrinsic randomness and energy distribution in its ionic motions are explored to realize sampling of exponential-class sigmoidal distributions that resembles the Fermi Dirac distribution in physical systems. The device incorporates gate modulation that allows the efficient control of the stochastic features in the device output characteristics. The device enables the realization of a Boltzmann machine in which the reconfigurable statistics of the device allows different "cooling" strategies to be implemented during the optimization process. The effect of different "cooling" strategies on improving the optimization process efficiency of the BM is demonstrated experimentally. Device Structure and Electrical Characteristics Figure 4.1 (a) shows the schematic of this reconfigurable memristive device. A thin MoS 2 layer is first mechanically exfoliated onto a Si wafer with a 285 nm thermally grown SiO 2 layer on top. The sample is then treated in an Ar/ H2 mixed gas environment at 350°C to clean the MoS 2 surface. Subsequently, a thin tin oxide layer oxidized from SnSe is deposited on MoS 2 and serves as 84 a b d e After oxidation / 60 80 100 120 140 160 Wavenumber (cm- 1 ) 10 4 10""-4~~- ~ 2-~ 0~~ 2~~ 4 V TE (V) C f ,o• ~1 0-7 ~ 10-a 8 10.g 4 VTE (V) 6 Figure 4.1: Device structure and electrical characteristics. (a) Schematic of the gate tunable memristive device. (b) The HR-STEM image of the fabricated device cross section. The scale bar is 5 nm. (c) EDX scan indicates the elemental composition. (d) Raman spectra for the SnSe sample before and after oxidation. The missing signature modes after oxidation indicates the full oxidation and amorphization of the SnSe sample. ( e) Unipolar electrical switching characteristics of the device at Vg=0 V. The set and reset voltages in positive scan are 3.2 V and 2.8 V, and in negative scan are -3.4 V and -3 V. (f) Modulation of the set voltage by the gate bias. When Vg decreases from 30 V to -20 V, the set voltage increases. filament switching layer. Electron beam lithography is then used to transfer the patterns followed by the evaporation of a 10 nm/40 nm Cr/ Au metal stack, which forms the top electrode (TE). The Si substrate serves as a modulating gate bias (Vg) that can influence the filament formation dynamics in the tin oxide layer. The high-resolution scanning transmission electron microscopy (HR-STEM) image in Figure 4.1 (b) shows the cross section of the fabricated device and reveals that the tin oxide layer is amorphous. An energy-dispersive X-ray spectroscopy (EDX) scan in Figure 4.1 (c) indicates the elemental composition. Figure 4.1 (d) plots the Raman spectra for the SnSe sample before and after oxidation, which leads to the formation of the SnOx layer. All signature modes of SnSe including the shear mode A~, the in-the-plane modes A~ and B 3g, and the out-of-plane mode Ag 3that are observed before oxidation are not detected after oxidation, indicating the full oxidation and amorphization of the SnSe sample [155]. The tin oxide film can also be synthesized using atomic layer deposition (ALD) [156], which produces films of similar 85 quality as the direct oxidation method. Unipolar electrical switching characteristics of the device at V 9 = 0 Vis shown in Figure 4.1 (e). It sets and resets at around 3.2 V and 2.8 V respectively in the positive bias, and at -3.4 V and -3 V respectively in the negative bias. The resistive switching behavior of this device is due to the field-assisted drift of oxygen ions, which leads to the formation and rupture of oxygen vacancy-type conductive filaments [111]. The insertion of the MoS 2 layer in the device made it possible to adjust the electron energy level in MoS 2 by externally modulating the gate bias V 9 , which can modulate both the contact energy barrier between the MoS 2 and SnOx, as well as the conductivity of the MoS 2 sheet itself. Hence, as shown in Figure lf, as the gate bias decreases from 30 V to -20 V, the electrostatic doping in MoS 2 and the associated energy level decreases, leading to the reduction in the series conductivity and hence the gradual increase in the set voltage. Sampling of exponential-class sigmoidal distribution The filament formation process is stochastic due to the inherent random motion of oxygen ions. To extract this stochastic property quantitatively, a statistical study is carried out on the SET process. As shown in Figure 4.2 (a) , the device is initially reset to the high resistance state and a bias Vr E is applied to the device for up to 2 seconds. During each set process, it takes certain amount of time t ( t ::; 2s) after the bias voltage is applied for the device to be set. This required bias time until set is stochastic in each trial. Furthermore, there is certain chance that the device may still remain in the high resistance state after 2 seconds. Figure 4.2 (a) plots the device current characteristics as a function of time when this reset and set process was repeated for 30 times at VrE = 6 V , 5 V , 4 V and 3 V, respectively, with V 9 fixed at O V. At VrE = 6 V, the device is successfully set within the first 2 seconds for all the 30 trials. At VrE = 5 V, 4 V, and 3 V, the device failed to set within the first 2 seconds in certain cases. Figure 2b shows the histogram probability distribution extracted from 30 trials of the time required until the device becomes set. If we consider t as a random variable, the probability that the SET will occur within an 86 a 0 · ~[m1 Ill J 11 VTE = 6V 11 ~ O.~ mi m 11111 11 ij J 2 5 : ] 1 1 c ~ 0 ·H~.[ LJI:JJ / , D 4 v o J I O . ~ ~[====~ ~-L__3V~ I 0 0.5 1.5 2 nme(s) b 8 :i~,-, z,0.6 ~ B o.4 co 0.2 .g O I 0:0.6 ~ I- 0.4 ~ 0.2 Q I i- =e 8 : ib . _ · = · . _ . _ , 0 0.4 0.8 1.2 nme(s) 1.6 2 Figure 4.2: (a) The SET process under different VrE • The initial state is reset to high resistance state and a bias Vr E is applied to the device for 2 seconds. (b) The experimentally extracted probability distribution of the bias time until SET occurrence for VrE =3 V, 4 V, 5 V and 6 V, respectively. infinitesimal interval /:).t at time t can be described by an exponential-class distribution function [ 157] ( 4.1) with the wait time t following a Poisson distribution and it fits the experimental data well (red lines, Figure 2b). This experimental observation resembling Poisson random wait time underlying the filament formation process in the tin oxide memristive device is indicative of its exponential class stochastic nature. Figure 4.3 (a) plots P ss ,t<2s as a function of VrE - VrEo under different gate voltages, which shows exponential-class sigmoidal distribution function. Here, Ps s ,t < 2s is the probability that the device will successfully set within 2 seconds and VrEo is the 50% probability bias voltage point, i.e. Pss ,t < 2s (VrE = VrEo) = 0.5. vVith the gate voltage fixed, the chance of the device being set within t < 2s becomes higher with increasing VrE, following a sigmoidal distribution. It shows that Vr E can tune the stochastic property of the set event in the device when V 9 is fixed. Microscopically, the VrE tunes the filament formation process by modulating the vacancy hopping barrier height and thus the ion hopping rate. Thus, the device is understandably easier to set at 87 high VrE than low VrE- Under different gate voltages, Pss,t<2s shows a sharper l-to-0 transition when V 9 is 30 V and a wider spread in its 1-to-O transition when the V 9 decreases. Here V 9 tunes the Fermi level and charge density in the MoS 2 layer, which modulates the potential distribution between MoS 2 and tin oxide layer under VrE bias. VrE is more effective in modulating the device when V 9 is higher, i.e. the MoS2 layer has higher electron energy level and lower conductivity, and thus leads to a sharper l-to-0 transition in the sigmoidal distribution curve. The SET process is achieved by the filament formation through stochastic vacancy generation and hopping transport processes. Applying a voltage can reduce the generation and hopping barrier height and exponentially enhance the generation and hopping rates. Analytically, the SET probability, P ss ,t<2s, can be derived as P ss,t <2s = 1 - e-f3e°'<vTE-vTEo ) , where a and /3 are parameters related to the material and device structure. After further derivation, P ss,t<2s is simplified to a distribution function that resembles Fermi-Dirac distribution: 1 Pss ,t<2s ;::::: \/, \/, 1 + exp(- TR- TRO ) Tef J ( 4.2) where Tef f is an effective "temperature" term that can be tuned by the gate bias. This expression fits very well with the experimental data in Figure 4.3 (a). The above analytical description is also in agreement with Kinetic Monte Carlo simulations, which describes microscopic stochastic process of vacancy generation, hopping, and recombination in filament formation. Tef f corresponding to various gate voltages are extracted from the fitting and Figure 4.3 (b) plots Tef f versus gate voltage V 9 . A behavioral model is developed to understand the dependence of the Tef f on the gate bias voltage. The device is modeled as a memristor in serial combination with a MoS 2 layer whose resistance (both the sheet resistance and its contact property with the memristive filament) can be modulated by the gate electric field. As a result, Teff can be expressed as (4.3) 88 a 1.0 0.8 c!!l 0.6 le .; <I) a.. 0.4 0.2 0.0 -3 -2 -1 0 - analytical 30V analytical 0V - analytical -10V - analytical -20V • exp 30V exp 0V � exp-10V • exp-20V 2 3 4 VTE-VTEO (V) b 50 40 '§ 30 1- 20 10 -- simulation • experiment .. o~~~~~~~~~~~~ -20-15-10-5 0 5 1015202530 V 9 (V) Figure 4.3: (a) Pss,t<2s as a function of the Vr E under different gate voltages, showing exponential class sigmoidal distribution function. Experimental results are shown as data symbols, and the analytical model fit is shown in lines. (b) Experimental results (dots) and model fit (line) showing the relation between Tef f and the gate bias V 9 . , where Tvo and Z are constants, Vr is the threshold voltage. As shown in Figure 4.3 (b), this model fits well with the experimental data and describes the modulation effect of Tef f by V 9 . We would like to note that the value of Tef f has the unit of volt. However, to avoid confusion with the actual electrical bias voltages applied on the device, the unit of Tef f will be omitted in the subsequent discussions. The above discussed stochastic process of the filament formation together with the gate voltage dependent "temperature" effect can be used to construct exponential-class distribution sampling that have broad applications in statistical modeling and computing, with the Boltzmann machine as a typical example. 4.2 Boltzmann Machine Implementation Using Tin oxide/MoS2 Based Stochastic Neuron To demonstrate the unique advantages of these tunable exponential-class stochastic memristive devices in computing application, a version of Boltzmann machine that contains a network of two-state units [158] Xi,i=l ,2,3 is implemented, whose state flips between the O state or 1 state 89 • MAX-SAT problem: maximize the number of clauses to be true in the set of Boolean clauses: {(x Vy V z), (x' Vy V z), (x' Vy' V z), (x Vy' V z'), (x' Vy V z')}. • Boltzmann machine task: minimize the equivalent system energy E = xTwx. X1 _] X Xz s y Wu Wtz • • • W16 Wz1 Wzz • • • Wz6 X3 _r z X= W = X4 --._ x' X5 """Ly' W6z • • • W66 x6 W61 7-z, Figure 4.4: Flow chart showing the steps in mapping a MAX-SAT problem to an equivalent form solvable using the Boltzmann machine. through activation by the stochastic units. The stochastic unit affects the probability of whether Xi flips and thus determines the stochastic dynamics of the BM. The BM iterates all units to search for best solution by minimizing the system energy function. Hardware implementations of such BM is challenging with conventional transistors and would require a large number of devices and complex circuitry. Here we build a BM where each of the stochastic unit is based on a single tin oxide/MoS 2 memristive device and simple peripheral circuitry. This implemented BM is used to solve a maximum satisfiability problem (MAX-SAT), which is an NP-complete combinatorial optimization problem underlying a wide range of key applications including Max Clique [159], correlation clustering, treewidth computation, Bayesian network structure learning and argumentation dynamics [160]. Given a set of Boolean clauses, where each clause is a disjunction of Boolean variables and their negations, the MAX-SAT problem [161] aims to maximize the number of clauses that can be true when truth values are assigned to the Boolean variables. Without the loss of generality, the set of Boolean clauses to be solved in this work are selected to be Ci I i = 1, 2, , 5, where the clause Cl is (x Uy Uz); C2 is (x' Uy Uz); C3 is (x' Uy' Uz); C4 is (x Uy' Uz') and C5 is (x' Uy Uz'). (shown in Figure 4.4, the Boolean variable x' is the negation of the Boolean variable x). The optimization 90 a b Figure 4.5: (a) The PCB evaluation board of BM integrated system including the packaged tin oxide/MoS 2 memristive units and CMOS peripheral circuits. (b) Schematic of the BM circuit blocks. I nit '111111' � '011100' X n y z + x' y' I z' I 6o 0 ~ : Jg w -30 r------1 0 -10 '----- -20 .sn 1 2 3 4 5 6 7 8 9 I nit '000000' � '011100' _J _J I ~ 0 -10 -20 .sn 1 2 3 4 5 6 7 8 9 Iteration number I nit '111101' � '011100' n i ,. + •- I -- ' 1 2 3 4 5 6 7 8 9 Figure 4.6: The experimentally obtained evolution of state vector and total energy when the BM was started from three different initial states, resulting in the same optimal solution. task here is to find a state vector X = (x 1,, x 6 ) = (x, y , z, x', y', z') that can maximize the number of clauses to be true. A MAX-SAT can be converted equivalently to a problem that is solvable for the BM [162]. Six stochastic units are used in the BM to realize the activation functions for each Boolean variable in the state vector X = (x 1 , ,x6 ) . Then we build a weight matrix W. The weight Wij that is between every two Boolean variables is assigned based on the formula E. Solving the MAX-SAT is equivalent to minimizing the total energy E = X TWX of the BM, where X T is the transverse of X. The BM constructed utilizing the tin oxide/MoS 2 device is shown in Figure 4.5 (a) and the 91 schematic of the circuit blocks is shown in Figure 4.5 (b). In every iteration when the BM runs, only one of the tin oxide/MoS 2 device goes through a SET process. Depending whether the stochastic device sets or not, the unit updates its value accordingly. If t he device set s, t he stochastic unit would flip its Boolean value. If the device does not set, the stochastic unit would remain the same value. The stochastic units are repeatedly updated in sequence until the BM reaches the optimal solution. In Figure 4.6, we experimentally demonstrated the evolution of the state vector and total energy when the BM started from three different initial states and found the same optimal solution for the expression E, which is X = (x , y, z , x' , y' , z') = (0, 1, 1, 1, 0, 0). As previously shown in Figure 4.3 (b ), V 9 can tune the tin oxide/ MoS 2 device to give different Tef f during t he BM opt imization process. Te f f of the BM describes the average behaviors of all t he stochastic units, in close analogy t o the temperature parameter in the Boltzmann distribution that describes the average behavior of particles under different thermal equilibrium states in physical systems. Thus, by controlling Tef f in the opt imization process that can be achieved via tuning t he V 9 , it is possible to avoid premature convergence issues and facilitate the convergence efficiency associated with the BM. Figure 4. 7 shows the effect of different V 9 bias on the BM optimization process. During t hese three different runs of the BM, all the tin oxide/MoS 2 stochastic devices are biased at V 9 = -20 V, 0 V, 20 V, respectively. The energy evolved differently during these runs each time. The BM is at Teff = 7 when V 9 = 20 V and converges easily for this particular problem. On the other hand, the BM is at Te f f = 50 when V 9 = -20 V and is less efficient in reaching convergence. For V 9 = 0 V, the BM is at Tef f = 10 and converges at an intermediate rate among t he three cases. By counting how many times the BM can reach the global optimal solution out of 50 trial runs, the success rate as a function of V 9 and Teff is statistically obtained as shown in Figure 3f. It indicates that the V 9 and hence the Tef f can substantially affect the performance of the BM. 92 a 10~------------~ 0 >, e' -10 a, C: LJ.J -20 -30 --- v =20V g --- v =OV g --- v =-20V ' 2 3 4 5 6 7 8 9 10 Iteration number b 1.0 ,SJ 0.8 ~ "' 0.6 :a g 0.4 (/) 0.2 0.0 20 0 V 9 (V) -20 Figure 4. 7: (a) Experimentally obtained energy evolution in the BM optimization process with V 9 =-20 V, 0 V, 20 V, respectively. (b) The success rate of the BM optimization process under different V 9 . 4.3 Implementing "Cooling" Strategies In Simulated Annealing Simulated annealing Wegener:2005 can be implemented with our BM where the Teff can gradually change during the optimization process to emulate different "cooling" strategy. It is an important approach for efficiently reaching better optimization solutions and for avoiding the premature convergence. Using the gate tunable tin oxide/MoS 2 device, such "cooling" procedures can be quantitatively implemented during the simulated annealing by translating the designated sequential evolution of Tef f into the corresponding series of gate voltage bias conditions through mapping the Teff to the gate voltage based on the relation in Figure 4.3 (b). To study the effect of different "cooling" strategies on the efficiency of the BM, four different Tef f variation strategies were experimentally applied on the BM. Strategy 1: high Tef f in the first three iteration steps followed by low Teff for the remaining iterations in one optimization process (HT-to-LT); Strategy 2: low Teff in the first three iterations followed by high Teff for the remaining iterations (LT-to-HT); Strategy 3: maintaining a low Teff in the entire optimization process (LT); and Strategy 4: maintaining a high Teff in the entire optimization process (HT). Figure 4.8 (a) shows the qualitative schematic about how system energy ( color dots) would evolve in the process of 93 a HT to LT LT to HT LT States evolution C 1.0 Initial: 111111 0.8 2 (ll Cl'. 0.6 "' "' ~ 0.4 (.) :::, rn 0.2 0.0 HTtoLT LTtoHT LT "Cooling" Strategy \} b -5-------------- ----- HT to LT ----- LT to HT -10 ·· • · LT · ·• · HT -15 >, ~ -20 C UJ -25 -30 -35~~-~~-~----~~- 1 2 3 4 5 6 7 8 9 10 Iteration number d 1.0 Initial: 111101 0.8 2 (ll Cl'. 0.6 "' ~ 0.4 (.) :::, rn 0.2 0.0 HT to LT LT to HT LT "Cooling" Strategy HT Figure 4.8: Implementing the simulated annealing in tin oxide/MoS 2 based BM. (a) Conceptual schematic illustrating the evolution of the solution and energy states during the optimization process employing the 4 different variation strategies. (b) Experimentally obtained energy evolution in the BM optimization process for the 4 different strategies. ( c) ( d) Experimentally extracted success rate of the BM in achieving the most optimal solution using 4 different strategies of Teff variation during the optimization process: HT-to-LT, LT-to-HT, LT and HT. Different initial states are used in (c) and (d). Tett = 50 for HT and Tett =5 for LT in (b), (c) and (d). searching optimal solutions among multiple possible energy minimums (grey line). To analyze the effect of these "cooling" strategies, typical evolutions of the energy ( cost function) during the BM optimization process for the four different strategies were experimentally obtained. As shown in Figure 4.8 (b) , using the HT strategy (Teff = 50) , the BM is highly active but loses the selectivity for reaching proper convergence. Using the LT strategy (Teff = 5) , the BM is significantly less active but possesses higher selectivity that facilitates its convergence to a premature state. Finally, simulated annealing using a "cooling" strategy (HT-to-LT) enables active initial searches at HT (Teff = 50) and then steady convergence to the minimum energy state at LT (Teff = 5) as shown in the experimental results. Furthermore, Figure 4.8 (c) and 4.8 (d) show the experimentally obtained statistics of success rate in finding the global optimal solution when the different "cooling" 94 strategies are used. Different initial values for the state vectors are used in Figure 4.8 ( c) and 4.8 ( d) to show the effect from the different initial conditions. Both figures indicate that the HT-to-LT strategy has the highest successful rate for reaching the global optimal solution for this particular problem, while the LT strategy has the lowest successful rate. Understanding the effect of Teff and simulated annealing To quantitatively understand why Tef f can make such a significant difference in the BM optimization process, we analyze the Russel-Rao (RR) similarity [163] between all the clauses for this particular MAX-SAT problem. It is because, as illustrated in Figure 4.9 (a), all the five clauses Cl-C5 bear inherent similarity to each other due to the following two constraints: the variable constraint and the clause constraint. On the variable side, a Boolean variable and its negation (two variables connected by red lines) are always logically opposite. For example, x and x' will always have opposite values; On the clause side, the chance of two clauses both being true is lower if they contain more complementary Boolean variables in each clause. By assigning true values to the variables x, y' and z' (yellow circle), the number difference of complementary variables (blue circle) between clauses could be easily observed. Counting the number of complementary variables can directly reflect the inner connection and constraint of the clauses. In Figure 5a, for example if the clause C4: (x Uy' U z') is true, then the probability that the clause C2: (x' Uy U z) also being true is much smaller than the other three clauses due to C4 and C2 containing three pairs of complementary variables. vVith the BM set to different (x Uy' U z') , the RR similarity matrix among the five clauses based on the experimental data is constructed in Figure 4.9 (b), (c) and (d). The color and number in each block quantify the similarity between each pair of clauses indexed by the row and column. It represents the probability when both clauses are true among all cases. For example, a RR similarity of 0.84 between Cl and C2 in Figure 5b means that by repeatedly running the BM 50 times at Teff = 50, we had Cl and C2 both being true by the end of 42 (out of 50) runs. 95 a b Vanable constraint Cl C2 C3 C4 cs ~ u 11'1 u Clause C1 Clause C2 Clause C3 Clause C4 Clause C5 T .tt = 50 C Cl C2 C3 C4 cs d Cl C2 C3 C4 cs 1.0 rl ,-f u u N 0.8 ~ N u ·c u ~ o.6 .E m vi m u 0 u "' - 0.4 3: ~ Ill ~ "' u "' u :, 0.2 er: \/"I \/"I u u 0.0 T .tt = 20 Figure 4.9: Russel-Rao similarity matrix underlying the clauses employing different "cooling" strategy in a MAX-SAT problem. (a) Schematic showing the correlation among the five clauses in the MAX-SAT problem as a result of the variable and clause constraints. As an illustration, if the variables in orange circles are assumed to be true, then the variables in blue circles would be false. (b), ( c), ( d), Russel-Rao similarity matrix between the five clauses when BM runs the optimization process under Tef f = 50, 20 and 5, respectively. The effect of Tef f can be explained as follows. We view the RR similarity as the distance measurement of the statistical relationship between each two clauses ( distance = 1 - RR coefficient) in solution space [164]. In other words, clauses with RR similarity close to 1 are seen as closely clustered, while the clauses with RR similarity close to Oare furthermost separated. When Teff is tuned to 50 (Figure 4.9 (b) ), all the clauses have similar distances in the solution space, since they show close RR similarity between all pairs. As a consequence, BM tends to search widely in the solution space with a high robustness, high stochasticity and low selectivity, since choosing 96 Iteration 1 Iteration 25 Iteration 50 T,.= 50 T,tt= 1 Figure 4.10: The evolution of the Russel-Rao similarity matrix in a BM optimization process when Tef f is decreased linearly with each iteration step. any solution would look the same to the BM. When Teff is 20 (Figure 4.9 (c)) , clauses with small distances are closely clustered, giving high RR similarity close to unity for pairs of clauses that can be easily satisfied simultaneously, such as Cl and C2, and a low RR similarity for pairs of clauses that can hardly be satisfied at the same time, such as Cl and C4. At this Teff = 20, the BM gains more selectivity in solution space. When the Tef f is 5 (Figure 4.9( d) ), all the clauses are either strongly clustered or separated in distance, with distinct either 1 or 0 RR similarity. BM behaves more like a deterministic "machine". This tends to cause premature convergence as the BM is significantly less active. Next, a simulated annealing process in the BM with linear cooling is simulated in Figure 4.10. The evolution of the RR similarity matrix indicates that the BM would evolve through all the cases that are discussed above from being fully stochastic towards nearly deterministic as Tef f decreases linearly. Thus the simulated annealing process of a BM could be understood as such: at high Tef f, the BM searches solution space globally with high robustness and low selectivity, for the sake of large gradient descent; as the BM cools down, it gains selectivity towards some solutions and can possibly jump out of local minima since Teff still provides enough perturbation; as the BM cools down to the limit, the BM exhibits a stronger selectivity than robustness, preventing itself from jumping out of the optimal zone. Hence, more efficient performance in the BM can be achieved with an appropriate "cooling" strategy. 97 In summary, tunable stochastic behavior is demonstrated in the tin oxide/ MoS 2 memristive device, showing inherent exponential-class statistical characteristics. The device can sample exponential-class sigmoidal distributions resembling t he Fermi-Dirac distribution in physical systems with tunable distribution parameters to emulate the "temperature" effects. Simulated annealing with control of the "cooling" strategies is demonstrated in the implemented Boltzmann machine for solving combinatorial optimization with respect to a MAX-SAT problem. These new stochastic devices with reconfigurable statistical behavior pave the way for implementing selected "cooling" strategies in BM to reach optimal convergence efficiency and can find broad applications in energy- efficient computing for learning, clustering and classification. 4.4 Probability Distribution Associated With Tin oxide/MoS 2 Hetero-memristor Derivation of the probability distribution that the first SET will occur within a small interval 6.t at time t We define t as the amount of time when the bias voltage is applied on top electrode until the device sets. Ignoring the device reset time, we can repeat the SET process as defined in main text and count the set events as arrivals. We can then define a Poisson process in terms of sequence of inter-arrival times, X 1 , X 2 , X 3,, which are independent and identically distributed (IID) random variables. The counting random variable N(O, t) is defined as the number of sets that happens in the time interval (0, t]. N(t) = 0 means the device does not set within (0, t]. N(t) = k means the device succeeds to set k times within (0, t]. For Poisson distribution, ( e( t/T) ( t )k P(N(t) = k) = k! T ( 4.4) 98 The probability that the first set will occur within an infinitesimal interval !:::..t at time t can be derived as: P = P(N(0, t - !:::..t) = 0, N(t - !:::..t, t) = 1), (4.5) which means at (0, t - !:::..t) there is no set event and the first set happens at (t - !:::..t, t). P = P(N(0, t - !:::..t) = 0) · P(N(t - !:::..t, t) = 1) (4.6) l - C>.l C>. t !:::..t == e T • e-T. - T (4.7) t !:::..t P =e- :;: __ T (4.8) Here, T is average time needed for the device to set. Derivation of the probability distribution that the device will set successfully in a SET process within to seconds, i.e. Pss, t<to In our device, the SET process is determined by filament formation through ion hopping. In the stochastic ion hopping transport, the average hopping rate is exponentially dependent on the barrier height, which can be reduced by applying a voltage. Therefore, T is exponentially dependent on the bias and gate voltage, and it can be described as (4.9) where To is fitting parameters in equation (2). In each trial, the device either sets or not within t 0 seconds, thus Pss,t<to can be derived analytically as follows: Pss,t<to = l - P(N(to) = 0) (4.10) (4.11) 99 Replace T with equation (2) , Pss ,t<to = 1 - e To e (4.12) (4.13) where /3 = to/To and a = l/V 9 can be treated as fitting parameters that are related to material and device structure. In this work, we mainly discuss the condition of time limit t 0 = 2. Asymptotic behaviors of the probability function The probability function of the stochastic device characteristics is described in equation (3). This function has these limits, i.e, Pss,t<2s (VrE � -oo) = 0 and Pss ,t<2s(VrE � oo) = 1. For a sufficiently small VrE satisfying /Jecx(VrE) < 0.1 , equation (3) is simplified by using the first order Taylor expansion, p ~ /Jecx(VrE-VTEO) ~ l ss,t<2s ~ ~ - 1 -+-( /J _ e _cx _(_ V, _,E _ • ) _) _ _ _ l 1 _Vyp; - Vyp;o' 1 + e T e f f (4.14) where VrEo = In( /3)/a is the 50% probability bias voltage point, and Teff = l /a is the effective temperature. For a sufficiently large VrE satisfying /Jecx(VTE) > 10, equation (3) is simplified to p ~ l ss,t<2s ~ 1 + e-/3e(VrE l ' (4.15) which indicates that the function asymptotically approaches to the Fermi-Dirac distribution function or sigmoid function. This feature makes the tin oxide/MoS 2 device perfectly suit for 100 realize the exponential-class distribution sampling, especially for the Boltzmann machine whose probability of turning on a unit i has the similar form: 1 Ps,= l = ---, 1 + e-Z; where Zi is the total input of the Boltzmann machine at unit i. Dependence of the "temperature" effect on applied gate voltage (4.16) A behavioral model is developed to understand the dependence of the effective "temperature" on gate voltages. The device is modeled as a memristor in serial combination with a MoS 2 layer, whose resistance is modulated by the gate voltage. The resistance of MoS 2 layer is modulated by gate voltages as ( 4.17) where Re and Vr are constants. The voltage on the switching layer is (4.18) where Rsw is the average resistance of the switching layer. As a result, the gate dependent "temperature" effect of the device can be expressed as T (V) = T [Rsw + RMs(V 9 )] = T [l + Z l ef f g VO R VO V, lT , sw 9 - vr (4.19) where Z = Re/ Rsw , and Tvo are constants. 101 Chapter 5 Other Two Dimensional Material Based Devices and Applications 5.1 j3-Ga20 3 /graphene Barristor ,B-Ga 2O 3 is widely recognized as a promising candidate for high-voltage devices in power electronics due to its wide bandgap of 4.6-4.9eV and a large theoretical critical breakdown field of 8 MV /cm [165], [166]. Measurements of ,B-Ga 2 O 3 field-effect-transistors (FETs) have revealed an electron mobility of 150cm 2 / Vs, approaching the theoretically estimated value of 200cm 2 / V s [166] [168]. Furthermore, high quality wafer-scale growth of ,B-Ga 2 O 3 has been demonstrated using Czochralski, metalorganic chemical vapour deposition (MOCVD) [169], hydride vapor phase epitaxy (HYPE) [170], and molecular beam epitaxy (MBE) methods [171]. Recent demonstrations of ,B-Ga 2O 3 in power electronics application include diodes with breakdown voltages (V BD ) exceeding 1 kV [172], as well as FETs with a breakdown voltage of 755 V [173]. Other potential applications of ,B-Ga2 O 3 currently under investigation include logic (e.g. , FinFETs) and radio frequency devices [17 4], [175]. The crystal structure of ,B-Ga 2O 3 is monoclinic with lattice constants a = 1.220nm along the [100] direction, b = 0.304 nm along the [ 010] direction, and c = 0.580 nm along the [ 001] direction. Due to a large lattice constant along the [100] direction, the ensemble of longest Ga-O 102 = Vbg (c) 203 nm -124 (b) (d) 200~-----------~ 150 E 100 .s rJJ Kl 50 C .:,,:_ 1 0 1- -50 146 nm -100 ~-~--~--~-~-~ 0 2 3 4 Distance (pm) Figure 5.1: (a) Schematic of a (3-Ga2O3/graphene vertical device. (b) Optical image of a fabricated device. (c) AFM image scan in the area marked in (b). (d) AFM height profile along the red dashed line marked in (c) atom bonds arranges periodically along the {100} crystal plane, which facilitates the cleavage of the (3-Ga2O3 crystal into thin flakes or nano-membranes along this plane [166], [176]. Based on these techniques, exfoliated (3-Ga2O3 samples have been fabricated into lateral FETs built within the (100) plane, demonstrating decent current-voltage (I-V) characteristics and high breakdown voltages [166]. However, the critical electric field in the lateral devices remains much lower than the theoretical critical electric field. Moreover, not many studies have shown the breakdown characteristics of (3-Ga2O3 as a function of the crystal orientation. In this work, we present a (3- Ga2O3/ graphene vertical heterostructure that enables the characterization of the critical electric field at breakdown in the direction perpendicular to the (100) crystal plane. Measurements on 103 the ,B-Ga2O 3 / graphene heterostructures indicate a breakdown field that is significantly larger than values reported on ,B-Ga 2 O 3 FETs built within the (100) crystal plane and is closer to t he theoretical crit ical field of 8 MV / cm. The graphene contacted heterostructure shows barristor type gate tunable I-V characteristics in this vertical device [177]. The theoretical and experimental analysis of the critical electrical field reveals the promising potential of the material for high voltage electronics application. The schematic of the ,B-Ga2O 3 / graphene vertical structure is illustrated in Figure 5.1 (a). Graphene is first mechanically exfoliated and transferred onto a SiO 2 / Si wafer with a SiO 2 thickness of 285nm. ,B-Ga 2O 3 , which is not intentionally doped in order to provide a conservative estimate for the breakdown field [178], is then exfoliated and transferred onto the wafer overlapping the graphene flake. Contact regions are defined by e-beam lithography followed by Ti/ Au (5/ 45 nm) metal evaporation. Shown in Figure 5.1 (b) is the optical microscopy image of a fabricated ,B-Ga 2O 3 / graphene heterostructure. We use atomic force microscopy (AFM) to characterize the surface profile of the device. Figure 5.1 (c) shows the AFM image of this device and Figure 5.1 (d) the thickness profile by the red dashed line in Figure 5.1 (c). From the AFM measurements, we obtain a thickness of 146 nm for this ,B-Ga 2O 3 layer. Shown in Figure 5.2(a) is the cross-sectional energy band diagram of the heterostructure, illustrating the charge transport mechanisms at different biasing conditions. As shown in Figure 5.2(a), the graphene layer enables the injection of electrons while allowing the electric field due to the back-gate bias to penetrate through and modulate the energy bands in the ,B-Ga 2O 3 channel. The Si substrate is used as t he back-gate. Cross-sectional high-resolution TEM images of t he fabricated ,B-Ga2O 3 / graphene heterostructure are shown in Figure 5.2 (c) and (d). The formation of atomically smooth interfaces at the ,B-Ga 2O 3 / graphene and graphene/ SiO 2 interfaces is clearly demonstrated in Figure 5.2 (c). A highly cryst alline strain-free ,B-Ga 2O 3 layer is revealed in Figure 5.2 (d), from which the lattice symmetry can be identified and lattice parameters can be obtained. The selected area electron diffraction ( SAED) pattern in the inset of Figure 5. 2 ( d) shows the 104 (a) (b) 1 0-a Si0 2 t,,, = 285 {Si02) 10" L = 146 nm Si A-3011m 2 -10· 1 D, - 6•10 12 cm 2 V• = 15V ~ - 10"11 +Vgs _ 'ii 0 20 0 10 30 V , M ,~ h, '1 ,, I 0 10 20 30 40 50 I --" ·--- Vbg (V) (c) (e) Figure 5.2: (a) Energy band diagram of a ,B-Ga20 3/graphene heterostructure. (b) Id - Vbg (symbol) and analytical calculation fitting (red line). The inset shows the barrister type output characteristics of the device. ( c) Cross-sectional TEM image of the ,B-Ga203 / graphene heterostructure on the Si02/Si sub- strate. ( d) Cross-sectional TEM image of highly crystalline strain-free ,B-Ga20 3 . The inset shows the corresponding SAED pattern. (e) EDS mapping of the cross-section of ,B-Ga20 3/ graphene/Si02 105 (200) lattice plane with a cl-spacing of 0.6037 nm, which closely matches calculations based on d = ax;in/3 = 0.5927nm, where a= 1.2202nm is the lattice constant and /3 = 103.7° is the angle between the [100] and [ 001] directions. Both the TEM image and the SAED patterns confirm that /3-Ga 2O3 was exfoliated along the [100] direction [179], [180]. A high-voltage energy-dispersive X-ray spectroscopy (EDS) scan indicates the elemental composition of the device as shown in Figure 5.2 (e). The mapping of the elemental compositions in the device cross-section clearly shows the different material layers in the heterostructure and the interface between the /3-Ga2O3 and SiO2 layers. The C element mapping indicates the presence of graphene. The /3-Ga 2O3/graphene heterostructure devices were electrically characterized using a Keysight B1500 semiconductor parameter analyzer. Current flows from the top electrode (labeled as the drain electrode with voltage Vd) to the bottom electrode on graphene (labeled as the source electrode with voltage ½). Therefore, current flow is perpendicular to the (100) crystal plane. All I-V measurements are conducted under static vacuum at ~ 5 x 10- 5 Torr and at room temperature. Figure 5.2 (b) plots measurements (symbols) of the drain (Id) and back-gate (h 9 ) currents as a function of the back-gate voltage (Vbg) for a drain-to-source bias of Vds = l5V. The measurements indicate modulation of Id as indicated by the > 10 4 on/off ratio in the Id - Vbg characteristics. This is correlated with the modulation of the energy bands in /3-Ga2O3 as a function of Vbg, facilitating carrier injection across the Schottky barrier (SB) formed at the /3- Ga2O3/graphene interface [see Figure 5.2 (a)]. The inset in Figure 5.2 (b) shows the barristor type output characteristics of the device [177]. To obtain better insight into the transport properties of the device, we fit analytical calculations to the experimental Id - V 98 data using a top-of-the barrier model based on the Landauer equation. vVe use a bandgap of E 9 = 4.6 eV and a parabolic approximation to model the bottom of the conduction band in /3-Ga2O3 with an effective mass of m* e = 0.27m0 . We treat both top and bottom electrodes as Schottky contacts and assume that the Fermi-levels align 0.5eV above the midgap, resulting in a SB of 1.7eV for electrons and 3.0 106 eV for holes. This assumption is experimentally justified given the measurable electron current, but negligible hole current. Current is calculated as where 2qf I = h T(E)M(E) [f(µ s, E) - f(µd, E)]dE, m* M(E) = A-----;-(E - Ee) 21r1i (5.1) (5.2) is the density of modes in the channel, T(E) is the transmission coefficient, and f (µ, E) is the Fermi function. The conduction band edge, relative to the Fermi levels at the source and drain (µ 8 , µd) , is related to Vbg using the channel potential given by (5.3) where Vfb is the flat-band voltage, C 0 x and Cq are the oxide and quantum capacitance per unit area, and Nit is the ionized (i.e., charged) near-interfacial trap density (i.e. , near the graphene/SiO 2 interface). Trap occupancy is determined based on the trap energy level relative to the Fermi level and is a function of Vbg· By adjusting the density and energy distribution of the traps (i.e., Dit ), we adjust the slope of the calculated I-V characteristics (i.e., the subthreshold slope) to fit the experimental data [74], [ 98], [ 99]. From the fit shown in Figure 5.2 (b), we obtain an interface trap density of Dit = 6 x 10 12 cm 2 ev- 1 . For simplicity, we assume that the transmission coefficient is fixed at ~ 105 for energies below the top-of-the-barrier (i.e., the conduction band edge in ,8-Ga 2 O 3 ) , given the low probability for electrons to tunnel across the thick SB. For energies above top of the barrier, the transmission coefficient is fixed at ~ 102 to account for scattering in the ,8-Ga 2O 3 channel. For the biasing conditions under consideration (i.e., weak and moderate inversion) , thermionic emission is the dominant charge-injection mechanism and a low tunneling probability is generally accepted in the analysis of SB-MOSFETs that is applicable 107 to our device [ 69], [ 98]. We note that in these heterostructure devices, a weak electrostatic control of the ,B-Ga20 3 channel (i.e., small modulation of the bands) results in a limited electron current compared to SB-MOSFETs with better electrostatics. Nonetheless, for determining the breakdown characteristics and for obtaining the critical fields in the vertical ,B-Ga20 3/graphene heterostructure, the currently obtained device performance is sufficient as these parameters are typically measured in the off-state mode of operation [165], [17 4]. 5.2 High Breakdown Electrical Field in ,B-Ga 2 O 3 /graphene Barristor To obtain the breakdown electric field perpendicular to the (100) plane, off-state (i.e., Vbg = 0V) Id - Vds electrical characterization of the ,B-Ga203/ graphene vertical devices is performed. Optical and AFM images (with extraction of the ,B-Ga20 3 thickness) of the device under test (DUT) are shown as an inset in Figure 5.3 (a). For this DUT, the ,B-Ga20 3 thickness is ~ 181nm. Device breakdown is measured at room temperature and at 5 x 10- 5 Torr. Here, V 8 and ½, 9 are fixed at O V, and Vd is swept up to the breakdown voltage (V BD )- The Id - Vds characteristics are plotted in Figure 5.3 (a) where the initially low Id increases dramatically as Vd approaches VBD· After breakdown, the device is permanently destroyed as a result of the large current that flows across the ,B-Ga20 3/graphene heterostructure. A breakdown voltage of 94 Vis measured for this DUT, and the corresponding breakdown electric field is FED= 5.2 MV /cm. Two-dimensional technology computer-aided design (TCAD) simulations are performed to better understand the device operation near the breakdown biasing conditions. Figure 5.3 (b) plots equipotential contours for Vd = VBD from which an average vertical electric field of 5 MV /cm is obtained. The vertical electric field distribution inside the ,B-Ga20 3 layer, perpendicular to the (100) plane, is plotted in Figure 5.3 ( c) for different values of Vds · Using a linear vertical electric field 108 (b) 0 E°.05 3 £ 0.1 74 a. 55 Cl) 36 0 0.15 17 0 0.2 .. 20 40 60 80 100 0 0.05 0.1 0.15 0.2 0.25 Voltage (V) Width (µm) (c) 6 (d) 250 Breakdown: E 200 lwadx = l u 9- >4 , 6 ~ , , 150 , , "O 0 , <ii Cl , , ii: > , , . g2 100 , , , , t5 . . . . Ref 3 o Q) w 50 -·· This work o 00 60 120 180 102 103 Depth (nm) W[nm] Figure 5.3: (a) Off-state breakdown characterization. (b) Simulated electrical potential distribution. ( c) Simulated electric field along the vertical direction. ( d) Analytical calculation of the breakdown voltages perpendicular and parallel to the (100) plane assuming similar junction structures. approximation, we calculate VBD as a function of t he /3-Ga 20 3 t hicknesses (W) based on t he avalanche breakdown criterion [178], [181] of (5.4) Here, we use the Chynoweth model for the impact ionization rate a = ae- b/lF(x)I , where a and b are the model parameters and F(x ) is the electric field. We use values of a and b that were recently obtained by fitting the Chynoweth model to first principles calculations in /3-Ga20 3 along t he different crystal directions [182]. Figure 5.3 (d) plots model calculation fits to t he experimental data. The solid blue line corresponds to calculations of breakdown perpendicular to 109 the (100) plane, and circles are data from the vertical devices in this work. The dashed red line shows calculations of breakdown parallel to the (100) plane, and the diamond is data from lateral devices in Ref. 3. For the model calculations, F(x) = Fp(l - x/W) + Fo, (5.5) where we express the peak electric field as a function of the breakdown voltage as Fp = 2(V BD + V 0 )/ W. F 0 and V 0 are the fitting parameter to account for non-linearity of the electric field and the voltage drop in the source/drain regions. The fitting parameters are kept constant at 2 MV / cm and 50 V for breakdown along both crystal directions. Thus, the higher breakdown voltages that can be attained in the vertical devices are attributed to a lower impact ionization rate perpendicular to the (100) plane. This results from anisotropy in electron-electron interactions due to anisotropy of the high-energy conduction bands in ,8-Ga2O3 as demonstrated from first principles (i.e., density functional theory and full-band Monte Carlo) [182]. Figure 5.4 plots the breakdown electric field of various current conducting channel materials [183]- [192]. Theoretically, semiconductor materials with a larger bandgap can result in a relatively larger critical breakdown electric field. vVe compare our results with other materials such as Si [186], 6H-SiC [187], [188], diamond [189], [190], and GaN [191], [192] and demonstrate that ,8- Ga2O3 achieves larger breakdown fields. Moreover, a larger breakdown field in our vertical ,8- Ga2 O3/graphene heterostructure, compared to lateral ,8-Ga2O3 devices, is possible due to a lower impact ionization rate perpendicular to the (100) plane. Further studies of transport properties of ,8-Ga 2O3 along different crystal directions can help to better understand the anisotropic breakdown behavior. In summary, a 5.2 MV /cm high breakdown electric field is observed in the ,8-Ga 2O3/graphene vertical heterostructure along the direction that is perpendicular to the (100) plane in the off-state breakdown measurement at room temperature in low pressure vacuum. The gate bias can switch 110 -6~------------~ E (.) > t ~ 4 - • Figure 5.4: Comparisons of breakdown electric fields measured on various semiconductor materials for power device applications. Red stars mark the results from this work. the current through this vertical heterostructure with an on/off ratio above 10 4 . The graphene contacted device structure shows its appeal and advantage for the study of ,8-Ga2O 3 devices and transport properties which is also applicable to other bulk semiconductor materials. 111 Chapter 6 Conclusion, Challenges and Future Work 6.1 Summary In summary, I showed the research works that are done during my Ph.D. time. Low-dimensional materials based electronic devices are studied for neuromorphic computing and other applications. In chapter 2, the experimental and theoretical study of black phosphorus based FET helps clarify some fundamental device physics concepts, which includes channel scattering, interface traps, quantum capacitance, field-effect mobility etc. The contribution of charged impurity and phonon scattering on the transport properties of such FETs is studied and verified by Landauer modeling approach. In chapter 3, aligned CNT FETs are demonstrated as non-volatile memory devices with synaptic responses. The application of aligned CNT FETs to adaptive online learning schemes and to homeostatic regulation of artificial neuron firing rates is discussed. vVe simulate the implementation of unsupervised learning for pattern recognition using a spike-timing-dependent plasticity scheme, indicate system-level performance with high recognition accuracy, and demonstrate improvements in the learning rate resulting from tuning the synaptic characteristics of aligned CNT devices. Then reconfigurable black phosphorus/SnSe based synaptic device is discussed. It 112 shows distinctively differences from the conventional heterosynaptic device in terms of both its operational characteristics and biological equivalence. In chapter 4, stochastc properties of tin oxide/MoS 2 hetero-memristor is studied. The stochastic properties are intentionally utilized and implemented in stochastic neuron, which distinguish itself with other research works. Temperature-based Boltzmann machine is built based on the stochastic neurons and circuits, which demonstrates the capability of solving combinatorial optimization problem under different simulated annealing strategies. The overall demonstrated system can be the fundamental building block for deep learning neural network. In chapter 5, ,B-Ga2O 3/graphene barristor is fabricated. Ultra high breakdown field up to 5.2 MV /cm is observed in this device. 6.2 Challenges and Future Work In order to build neuro-inspired computing system, many new problems and challenges will always come out, which calls for the future interdisciplinary collaborations from exciting researches of biological study in depth, material growth, device structure and function innovations, integrated circuit design optimization, computing algorithm exploration etc. I will discuss both chanllenges and future research directions in following three aspects. 6.2.1 Material Growth To realize the artificial neural network, how to scale up the low-dimensional material based devices and circuits to planar wafer scale is an inevitable challenge that researches keeps devoting efforts to solve. For 2D materials, mechanical exfoliation of 2D materials from their bulk form would give excellent material quality and device performance but with limited flake sizes and scaling capability. The lD and OD materials also suffer similar scaling issues. Therefore, it is critical to develop the reliable bottom-up growths techniques that can precisely control flake thickness, 113 a b C 380 Raman mapping of lnt(Ag')/lnt(B 28 ) 0 0 Distance X (µm) d 1100 3 x 104 0.5 B,g A' g 350 400 450 500 Raman Shift [cm· 1 1 Figure 6.1: (a) SEM image showing BP film aftergrowth. The substrate area is also shown for comparison. (b) BP film under optical microscope. (c) Raman mapping of A~/ B 29 peak intensity ratio of the BP film for the same area in (b). (d) Raman spectra of BP film from a small area in (b ). size, composition, doping etc. and can offer good scaling capability. The current progress in growing 2D materials using CVD, MBE, MOCVD can be found in these literatures [193], [194]. Growth conditions should be optimized for many low dimensional materials. For example, the CVD growth of MoS 2 film usually requires a furnace temperature up to 650 °C, which brings difficult to grow MoS 2 film directly on most plastic substrates. So exploring the growth of MoS 2 film under low temperature window (below 400 °C) would be beneficial. Despite the success of growing some kinds of 2D materials, there are still many materials that cannot be grown easily, like h-BN, black phosphorus etc. Considering black phosphorus (BP) film is experimentally difficult to realize and BP film is widely studied in various electronic and optical devices, in my Ph.D. time, we started exploring to grow wafer-scale, single crystalline black phosphorus under high temperature, high pressure growth condition in cylindrical piston. Red phosphorus (RP) film of tens of nanometers is initially deposited on sapphire substrate in CVD under 400 °C. Then RP film is treated under 700 °C, 1.5 GPa in cylindrical piston. The RP film converts to BP. The afterwards characterizations show that the BP area of~ 1000 µm x 300 µm is single crystalline with same A~/ B 29 peak intensity ratio. Figure 6.1 shows the results. Further optimization of this method is still undergoing. 114 If growth techniques of various low dimensional materials can be realized, it will faciliate in-situ growth of mixed dimensional heterostructures with good material conditions and scalability. 6.2.2 Mixed Dimensional Devices and Circuits Mixed dimensional devices and circuits break the limit of using only one material or materials with only one type of dimension throughout entire chip. Assembling 3D, 2D, ID, OD materials together can enable novel physics and device functions. It is also a possible way to break the Moore's scaling limit that is experienced by CMOS technology [ 28]. The three-dimensional integration of all nanotechnologies into one single chip has been demonstrated by Max Shulaker et al. in year 2017. It tries to stack all kinds of device and circuits vertically and connect them through vias. The overall system can store large amount of data and perform in-situ computing at the same time. The mixed dimensional devices and circuits can be incorporated into this platform easily. Together they may become possible candidate technologies to break the von-Neumann bottleneck. Low dimensional materials usually don't have dangling bonds and can be integrated between each other or integrated with 3D materials by van der Waals force. Multiple combinations, like OD-ID, 2D-2D, OD-2D, 1D-2D, 2D-3D, OD-3D, OD-2D-3D etc., can be studied, which will keep broadening the impact of low dimensional materials based research. 6.2.3 New Research Capability Without advanced research capability, many fundamental research nowadays would be halted. To realize artificial neural networks with real intelligence, the deep understanding of fundamental biological, material, device and circuitry, computer science etc. is essential, which calls for new research capabilities. The compressed ultrafast photography ( CUP) technique is a newly developed video shot methods that can take single-shot two-dimensional videos at a speed up to one billion frames 115 I CCD Streak Camera Wide open entrance slit Objects for video shot ~--- ~ Camera lens , ' L - - - ..I Beam splitter Tube lens Objective DMD Figure 6.2: A schematic showing t he CUP setup. The objectives can be neural systems, semiconductor devices etc. per seconds [195] . The setup of this system is shown in t he schematic Figure 6.2. The CUP system can be used to study signal transport in neural network system by capturing the spatial- temporal neural signal transmission in one-single shots. It can help reveal the tiny scientific change that happens within pico-seconds. It can also be used to study the non-repeatable phenomenon in biological brain system, which cannot be done by traditional methods which can only capture the time-averaged effects by integrating the multiple phenomena. As one of the early demonstrations of the CUP system, we used this system to study the optical chaos in two-dimensional optical cavities. "\¥e were the first to visualize the deterministic optical chaos when laser was propogating inside the optical cavities. What's more, t he semiconductor devices with stochastic properties can also be studied using the CUP system. The non-repeatable stochastic behavior like memristive filamental connection or rupture phenomenon can be visualized in this system. It will give more insights of how the memristive devices actually work and how to furthre optimize them. The integration of current research instruments can also generate new research capability. The tranditional AFM is integrated with the DIY setup to be able pick up a single layer 2D 116 flake, rotate it with arbitrary angles and put it down on the other single layer 2D flake, which generates double layer 2D structure with moire patterns. The ARPES can be integrated with MBE and STM, metal deposition and glovebox, which forms a production line from material growth and in-situ material characterizations, all the way to device fabrication and characterizations. Researchers can also combine Raman specstropy with previously mentioned instruments. The broadened research capability will be rewarding. Besides the instrumental improvement, the computer assisted research can also be useful. Technique like machine learning has already been developed for applications in semiconductor research, such as material discovery, growth parameter optimization, etch anomaly analysis, lithography hotspot detection etc. Recently, we have started combining machine learning with TCAD simulation methods together to predict device performance [196]. In the first step, parameters that were calibrated from real fabricated Ga 2O 3 devices were extracted and coded in TCAD. Second step, the TCAD tool can generate over 150,000 simulation dataset. In the third step, a machine learning model was trained with the dataset to become accurate. In the fourth step, the machine learning model can now be used to predict device performance. More development in this research direction is needed. The computer assited research will become more helpful in the future. In summary of this chapter, with deep understanding of biological neural system, good control of material growth, efficient hardware and algorithms co-design, it is forseeable that neuro-inspried computing will have fascinating breakthrough and significant impact in the future. 117 References [1] J. Gantz and D. Reinsel, "The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East," JDC Bullentin: JDC !VIEW, 2012. [2] Q. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. Corrado, J. Dean, and A. Ng, "Building high-level features using large scale unsupervised learning," International Conference in Machine Learning (ICML), 2012. [3] S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, and J. Tran, "cuDNN: Efficient Primitives for Deep Learning," arXiv:140.0759, 2014. [4] G. Lacey, G. Taylor, and S. Areibi, "Deep Learning on FPGAs: Past, Present, and Future," arXiv:1602.04283, 2016. [5] Y.-H. Chen, T. Krishna, J. Erner, and V. Sze, "Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks," IEEE International Solid-State Circuits Conference (ISSCC) , 2016. [6] N. Jouppi, "Google supercharges machine learning tasks with TPU custom chip," online, 2016. [7] S. Furber, F. Galluppi, S. Temple, and L. Plana, "The SpiNNaker project," Proc. IEEE, vol. 102, no. 5, pp. 652- 665, 2014. [8] J. Schemmel, D. Bruderle, A. Grubl, M. Hock, K. Meier, and S. Millner, "A wafer scale neuromorphic hardware system for large-scale neural modeling," IEEE International Symposium on Circuits and Systems (ISCAS ), 2014. [9] P. Merolla, J. Arthur, R. Alvarez-Icaza, A. Cassidy, J. Sawada, F. Akopyan, B. Jackson, N. Imam, C. Guo, Y. Nakamura, B. Brezzo, I. Vo, S. Esser, R. Appuswamy, B. Taba, A. Amir, M. Flickner, vV. Risk, R. Manohar, and D. Modha, "A million spiking-neuron integrated circuit with a scalable communicatin network and interface," Science, vol. 345, no. 6197, pp. 668-673, 2014. [10] M. Prezioso, F. Merrikh-Bayat, B. Hoskins, G. Adam, K. Likharev, and D. Strukov, "Training and operation of an integrated neuromorphic network based on metal-oxide memristors," Nature, vol. 521, pp. 61- 64, 2015. [11] S. Kim, M. Ishii, S. Lewis, T. Perri, M. BrightSky, W. Kim, R. Jordan, G. Burr, N. Sosa, A. Ray, J. Han, C. Miller, K. Hosokawa, and C. Lam, "NVM neuromorphic core with 64 k-cell (256-by-256) phase change memory synaptic array with on-chip neuron circuits for continuous in-situ learning," IEEE International Electron Devices Meeting (IEDM), 2015. [12] T. Chang, S. Jo, and W. Lu, "Short-term memory to long-term memory transition in a nanoscale memristor," ACS Nano, vol. 5, no. 9, pp. 7669-7676, 2011. [13] S. Yu, B. Gao, Z. Fang, H. Yu, J. Kang, and H. vVong, "A low energy oxide-based electronic synaptic device for neuromorphic visual system with tolerance to device variation," Adv. Mater., vol. 25, no. 12, pp. 1774- 1779, 2013. [14] I. Wang, Y. Lin, Y. Wang, C. Hsu, and T. Hou, "3D synaptic architecture with ultralow sub-10 fJ energy per spike for neuromorphic computation," IEEE International Electron Devices Meeting (IEDM) , 2013. 118 [15] D. Kuzum, R. Jeyasingh, B. Lee, and H. vVong, "Nanoelectronic programmable synapses based on phase change materials for brain-inspired computing," Nano Lett. , vol. 12, no. 5, pp. 2179- 2186, 2012. [16] S.Jo, T. Chang, I. Ebong, B. Bhadviya, P. Mazumder, and W. Lu, "Nanoscale memristor device as synapse in neuromorphic system, " Nano Lett., vol. 10, no. 2, p. 12, 2010. [17] T. Ohno, T. Hasegawa, T. Tsuruoka, K. Terabe, J. Gimzewski, and M. Aono, "Short-term plasticity and long-term potentiation mimicked in single inorganic synapses," Nat. Mater., vol. 10, no. 8, pp. 591- 595, 2011. [18] S. Manan, 0. Bichler, D. Querlioz, G. Palma, E. Vianello, D. Vuillaume, C. Gamrat, and B. DeSalvo, "CBRAM devices as binary synapses for low-power stochastic neuromorphic systems: auditory (cochlea) and visual (retina) cognitive processing applications," IEEE International Electron Devices Meeting (IEDM), 2013. [19] Y. Choi, I. Song, and Y. L. et al., "A 20 nm 1.8V 8Gb PRAM with 40MB/s program bandwidth," IEEE International Solid-State Circuits Conference (ISSCC), 2012. [20] T. Liu, T. Yan, and R. S. et al., "A 130.7mm2 2-layer 32Gb ReRAM memory device in 24nm technology," IEEE International Solid-State Circuits Conference (ISSCC) , 2013. [ 21] A. Kawahara, R. Axuma, and K. A. et al., "An 8 Mb multilayered cross-point ReRAM macro with 443 MB/ s write throughput," IEEE International Solid-State Circuits Conference (ISSCC) , 2013. [22] E. Rolls and A. Treves, Neural Networks and Brain Function. New York: Oxford Univesity Press, 1998. [ 23] D. Hebb, The Organization of Behavior. New York: Wiley, 1949. [ 24] R. Morris, "Does synaptic plasticity play a role in information storage in the vertebrate brain?" Parallel Distributed Processing: Implications for Psychology and Neurobiology, pp. 248- 285, 1989. [25] H. Shouval, S. vVang, and G. Wittnberg, "Spike timing dependent plasticity: a consequence of more fundamental learning rules," Front. Comput. Neurosci. , vol. 4, no. 19, 2010. [ 26] H. Tian, J. Tice, R. Fei, V. Tran, X. Yan, L. Yang, and H. vVang, "Low-symmetry two dimensional materials for electronic and photonic applications," Nano Today, vol. 11, no. 6, pp. 763- 777, 2016. [27] J. Wu, N. Wang, X. Yan, and H. Wang, "Emerging low-dimensional materials for mid infrared detection," Nano Research, pp. 1-15, 2020. [ 28] V. K. Sangwan and M. C. Hersam, "Neuromorphic nanoelectronic materials," Nat. Nanotech., vol. 15, pp. 517-528, 2020. [29] M. K. et al. , "Zero-static power radio-frequency switches based on MoS 2 atomristors," Nat. Commun., vol. 9, p. 2524, 2018. [30] M. W. et al. , "Robust memristors based on layered two-dimensional materials," Nat. Elect., vol. 1, pp. 130-136, 2018. [31] R. X. et al., "Vertical MoS 2 double-layer memristor with electrochemical metallization as an atomic-scale synapse with switching thresholds approaching lO0mV," Nano Lett., vol. 19, pp. 2411- 2417, 2019. [32] H. Z. et al., "Atomically thin femtojoule memristive device," Adv Mater., vol. 29, p. 1703232, 2017. [33] Y. S. et al. , "Electronic synapses made of layered two-dimensional materials," Nat. Elec., vol. 1, pp. 458-465, 2018. [34] H. T. et al., "Anisotropic black phosphorus synaptic device for neuromorphic applications," Adv. Mater., vol. 28, pp. 4991- 4997, 2016. [35] --, "Graphene dynamic synapse with modulatable plasticity," Nano Lett., vol. 15, pp. 8013-8019, 2015. [36] M. T. S. et al., "Low-power, electrochemically tunable graphene synapses for neuromorphic computing," Adv. Mater., vol. 30, p. 1802 353, 2018. 119 [37] J. Z. et al., "Ion gated synaptic transistors based on 2D van der \¥aals crystals with tunable dynamics, " Adv. Mater., vol. 30, p. 1800195, 2018. [38] R. J. et al., "Synergistic gating of electro-iono-photoactive 2D chalcogenide neuristors Coexistence of Hebbian and homeostatic synaptic metaplasticity," Adv. Mater., vol. 30, p. 1800 220, 2018. [39] V. S. et al., "Gate-tunable memristive phenomena mediated by by grain boundaries in single-layer MoS 2 , " Nat. Nanotechnol., vol. 10, pp. 403-406, 2015. [40] M. Y. et al., "Memristive phase switching in two-dimensional 1T-TaS 2 crystals," Sci. Adv., vol. 1, p. 1500 606, 2015. [41] X. Z. et al. , "Ionic modulation and ionic coupling effects in MoS 2 devices for neuromorphic computing," Nat. Mater., vol. 18, pp. 141-148, 2018. [42] F. Z. et al. , "Electric-field induced structural transition in vertical MoTe2 and Moi-x vV x Te2 based resistive memories," Nat. Mater. , vol. 18, pp. 55-61 , 2018. [43] I. S. et al. , "Aligned carbon nanotube synaptic transistors for large-scale neuromorphic computing," ACS Nano, vol. 12, pp. 7352- 7361, 2018. [44] S. K. et al., "Pattern recognition using carbon nanotube synaptic transistors with an adjustable weight update protocol," ACS Nano, vol. 11 , pp. 2814-2822, 2017. [45] M. S. et al. , "Three-dimensional integration of nanotechnologies for computing and data storage on a single chip," Nature, vol. 547, pp. 74- 78, 2017. [46] W. Xu, S. Hwang, and T. Lee, "Organic core-sheath nanowire artificial synapses with femtojoule energy consumption," Sci. Adv. , vol. 2, e1501326, 2016. [47] C. M. et al. , "Artificial neuron based on integrated semiconductor quantum dot mode locked lasers," Sci. Rep., vol. 6, p. 39317, 2016. [48] P. M. et al., "Electro-photo-sensitive memristor for neuromorphic and arithmetic computing," Phys. Rev. Appl. , vol. 5, p. 054 011, 2016. [49] Y. vV. et al. , "Photonic synapses based on inorganic perovskite quantum dots for neuromorphic computing," Adv. Mater., vol. 30, p. 1802 883, 2018. [50] S. H. et al., "Black phosphorus quantum dots with tunable memory properties and multilevel resistive switching characteristics," Adv. Sci., vol. 4, p. 1600 435, 2017. [51] F. A. et al. , "An organic nanoparticle transistor behaving as a bilogical spiking synapse," Adv. Funct. Mater. , vol. 20, pp. 330- 337, 2010. [52] Z. vV. et al. , "Memristors with diffusive dynamics as synaptic emulators for neuromorphic computing," Nat. Mater., vol. 16, pp. 101-108, 2017. [53] S. G. et al. , "Robust resistive memory devices using solution-processable metal-coordinated azo aromatics," Nat. Mater. , vol. 16, pp. 1216- 1224, 2017. [54] L. C. et al. , "Nonlinear circuit foundations for nanodevices. I. The four-element torus," Proc. IEEE, vol. 91, pp. 1830- 1859, 2003. [55] S. Yu, Neuro-inspired computing using resistive synaptic devices. Switzerland: Springer International Publishing AG, 2017. [56] S. E. al. , "Nonlinear circuit foundations for nanodevices. I. The four-element torus," IEEE International Electron Devices Meeting (IEDM), 2013. [57] P. M. S. et al. , "Sparse coding with memristor networks," Nature Nanotechnol., vol. 12, no. 784, 2017. [58] Z. vV. et a l. , "Fully memristive neural networks for pattern classification with unsupervised learning," Nat. Electron., vol. 1, no. 137, 2018. [59] M. D. P. et al., "A scalable neuristor built with Mott memristors," Nat. Mater., vol. 12, no. 2, pp. 114- 117, 2013. [60] H. H. et al., "Quasi-Hodgkin-Huxley neurons with leaky integrate-and-fire functions physically realized with memristive devices, " Adv. Mater., vol. 31 , no. 3, 2019. [61] J. F. et al., "All-optical spiking neurosynaptic networks with self-learning capabilities," Nature, vol. 569, no. 208, 2019. 120 [ 62] J. P. et al. , "Towards artificial general intelligence with hybrid Tianjic chip archetecture," Nature, vol. 572, no. 106, 2019. [ 63] L. L. et al., "Black phosphorus field-effect transistors, " Nat. Nanotechnol., vol. 9, no. 5, pp. 372- 377, 2014. [ 64] Q. H. W. et al. , "Electronics and optoelectronics of two-dimensional transition metal dichalcogenides," Nat. Nanotechnol. , vol. 7, no. 11, pp. 699-712, 2012. [ 65] Y. S. et al. , "Gate-tuned thermoelectric power in black phosphorus," Nano Lett., vol. 16, no. 8, pp. 4819- 4824, 2016. [ 66] H. W. et al. , "Black phosphorus radio-frequency transistors, " Nano Lett., vol. 14, no. 11, pp. 6424- 6439, 2014. [ 67] G. L. et al. , "Achieving ultrahigh carrier mobility in two-dimensional hole gas of black phophorus," Nano Lett., vol. 16, no. 12, pp. 7768-7773, 2016. [ 68] F. X. et al. , "Rediscovering black phosphorus as an anisotropic layered material for optoelectronics and electronics," Nat. Commun., vol. 5, no. 4458, 2016. [ 69] A. V. P. et al., "Analysing black phosphorus transistors using an analytic Schottky barrier MOSFET model," Nat. Commun., vol. 6, no. 8948, 2015. [ 70] R. A. D. et al., "Transport properties of ultrathin black phosphorus on hexagonal boron nitride," Appl. Phys. Lett., vol. 106, no. 8, 2015. [71] C. R. D. et al. , "Boron nitride susstrates for high-quality graphene electronics," Nat. Nanotechnol., vol. 5, no. 10, 2010. [72] L. L. et al. , "High-performance p-type black phosphorus transistor with scandium contact ," ACS Nano. , vol. 10, no. 4, 2016. [ 73] Z. L. et al., "Anisotropic in-plane thermal conductivity observed in few-layer black phosphorus," Nat. Commun., vol. 6, p. 8572, 2015. [74] I. S. E. et al., "Charge trapping in aligned single-walled carbon nanotube arrays induced by ionizing radiation exposure," J. Appl. Phys. , vol. 115, no. 5, p. 054506, 2014. [ 75] Y. Y. I. et al. , "The role of charge trapping in MoS 2 / SiO 2 and MoS 2 /hBN field-effect transistors," 2D Mater. , vol. 3, no. 3, p. 035004, 2016. [ 76] Y. L. et al., "Mobility anisotropy in monolayer black phosphorus due to scattering by charged impurities," Phys. Rev. B Condes. Matter., vol. 93, no. 16, p. 165402, 2016. [77] R. L. et al., "Spatial variation of currents and fields due to localized scatterers in metallic conduction," IBM J. Res. Develop. , vol. 1, no. 3, pp. 223-231 , 1957. [ 78] S. Datta, Electronic Transport in Mesoscopic Systems. U.K.: Cambridge Univ. Press, 1995. [ 79] K. M. et al., "A unified simulation of Schottky and ohmic contacts," IEEE Trans. Electron Devices, vol. 47, no. 1, pp. 103- 108, 2000. [ 80] M. S. Lundstrom, Fundamentals of carrier transport, 2nd. U. K.: Cambridge Univ. Press, 1990. [ 81] B. L. et al. , "Ab initio study of electron-phonon interaction in phosphorene," Phys. Rev. B, Condens. Matter., vol. 91 , p. 235419, 2015. [ 82] S. Datta, Quantum Transport: Atom to Transistor. U.K.: Cambridge Univ. Press, 2005. [ 83] I. S. E. et al., "Modeling the non-uniform distribution of radiation-induced traps," IEEE Trans. Nuc. Sci., vol. 59, no. 4, pp. 723- 727, 2012. [ 84] N. H. T. et al. , "Interface state energy distribution and Pb defects at Si(110) / SiO 2 , " J. Appl. PHys., vol. 109, p. 13710, 2011. [ 85] C. \V. C. et al., "The effect of interface processing on the distribution of interfacial defect states and the C-V characteristics of III-V metal-oxide-semiconductor field effect transistors," J. Appl. PHys. , vol. 109, p. 023714, 2011. [ 86] vV. Z. et al. , "Electronic transport and device prospects of monolayer molybdenum disulphide grown by chemical vapour deposition," Nat. Commun., vol. 5, no. 3087, 2014. [ 87] Y. Y. I. et al., "Highly-stable black phosphorus field-effect transistors with low density of oxide traps," NPJ 2D Mater. Appl., vol. 1, pp. 1- 7, 2017. 121 [ 88] R. D. et al. , "Trap spectroscopy by charge injection and sensing (TSCIS): A quantitative electrical technique for studying defects in dielectric stacks," IEDM Tech. Dig., pp. 1- 4, 2008. [89] J. Q. et al., "High-mobility transport anisotropy and linear dichroism in few-layer black phosphorus," Nat. Commun., vol. 5, no. 4475, 2014. [ 90] D. J. P. et al. , "High-performance n-type black phophorus transistors with type control via thickness and contact-metal engineering, " Nat. Commun., vol. 6, p. 7809, 2015. [ 91] Y. K. et al., "Landauer-Datta-Lundstrom generalized transport model for nanoelectronics," J. Nanosci., p. 725 420, 2014. [ 92] H. L. et al. , "Semiconducting black phosphorus: Synthesis, transport properties and electronic applications," Chem. Soc. Rev., vol. 44, no. 9, pp. 2732-2743, 2015. [ 93] J. N. et al., "Few-layer black phophorus field-effect transistors with reduced current fluctuation," ACS Nano. , vol. 8, no. 11, pp. 11753- 11 762, 2014. [ 94] X. L. et al. , "The renaissance of black phophorus," Proc. Nat. Acad. Sci., vol. 112, no. 15, pp. 4523- 4530, 2015. [ 95] H. T. et al. , "Dynamically reconfigurable ambipolar black phophorus memory device, " ACS Nano, vol. 10, pp. 10428- 10 435, 2016. [ 96] T. L. et al., "High field transport of high performance black phosphorus transistors," Appl. Phys. Lett. , vol. 110, pp. 2- 5, 2017. [ 97] S. D. et al., "High performance multilayer MoS 2 transistors with scandium contacts," Nano Lett. , vol. 13, pp. 100- 105, 2013. [ 98] I. S. E. et al. , "Transport properties and device prospects of ultrathin black phosphorus on hexagonal boron nitride," IEEE Trans. Electron Devices, vol. 64, pp. 5163- 5171, 2017. [ 99] I. E. et al., "The impact of defect scattering on the quasi-ballistic transport of nanoscale conductors," J. Appl. Phys. , vol. 117, p. 84 319, 2015. [100] R. Kim, S. Datta, and M. S. Lundstrom, "Influence of dimensionality on thermoelectric device performance," J. Appl. Phys., vol. 105, p. 34 506, 2009. [101] D. J. F. et al., "Device scaling limits of Si MOSFETs and their application dependencies," Proc. IEEE, vol. 89, pp. 259- 288, 2001. [102] T. King, "FinFETs for nanoscale CMOS digital integrated circuits," IEEE ICCAD, pp. 207- 210, 2005. [103] G. F. et al., "Electronics based on two-dimensional materials," Nat. Nanotechnol. , vol. 9, pp. 768- 779, 2014. [104] C. Q. et al. , "Scaling carbon nanotube complementary transistors to 5-nm gate lengths," Science, vol. 355, pp. 271- 276, 2017. [105] K. Kim and C. C. et al. , "Carbon nanotube synapse with dynamic logic and learning," Adv. Mater., vol. 25, pp. 1693- 1698, 2013. [106] M. Shulaker and T. "\V. et al., "Monolithic 3D integration: a path from concept to reality, " Design Automation and Test in Europe Conference and Exhibition. , pp. 1197- 1202, 2015. [107] C. Li and M. H. et al., "Analogue signal and image processing with large memrstor crossbars," Nat. Electron., vol. 1, pp. 52- 59, 2018. [108] B. Indiveri and S. Liu, "Memory and information processing in neuromorphic systems," Proc. IEEE, vol. 103, pp. 1379- 1397, 2015. [109] K. Kim and S. G. et al., "A functional hybrid memristor crossbar-array/CMOS system for data storage and neuromorphic applications," Nano Lett., vol. 12, pp. 389- 395, 2012. [110] Q. Luo and X. X. et al., "Super non-linear RRAM with ultra-low power for 3D vertical nano-crossbar arrays, " Nanoscale, vol. 8, pp. 15629- 15636, 2016. [111] J. Yang and D. S. et al., "Memristive devices for computing," Nat. Nanotechnol. , vol. 8, pp. 13-24, 2013. [112] S. Park and J. N. et al., "Nanoscale RRAM-based synaptic electronics: toward a neuromorphic computing device," Nanotechnol., vol. 24, pp. 384009, 2013. 122 [113] T. Chang and S. K. et al., "Synaptic behaviros and modeling of a metal oxide memristive device, " Appl. Phys. A: Mater. Sci. Process. , vol. 102, pp. 857- 863, 2011. [114] H. Wong and H. L. et al. , "Metal-oxide RRAM, " Proc. IEEE, vol. 100, pp. 1951- 1970, 2012. [115] A. Fantini and L. G. et al., "Intrinsic switching variability in HfO 2 RRAM," IEEE International Memory Workshop, pp. 1-4, 2013. [116] S. Yu and B. G. et al., "Stochastic learning in oxide binary synatic device for neuromorphic computing," Front. Neurosci. , vol. 7, pp. 1- 9, 2013. [117] Y. Li and M. Z. et al., "Investigation on the conductive filament growth dynamics in resistive switching memory via a universal Monte Carlo simulator," Sci. Rep., vol. 7, pp. 1-11 , 2017. [118] S. Kim and S. C. et al., "Comperhensive physical model of dynamic resistive switching in an oxide memristor," ACS Nano., vol. 8, pp. 2369-2376, 2014. [119] D. Acharyya and A. H. et al., "Microelectronics reliability a journey towards reliability improvement of TiO 2 based resistive random access memory: a review," Microelectron. Reliab., vol. 54, pp. 541- 560, 2014. [120] X. Gu and S. Iyer., "Unsupervised learning using charge-trap transistors," IEEE Electron Device Lett., vol. 38, pp. 1204-1207, 2017. [121] M. Shulaker and G. H. et al., "Carbon nanotubes computer," Nature, vol. 501, pp. 526- 530, 2013. [122] M. Shulaker and H. W. et al., "Carbon nanotubes for monolithic 3D ICs," Carbon Nanotubes for Interconnects, pp. 315- 333, 2017. [123] F. Liu and S. M. et al., "Study of random telegraph signals in single-walled carbon nanotube field effect transistors," IEEE Trans. Nanotechnol., vol. 5, pp. 441- 445, 2006. [124] S. Han and J. T. et al. , "High-speed logic integrated circuits with solution-processed self assembled carbon nanotubes," Nat. Nanotechnol., vol. 12, pp. 861- 865, 2017. [125] Y. Che and Y. L. et al. , "T-gate aligned nanotube radio frequency transistors and circuits with superior performance," ACS Nano., vol. 7, pp. 4343- 4350, 2013. [126] T. Rueckes and K. K. et al. , "Carbon nanotube-based nonvolatile random access memory for molecular computing," Science, vol. 289, pp. 94- 98, 2000. [127] K. Jinkins and J. C. et al., "Nanotube alignment mechanism in floating evaporative self assembly, " Langmuir, vol. 33, pp. 13407-13414, 2017. [128] Y. Che and A. B. et al. , "Self-aligned T-gate high-purity semiconducting carbon nanotube RF transistors operated in quasi-ballistic transport and quantum capacitance regime, " ACS Nano, vol. 6, pp. 6936- 6943, 2012. [129] C. Rutherglen and D. J. et al., "Nanotube electronics for radiofrequency applications," Nat. Nanotechnol. , vol. 4, pp. 811- 819, 2009. [130] R. Park and M. S. et al., "Hysteresis in carbon nanotube transistors: measurement and analysis of trap density, energy level and spatial distribution, " ACS Nano, vol. 10, pp. 4599- 4608, 2016. [131] B. Sarkar and B. L. et al., "Understanding the gradual reset in Pt/ A1 2O 3/Ni RRAM for synaptic applications," Semicond. Sci. Technol. , vol. 30, p. 105 014, 2015. [132] S. Kim and J. Y. et al., "Carbon nanotube synaptic transistor network for pattern recognition, " ACS Appl. Mater. Interfaces, vol. 7, pp. 25479-25 486, 2015. [133] Y. Lecun and L. B. et al. , "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, pp. 1- 46, 1998. [134] D. Querlioz and 0. B. et al., "Immunity to device variations in a spiking neural network with memristive nanodevice," IEEE Trans. Nanotechnol., vol. 12, pp. 288-295, 2013. [135] Y. Yang and B. C. et al. , "Memristive physically evolving networks enabling the emulation of heterosynaptic plasticity," Adv. Mater., vol. 27, pp. 7720-7727, 2015. 123 [136] A. Thomson and J. D. et al., "Single axon excitatory postsynaptic potentials in neocortical interneurons exhibit pronounced paired pulse facilitation," Neuroscience, vol. 54, pp. 347- 360, 1993. [137] E. Buhl and K. H. et al., "Diverse sources of hippocampal unitary inhibitory postsynaptic potentials and the number of synaptic release sites," Nature, vol. 368, pp. 823- 828, 1994. [138] M. Mattson and S. K. et al. , "Excitatory and inhibitory neurotransmitters in the generation and degeneration of hippocampal neuroarchitecture," Brain Res. , vol. 478, pp. 337- 348, 1989. [139] D. Root and C. M.-A. et al., "Single rodent mesohabenular axons release glutamate and GABA," Nat. Neurosci., vol. 17, pp. 1543- 1551, 2014. [140] K. Ganguly and A. S. et al., "GABA itself promotes the developmental switch of neuronal gabaergic responses from excitation to inhibition," Cell, vol. 105, pp. 521- 532, 2001. [141] G. Indiveri and E. C. et al., "A VLSI array of low-power spiking neurons and bistable synapses with spike-timing dependent plasticity," IEEE Trans. Neural Netw., vol. 17, pp. 211- 221, 2006. [142] L. Li and Z. C. et al., "Slngle-layer single-crystalline SnSe nanosheets," J. Am. Chem. Soc., vol. 135, pp. 1213-1216, 2013. [143] L. Abbott and S. S. et al. , "Competitive hebbian learning through spike-timing-dependent synaptic plasticity," Nat. Neurosci., vol. 3, pp. 919- 926, 2000. [144] S. Kirkpatrick and C. G. et al. , "Optimization by simulated annealing," Science, vol. 220, pp. 671- 680, 1983. [145] H. Larochelle and M. M. et al., "Learning algorithms for the classification restricted boltzmann machine," The Journal of Machine Learning Research, vol. 13, pp. 643-669, 2012. [146] G. L. et al. , "Temperature based restricted boltzmann machines," Sci. Rep., vol. 6, pp. 19133, 2016. [147] P. K. et al., "Resistor-logic demultiplexers for nanoelectronics based on constant-weight codes," Nanotechnology, vol. 17, pp. 1052, 2006. [148] S. A. et al. , "Equibalent-accuracy accelerated neural-network training using analogue memory, " Nature, vol. 558, pp. 60- 67, 2018. [149] H. vVong and S. Salahuddin, "Memory leads the way to better computing," Nat. Nanotechnol., vol. 10, pp. 191-194, 2015. [150] M. Hu and Y. "\¥. et al. , "Leveraging stochastic memoristor devices in neuromorphic hardware systems," IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 6, pp. 235- 246, 2016. [151] H. J. et al. , "A novel true random number generator based on a stochastic diffusive memristor," Nature Communication, vol. 8, pp. 882, 2017. [152] R. Z. et al., "Nanoscale diffusive memristor crossbars as physical unclonable functions, " Nanoscale, vol. 10, pp. 2721-2726, 2018. [153] W. Z. et al., "Neuro-inspired computing chips," Nat. Electronics, vol. 3, pp. 371- 382, 2020. [154] C. Huang and C. S. et al., "A contact-resistive random-access-memory-based true random number generator," IEEE Electron Device Letters, vol. 33, pp. 1108- 1110, 2012. [155] S. Z. et al. , "Controlled synthesis of single-crystal SnSe nanoplates," Nano Research, vol. 8, pp. 288-295, 2015. [156] B. P. et al., "Phase-controlled synthesis of SnOx thin films by atomic layer deposition and post-treatment ," Applied Surface Science, vol. 480, pp. 472- 477, 2019. [157] S. J. et al., "Programmable resistance switching in nanoscale two-terminal devices," Nano Lett. , vol. 9, pp. 495-500, 2009. [158] E. Aarts and J. Korst, Simulated annealing and boltzmann machines. United States, 1988. [159] F. Heras and J. Larrosa, "A max-sat inference-based pre-processing for max-clique," International Conference on THeory and Applications of Satisfiability Testing, pp. 139- 152, 2008. 124 [160] J. \¥allner and A. N. et al. , "Complexity results and algorithms for extension enforcement in abstract argumentation," Journal of Artificial Intelligence Research, vol. 60, pp. 1-40, 2017. [161] C. Ansotegui and M. B. et al., "SAT-based MAXSAT algorithms," Artificial Intelligence, vol. 196, pp. 77- 105, 2013. [162] A. dAnjou and M. G. et al., "Solving stisfiability via boltzmann machines," IEEE Transaction on pattern analysis and machine intelligence, vol. 15, pp. 514- 521, 1993. [163] B. Zhang and S.S. et al., "Binary vector dissimilarity measures for handwriting identification," Proc. SPIE, vol. 5010, 2003. [164] H. F. et al. , "Comparison of distance measures in cluster analysis with dichotomous data," Journal of Data Science, vol. 3, pp. 85-100, 2005. [165] M. Higashiwaki and K. S. et al., "Depletion-mode Ga2O3 metal-oxide-semiconductor field effect transistors on ,B-Ga2O3 (010) substrates and temperature dependence of their device characteristics," Appl. Phys. Lett. , vol. 103, pp. 123511, 2005. [166] W. Hwang and A. V. et a l. , "High-voltage field effect transistors with wide-bandgap ,B Ga2O3 nano-membranes," Appl. Phys. Lett. , vol. 104, pp. 203111 , 2014. [167] Z. Galazka and R. U. et al., "Czochralsi growth and characterization of ,B-Ga2O3 single crystals," Cryst. Res. Technol. , vol. 45, pp. 1229-1236, 2010. [168] N. Ma and N. T. et al. , "Intrinsic electron mobility limits in ,B-Ga 2O3," Appl. Phys. Lett., vol. 109, pp. 212101 , 2016. [169] D. Shinohara and S. Fujita, "Heteroepitaxy of corundum-structured a-Ga2O 3 thin films on a -Al2O3 substrats by ultrasonic mist chemical vapor deposition," Jpn. J. Appl. Phys. , Part I, vol. 47, pp. 7311, 2008. [170] H. Murakami and K. N. et al. , "Homoepitaxial growth of ,B-Ga2O3 layers by halide vapor phase epitaxy," Appl. Phys. Express, vol. 8, pp. 015503, 2015. [171] K. Sasaki and A. K. et al. , "Device-quality ,B-Ga 2O 3 expitaxial films fabricated by ozone molecular beam epitaxy," Appl. Phys. Express, vol. 5, pp. 035502, 2012. [172] K. Konishi and K. G. et al., "1-kV vertical ,B-Ga2O3 field-plated Schottky barrier diodes," Appl. Phys. Lett. , vol. 110, pp. 103506, 2017. [173] M. Wong and K. S. et al. , "Field-plated Ga2O3 MOSFETs with a breakdown voltage of over 750V," IEEE Elec. Dev. Lett., vol. 37, pp. 212- 215, 2016. [174] K. Chabak and N. M. et al. , "Enhancement-mode Ga2O3 wrap-gate fin field-effect transistors on native (100) ,B-Ga2O3 substrate with high breakdown voltage," Appl. Phys. Lett., vol. 109, pp. 213501 , 2016. [175] A. Green and K. C. et al. , ",B-Ga2O3 MOSFETs for radio frequency operation," IEEE Elec. Dev. Lett., vol. 38, pp. 790- 793, 2017. [176] V. Bermudez, "The structure of low-index surfaces of ,B-Ga2O 3," Chem. Phys., vol. 323, pp. 193-203, 2006. [177] H. Yang and J. H. et al., "Graphene barristor, a triode device with a gate-controlled Schottky barrier," Science, vol. 336, pp. 1140- 1143, 2012. [178] E. Napoli and H. W. et al. , "Analytical calculation of the breakdown voltage for balanced, symmetrical superjunction power devices," 2010 22nd International Symposium on Power Semiconductor Devices and !C's, pp. 205-208, 2010. [179] S. Ahn and F. R. et al., "Effect of front and back gates on ,B-Ga2O3 nano-belt field-effect transistors," Appl. Phys. Lett. , vol. 109, pp. 062102, 2016. [180] J. Kim and M. M. et al. , "Quasi-two-dimensional h-BN / ,B-Ga2O3 heterostructure metal insulator-semiconductor field-effect transistors," ACS Appl. Mater. Interfaces, vol. 9, pp. 21322, 2017. [181] S. M. Sze and K. K. Ng, Physics of Semiconductor Devices. Wiley, 2006. [182] K. Ghosh and U. Singisetti, "Impact ionization in ,B-Ga2O3," Journal of Appl. Phys., vol. 124, 2018. 125 [183] H. Zhou and M. S. et al., "High-performance depletion/enhancement-mode /3-Ga 20 3 on insulator (GOOI) field-effect transistors with record drain currents of 600/450 mA/mm," IEEE Electron Device Lett. , vol. 38, pp. 103- 106, 2017. [184] M. Higashiwaki and K. S. et al. , "Gallium oxide (Ga20 3 ) metal-semiconductor field-effect transistors on single-crystal /3-Ga 20 3 (010) substrates," Appl. Phys. Lett., vol. 100, pp. 013504, 2012. [185] A. Green and K. C. et al., "3.8MV /cm breakdown strength of MOVPE-grown Sn-doped /3-Ga 20 3 MOSFETs," IEEE Electron Dev. Lett. , vol. 37, pp. 902- 905, 2016. [186] A. Chynoweth, "Uniform silicon p-n junctions. II. Ionization rates for electrons," J. Appl. Phys. , vol. 31, pp. 1161- 1165, 1960. [187] T. Kimoto and T. U. et al. , "High-voltage (>lkV) SiC Schottky barrier diodes with low on-resistances," IEEE Electron Device Lett. , vol. 14, pp. 548-550, 1993. [188] D. Alok and B. B. et al., "A simple edge termination for silicon carbide devices with nearly ideal breakdown voltage," IEEE Electron Device Lett., vol. 15, pp. 394- 395, 1994. [189] M. Syamsul and Y. K. et a l. , "High voltage breakdown (1.8kV) of hydrogenated black diamond field effect transistor, " Appl. Phys. Lett. , vol. 109, pp. 203504, 2016. [190] H. Umezawa and T. M. et al., "Diamond metal-semiconductor field-effect transistor with breakdown voltage over 1.5kV," IEEE Electron Device Lett., vol. 35, pp. 1112-1114, 2014. [191] M. Zhu and B. S. et al. , "1.9-kV AlGaN/GaN lateral Schottky barrier diodes on silicon," IEEE Electron Device Lett. , vol. 36, pp. 375- 377, 2015. [192] C. Tsou and K. W. et al. , "2.07-kV AlGaN/GaN Schottky barrier diodes on silicon with high Baliga's figure-of-merit, " IEEE Electron Device Lett., vol. 37, pp. 70-73, 2016. [193] A. Mannix and B. K. et al. , "Synthesis and chemistry of elemental 2D materials," Nat. Rev. Chem., vol. 1, 2017. [194] P. Shen and Y. L. et al. , "CVD technology for 2-D materials," IEEE Transactions on Electron Devices, vol. 65, pp. 4040- 4052, 2018. [195] L. Gao and J. L. et al., "Single-shot compressed ultrafast photography at one hundred billion frames per second," Nature, vol. 516, pp. 74-77, 2014. [196] H. Wong and M. X. et al., "TCAD-machine learning framework for device variation and operating temperature analysis with experimental demonstration," IEEE Journal of the Electron Devices Society, vol. 8, pp. 992- 1000, 2020. 126
Abstract (if available)
Abstract
The computing demand for data-intensive applications is heavily exceeding the computing capablities of von-Neumann computing system, which suffers issues like “von-Neumann bottleneck”, “Moore’s Law” scaling limit, high energy cost and low computing efficiency. The neuromorphic computing takes inspiration from human brains to invent new types of hardwares that can have advantages like “massive parallelism”, “low power consumption” and “stochasticity”. The low-dimensional materials offer rich physical properties and are superior building blocks for developing advanced devices for neuromorphic computing. This thesis focuses on building novel electronic devices and circuits for neuromorphic computing applications by addressing these aspects: (1) studying fundamental properties of low-dimensional materials and devices
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
2D layered materials: fundamental properties and device applications
PDF
III-V semiconductor heterogeneous integration platform and devices for neuromorphic computing
PDF
Semiconductor devices for vacuum electronics, electrochemical reactions, and ultra-low power in-sensor computing
PDF
Building blocks for 3D integrated circuits: single crystal compound semiconductor growth and device fabrication on amorphous substrates
PDF
Optoelectronic, thermoelectric, and photocatalytic properties of low dimensional materials
PDF
Printed electronics based on carbon nanotubes and two-dimensional transition metal dichalcogenides
PDF
Light Emission from Carbon Nanotubes and Two-Dimensional Materials
PDF
Nanomaterials for energy storage devices and electronic/optoelectronic devices
PDF
Power-efficient biomimetic neural circuits
PDF
Synthesis, characterization, and device application of two-dimensional materials beyond graphene
PDF
Memristor device engineering and memristor-based analog computers for mobile robotics
PDF
Light-matter interactions in engineered microstructures: hybrid opto-thermal devices and infrared thermal emission control
PDF
GaN power devices with innovative structures and great performance
PDF
Low-dimensional asymmetric crystals: fundamental properties and applications
PDF
Nonlinear optical nanomaterials in integrated photonic devices
PDF
Memristive device and architecture for analog computing with high precision and programmability
PDF
Silicon-based RF/mm-wave power amplifiers and transmitters for future energy efficient communication systems
PDF
Development of novel optical materials for whispering gallery mode resonator for nonlinear optics
PDF
Cathode and anode materials for sodium ion batteries
PDF
Energy-efficient computing: Datacenters, mobile devices, and mobile clouds
Asset Metadata
Creator
Yan, Xiaodong
(author)
Core Title
Low-dimensional material based devices for neuromorphic computing and other applications
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
11/27/2020
Defense Date
10/16/2020
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
low-dimensional material,neuromorphic computing,OAI-PMH Harvest,semiconductor device
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wang, Han (
committee chair
), Ravichandran, Jayakanth (
committee member
), Wu, Wei (
committee member
)
Creator Email
xiaodony@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-395339
Unique identifier
UC11668217
Identifier
etd-YanXiaodon-9150.pdf (filename),usctheses-c89-395339 (legacy record id)
Legacy Identifier
etd-YanXiaodon-9150.pdf
Dmrecord
395339
Document Type
Dissertation
Rights
Yan, Xiaodong
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
low-dimensional material
neuromorphic computing
semiconductor device