Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Theoretical and computational foundations for cyber‐physical systems design
(USC Thesis Other)
Theoretical and computational foundations for cyber‐physical systems design
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Theoretical and Computational Foundations for Cyber-Physical Systems Design by Yuankun Xue A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Electrical Engineering) May 2018 Copyright 2018 Yuankun Xue Acknowledgements First and foremost, I would like to express my sincere gratitude to my thesis advisor, mentor and friend Paul Bogdan for his constant and powerful support throughout the course of my Ph.D work at University of Southern California. He has been perpetually supportive since the days I began working on the ambitious large scale manycore systems as a graduate student at Fudan University in China. Thanks to his guidance and inspirational advisement, I have been exposed not only to the most rigorous academic training to be a qualified researcher of strong ambition, but also to the mentorship and help that navigate me through my most difficult times and turn me into a stronger person in every possible aspect. I enjoyed every moment to work under his supervision with full freedom that I needed to dream big and make great things happen. The countless and extremely challenging research discussions I have had with Paul have prepared me for embracing research problems in a vastly ranged topics from electrical engineering, computer science, statistics, machine learning to systems biology. His intellectual curiosity in research and sincere friendship in life have inspired me to treat him as a role model and follow a path that brings much more greater impact to the humanity. Secondly, I consider myself fortunate to have received feedback on my research from my thesis com- mittee members: Professor Aiichiro Nakano, Professor Bhaskar Krishnamachari , Professor Edmond Jon- ckheere. I am very thankful to Professor Aiichiro Nakano for his captivating perspective in Physics to relate my work in Multi-fractal analysis to domains beyond my own vision. I also enjoyed very much all the feedback and discussion related the statistical inference and data science with Professor Bhaskar Krishnamachari during my qualifying exam and thesis defense. His challenging and interesting questions have lifted my ambition to extend my research to a broader class of domains. I am particularly grateful to Professor Edmond Jonckheere for numerous discussion over my course of study in complex networks, dynamical systems and multifractal geometry. His wise suggestions and research advisement has proven extremely valuable. I also want to present my sincere gratitude to Professor Sergio Pequito at Rensselaer Polytechnic Institute for his invaluable help and guidance throughout the stages of my research endeavors. I am very thankful to him for taking the time to teach me to approach research problems in a organized and principled way. His crucial suggestions for improving my research methodology and presentation skills have been perpetually helpful. I have been privileged to benefit from the collaborations with Professor Yanzhi Wang and Professor Xue Lin at Northeastern University. Their academic aggression and piloting spirit have always been the beacon in my life path. I also want to express my special thanks to Dr. Zhiliang (Toby) Qian at Shanghai Jiaotong University for his close guidance at the early stage of my PhD study. Without his stimulating suggestions and countless help during my infancy as a PhD student, my research ii would have been much more difficult. Lastly, but not least, I would also thank Professor Partha Pratim Pande at Washington State University for his wisdom and enlightening research advice. I want to thank Professor Shahin Nazarian at University of Southern California for all the inspirational discussion and input in formulating some of my most interesting research problems in embedded systems. My life as a graduate student at University of Southern California would have been much harder with- out the heartfelt support of many friends. I would like to extend my warm appreciation to the colleagues and friends who have made my Ph.D. life more fruitful and exciting. I am thankful to my roommate and friend Luhao Wang for his friendship and help during the most difficult period of my life in Los Angeles. My gratitude to Dr. Ji Li for his shared wisdom and enjoyable friendship. I also thank Gaurav Gupta for all the insightful suggestions and best times we have spent together during our road trips to Lake Michigan. My special thanks to Dr. Guopeng Wei for his research input, friendship and invitation to serve on the technical board of his great company. I would also like to thank my current and former colleagues Hana Koorehdavoudi, Mahboobeh Ghorbani, Valeriu Balaban, Mohamed ridha Zenaidi, Panagiotis Kyriakis and Yao Xiao. Their enthusiasm for research and hard work have been highly encouraging. Lastly, but not least, I want to present my heartfelt gratitude to all the USC staff members (Diane, Ted, Tim, Jeniffer, Tracy, Estela, Annie) who have ensured that I have all the necessary support and advisement during my PhD studies. Above all, I want to share my deepest thoughts with my grandparents for their unconditional love, support, nurturing and companion during the darkest period of my childhood after my parents passed away. Throughout my formative years, they held my hands to walk out of all the miseries and fears, provided me love in every possible way an ordinary child would receive from his parents and guided me to become an independent and loving person. I miss them so much and hope they are watching me now. I would also thank all my family members for their love, caring and support. Without these, I could not imagine how hard my life would be. I am extremely blessed to have the deepest love from my wife, my best friend and my dearest treasure Yijing Zhang. My gratitude is beyond any word. I want to thank you for the road we walked, the vision we shared and the future you guide me to. I would also thank all my dearest friends Rui Shi, Bing Tian, Sha Li, Bo Nan, Xin Gao, Yang Li, Shiyi Zhang, Jialiang (Paul) Xu, Dr. Chixiao Chen, Huifeng Tang, Xiaofeng Gong, Weng Fang, Yaopeng Gao who made my life full of love, joy, friendship and new adventures. Lastly, but not least, my special thanks to our lovely cat and irreplaceable family member ZaiZai (Dorothy). Her silent yet loving companion has constantly saved me from all the negative feelings and thoughts. Finally, I would like to express my gratitude to the agencies who have contributed to the funding of my research, namely National Science Foundation, Defense Advanced Research Projects Agency and University of Southern California. iii Table of Contents Acknowledgements ii List Of Tables vi List Of Figures vii Abstract xii Chapter 1: Introduction 1 Chapter 2: Constructing Compact Causal Mathematical Models for Complex Dynamics in Cyber- physical Systems 6 2.1 Compact Mathematical Modeling of Complex Dynamics . . . . . . . . . . . . . . . . . . 9 2.1.1 Microscopic Dynamics Dictates the Governing Mathematical Equation . . . . . . 9 2.1.2 Non-Markovian Probabilistic Description . . . . . . . . . . . . . . . . . . . . . . 12 2.1.3 Dynamical State Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.2 Investigation of Non-Exponential System Dynamics . . . . . . . . . . . . . . . . 17 2.2.3 Efficacy Evaluation of FDSE for Physiological Processes . . . . . . . . . . . . . . 20 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Chapter 3: Spatio-Temporal Fractal Model for a CPS Approach to Physiological Systems 23 3.1 Related Work and Novel Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2 Spatio-temporal Fractal(STF) Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.1 Premises and Vision for Constructing the STF Model . . . . . . . . . . . . . . . . 26 3.2.2 Data-driven Spatio-Temporal Fractal Model . . . . . . . . . . . . . . . . . . . . . 27 3.2.3 STF Model Identification Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 Experiment Setup and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3.1 Effectiveness of Spatio-Temporal Fractal Modeling . . . . . . . . . . . . . . . . . 30 3.3.2 Model Validation in Realistic Clinical Experiments . . . . . . . . . . . . . . . . . 32 3.3.3 Statistical Analysis of Fractal Connectivity . . . . . . . . . . . . . . . . . . . . . 33 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Chapter 4: Minimum Number of Sensors to Ensure Observability of Physical Systems: Case Studies 34 4.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2 CDFODS Observability and Submodularity . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3 Minimum Sensor Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 iv 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Chapter 5: Multifractal Geometry and Characterization of Networked Cyber-Physcial System: Algorithms and Implications 49 5.1 Multi-fracal Geometry of Complex Systems . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2 Multi-fractal Analysis of Sierpinksi Fractal Network . . . . . . . . . . . . . . . . . . . . 53 5.3 Analysis of Finite Resolution and Link Weight Distribution . . . . . . . . . . . . . . . . . 57 5.3.1 Estimation error analysis and stairway effect . . . . . . . . . . . . . . . . . . . . 57 5.3.2 Finite resolution and compatible growth rule . . . . . . . . . . . . . . . . . . . . 60 5.4 Comparative Analysis of Estimation Methods . . . . . . . . . . . . . . . . . . . . . . . . 63 5.4.1 Intrinsic estimation bias of BCANw and SBw . . . . . . . . . . . . . . . . . . . . 63 5.4.2 FBCw and FSBw for weighted complex network of finite resolution . . . . . . . . 67 5.5 Multi-fractal analysis of real networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.5.1 Vision and objectives of the multi-fractal analysis . . . . . . . . . . . . . . . . . . 71 5.5.2 Space-localized multi-fractal scaling . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.5.3 Scale-localized inconsistent multi-fractality of weighted real networks: . . . . . . 73 5.5.4 Link weight to dictate the mulifractality . . . . . . . . . . . . . . . . . . . . . . . 75 5.5.5 Localized scaling based network characterization and community detection . . . . 75 5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.7 Detailed Implementation of FBCw and FSBw . . . . . . . . . . . . . . . . . . . . . . . . 80 Chapter 6: CPS Application Learning and Profiling based on Scalable Model of Computation 96 6.1 Related Work and Novel contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.2 MoC-based Application Profiling and Benchmarking Framework . . . . . . . . . . . . . . 100 6.2.1 Overview of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 6.2.2 Application Traffic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.2.2.1 Vision of the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.2.2.2 Model Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.2.3 Benchmark Workloads synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.3 Evolvable Benchmark Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.3.2 Measuring the Graph Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.3.3 A Complex Network-inspired Benchmark Scaling . . . . . . . . . . . . . . . . . 108 6.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Chapter 7: Application-driven Runtime Reconfigurable Communication System Synthesis for CPS Design: A General Mathematical Framework. 115 7.1 Mathematical Modeling and Optimization Framework . . . . . . . . . . . . . . . . . . . . 117 7.1.1 Architectural Modeling Framework . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.1.2 Application Modeling Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.1.3 Runtime Reconfiguration Problem Formulation . . . . . . . . . . . . . . . . . . . 120 7.1.4 Complexity and Algorithm Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.2 Architectural Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Bibliography 129 v List Of Tables 2.1 KS-test and ML best-fitting parameters of-stable, exponential and Pareto distributions for selected channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1 Average goodness-of-fit in NRMSE(-inf: worst ; 1 : best) . . . . . . . . . . . . . . . . . . 33 vi List Of Figures 2.1 The overview of the proposed framework to construct compact mathematical modeling of complex dynamics and its application to the development of closed-loop cyber-physical systems. The monitored physical process is interfaced with a data-driven learning frame- work that investigates the system dynamics from a microscopic perspective (e.g., mag- nitude and inter-event time increments). A non-Markovian probabilistic description is constructed based on generalized master equation and the principle of maximum causal entropy to encapsulate the exhibited power-law behaviors (Section 2.1 and Section 2.2). This master equation based formalism allows the derivation of the nonlinear dynamical state equations with minimal postulation to capture system dynamics (Section 2.3). By retrieving the system model parameters from a specific physical process from which the model is derived, the proposed framework can serve as the intelligent core of the closed- loop intelligent cyber-physical systems for a wide spectrum of CPS applications( e.g., decode the neuron activities to generate stimulation signals for the control of prothesis or muscles to help amputee or paralyzed patients to regain body functionality). . . . . . . . . 8 2.2 EMG setup: Intramuscular EMG signals are measured at 6 muscles (i.e., 2 flexor muscles, 2 extensor muscles, 1 pronator muscle and 1 supinator muscle). The subject is inserted with fine wire electrodes for measurement purpose. All channels are considered and the channels highlighted in red are selected as case study. . . . . . . . . . . . . . . . . . . . . 17 2.3 The 64-channel geodesic sensor distribution for measurement of EEG. . . . . . . . . . . . 18 2.4 The deployment of 12-lead ECG system used in the experiment is shown. . . . . . . . . . 19 2.5 Empirical survival cumulative distribution functions and maximum likelihood best fitting -stable, exponential and Pareto distributions for magnitude increments in selected EEG, ECG and EMG channels. Figure (a)-(c) show the probabilities of the positive increments in magnitude exceeding a threshold value. P-value is obtained by performing a two-sample Kolmogorov-Smirnov test with the null hypothesis that the measurements come from the same distribution as the postulated-stable, exponential or Pareto distribution. . . . . . . 19 2.6 Fitting the model to physiological measurements of EEG, ECG and iEMG considering i) FDSE with no coupling matrixA, ii) FDSE with coupling matrixA and iii) Vector ARMA model with no fractal exponents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1 Cyber-physcial system for Brain-Machine-Body interfaces . . . . . . . . . . . . . . . . . 24 3.2 a-b) The iEMGs from two muscles (ED and APL) are measured when the subject is ab- ducting the thumb for 10 seconds after 6-second relaxing. c-d) The SACF decays hyper- bolically rather than exponentially proving a long-range memory. e) The SACFloglog plot of ED iEMG measurements against lags shows a power-law behavior. f) The SXACF between the iEMG signals proves the spatial interdependence over time. . . . . . . . . . . 25 3.3 The clinical experiments settings. 3 healthy subjects are implanted with fine wire elec- trodes measuring the iEMG signals when they are asked to do: i) finger extension; ii) finger flexion; iii) pronation iv) supination . . . . . . . . . . . . . . . . . . . . . . . . . . 29 vii 3.4 The ANRMSE values (a-c) for partial observability of several signal lengths (1024; 2048; 4096). d) Partial observability estimating 8-dimensional cross-correlated signals from only 2; 4 observed channels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5 Fractal connectivity network inferred from 4 different movements. 3 healthy subjects are asked to i) Extend all fingers; ii) Flex all fingers at a consistent strength for 10 seconds or iii) Pronate forearm; iv) Supinate forearm in the experiments. . . . . . . . . . . . . . . . . 31 3.6 Comparison of first-order model-fitting to 2-channel (ED and APL) raw iEMG data col- lected from subject 3 under finger extension . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1 Can the Fitbit sensor be used to retrieve the overall activity of the remaining sensors, hence, assess if one is about to have a cardiac arrest? . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Are both technologies equivalent? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3 Sensor distribution in the 6-channel iEMG signal clinical measurement experiment. The sensors in blue represent the minimal deployment of sensors that ensure the global dynam- ics of iEMG in all 6 muscles can be retrieved. The sensors in grey represent the unused sensors while the muscular activity at where the sensor in red is located is simulated based on the identified fractional order system. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4 The figure shows the accumulated errors in retrieved system’s initial states change as a function of the number of used sensors and the observation length. The increase in number of sensor deployed provides information gain that leads to the decrease in overall reduced deviation of states retrieved compared to the actual measurements. . . . . . . . . . . . . . 44 4.5 The retrieved initial states of the system using (4.7) across different signals are presented in the top row. We show the simulated states evolution of sensors that were not considered in the experiments and compare them with actual measurements at these sensors. In the first column, Figure 4.5(a-1) shows The initial states of unused sensors are recovered by the proposed algorithm based on the measurements from minimal subset of sensors (ED,APL and FDP). In Figure 4.5(a-2), the simulated data based on the retrieved initial states at unused FPL-sensor is compared against the actual measurements during first 0.4 second of finger flexion. Similarly, Figure 4.5(b) and Figure 4.5(c) show the initial states retrieved for unused sensors and the simulated dynamics at one of these sensors during the experiments of EEG and ECG, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.6 The 64-channel geodesic sensor distribution for measurement of EEG. The sensors in blue represent the minimum number of sensors and their deployment to guarantee that the CD- FODS in (4.1), whose states correspond to the channel measurements, is observable. The sensor in red is used as a sanity check on the evolution of the identified CDFODS. The results are compared against the recorded activity in Figure 4.5.(b-2) . . . . . . . . . . . . 46 4.7 The accumulated errors of the retrieved neural system’s initial states over 64 channels change over observation horizon. A phase-change phenomenon is shown when number of sensors deployed drops from 16 to 8 where the errors do not decrease as more observations are made. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.8 The deployment of 12-lead ECG system used in the experiment is shown. The sensors in blue (I and II) is the identified minimal subset of sensors to ensure the observability of the fractional order system. We simulate the cardiac activity at aV R given the identified fractional order system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.9 The coefficients colormap of spatio-coupling matrix A. The distribution of normalized coefficients shows a subset of channels plays dominant roles in the cardiac dynamics. . . . 48 viii 4.10 The accumulated errors are plotted against the length of observations and different settings of sensors. The longer observations improve upon the overall accuracy of system states estimates while it also suggests a lower bound for number of sensors used for observability. This observation aligns with our theoretical analysis on minimal observability. . . . . . . . 48 5.1 Failure of single (dominant) fractal dimension to capture the heterogeneity in detailed configuration of fractal networks. A comparative example shows A) Sierpinski fractal network (s = 1=3;b = 6) and B) (u,v)-flower fractal network (u = 3;v = 3) share the same fractal dimension (1:631) yet having distinct topological structure. . . . . . . . . . . 51 5.2 Relative error of estimate of 0 as function of insertion location of staircase and number of fake observations considered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.3 Relative error of estimate of 0 as function of width of staircase and the number of fake observations considered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.4 A case study of stairway effect. A linear relationy = 50 0:4x is observed on a set of inputx. Two sets of unchangingy observations are manually inserted between two actual measurement to create ”stairs”. The introduction of such stairs biases the linear regression and causes estimation errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.5 Observation of staircase effect in determination of dominant fractal dimension of G 4 of Sierpinski fractal network family using box-covering method with an incompatible linear growth rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.6 Observation of staircase effect in determination of dominant fractal dimension of G 4 of Sierpinski fractal network family using sandbox method with an incompatible linear growth rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.7 Observation of staircase effect in determination of dominant fractal dimension of G 4 of Sierpinski fractal network family using BCANw. . . . . . . . . . . . . . . . . . . . . . . 64 5.8 Observation of staircase effect in determination of dominant fractal dimension of G 4 of Sierpinski fractal network family using SBw. . . . . . . . . . . . . . . . . . . . . . . . . 65 5.9 Normalized estimation errors of dominant fractal dimension ofG 5 (b = 3,s = 1=2) under different i) numbers of the repeated trials for box-covering-based methods (BCANw and FBCw) and ii) utilization of nodes as sandbox center for sandbox-based methods (SBw and FSBw). Averaging the estimations over an increasing number of box-covering trials or nodes used as sandbox centers brings trivial improvement to the intrinsically biased estimation of BCANw and SBw. The proposed FBCw and FSBw provide better accuracy by addressing the finite resolution and the skewness of the link weight distribution of the weighted complex network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.10 Skewness of link weight distribution of Sierpinski fractal network. a) The skewness of link weight distribution ofG 5 as function of scaling factors and copy factorb = 2; 3; 4; 5; 6; 7; 8 . b) The skewness of link weight distribution ofG 7 as function of scaling factors and copy factorb = 2; 3; 4; 5; 6; 7; 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.11 Estimated dominant fractal dimension of Sierpinski fractal network family (G 3 toG 8 with b = 3 ands = 1=2) using BCANw, SBw, FBCw and FSBw. As predicted by Eq. (5.39) and (5.43), the estimation accuracy is improved as the numerical calculation of the limit by the linear regression is performed over a growing set of observations. However, the increased skewness of link weight distribution prevents BCANw and SBw from approaching the theoretical value as quickly as the proposed FBCw and FSBw do. . . . . . . . . . . . . . 86 ix 5.12 Normalized estimation error of BCANw, SBw, FBCw and FSBw under different skewness of link weight distribution by changing the copy factor b of G 5 from 2 to 8. i) The performance of BCANw and SBw degrade as the grows. ii) BCANw and SBw tend to underestimate the dominant fractal dimension which is aligned with our theoretical pre- diction in analysis of the staircase effect. iii) The proposed FBCw and FSBw tends to be insensitive to the change of and benefit from the increased size of the target network. . . 87 5.13 First-order phase transition of the free energy as function of scaling factor s in Sier- pinski fractal networkG 4 with copy factorb = 3. (a) The free energy ofG 4 exhibits a discontinuous behavior betweens = 0:7738 ands = 0:7351. (b) The observed possible discontinuity in the first-order derivative of free energy. . . . . . . . . . . . . . . . . . . 87 5.14 Free energy (mass exponent)(q) deviates from linear dependence to non-linear depen- dence onq as scaling factor decreases below 0:7351. . . . . . . . . . . . . . . . . . . . . 88 5.15 Distribution of probability measure as a function of scale of multi-fractal analysis on col- laboration network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.16 Coexistence of multi-fractal and mono-fractal scaling in the collaboration network . . . . . 89 5.17 The failure of BCANw to capture the localized fractal scaling of collaboration network over a finite range of network scales. In the case of real world networks, the self-similar property does not holds at all scales of networks. There might exist a finite range of scales where fractal scaling behavior dominates. Moreover, this phase transition phenomenon consistently holds under all distorting factorq, suggesting a localized multi-fractality. . . . 90 5.18 Distribution of probability measure as a function of scale of multi-fractal analysis on Bu- dapest connectome network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.19 Coexistence of localized multi-fractal and mono-fractal scaling in the Budapest connec- tome network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.20 The fundamental impact of link weights on the multi-fractality of network. We keep the exactly same structure of the collaboration network but remove all its weights to transform the network into a binary network. We performed the proposed box-covering method to measure the scaling dependence of number of boxes and the distribution of their associated measure. Figure shows the loss of multi-fractality as a result of removal of link weights. Instead, we notice it the scaling dependence is best explained by an exponential law (a exp(bx);a = 3:75 10 4 ;b =11:55) suggesting the unweighted collaboration network becomes a ”small-world” network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.21 The fundamental impact of link weights on the multi-fractality of Budapest connectome network. Figure shows the similar loss of multi-fractality by removing the weights on the links of Budapest connectome. The scaling behavior is best explained by an exponential law (aexp(bx);a = 934:7;b =0:664) indicating that the common brain connectome is a ”small-world” if no weights are considered. . . . . . . . . . . . . . . . . . . . . . . . 93 5.22 An example application of the proposed localized scaling feature space for characterization of weighted complex network. Interfaced with the unsupervised machine learning based clustering algorithm, the localized scaling based community detection is able to identify the brain network communities consistent with the anatomical facts. The detected com- munities are not limited to neighboring nodes but based on their relative spatial relations to the rest of network with potential functional implications. . . . . . . . . . . . . . . . . 94 5.23 An example application of the proposed data-driven filtering method. By applying the filter sliding through the observations, the peaks of the output correspond to the critical scales where a significant change of the accumulative measure P (B i (l)) q occurs. . . . . 95 x 6.1 A simple case where data dependencies can be known only at execution time as user input determines both data and the type of task to be performed. . . . . . . . . . . . . . . . . . 97 6.2 Problem Overview. We propose a mathematical framework (Section 3) that constructs graphical models (Section 3.2) that are able to capture the sptio-temporal inter-task de- pendencies on which traffic can be synthesized (Section 3.3). The model can be learned by running the instrumented LLVM intermediate representation of the application of in- terest and collecting the execution trace. We also propose a benchmark scaling algorithm (Section 4) to scale the constructed model while preserving key structural features of the original application model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 6.3 An example of how a data dependency has an impact on traffic behaviors. . . . . . . . . . 101 6.4 Example graphs visualize assortativity, betweenness and clustering . . . . . . . . . . . . 108 6.5 The evolution of genetic algorithm at runtime. . . . . . . . . . . . . . . . . . . . . . . . . 109 6.6 The overview of benchmark scaling algorithm. The procedure follows a V-cycle of coars- ening and refining operations. By preserving nodes under proper scales, it is able to protect the structural features of interest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.7 Measuring the distribution of injection strength over different processors under three appli- cation benchmarks, blackscholes, canneal and freqmine, using both full-system simulation and synthesized traffic workloads based on the proposed model during ROI. The injection strength is calculated as the injection rate of a processor averaged over the execution time. In all three cases, the synthetic traffic workloads stress the target NoC to exhibit close injection distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.8 Comparison of average latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 6.9 Measuring structural similarities of graphical models scaled by a factor= 4; 8 and 16. . . . 113 7.1 Overview of the proposed mathematical framework. The reconfigurable NoC system and its associated applications are characterized through the proposed system and application graphical models. Then the optimization to the network structural configuration is per- formed by exploiting the submodular property of the problem. In case of a valid solution does not exist, we introduce the relaxation on problem constraints to obtain a feasible solution while preserve the optimality bound. . . . . . . . . . . . . . . . . . . . . . . . . 116 7.2 Augmented connectivity sets are shown forn 0 andn 3 . n 3 is configurable to set up link withn 1 andn 5 whereas both links can not be established at the same time. Similar link to n 0 is invalid due to physical limitation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.3 A case study: low-level architecture details . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.4 Optimized network configuration for real world applications . . . . . . . . . . . . . . . . 127 7.5 Network latency and energy savings comparison between the baseline and optimized re- configurable NoC under different traffic workloads.(a) Network latency measurements and (b) corresponding normalized energy consumptions. . . . . . . . . . . . . . . . . . . . . 127 xi Abstract Recent astonishing advances on cognitive sensing modalities, data science, machine learning frame- works, communication and computation design methodologies not only enable us to capture, analyze and impact such multi-system interactions with remarkably improved accessibility, operability, resolution and sophisticated mathematical and processing frameworks, but also calls for a radically novel integration of state-of-art techniques in both theoretical and computational domains to endow the cyber-physical systems (CPS) with built-in intelligence and realtime processing capabilities. The intelligent CPS is prescribed by the urgent demands for a revolutionary shift from the manually intervened analyzing paradigms to a self- aware, self-optimized and self-configured systems of extreme autonomy to facilitate adaptive sensing (i.e., data collection), learning (i.e., data processing) and decision making (i.e., control) processes with mini- mal human supervision over extended periods of operation. However, porting the embedded intelligence to devices with strict real-time processing constraints raises concerns with respect to the applicability of the well-established general-purpose computing paradigms (e.g., CPU/GPU array or clusters, data centers, cloud-computing) to application scenarios that have i) exorbitantly large amount of raw data, ii) limited or insufficient communication bandwidth and/or iii) strict power, thermal budget and timing constraints. These objectives can not be reached without an integrated mathematical and computational framework that is able to i) capture and understand the complex interdependent structure of CPS processes rich in stochas- ticity and ii) provides real-time processing capabilities that can be encompassed by on-board computing powers dynamically optimized for interested CPS applications. In this thesis, we aim to address these challenges by embracing the novel mathematical framework in modeling and control theory and advocating for a top-down design methodology for computing and communication platforms. We propose a new mathematical strategy for constructing compact yet accurate models of complex systems dynamics that aim to scrutinize the causal effects and influences by analyzing the statistics of the magnitude increments and the inter-event times of stochastic processes. We derive a framework that enables to incorporate knowledge about the causal dynamics of the magnitude increments and the inter-event times of stochastic processes into a multi-fractional order nonlinear partial differen- tial equation for the probability to find the system in a specific state at one time. Rather than following the current trends in nonlinear system modeling which postulate specific mathematical expressions, this mathematical framework enables us to connect the microscopic dependencies between the magnitude in- crements and the inter-event times of one stochastic process to other processes and justify the degree of nonlinearity. In addition, the proposed formalism allows to investigate appropriateness of using multi- fractional order dynamical models for various complex system which was overlooked in the literature. We xii validate the proposed modeling framework by showing that the multi-fractal dynamical equations learned from muscular, neural and vascular systems are accurately consistent with the system dynamics and pos- sess predictive power with minimized sensory efforts. Moreover, learning the geometric principles underlying the organization of complex cyber-physical systems modeled by weighted networks facilitates the identification of their fundamental properties through an elegant structure-functionality translation. To further understand how heterogeneous agents in a CPS are mutually influenced via interactions of different intensities over multiple spatio-temporal scales, we have also developed a reliable multi-fractal analysis and characterization framework to investigate the un- derlying geometry of complex systems and its variations subject to the changing interactions of agents. More importantly, the primary benefit of the proposed characterization framework comes from its capa- bility to quantify the variation of interaction intensities while no significant network structure mutation is present. On a different direction, this proposed framework also enables the quantification of the fine-grain similarity analysis among a set of nodes in the same network, which can be interfaced with the state-of-art machine learning frameworks to perform the novel community detections. In addition, it is also useful to combine with domain knowledge (labels and attributes of nodes, e.g., functionality of brain region) to drive an informed exploration (e.g., any functional similarity between brain regions that are topologically apart but share the same scaling law). These examples may only constitute a small portion of its poten- tial applications which necessitate our ongoing research efforts to extend the presented work to broader domains. To encompass the above-mentioned built-in intelligence and realtime processing capabilities within efficient computing and communication powers, we propose to advocate for a shift from ad-hoc archi- tectural optimization in a bottom-up fashion to an automated application-driven top-down optimization flow by learning the target applications and allowing runtime configuration of computing and communica- tion architectures. More precisely, we propose a model-of-computation (MoC) based application profiling framework to understand the computation and communication requirement of the applications. MoC de- scribes the identified CPS applications behaviors and structures as a directed dynamical weighted graph in terms of required computation capabilities, data movement, storage requirements and timing character- istics to support the discovery of the optimal fine-grained parallelism for design of processing elements. To tackle the dynamic and irregular nature of communication workloads in CPS applications to facili- tate a runtime optimization for best-fit communication architecture that sustains the data processing with maximized efficiency, we first propose a novel hierarchical NoC architecture that exploits user-cooperated network coding (NC) concepts for improving system throughput especially under heavy collective traf- fics (e.g., multicast, broadcast). Then we further take advantage of the MoC-based profiling framework and propose a general mathematical framework for reconfigurable Network-on-Chip (NoC) synthesis with guaranteed optimality. In summary, this PhD work explores for compact yet accurate construction of mathematical models that are able to capture the non-Markovian, non-stationary and non-linear system dynamics in a wide spectrum of CPS applications and advocate for a new top-down design methodology for CPS applications capable of processing and mining the exa-scale data with optimal computation and communication architectures. xiii With the theoretical and computational contribution combined, this PhD work aims to serve as the basis for the intelligent core of future autonomous CPS with self-understanding, self-configuration and self- optimization capabilities. xiv Chapter 1 Introduction The physical world is deeply rooted in complex interactions among synergetically coupled components of systems at different scales. Recent astonishing advances on cognitive sensing modalities, data science, machine learning frameworks, communication and computation design methodologies not only enable us to capture, analyze and impact such multi-system interactions with remarkably improved accessibility, operability, resolution and sophisticated mathematical and processing frameworks, but also calls for a rad- ically novel integration of state-of-art techniques in both theoretical and computational domains to endow the cyber-physical systems (CPS) with built-in intelligence and realtime processing capabilities. On one hand, the envisioned intelligent CPS is prescribed by the urgent demands for a revolutionary shift from the manually intervened analyzing paradigms to a self-aware, self-optimized and self-configured systems of extreme autonomy to facilitate adaptive sensing (i.e., data collection), learning (i.e., data processing) and decision making (i.e., control) processes with minimal human supervision over extended periods of opera- tion. On the other hand, porting the embedded intelligence to devices with strict real-time processing con- straints raises concerns with respect to the applicability of the well-established general-purpose computing paradigms (e.g., CPU/GPU array or clusters, data centers, cloud-computing) to application scenarios that have i) exorbitantly large amount of raw data, ii) limited or insufficient communication bandwidth and/or iii) strict power, thermal budget and timing constraints. These objectives can not be reached without an integrated mathematical and computational framework that is able to i) capture and understand the com- plex interdependent structure of CPS processes rich in stochasticity and ii) provides real-time processing capabilities that can be encompassed by on-board computing powers dynamically optimized for interested CPS applications. Towards this end, several key research challenges in both theoretical and computational domains have to be sufficiently addressed by this PhD work. From the statistical perspective, a systematic understanding of inter-coupling (e.g., causal influence, coupling directionality and structure, spatial and temporal distribution) among heterogeneous entities in- volved in the evolution of the investigated physical process is a prerequisite. From microbial communities, human physiological and biological systems to large scale intelligent networked systems (e.g., smart grid, metropolitan traffic management systems) and social networks, complex interdependent systems usually display multi-scale spatio-temporal dynamics that are frequently classified as non-linear, non-Gaussian, non-ergodic, and/or fractal [21, 22, 48, 150, 70]. Distinguishing between the sources of nonlinearity, 1 identifying the nature of fractality (space versus time) and encapsulating the non-Gaussian characteristics into dynamic causal models remains a major challenge for studying complex systems. More specifically, a number of outstanding problems for constructing mathematical models of complex systems should be con- sidered: (i) How can we distinguish between spatial and temporal nonlinearity and how can we construct mathematical models (i.e., dynamical equations) that capture the spatio-temporal statistical characteristics of complex systems? (ii) How can we identify the mathematical expressions for the nonlinear models of complex systems and determine the degree of nonlinearity that should be accounted for without incorpo- rating unnecessarily many nonlinear terms? (iii) How can the power law and non-Gaussian properties of the magnitudes observed in many time series impact the degree of nonlinearity? (iii) How one can inter- pret the asymmetry observed in the time series realization of various processes and estimate the amount of information gained from analyzing the spatio-temporal complexity present in time series? The failure to address these problem lead to a modeling of system evolution that either suffers from biased characteriza- tion of the dynamics (e.g., under-fitting) or hardly generalize to endow the learned model with predictive powers (e.g., over-fitting). Moreover, the mathematical understanding, description, prediction and control of physical processes poses critical challenges in design and optimization of CPS systems that are highly reliable and robust with bounded performance margins. The irregular and non-stationary dynamics of CPS applications con- tradicts the traditional statistical memoryless assumption (e.g., Markovian processes, short-term memory) and cannot be captured by integer-order calculus. Instead, they exhibit long-term memory and can only be accurately modeled by fractal-order differential equations. For instance, inferring causal and higher- order complex relations in physiological and biological systems (e.g., cardiovascular, muscular or neural systems) from unstructured time varying data requires a spatio-temporal fractal model that considers mul- tivariate dependency and a power-law temporal correlations. The failure to capture these properties of CPS applications in a robust and real-time fashion not only prevents its practical employment, but can also lead to irreversible consequences especially in presence of noisy and limited measurements, system anomalies, environmental uncertainties and malicious attacks. From the geometrical perspective, learning the geometric principles underlying the organization of complex cyber-physical systems modeled by weighted networks facilitates the identification of their fun- damental properties through an elegant structure-functionality translation. Complex systems consist of heterogeneous agents mutually influenced via interactions of different intensities over multiple spatio- temporal scales. This heterogeneity encompassed in both the participating components and their varying interactions makes complex systems difficult to decipher. To understand and control these complex sys- tems, the network theory provides an effective mathematical modeling framework that enables the encod- ing of the entities (nodes) of a complex system and their heterogeneous interactions (links) of different strength (weights) into a topological network configuration implicitly embedded in metric spaces, where the distance among nodes is decided both by the structural configuration of the system (topology) and the intrinsic nature of the inter-node couplings (e.g., social affinity, chemical bonds, traffic intensity or neural connectivity strength). In some cases, the properties of the inter-couplings among system components and the corresponding spatial embeddings even play a far more dominant role in regulating the overall 2 system behaviors and dynamics. For instance, the atomic and molecular interactions among a chain of amino acids definitively dictate not only the dynamical spatial conformation of the corresponding protein but also its biological functionality [33, 36]. The disturbance of normal protein interactions can lead to irreversible pathological consequences known as proteopathies like Alzheimer’s, Parkinson’s [122] and Huntington’s disease [73]. Therefore, the study of structural organization, formation and dynamics of the complex systems can benefit from studying their geometrical properties and discovering new relationships between geometrical characteristics and network problems (e.g., community structure identification). From the computational perspective, all these challenges are further exacerbated by the fact that the current deployment of intelligent CPS highly rely on large scale general-purpose computing platforms (data-center scale clustered CPUs, arrays of GPUs or GPGPUs). These traditional computing power are exorbitantly power consuming and require extensive maintenance, adjusting and monitoring efforts. The flexibility of these well-established platforms endows the fast-forwarding CPS applications with mini- mized development efforts and highly dynamical supporting communities. However, this also prevents its portability to power-limited devices and instruments with strict constraints (e.g., edge devices). Fur- thermore, advanced scientific and engineering investigations in CPS systems generate exorbitantly large amount of data that is much beyond easily accessible computing power. Relaying large volume of un- structured and under-explored raw data back to a processing fusion not only drains unnecessarily huge amount of energies and computing powers, but also place extremely high demands on the bandwidth of communication bandwidth especially when a prompt decision and feedback is required. For instance, the data generated by genomics applications in precise medicine are projected to exceed any other Big Data applications with scale of zeta-byte by year of 2025. Processing astronomical volume of data requires computing capabilities ranging from 2-1000 trillion CPU hours due to the excessive resource locking, data contention and poorly exploited fine-grain parallelism in well-established computing paradigms (e.g., CPU, GPGPU). The fact that CPS applications are rich in non-stationary behaviors with lack of regu- lar communication structures (e.g., physiological and biological system learning and control) exacerbates the drawbacks of parallelization techniques, memory wall, or the need for multiple instruction multi-data (MIMD) execution. All these challenges sum up to the urgent need for a shift to design methodologies that optimize for target applications (rather than applications in general) to break the hardware-software boundary to provide remarkable improvement on data movement, energy efficiency, memory wall and concurrency under rich uncertainties with enhanced robustness. To embrace the complexity in CPS design and tackle these challenges, this PhD work propose to devote research efforts on the following two primary research topics: (Task 1). Exploration for compact yet accurate construction of mathematical models and reliable algorithms that are able to capture the non-Markovian, non-stationary and non-linear system dy- namics and heterogeneous spatial-temporal interactions in a wide spectrum of CPS applications (Task 2). A new top-down design methodology for CPS applications capable of processing and min- ing the exascale data with optimal computation and communication architectures. To address the theoretical challenges posed by Task 1 and 2, we embrace the novel mathematical frame- work in modeling and control theory and advocate for a top-down design methodology for computing and 3 communication platforms. More specifically: (1) In Chapter 2, we propose a new mathematical strategy for constructing compact yet accurate models of complex systems dynamics that aim to scrutinize the causal effects and influences by analyzing the statis- tics of the magnitude increments and the inter-event times of stochastic processes. We derive a framework that enables to incorporate knowledge about the causal dynamics of the magnitude increments and the inter-event times of stochastic processes into a multi-fractional order nonlinear partial differential equation for the probability to find the system in a specific state at one time. Rather than following the current trends in nonlinear system modeling which postulate specific mathematical expressions, this mathematical framework enables us to connect the microscopic dependencies between the magnitude increments and the inter-event times of one stochastic process to other processes and justify the degree of nonlinearity. In addition, the newly presented formalism allows to investigate appropriateness of using multi-fractional order dynamical models for various complex system which was overlooked in the literature. (2) To validate the framework discussed in Chapter 2, we pioneer a dynamical systems approach to the problem of tracking both muscular and neural activity, and demonstrated that the current modeling ap- proaches (e.g., ARMA. ARFIMA) of muscular and neural activities either ignored the long-term depen- dency in system dynamics or spatial coupling among system components in Chapter 3. We propose to describe the evolution of biological systems by construction of discrete-time fractional-order systems and demonstrate much improved performance in terms of goodness-of-fit against modeling approaches with short-term memory assumption. (3) In Chapter 4, to allow optimized sensing strategies with minimized sensory data collection efforts in processes described by the proposed fractional-order dynamical system model, we study problem of determining the minimum number of sensors such that the global dynamics described by the fractional- order differential equation can be recovered from the collected data (i.e., observability). We also show the problem is NP-hard. We propose a polynomial algorithm to obtain suboptimal solutions with optimality guarantees and are robust in presence of modeling errors, process and measurement noise. By taking ad- vantage of the proposed approach, we investigate the state-of-art wearables that monitors brain activities (e.g., Emotiv) and show that the current placement of EEG sensors on the scalp is not optimal with current setup. (4) In Chapter 5, we investigate the mutlifractal geometry of networked complex systems. We propose a reliable multi-fractal analysis and characterization framework to investigate the underlying geometry of complex systems and its variations subject to the changing interactions of agents. More importantly, the primary benefit of the proposed characterization framework comes from its capability to quantify the variation of interaction intensities while no significant network structure mutation is present. On a differ- ent direction, this proposed framework also enables the quantification of the fine-grain similarity analysis among a set of nodes in the same network, which can be interfaced with the state-of-art machine learning frameworks to perform the novel community detections. In addition, it is also useful to combine with domain knowledge (labels and attributes of nodes, e.g., functionality of brain region) to drive an informed exploration (e.g., any functional similarity between brain regions that are topologically apart but share the same scaling law). These examples may only constitute a small portion of its potential applications which 4 necessitate our ongoing research efforts to extend the presented work to broader domains. (4) In Chapter 6, we propose a model-of-computation (MoC) based application profiling framework to understand the computation and communication requirement of the applications, which enables a novel design methodology for the exascale computation and communication architectures optimized for CPS in runtime. More specifically, we make the following contributions: (i) We develop a LLVM-compiler framework based architecture-independent application profiling engine to learn the CPS application be- haviors and inter-dependent structure in real-time. (ii) We propose a novel model of computation (MoC) that describes the identified CPS applications behaviors and structures as a directed dynamical weighted graph (DDWG) in terms of required computation capabilities, data movement, storage requirements and timing characteristics to support the discovery of the optimal fine-grained parallelism and enables a top- down synthesis of best-fit architectures with remarkable performance improvement given the same energy budget. (5) In Chapter 7, we tackle the dynamic nature of CPS application traffic after learning the task structure and data movement properties in Chapter 5 to facilitate a runtime optimization for best-fit communication architecture that sustains the data processing (computation). Power-efficient data movement strategies for CPS applications are required to shift more power from communication toward computation. Towards this end, we first propose a novel hierarchical NoC architecture that exploits user-cooperated network coding (NC) concepts for improving system throughput especially under heavy collective traffics (e.g., multicast, broadcast). To improve this technique with runtime optimization capability that allows for synthesis of best-fit communication architecture in general cases, we propose a general reconfigurable Network-on- Chip (NoC) synthesis framework with guaranteed optimality given the target CPS application profiled by DDWG. This framework combined with the profiling technique discussed in Chapter 5 lays the founda- tion for the automated tow-down computing and communication optimization engine that can serve as the cognitive core for future intelligent CPS systems. 5 Chapter 2 Constructing Compact Causal Mathematical Models for Complex Dynamics in Cyber-physical Systems The behavior of complex systems is influenced over many space and time scales by multi-physics interactions. From cellular interactions within the microbiome-to-brain architecture, to organ interdepen- dencies within human body expressed via signature physiological processes, to animal swarms and food webs, to social groups and society, memory, interdependency and concurrency are fundamental char- acteristics. Although recent advances in sensing technology contributed to large datasets, the modeling and analysis of complex systems have mainly focused on modeling frameworks that ignore the exhib- ited long-range memory, spatio-temporal fractality, non-linear, non-ergodic and non-Gaussian properties. More precisely, traditional mathematical modeling approaches in nonlinear dynamical systems, statistical machine learning, statistical signal processing and process control, system identification and econometrics focussed on describing the complex system dynamics via a set of integer-order ordinary differential equa- tions (ODEs) (see eq. (2.1)) and identify the parameters of the model by minimizing a cost function that depends on a goodness-of-fit metric (i.e., the difference between the data and the postulated model class) and a complexity of the model penalty metric [79]: dx j (t) dt =f j (x 1 ;:::;x n ;u 1 ;:::;u m ; j ) (2.1) where the functionf j encodes the linear/nonlinear interdependencies between the state variablesx 1 ;:::;x n and the control variablesu 1 ;:::;u m , and j represents the set of parameters to be estimated from measured time series. For instance, with the aim of determining equations of motion from observations of time-dependent behavior, Crutchfield and McNamara developed an informational measure to quantify the modeling accu- racy for eq. (2.1) from time series observations and identifies the parameters of the ODEs by minimizing the distance between the data and the postulated model [29]. Relying on the assumption that the observed dynamical system is memoryless (i.e., the rate of change of its state variables obeys an integer order time derivative), the heuristic modeling strategy proposed in [103] first constructs a phase-space plot from the observations of a single variable (ignoring the true causal interdependency) and determines the dimension- ality of the system’s attractor. Building on postulating a particular integer order time derivative coupled 6 with a specific nonlinear mathematical expression and using the Ritt’s algorithm of differential algebra, one can find the parameters of the model by solving a regression problem [78]. Using a bilinear approx- imation of the dynamics of interactions between system variables, the dynamic causal modeling in [62] proposes a Bayesian framework that determines the parameters of a multi-dimensional linear (integer or- der) dynamical system (the dynamics is assumed to be memoryless). Along the same lines, numerous other approaches (e.g., manifold learning [98], Bayesian networks [1], Q-learning [87], variational in- ference [102]) have been proposed in the literature. More recently, a kernel cross-spectral density based analysis of stationary time series was proposed for measuring the independence and the similarity between various time series [15]. While not being able to be exhaustive in covering all previous work, the implicit assumption in all these prior approaches is that the dynamics is intrinsically governed by a first order time derivative which implies that the inter-event times between successive changes in the magnitude of the processes are characterized by an exponential law. However, many complex systems exhibit long-range memory [133] and fractal dynamics that are characterized by power (non-exponential) law magnitude and inter-event times bringing into question whether the nonlinearity should be considered in the time or space domains or both [123, 13, 145, 19]. In addition, a number of outstanding challenges remain for constructing mathematical models of com- plex systems: (i) How can we distinguish between spatial and temporal nonlinearity and how can we construct mathematical models (i.e., dynamical equations) that capture the spatio-temporal statistical char- acteristics of complex systems? (ii) How can we identify the mathematical expressions of the functions f j for the nonlinear models of complex systems and determine the degree of nonlinearity that should be accounted for without incorporating unnecessarily many nonlinear terms? (iii) How can the power law and non-Gaussian properties of the magnitudes observed in many time series impact the degree of nonlinear- ity? (iii) How one can interpret the asymmetry observed in the time series realization of various processes and estimate the amount of information gained from analyzing the spatio-temporal complexity present in time series? To overcome the afore-mentioned drawbacks, in this work, we seek to understand mathematically how various fundamental and essential components of the complex systems interact and exchange information to influence the overall performance and behavior. To the best of our knowledge, this paper is the first to investigate the impact of microscopic dynamics encapsulated in the ordered sequence of magnitude increments and inter-event times of the stochastic process. More precisely, we have made the following novel contributions in this work: Firstly, we show that by adopting a causal inference like framework and combining with probabilistic tools that were originally developed in statistical physics context we can develop mathematical strategies that can enable us to distinguish when a time series exhibits a short-range or a long-range dependence dynamics. Secondly, we show how the analysis of a multi-point probability density function can be interpreted through the lens of maximum entropy principle and distinguish between a memoryless or a complex time- dependency structure. 7 F F F AF AF AF AF AF F F F F F F F F F FT FC FC FC FC FC FC FC FT T T C C C C C C C T T TP CP CP CP CP CP CP CP TP P P P P P P P P P PO PO PO PO PO O O O I PZ P1 P2 7 3 Z 4 8 7 5 3 1 Z 2 4 6 8 7 5 3 1 Z 2 4 6 8 9 7 5 1 3 Z 2 4 6 8 10 7 5 3 1 Z 2 4 6 8 7 5 3 1 Z 2 4 6 8 7 3 Z 4 8 1 Z 2 Z 23 22 24 61 63 62 25 26 27 28 29 30 38 31 37 32 33 34 35 35 36 39 40 1 2 3 4 7 6 5 43 44 41 42 8 9 11 12 13 14 10 45 46 15 21 16 20 17 19 18 47 47 55 48 54 49 50 52 53 51 56 57 59 60 58 Physical Process Sensing Controller Synthesis Motor neuron Feedback ( ) ( ) [ ] ( ) { } , , , max 1 : 1 1 : 1 1 : 1 1 : 1 , ˆ : 1 : 1 - - - - D D D D D D + D D m m m m q m m q t x w t x t x S t x w S m m t X(t) Positive increments Negative increments Inter-event time Microscopic Dynamics Capturing(Sec. 2.1) ( ) t x w D D + , ( ) t x w D D - , ( ) ( ) ( ) ( ) ò ò ¥ ¥ - D D - D - D D D + = t x d t t x x P t x w t d t x P t x P 0 0 , , , , , , b a b a ( ) t x w D D , Maximum Causal Non-extensive Entropy Principle(Sec. 2.1) ( ) ( ) ( ) ò ò = D D D - - - 1 ... , ˆ ... 1 1 1 : 1 1 : 1 t d x d t x w m m m ( ) ( ) ( ) ò ò = D D D 1 ... , ˆ ... 1 : 1 : 1 t d x d t x w m m m ( ) ( ) ( ) ò ò = D D D D m m I t d x d t x w x m m m m a a 1 : 1 : 1 ... , ˆ ... s.t. ( ) ( ) ( ) ( ) ( ) q m h q m g q m f m m t x t x w t x w , , , 1 1 : 1 : 1 , , ˆ D D µ D D µ D D Power-law family solutions Dynamical State Equation Derivation(Sec. 2.3) Spatio-temporal coupling Stimulation Cyber domain Physical domain Non-Markovian Probabilistic Description(Sec. 2.2) Figure 2.1: The overview of the proposed framework to construct compact mathematical modeling of complex dynamics and its application to the development of closed-loop cyber-physical systems. The monitored physical process is interfaced with a data-driven learning framework that investigates the sys- tem dynamics from a microscopic perspective (e.g., magnitude and inter-event time increments). A non- Markovian probabilistic description is constructed based on generalized master equation and the principle of maximum causal entropy to encapsulate the exhibited power-law behaviors (Section 2.1 and Section 2.2). This master equation based formalism allows the derivation of the nonlinear dynamical state equa- tions with minimal postulation to capture system dynamics (Section 2.3). By retrieving the system model parameters from a specific physical process from which the model is derived, the proposed framework can serve as the intelligent core of the closed-loop intelligent cyber-physical systems for a wide spectrum of CPS applications( e.g., decode the neuron activities to generate stimulation signals for the control of prothesis or muscles to help amputee or paralyzed patients to regain body functionality). Thirdly, this analysis allows to identify conditions under which a complex system under investiga- tion can be approximated through a single order (i.e., the inter-event times can be characterized by a marginal power law distribution being reminiscent of mono-fractal dynamics) or multiple order (i.e., the inter-event times and overall system dynamics requires multiple interwoven power laws and multiple scal- ing exponents being reminiscent of multi-fractal dynamics) fractional dynamics. With respect to the afore- mentioned question (i), the proposed mathematical formalism allows us to determine when the space and time components of a stochastic process can be decoupled and its overall evolution be described through a multi-fractional nonlinear partial differential equation (PDE) for the probability to reach a state at a particular point in time. With respect to the afore-mentioned question (ii), we demonstrate how the power law behavior exhib- ited in the space (magnitudes) and time (intervals of time between successive changes in the magnitude) domains can be encapsulated through a compact fractional calculus representation for the probability to reach a state at a particular point in time and how this new formalism enables the derivation of the nonlinear 8 dynamical equations with minimal postulation. Simply speaking, the statistical analysis of the magnitudes and the inter-event times dictates the mathematical expression (form) of the dynamical model. With respect to the afore-mentioned question (iii), we illustrate how the asymmetry observed between the positive and negative increments in the magnitude of the stochastic process can be encoded through Riesz-Feller fractional order operators leading to a new class of multi-fractional nonlinear PDEs that could be exploited for computing information theoretic metrics regarding the modeling accuracy and overfitting. This is left for future work. 2.1 Compact Mathematical Modeling of Complex Dynamics We show in Figure 3.2 the overview of the proposed framework that enables the construction of the compact mathematical causal modeling of the complex dynamics in a wide spectrum of applications in cyber-physical domain. By setting up sensors that monitor the physical process of interest, we are able to investigate the microscopic dynamics of the underlying system evolution in both space (magnitude increments) and time (inter-event times). We demonstrate that the interdependency between the jumps of the magnitude and inter-event times i) dictate the mathematical characterization of changing rate of system states and ii) impact the degree of nonlinearity of the couplings among system components. To capture these properties, we propose a non-Markovian probabilistic description of the physical process through generalized master equation. By principle of maximum causal entropy considering the power-law behavior exhibited in the system dynamics, we are able to derive the nonlinear dynamical state equations with minimal postulation. By leveraging the system identification techniques to retrieve the model derived by learning the micro- scopic dynamics of the observed physical process, the proposed causal modeling of complex dynamics can be integrated as the core of the intelligent cyber-physical systems in a widely ranged CPS applications such as the prediction of system dynamic behaviors, the detection of system anomalies and the synthesis of the closed-loop controllers for autonomous systems. In what follows, we will present the detailed discussion on all components of our proposed mathematical modeling framework. 2.1.1 Microscopic Dynamics Dictates the Governing Mathematical Equation We consider a stochastic processx(t), whose realization is described by a tuple sequence of magnitude and time increments:f(x 1 ; t 1 ); (x 2 ; t 2 );::: ; (x m ; t m )g (see Figure 1). More precisely, the t j represents the waiting time in which the stochastic process makes a jump x j in thej-th iteration. To define the underlying stochastic process (sequence), we introduce the conditional probability density function w(x m ; t m jx m1 ; t m1 ::: ; x 1 ; t 1 ) = =w(x m ; t m jx 1:m1 ; t 1:m1 ) (2.2) 9 and the joint probability density function (PDF): w(x m ; t m ;::: ; x 1 ; t 1 ) =w(x 1:m ; t 1:m ) = = m Y j=1 w(t j jt 1:j1 ) m Y j=1 w(x j jx 1:j1 ; t 1:m ) (2.3) where we have taken into account the chain rule and the causal dependency. The rationale for consider- ing these probabilities is motivated by the need to understand how the so called microscopic dynamics influences the overall evolution; alternatively stated we aim to understand: i) How the changes in the magnitude increments affect the degree of linearity /nonlinearity of the functionf j in equation (2.1); ii) How the statistics of the inter-event times impact the overall dynamics and could dictate the mathematical operator characterizing the rate of change in the system. One fundamental assumption we make is that the tuple (x j ; t j ) is not random and independent of the previous magnitude increments (x i ; t i ) with 1 i < j. Instead, we assume that the ordered sequencef(x 1 ; t 1 ); (x 2 ; t 2 );::: ; (x m ; t m )g posses some form of directed information that is conveyed either from the space of magnitudes x j to that of time increments t j or from the space of time increments t j to that of the process magnitudes x j from one iteration (generation) to another (preserving the axis of time). This ordered dependency represents the source of the fractal behavior that could be observed in either the magnitudes or the waiting (inter-event) times. Consequently, in this work, we investigate how this directional information and its related statistics determine the type of dynamical equation that governs the evolution of process x(t). An important problem in establishing a governing dynamical state equation (i.e., a differential equation describing the evolution of thek-th order statistical moments) is represented by the need to elucidate the dependency structure between the process magni- tudes and time increments and the mathematical form of the joint PDFw(x m ; t m ;::: ; x 1 ; t 1 ). In order to describe a strategy for investigating the dependency structure between the process magnitudes and time increments, we assume that the stochastic processx(t) corresponds to a non-extensive 1 system and introduce the following definitions: Definition 1: ([137]) Given a continuous probability distribution f(x;y), the non-extensive Tsallis entropy is defined by: S q [f] = 1 q 1 xmax Z xmin ymax Z ymin ff(x;y)f(x;y) q gdxdy (2.4) whereq is a real number and represents the entropic index. Whenq! 1 the above formula reduces to the Boltzmann-Gibbs or Shannon entropy (up to a constant parameter). 1 Systems obey non-extensive statistical mechanics where entropy is non-additive 10 Definition 2: Given two continuous joint probability distributionsw(x 1:m ; t 1:m ) andw(x 1:m1 ; t 1:m1 ), we define the causal non-extensive Tsallis entropy as follows: S q (x m ; t m jx 1:m1 ; t 1:m1 ) = 1 q 1 Z ::: Z f w(x 1:m ; t 1:m ) w(x 1:m1 ; t 1:m1 ) [ w(x 1:m ; t 1:m ) 1q w(x 1:m1 ; t 1:m1 ) 1q 1]g (2.5) As will be shown in the next subsection, the mathematical expression of the joint PDFw(x 1:m ; t 1:m ) plays a crucial role in establishing the dynamical equations of thek-th order statistical moments. Conse- quently, in order to elucidate the form of the joint PDF w(x 1:m ; t 1:m ) we employ the principle of maximum entropy which describes a probability distribution estimator that best represents the state of knowledge or known properties of that distribution [91]. Simply speaking, this estimator implies max- imizing an entropic functional subject to constraints that reflect our knowledge about the distribution (e.g., normalization condition, fractional order statistical moments). In what follows, we will describe a maximum entropy inspired estimator for the joint PDF w(x 1:m ; t 1:m ) and its related probabilistic components. Definition 3: The principle of maximum causal non-extensive entropy describes the causal non-extensive entropy-maximizing probability distribution estimator, ^ w(x m ; t m jx 1:m1 ; t 1:m1 ), by solving the following optimization problem: max ^ w(x1:m;t1:m) fS q [ ^ w(x 1:m1 ; t 1:m1 )]+ S q (x m ; t m jx 1:m1 ; t 1:m1 )g (2.6) R ::: R ^ w(x 1:m ; t 1:m )d(x m ):::d(t 1 ) = 1 (2.7) R ::: R ^ w(x 1:m1 ; t 1:m1 )d(x m1 ):::d(t 1 ) = 1 (2.8) R ::: R x m1 m1 ^ w(x 1:m1 ; t 1:m1 ) d(x m1 ):::d(t 1 ) =I m1 (2.9) R ::: R x m m ^ w(x 1:m ; t 1:m )d(x m ):::d(t 1 ) =I m (2.10) Theorem 1: The causal non-extensive entropy-maximizing probability distribution estimator in opti- mization problem (2.6) leads to a solution for the conditional probability ^ w(x m ; t m jx 1:m1 ; t 1:m1 ) = ^ w(x 1:m ; t 1:m )= ^ w(x 1:m1 ; t 1:m1 ), where the joint PDFs ^ w(x 1:m ; t 1:m ) = ^ w m and ^ w(x 1:m1 ; t 1:m1 ) = ^ w m1 satisfy the following relations: ^ w m = ^ w m1 h(x m1:m ; t m ), where (2.11) h(x m1:m ; t m ) =f( m m x m m )l(x m1 ) q2 1q g 1 q1 withl(x m1 ) = 1 + (q 1)( m1 + m1 x m1 m1 ) One can notice from eq. (2.11), that if we do not have knowledge about the fractional order statistical moments, thenw m =w(x 1:m ; t 1:m ) is proportional to the initial joint PDFw 1 =w(x 1 ; t 1 ) (i.e., 11 w(x 1:m ; t 1:m )/ w(x 1:m1 ; t 1:m1 ):::/ w(x 1 ; t 1 )). It should be noted thatw(x m ; t m j x 1:m1 ; t 1:m1 ) connects to the concept of NeymanRubin causality [112] and preserves the causal dependence in space and time. 2.1.2 Non-Markovian Probabilistic Description In the previous subsection, we observed that under some knowledge about the stochastic process mag- nitudes (x and time increments (t) reflected as constraints in the maximum causal non-extensive en- tropy formulation, the joint distribution can take exponential, power law or more complex mathematical expressions. In this section, we incorporate the results derived in subsection 2.1.1 and provide a prob- abilistic description of the stochastic processx(t) through a generalized master equation (GME) for the conditional PDFP (x;tj;): P (x;tj;) =P 0 (x;tj;) + t R 0 d(t) (2.12) 1 R 1 w(x; tj;)P (x x;t tj;)d(x) where theP (x;tj;) represents the conditional PDF that the stochastic processx(t) attains valuex at timet given that the process magnitude increments x and the process inter-event times t are charac- terized by fractal exponents and, thew(x; t) represents the joint PDF of the process magnitude increments x and the inter-event times t that is assumed to satisfy a fractal scaling relationship char- acterized by the fractal exponents and , and P 0 (x;tj;) is the initial condition that the stochastic process was initiated from statex at timet = 0. The conditioning on the fractal exponents and in eq. (2.12) of the PDFP (x;tj;) is motivated by the fact that the joint PDF of the process magnitude increments and the inter-event timesw(x; t) can be expressed as a generalized fractional Taylor (power) series: w(x; t) = P k (A k +B k x k )t k 1 . Consequently, we express the GME for the conditional PDF P (x;tj;) for a specific and , with w(x; tj;)/ x t 1 and integrate over the set of exhibited fractal exponents. Taking into account the Riemann-Liouville fractional order integral of order > 0 for a functionf over the space of locally integrable function [99]: I f(t) = 1 () t R 0 1 f(t)d (2.13) the relation between the fractional order integral and fractional order derivative (for 0< < 1) 0 D t f(t) = d f(t) dt = 0 I t f(t) = = 1 (1) t R 0 1 (t) 1 df() d d (2.14) 12 and the fractional Kramer-Moyal expansion, the GME (2.12) [56] can be rewritten as follows: 0 D t P (x;tj;) = x R 1 d l (x )[P ( ;tj;) P (x;tj;)] + 1 R x l(x )[P ( ;tj;)P (x;tj;)]d (2.15) The transition functionalsl (z) andl(z) can take many different mathematical expressions. In what fol- lows, we consider a generalized mathematical form motivated by observed statistical asymmetry of several physiological time series as follows: l (z) =a 1 1 (1 +)sin( (+) 2 )z (1+) +a 2 z 1 forz 0 (2.16) l(z) =a 1 1 (1 +)sin( () 2 )z (1+) +a 2 z 1 forz< 0 (2.17) Although other mathematical expressions can be retrieved from statistical analysis of the magnitude incre- ments observed in the time series, the analysis of their implications on the overall mathematical form of the master equation is left for future work. For the above power law asymmetric transition functionals, the GME in (2.18) takes the following form: 0 D t P (x;tj;) = a 1 @ P (x;tj;) @jxj +a 2 P (x;tj;) (2.18) Taking into account that the magnitude increments and the inter-event times can be characterized by a dis- tribution of fractional exponents, then the above partial differential equation can be re-written as follows: max R min d() 0 D t P (x;t)d = max R min a 2 P (x;t)e()d+ + max R min a 1 @ P (x;t) @jxj e()d (2.19) whered() denotes the distribution of fractional exponents characteristic to the statistics of the inter- event times, min and max are the lower and upper bounds on the exhibited fractional exponents for inter-event times,e() is the distribution of fractional exponents corresponding to the statistics of the magnitude increments ofx(t), and min and max are the lower and upper limits on the fractional ex- ponents characterizing the magnitude increments. Investigating and determining the dynamical equations governing the evolution of complex systems implies analyzing and characterizing the statistical interde- pendency of multi-dimensional signature processes. Towards this end, by assuming that the inter-event 13 times between changes in all the signature processes are characterized by a fractal exponent, a multi- dimensional generalization of eq. (2.15) reads: P (x 1 ;:::;x n ;tj 1 ;::; n ; j ) =P t=0 (x 1 ;::;x n j 1 ;::;) + t R 0 d x R 1 d 1 ::: x R 1 d n l (x 1 1 ;:::;x n n ;t) [P ( 1 ;:::; n ;j 1 ;:::;)P (x 1 ;:::;x n ;tj 1 ;:::)] + t R 0 d 1 R x d 1 ::: 1 R x d n l(x 1 1 ;:::;x n n ;t) (2.20) [P ( 1 ;:::; n ;j 1 ;:::;)P (x 1 ;:::;x n ;tj 1 ;:::;)] where the transition probabilitiesl (z 1 ;:::;z n ;) andl(z 1 ;:::;z n ;) are introduced to capture the possible asymmetry in the evolution of the positive and negative increments of each processx j (t) and the interde- pendency between all processesx 1 (t);:::;x n (t). In the master equation (2.20), the transition probabilities l help at capturing both the causal structure (i.e., how one or a set of system variables influence a specific variable and with what degree / strength, how system variables are impacted by external perturbations) and the causal dynamics (i.e., how much of the past evolution and with what strength influences the rate of change in one or multiple system variables). Alternatively stated, the master equation encodes the spatio- temporal interdependency and memory (changes in the patterns of interactions between system variables and variations in external inputs if needed) of a complex system. In the next subsection, we will consider a particular symmetric case, when thel (z 1 ;:::;z n ;) andl(z 1 ;:::;z n ;) take the following form: l(z 1 ;:::;z n ;) = n P j=1; k6=j [a j +b j z 1 j +c j z 1 j z 1 k ] j1 (2.21) Under these transition probabilities, the master equation (2.20) can be expressed as follows: P (x 1 ;::;x n ;tj j ) =I j n P j=1; k6=j [a j P (x 1 ;::;x n ;tj j )+ b j @P (x 1 ;:::;x n ;tj j ) @x j +c j @ 2 P (x 1 ;:::;x n ;tj j ) @x j @x k ] (2.22) where the inter-event times are characterized by a fractal exponent j that is determined by the intrinsic degree of memory associated with processx j . We also express the conditional probability because in some cases the processx j (t) can be characterized by multiple fractal exponents and so a distributiond( j ) is required to fully characterize its dynamics. 14 2.1.3 Dynamical State Equations One dimensional case: In what follows, we show how the afore-mentioned analysis helps at determin- ing the time dependence of the statistical moments associated with the stochastic processx(t) and defined as follows: E k (t) =E[x k ] = 1 R 1 jxj k P (x;t)dx (2.23) wherek denotes the order of the statistical moments andP (x;t) is assumed to be described by equation (2.19). By multiplying both sides of equation (2.19) withjxj k and integrating over thex-magnitude space, we obtain the following relation for thek-th order moment ofx(t): max Z min h() @ E k (t) @t d = max Z min a 2 E k (t)e()d+ + max Z min a 1 e() 1 Z 1 jxj k @ P (x;t) @jxj dxd (2.24) Making use of the integration by parts formula leads to max Z min h() @ E k (t) @t d = max Z min a 2 E k (t)e()d+ max Z min a 1 e() 1 Z 1 @ E k (t) @jxj P (x;t)dxd (2.25) and using the following result from fractional calculus @ x k (a +bx) m @x = a (k + 1)x k (a +bx) m (k + 1) (2.26) equation (2.24) can be written as follows: max Z min h() @ E k (t) @t d = max Z min a 2 E k (t)e()d+ + max Z min a 1 (k + 1) (k + 1) E k (t)e()d (2.27) Of note, ford() =( 0 ),a 1 = 0, considering noise terms and discretizing equation (2.27) using the Gr¨ unwald-Letnikov discrete fractional differential operator [113] we obtain the well know autoregressive fractionally integrated moving average (ARFIMA) type of models [51][58]. Furthermore, equation (2.27) represents a new class of mathematical models that could be used to study time series. 15 Multi-dimensional case: Modeling complex dynamical systems and determining governing differen- tial equations for various state variables (e.g., mean, variance, skewness, kurtosis associated with a stochas- tic process) requires to characterize the statistical inter-dependency between several processes. While the transition probabilities can in general take many forms, in what follows, we analyze the case in which the evolution of the magnitudes and inter-event times associated with a set of interdependent processes is described by a joint PDF of the form in (2.21). By multiplying both sides of equation (2.22) by x j , integrating the equation over the entire space (all variablesx 1 tox n ) and taking into account the inverse relationship between the fractional order integral and fractional order derivative operators, we can obtain a dynamical equation for the meanM j (t) of the processx j : 0 D j t M j (t) = P n j=1; j6=k [a j M j (t)b j +c j M k (t)] (2.28) or in matrix format: 2 6 6 4 0 D 1 t . . . 0 D n t 3 7 7 5 2 6 6 4 M 1 (t) . . . M n (t) 3 7 7 5 =A 2 6 6 4 M 1 (t) . . . M n (t) 3 7 7 5 +E (2.29) where the matricesA andB contain the coefficients appearing in each fractional order differential equation forM j (t). Taking into account of Riemann-Liouville fractional order operation, one can notice the embed- ded capability of the proposed fractal dynamical state equation (FDSE) in capturing the inter-dependency and memory effect by the construction of master equation (2.19). The majority of prior works implicitly assume the dynamics of the system is governed by the first order derivative that suggests the successive changes (positive and negative increments) in magnitude of the processes are characterized by an expo- nential law. In contrast, equation (2.29) considers systems where the magnitude of the variation over time contradicts with the patterns in time-domain of commonly employed Markovian processes (e.g., Poisson approaches). The dynamics of such systems usually exhibit non-exponential and fractal behaviors charac- terized by a power-law. 2.2 Experiments 2.2.1 Experiment Setup To make the discussion more concrete and exemplify the mathematical formalism, we investigated the statistical properties of a set of primary physiological processes (i.e, muscular, cardiac and neural processes). More specifically, we analyze the intramuscular EMG (iEMG), EEG and ECG signals from realistic clinical experiments. The iEMG signals are collected at different sites of the forearm muscles as shown in Figure 2.2: (i) extensor digitorum (ED); (ii) flexor digitorum profundus (FDP); (iii) abductor pollicis longus (APL); (iv) flexor pollicis longus (FPL); (v) pronator teres (PT) and (vi) supinator (SUP), when the subject is asked to 16 Extensor Digitorum Supinator Abductor Pollicis Longus Flexor Digitorum Profundus Pronator Teres Fine wire electrodes Motor neurons Neural activities ADI EMG recording system Sampling@4KHz Channel interested Flexor Pollicis Longus Figure 2.2: EMG setup: Intramuscular EMG signals are measured at 6 muscles (i.e., 2 flexor muscles, 2 extensor muscles, 1 pronator muscle and 1 supinator muscle). The subject is inserted with fine wire electrodes for measurement purpose. All channels are considered and the channels highlighted in red are selected as case study. relax 6 seconds, then do the finger flexion at a consistent strength for 10 seconds. The ADInstruments data acquisition system sampled the iEMG at 4 KHz. The EEG signals are recorded by a 64-channel electroencephalogram acquisition system shown in Fig- ure 2.3 that monitors the brain activity of 109 subjects when they are performing motor and imagery tasks upon noticing objects appearing on the screen [118]. Each subject is asked to open and close the corre- sponding fists or feet as a function of where the target appears. Each individual performed 14 experimental runs consisting of one minute with eyes open, one minute with eyes closed, and three two-minute runs of interacting with the target. The data set is collected by BCI2000 system with a sampling rate of 160Hz. The raw clinical ECG data was extracted from the PTB diagnostic ECG database [23]. Data on 52 healthy subjects (13 women, age 48 19 and 39 men, age 42 14) was obtained by the National Metrology Institute of Germany. Each subjects record includes 15 different signals simultaneously acquired: the con- ventional 12-lead (I, II, III, aVR, aVL, aVF, V1, V2, V3, V4, V5, and V6) and 3 Frank orthogonal leads (VX, VY , and VZ) as shown in Figure 2.4. Each signal is digitalized at 1000Hz, with a signal bandwidth of 0Hz to 1KHz and with 1 uV LSB resolution. 2.2.2 Investigation of Non-Exponential System Dynamics In the first set of experiments, we study the stochastic nature of magnitude increments (positive and negative) in all physiological processes considered to distinguish between exponential and power law behavior or between Gaussian and non-Gaussian cases. More specifically, we perform a statistical analysis to first estimate the empirical cumulative distribution function (CDF) and its support of the physiological 17 F F F AF AF AF AF AF F F F F F F F F F FT FC FC FC FC FC FC FC FT T T C C C C C C C T T TP CP CP CP CP CP CP CP TP P P P P P P P P P PO PO PO PO PO O O O I PZ P1 P2 7 3 Z 4 8 7 5 3 1 Z 2 4 6 8 7 5 3 1 Z 2 4 6 8 9 7 5 1 3 Z 2 4 6 8 10 7 5 3 1 Z 2 4 6 8 7 5 3 1 Z 2 4 6 8 7 3 Z 4 8 1 Z 2 Z 23 22 24 61 63 62 64 25 26 27 28 29 30 38 31 37 32 33 34 35 35 36 39 40 1 2 3 4 7 6 5 43 44 41 42 8 9 10 11 12 13 14 45 46 15 21 16 20 17 19 18 47 47 55 48 54 49 50 52 53 51 56 57 59 60 58 Channel interested Figure 2.3: The 64-channel geodesic sensor distribution for measurement of EEG. process magnitude positive and the absolute value of negative increments. By fitting the measurements to the postulated stochastic models, we estimate the parameters of i) an-stable distribution, ii) exponential distribution and iii) Pareto distribution via maximum likelihood method. Then the CDFs of three stochastic models are generated based on the estimates of the model parameters. To quantify the statistical confidence of model fitting, we performed the two-sample Kolmogorov- Smirnov test between the measurements and the generated stochastic processes with identified-stable, exponential and Pareto distribution parameters. The null hypothesis assumes the measurements come from the same distribution as the postulated distributions with significance of 0.05. As a case study, we report the results of a selected set of physiological channels from the subjects that best illustrate their stochastic natures. Both positive and negative increments are studied and we visualize the experiments considering the positive increments in Figure 2.5. Figure 2.5.(a-c) show how two models fit to the empirical survival function (SF) of a set of EEG (a), ECG (b) and iEMG (c) signal channels in the corresponding experiments, respectively. The interested channels are highlighted in Figure 2.2-2.4. In all subfigures, the red squares correspond to the empirical SF, the blue lines, the black lines and the green lines represent the best-fitted stable distribution, exponential distribution and Pareto distribution, respectively. The retrieved model parameters and p-values are reported in both the plot legends and Table 2.1. By examining the figures, we can make several important observations: i) The null hypothesis that the positive increments follow an exponential distribution is rejected in all physiological channels we considered with p-value ranging from 0:001 to 0:043. This suggests that magnitude variation over time domain strongly contradicts with an exponential law which is a well-adopted assumption in previous work. ii) The p-value well coincides with our visual inspection that the exponential SF fitting (black lines) deviates the empirical SF in all the channels considered. Instead, the Pareto distribution and the-stable fitting better represent the stochastic properties of the signal variations over time in all channels. This 18 aV aVF aVR V3 V4 V5 V6 V2 RA LL LA I II III aVL I Lateral left ventricle aVR Square root of squat II Inferior portion of left ventricle III Inferior portion of left ventricle aVL Lateral left ventricle Inferior portion of left ventricle aVF Septal Anterior V1 Lateral left ventricle Inferior portion of left ventricle Limb leads Precordial leads V2 V3 V4 V5 V6 Antero-Septal Antero-Septal Channel interested Figure 2.4: The deployment of 12-lead ECG system used in the experiment is shown. Postive Increments Postive Increments 10 0 10 1 10 2 10 -3 10 -2 10 -1 10 0 Empirical SF of the postive increments Fitted α-stable SF(α=1.87,β=1,γ=1.00,δ=6.047,p=0.154) Fitted exponential SF (λ=12.143, p=0.0089) Fitted pareto SF (k=-0.351,σ=16.385, p=0.44) 10 0 10 1 10 2 10 -3 10 -2 10 -1 10 0 Empirical SF of the postive increments Fitted α-stable SF(α=1.85,β=1,γ=1.00,δ=5.563,p=0.28) Fitted exponential SF (λ=12.344, p=0.053) Fitted pareto SF (k=-0.469,σ=18.070, p=0.581) 10 -4 10 -3 10 -2 10 -1 10 -3 10 -2 10 -1 10 0 Empirical SF of the postive increments Fitted α-stable SF(α=1.57,β=1,γ=1.00,δ=0.001,p=0.19) Fitted exponential SF (λ=0.003, p=0.043) Fitted pareto SF (k=-0.230,σ=0.004, p=0.32) 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 -4 10 -3 10 -2 10 -1 10 0 Empirical SF of the postive increments Fitted α-stable SF(α=1.93,β=1,γ=1.00,δ=0.003,p=0.56) Fitted exponential SF (λ=0.006, p=0.001) Fitted pareto SF (k=-0.300,σ=0.008, p=0.148) Fc1 F1 10 -4 10 -3 10 -2 10 -1 10 -3 10 -2 10 -1 10 0 Empirical SF of the postive increments Fitted α-stable SF(α=1.56,β=1,γ=1.00,δ=0.003,p=0.44) Fitted exponential SF (λ=0.007, p=0.02) Fitted pareto SF (k=-0.222,σ=0.008, p=0.304) II V2 ED 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1 10 -6 10 -4 10 -2 10 0 Empirical SF of the postive increments Fitted α-stable SF(α=1.54,β=1,γ=1.00,δ=0.001,p=0. 0) 4 Fitted exponential SF (λ=0.003, p=0.006) Fitted pareto SF (k=0.251,σ=0.002, p=0.235) PT Probability(positive increments > threshold) a) b) c) Postive Increments Figure 2.5: Empirical survival cumulative distribution functions and maximum likelihood best fitting- stable, exponential and Pareto distributions for magnitude increments in selected EEG, ECG and EMG channels. Figure (a)-(c) show the probabilities of the positive increments in magnitude exceeding a thresh- old value. P-value is obtained by performing a two-sample Kolmogorov-Smirnov test with the null hy- pothesis that the measurements come from the same distribution as the postulated-stable, exponential or Pareto distribution. suggests the existence of fractality which is governed by a power-law distribution as postulated by equa- tion (2.21), which also coincides with our following observation. iii) For all channels that can be better characterized by an-stable distribution or Pareto distribution, the estimated stability parameters are all smaller than 2 where = 2 corresponds to a Gaussian process. 19 0 100 200 300 400 500 600 700 800 -500 -400 -300 -200 -100 0 100 200 300 400 500 Neural activity Measurement FDSE@no coupling FDSE@coupling VARMA 0 100 200 300 400 500 600 700 800 -600 -400 -200 0 200 400 600 800 Neural activity Measurement FDSE@no coupling FDSE@coupling VARMA 0 100 200 300 400 500 600 700 800 -300 -200 -100 0 100 200 Neural activity Measurement FDSE@no coupling FDSE@coupling VARMA 0 200 400 600 800 1000 Sample ID -1.5 -1 -0.5 0 0.5 Cardiac activity Measurement FDSE@no coupling FDSE@coupling VARMA 0 200 400 600 800 1000 -1 0 1 2 3 Measurement FDSE@no coupling FDSE@coupling VARMA Sample ID Sample ID Sample ID Sample ID 0 200 400 600 800 1000 -0.4 -0.2 0 0.2 0.4 0.6 Measurement FDSE@no coupling FDSE@coupling VARMA Sample ID 0 200 400 600 800 1000 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 Muscular activity Measurement FDSE@no coupling FDSE@coupling VARMA 0 200 400 600 800 1000 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 Measurement FDSE@no coupling FDSE@coupling VARMA Sample ID Sample ID Muscular activity Cardiac activity Cardiac activity 0 200 400 600 800 1000 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 Measurement FDSE@no coupling FDSE@coupling VARMA Sample ID Muscular activity Fc1 F1 P8 a) II V2 V5 ED FDP PT b) c) Figure 2.6: Fitting the model to physiological measurements of EEG, ECG and iEMG considering i) FDSE with no coupling matrixA, ii) FDSE with coupling matrixA and iii) Vector ARMA model with no fractal exponents. For< 2, stable distributions have one tail (when< 1 and =1) or both tails (all other cases) that are asymptotically power laws with heavy tails. This is best illustrated in FDP and PT channel of iEMG signals where the empirical distribution fits well to thestable distribution up to a certain transition point where the empirical SF starts to obey the Pareto distribution (i.e., power-law). This suggests the existence of fractality in these physiological processes which is aligned with our analytical prediction made by Theorem 1, hence justifying the application of the proposed fractal dynamical state equation described in equation (2.29). The similar conclusion can also be reached by investigating the stochastic properties of the negative increments (see Table 2.1) in physiological processes. 2.2.3 Efficacy Evaluation of FDSE for Physiological Processes The investigation of the stochastic characteristics of the processes considered verified the existence of non-Gaussian and fractality (non-linearity) in the physiological systems. Therefore, the dynamical be- haviors of these systems in time and spatial domain can not be accurately described by the conventional methods that assumes a stationary and Markovian system state equation with short-term memory. In what follows, we present the second set of experiments where we are interested in the spatial (i.e, interdepen- dency across multiple channels) and temporal (i.e., how previous system state passes down its influence 20 Table 2.1: KS-test and ML best-fitting parameters of -stable, exponential and Pareto distributions for selected channels Channel Positive increments Negative increments ,, , p stbl p exp k, p par ,, , p stbl p exp k, p par Fc1 1.87,1,1,6.05 0.15 12.14 0.0089 -0.35,16.39 0.44 1.56,1,1,5.56 0.08 14.47 0.071 0.39, 20.04 0.18 F1 1.85,1,1,5.56 0.28 12.34 0.053 -0.47,18.07 0.58 1.89,1,1,6.02 0.15 11.29 0.05 -0.43,16.18 0.73 P8 1.57,1,1,4.62 0.19 11.38 0.0025 -0.19, 13.56 0.16 1.88,1,1,6.17 0.01 11.2 0.03 -0.25,14.03 0.16 II 1.56,1,1,0.003 0.44 0.007 0.02 -0.22, 0.008 0.30 1.69,1,1,0.003 0.24 0.006 0.0089 -0.20, 0.007 0.21 V2 1.57,1,1,0.001 0.19 0.003 0.043 -0.23,0.004 0.32 1.8,1,1,0.002 0.05 0.004 0.01 -0.40,0.005 0.38 V5 1.4,1,1,0.001 0.15 0.003 0.0017 -0.19,0.004 0.33 1.93,1,1,0.002 0.01 0.003 0.047 -0.42,0.005 0.49 ED 1.93,1,1,0.003 0.56 0.006 0.001 -0.30,0.008 0.15 1.92,1,1,0.003 0.1 0.005 0.034 -0.29,0.007 0.65 FDP 1.09,1,0.59,0.004 0.36 0.014 0.01 0.18,0.011 0.22 0.49,1,1,0.003 0.15 0.021 0.0001 0.87,0.007 0.47 PT 1.54,1,1,0.001 0.4 0.003 0.006 0.25,0.002 0.24 1.41,1,1,0.001 0.37 0.002 0.004 0.051,0.002 0.68 to current system dynamics) dependency structure of the physiological processes. We evaluated the ca- pability of the proposed FDSE in capturing the complex dynamical behaviors of physiological processes. More precisely, we employed a least-square error estimator proposed in [150, 147] to identify the model described by equation (2.29). After the identification of the model, we evaluate the model adequacy by comparing the physiological measurements and predicted model output as goodness-of-fit metrics. To compare with the Vector ARMA (V ARMA) model and understand the significance of coupling the fractal exponents into FDSE (i.e., considering the long term memory in system state dynamics), we report three experimental settings: i) only fractal exponents are considered (i.e., assuming an identify matrix forA in equation (2.29)); ii) only coupling matrixA is considered (i.e., assuming i = 1 which reduces to V ARMA type model) and iii) both coupling matrix and fractal exponents are considered (i.e., FDSE). We show the comparison results of a selected set of channels from EEG (a), ECG (b) and iEMG (c) measurements in Figure 4.5, respectively. The blue lines show the actual measurements. The orange lines represent the predicted model output where only fractal exponents are considered (i.e, setting i). The yellow lines and the magenta lines correspond to setting ii and iii, respectively. We use the system identification approach proposed in [150]. The estimated fractal exponents range from 0:30 to 0:66, 0:94 to 1:19 and 0:18 to 0:61 for EEG, ECG and iEMG signals, respectively, hence verifying the existence of fractality. Two key observations can be made for Figure 4.5: i) In all experiments we considered, the efficacy of incorporating the fractal exponents that captures the long-term memory effect in system dynamics can be best illustrated by the comparison between the predicted model output and actual measurements. In spite of the difference in magnitude, the predicted model output stays close to the actual measurement in terms of preserving critical system state transition behaviors. Intuitively, the predicted model output preserves turning points and the envelope of the state dynamics of actual physiological processes. This is primarily important for construction of a time-series model capable of characterizing the physiological processes. Vital changes in bio-markers of the physiological system usually correspond to infrequent anomalies (e.g., the abrupt decrease/increase in blood glucose/blood pressure or excessive brain activity caused by epilepsy). The failure of the model in capturing such vital changes translates to false negative errors and might lead to irreversible undesired consequences. In contrast, the fitted V ARMA models 21 without considering the long-term memory have the tendency to smooth away the sudden changes in model output in order to minimize least-square errors as a consequence of the regression process. ii) Comparing the goodness-of-fit between FDSE models with and without considering coupling matrix A that encodes the interdependency across different channels leads to interesting findings. As shown in the figure, the FDSE without matrix consideringA tends to overestimates the signal magnitude as a result of accumulating the influence of the previous system states over a long time course. In contrast, FDSE model with coupling matrixA in all experiments aligns well with the actual measurements suggesting its adequacy in characterizing the physiological processes. The performance difference can be understood as follows: the FDSE assuming an identity matrix A has no knowledge of how coupled physiological processes contribute to the state transition dynamics of each other. As a result, given a specific channel, the estimation process tries to compensate the contribution from other channels by assuming a long lasting influence from the previous system states. By incorporating the matrixA, the predicted model output is well regulated to adequately fit to actual measurements by coupling the interdependency between different channels of the physiological signals. 2.3 Summary Understanding the implications of the degree of nonlinearity and the nature of fractality (i.e., dis- tinguishing between fractality in the magnitude increments (space) of one variable with respect to other variables and the fractality in the inter-event times) represents the motivation of this work. We generalize the linear / nonlinear dynamic causal approaches, by adopting a statistical physics inspired probabilistic description of various processes representing the evolution of a complex system and incorporating the statistics of the magnitude increments and the inter-event times into the mathematical expression of the master equation. First, this new approach allows to capture the power law and nonlinear interactions that exist be- tween the magnitude increments and the inter-event times of one stochastic process on one hand, and the inter-dependencies between the magnitude increments and the inter-event times of one process and other processes, on the other hand. Second, it provides a mathematical strategy for modeling of complex systems whose dynamics exhibits a mixture of Markovian and non-Markovian evolution. Relying on conditional probabilistic description on how ordered sequence of magnitude increments and the inter-event times af- fect the overall system dynamics allows to define new multivariate causal inference techniques that take into account the non-Markovian nature of the dynamics. This is left for future work. Moreover, it allows to mathematically justify the adoption of a class of mathematical model that could potentially complement current Bayesian model selection strategies. The presented mathematical framework could be enriched by combining it with other techniques from statistical machine learning and signal processing for develop- ing new modeling strategies for complex interdependent networks. Third, the proposed causal modeling of complex dynamics can be integrated as the core of the cognizant cyber-physical systems in a widely ranged CPS applications to be able to understand, describe, predict and control the underlying physical processes. 22 Chapter 3 Spatio-Temporal Fractal Model for a CPS Approach to Physiological Systems Advances in the physical sciences and engineering enable the development of cyber-physical systems (CPS) to understand, interface / interact and engineer physical (biological) world (systems). The CPS paradigm refers to a broad class of smart systems with deeply embedded cyber capabilities for sensing, monitoring, and communicating the accumulated large amounts of data about the physical world to com- putational nodes for real-time analysis, interpretation and determination of closed-loop control strategies [2][9][20]. Although the synergetic coupling of physical and cyber processes has a tremendous impact on a broad application domain (e.g., environmental, healthcare, avionics, smart transportation / buildings / cities), it also raises a few grand challenges: What is the appropriate modeling framework that captures the CPS characteristics and facilitates the analysis, design and optimization of CPS? What compact yet accurate modeling techniques that account for spatio-temporal complexity and fractal properties should be developed to enable the design of large-scale autonomous (or semi-autonomous) CPS? What are the data mining strategies that meet the real-time CPS requirements? How can the cognitive control of human brain be understood and be enabled within the CPS architectures? To address these challenges, we turn our attention to biological systems and get inspiration from un- derstanding the dynamics and functionality of human brain to enable a more efficient and safer human-to- cyber-to-physical interaction. From a healthcare perspective, understanding and mining brain activity can be beneficial for developing CPS-based therapies (e.g., brain-machine-body interfaces (BMBI)) for brain disorders. As stated in [7], we use a rich neurotechnology to measure the brain activity, but lack math- ematical models, algorithms and computational tools for understanding neural-muscle data, deciphering disease indicators, explaining brain disorders, identifying clinical therapies and enabling complex human- to-cyber interactions. From control perspective, understanding and modeling the human brain cognition could help us define the principles of autonomous systems design. For manufacturing and smart things, thought-controlled robots that can interact with our collective cognitive efforts could contribute to not only higher yield/performance, but also safer environments. Thought-control systems could prove essential for cleanup operations in dangerous environments (e.g., nuclear reactors) or for maintaining high ecological standards. Traditional approaches for developing decoding algorithms of brain dynamics (converting spiking 23 Muscular activities Conscious neural motor activities Brain-Machine-Body-Interface Physical domain Cyber domain Brain-Machine-Body interface signals and modeling Cyber-physical boundary Myoeletric equipment,etc. Brain-Brain communication Thought-control machineary ...... Translation to control Figure 3.1: Cyber-physcial system for Brain-Machine-Body interfaces neural activity from motor cortex into muscle activity and kinematics of a prosthetic arm) have mainly focussed on determining a mapping function from some observations of neural and muscle activity (train- ing data) by minimizing a specific error metric on testing data. Despite some successes, these approaches have neglected some important features of brain-muscle dynamics: (i) A cognitive operation may activate a brain region, but the converse operation of activating that same region does not imply that the cognitive process is actually occurring. This implies that assuming instantaneous activation may offer a simplistic view and that the spatial correlation structure and functional dependency between multiple brain regions must be taken into account. (ii) Brain activity is not random. Even though many studies assumed a high degree of randomness in the neural dynamics for the purpose of employing statistical averaging, the actual brain dynamics and its coupled physiological processes proved to posses fractal characteristics. Simply speaking, the neural and muscle activity cannot be modeled as short-range dependent processes, but rather their long-range dependence should be accounted for improving modeling accuracy and prediction capa- bilities. 3.1 Related Work and Novel Contribution Probing brain’s activity for the purpose of decoding its cognitive process, explaining its interaction with overall body (e.g., muscles) and controlling the movement of arms originated in the 1960’s with the pioneering work of Evarts [43]. More recently, Donoghue and co-workers designed the BrainGate which allowed a tetraplegic patient with spinal cord injury perform simple movements [39]. 24 Time(sec) -0.5 Muscular activities(mV) Sample ACF 0 10 20 30 40 50 Lag 8 10 12 14 16 -0.5 0 0.5 ED: Extensor Digitorum 0 0.5 1 0 1 2 3 4 -10 -5 0 5 Slope=-1.28 a) d) 0 log(SACF) c) e) Sample XACF Lag Time(sec) Lag log(Lag) 0 0.5 APL: Abductor Pollicis Longus 8 10 12 14 16 Sample ACF 0 1 0.5 0 10 20 30 40 50 -200 -100 0 100 200 -0.15 -0.1 -0.05 0 0.05 0.1 Xcorr(ED,APL) Xcorr(ED,AWGN) Xcorr(APL,AWGN) f) b) Figure 3.2: a-b) The iEMGs from two muscles (ED and APL) are measured when the subject is abduct- ing the thumb for 10 seconds after 6-second relaxing. c-d) The SACF decays hyperbolically rather than exponentially proving a long-range memory. e) The SACF loglog plot of ED iEMG measurements against lags shows a power-law behavior. f) The SXACF between the iEMG signals proves the spatial interdependence over time. Going beyond performing simple tasks requires advanced mathematical and algorithmic approaches for decoding brain-muscle activity. Current approaches for the analysis of electroencephalographic (EEG), electrocortigraphic (ECoG) and electromyographic (EMG) signals can be classified into two classes: linear (e.g., linear estimator [38][114], population vector [140], Wiener filter [42], Kalman filter [40], recalibrated feedback intention-trained Kalman filter [44]) and nonlinear (e.g., artificial neural networks [117][37], particle filters [41]). To enhance the BMBI capabilities and performance, we must address the following challenges: (i) Real-time parameter identification of the decoder algorithm. We propose a closed-loop spatio-temporal fractal (STF) model that interacts with BMBI user. This overcomes the suboptimal performance of algo- rithms that are based on offline cross-correlation validation. (ii) Use the neuromuscular spatial dependen- cies and their fractal dynamics to develop a model of the BMBI at runtime for the purpose of analysis, prediction and control. Many of the current approaches ignore these unique aspects. To illustrate the discrepancies between the mathematical features of biological processes and the cur- rent employed memoryless models, we analyze the intramuscular EMG (iEMG) signals from clinical experiments(iEMG Recording study in courtesy of The Alfred Mann Foundation) where 3 tested healthy subjects are asked to perform forearm movements. The iEMG signals are recorded at different sites of the forearm muscles: (1) extensor digitorum (ED), (2) flexor digitorum profundus (FDP), (3) abductor pollicis longus (APL), (4) flexor pollicis longus (FPL), (5) pronator teres (Pro), and (6) supinator (Sup) (see Figure 3.3 and section IV for experimental setup details). Figure 3.2 shows 2 measured time series for 2 muscles (ED and APL) when the subject is asked to abduct the thumb at a consistent strength for 10 seconds before relaxing (a-b), the sample autocorrelation function (SACF) of the 2 signals (c-d), thelog-log plot of SACF of the signal collected from ED (e), and the sample cross-correlation function (SXACF) between the mus- cle time series (f). The individual analysis of iEMG signals via the SACF method (see Figure 3.2.c-e) shows that the SACF (k) decreases for higher lagsk as a power law rather than as an exponential that 25 corresponds to short-range memory or non-fractal models (including autoregressive (AR) models). This is best shown in Figure 3.2.(e) where the points corresponding to thelog( (k)) scatter around a straight line as a function oflog(k), following alog( (k)) =const +log(k) expression. This demonstrates the existence of a correlated neural-muscular regulatory over long temporal horizons. To probe the spatial cross-dependency among biological processes, we measure the sample cross- correlation functions (SXACF) (see Figure 3.2.(f)). In contrast to uncorrelated signals (e.g., additive white Gaussian noise with zero SXACF), the SXACF analysis among iEMG signals demonstrates consistent in- teractive influence and reveals spatial long-term dependencies. The amplitudes of the SXACF coefficients are fluctuating over a long temporal horizon which is not expected assuming short memory processes. Such long term memory effect can also be found in other neural and muscular signals and are not discussed here due to limited space. In summary, the analysis in Figure 3.2 shows that even the simple movements of the thumb calls for a synergetic model of neural, muscular and cyber components as interdependent net- works. Alternatively, capturing such long term cross-dependent behavior associated to BMBI calls for the development of multivariate fractal mathematical model within CPS. In light of these mathematical observations, we make the following contributions: First, we propose a data-driven multivariate fractal model to capture the long-range memory and spa- tial cross-dependencies that exist between biological (neural, muscular) and cyber processes. By exploiting the fractal properties of the biological processes, the model can be learned within a CPS infrastructure at run-time from fewer measurements. Second, we develop an efficient algorithm for identifying the parameters of the proposed mathemati- cal model and provide theoretical bounds concerning the minimum amount of samples that lead to good identification performance. Third, we investigate the effectiveness and accuracy of the proposed mathematical model, we contrast and highlight the benefits of our model with prior memoryless approaches and validate our model under clinical measurements and known biological facts from medical literature. This mathematical framework enables the understanding of cognitive control and development of advanced control techniques for CPS. 3.2 Spatio-temporal Fractal(STF) Modeling 3.2.1 Premises and Vision for Constructing the STF Model The complex interplay between the neural and muscle systems leads to a closed-loop networked con- trol architecture that translates bioelectrical signals into motor commands and enables the human body execute highly refined and high degree of freedom movements. Decoding how such translation takes place is essential for BMBI that can further enable brain-to-brain communication and thought-controlled robots. Towards this end, the CPS approach to BMBI shown in Figure 3.1 needs a mathematical modeling framework for describing brain-body-cyber dynamics and enable human-cyber interactions. The vision is to build a robust mathematical framework for the CPS approach to BMBI, we represent the interplay between neural, muscular and cyber processes as three highly dynamic interdependent net- works. From a sensing perspective, the neural activity can be non-invesively sensed via EEG. To quantify 26 the impact of neural commands on muscle activity, we can record the muscle electrical dynamics via EMG signals. Usually two types of EMG signals are recorded: (i) The surface EMG (sEMG) signals are used to assess the muscle function from skin level measurements. The sEMGs are influenced by the depth of the subcutaneous tissue at the site of the recording which can be highly variable depending of the weight of a patient, and cannot reliably discriminate between the discharges of adjacent muscles. (ii) The iEMG signals avoid the sEMG drawbacks and measure the muscle activity via inserted monopolar or concen- tric needle electrodes through skin into the muscle tissue. As shown in Figure 3.2, these interdependent networks exhibit: i) A cross-correlated (dependent) spatial patterns when executing different tasks and ii) A complex fractal (long-range memory) property. Consequently, in what follows, we will construct a mathematical model of spatio-temporal fractal interdependent networks. 3.2.2 Data-driven Spatio-Temporal Fractal Model To build a mathematical model capturing the spatio-temporal fractality among BMBI signals, we de- note by X(t) = [X 1 k1 (t):::X n 1 (t):::X n kn (t)] T , where X(t) is aK-th order STF state vector ofK biological and cyber processes representingn different functional entities (e.g. different muscles as in iEMG); the biological and cyber processes interact with each other over time and their cardinality satisfies the follow- ing relations: k 1 +k 2 +::: +k n =K; theX m k l is thek l -th channel of them-th dimensional BMBI time series. Demonstrated by clinical and experimental investigations, the neural-to-body activities (e.g., body movement) imply a high degree of coordinated dependency among biological entities (e.g., neurons and muscles). To encode such dependencies, we introduceA(L) p =A 1 L +A 2 L 2 +A 3 L 3 +...+A p L p as the cross-dependency matrix, whereA i L i is anyKxK matrix for whichjA i L i j has all its root outside the unit circle. Here,p is the autoregressive order that models the short term memory effect from events that are p-steps back into the past andL is the backward operator such thatA i L i X k (t) =A i X k (ti). LetD (L) (with = [ 1 ; 2 ... K ] T ) be aKxK diagonal matrix with entries (1L) 1 ,(1L) 2 ,....,(1L) K , where each i 2 [0:5; 0:5] is the fractal differencing order for i-th process. The D (L) integrates the long range memory and can be expressed as a binominal expansion for k-th process, (1L) k = P 1 j=0 (j k ) (j+1)( k ) L j = P 1 j=0 ( k ;j)L j , where ( k ;j) is the expansion coefficient that depends only on the fractal order of the process and indexj. With these definitions, the multivariate STF model reads: D (L)X(t) =A(L) p X(t) + E(t) (3.1) E(t) N(0; ) is aK-dimensional multivariate normal distribution with zero-mean and cross-covariance matrix . 3.2.3 STF Model Identification Algorithm To jointly estimate matrices A(L) p and D (L) (i.e., the fractal differencing operator) of the STF model, we formulate the following optimization problem: Given the limited observations offX m (t)g for allm over a time horizon [t;t+T1] and the following notations: a m i = [a m i;1 a m i;2 ...a m i;K ] T is them-th row of matrixA i , andZ m (t) = (1L) m X m (t) represents 27 the fractal differenced time series that is a function of observations X m (t) weighted by the binominal expansion coefficients. Find the parameters [a m 1 a m 2 ...a m p ] and fractal exponents m for allm that minimize the least square error (LSE): min [a m 1 a m 2 :::a m p ] T1 X i=0 (Z m (t) p X i=1 a mT i X(ti)) 2 (3.2) Of note, the fractal differencing order m ofZ m (t) makes the optimization problem in Eq. (3.2) infeasible for applying a linear regression unless we have prior knowledge of m . Unfortunately, we usually do not have any information about m and decoupling the estimation of fractal order from that ofA(L) can cause misleading estimation [131]. To solve this problem, we rewrite the Eq. (3.2) in form of finite binominal expansion of the fractally integrated terms as: X m (t) = P K k6=m P p i=1 a m i;k X k (ti) + P inf i=p+1 ( m ;j))X m (ti) + P p i=1 (a m i;m ( m ;j))X m (ti) (3.3) As we know, (jm) (j+1) is well approximated by j m1 when j is large. So we have ( m ;j)) Cj m1 whereC is a constant. Putting the relation between indexj and autoregressive coefficients of X m in alog-log plot leads to a linear function with - m -1 slope. This power-law relation not only leads us to our previous argument that our proposed STF model captures the long term dependencies, but also enables us to perform multi-variate regression considering (K 1)p +inf unknown coefficients to estimate theA(L) and fractal order m at the same time. Thus we propose an iterative multi-regression algorithm to solve this optimization problem. Letting X m (t;T ) be aT -dimensional observed output over the interval [t;t +T 1], we have X m (t;T ) = X(m;inf)A m + e (3.4) where, X(m;inf) =fx m (ti)g T((K1)p+inf) is aT (K 1)p +inf autoregressive observation matrix. The i-th row in X(m;inf) represents all the p-order autoregressive terms of (K-1) signals and inf-order autoregressive terms ofX m channel at timet +i. A m =[a m 1;1 a m 2;1 ...a m p;1 ...a m 1;m ...a m infi;m ...a m p;K ] T is a vector that contains all the unknown dependency coefficients associated to m-th signal. a m i;j is the element at position (m;j) ofA i . Eq. (3.4) is a linear system such that we could derive the lower-bound of number of observations we need for estimation of coefficients in our model. To have unique solution, the next condition must be satisfied,T Rank(X(m;inf)) = (K 1)p +inf. Our algorithm could reliably estimate the STF model from few observations as a function of the time series cardinality and model orderp. This translates in reduced complexity. 28 Algorithm 1 LSE Estimator for STF model Require: Observations X(t); Autoregressive orderp; Ensure: Cross dependency matrix setfA i ji2 [1;p]g; fractal differencing order 1: for allX m (t) in X(t) do 2: ConstructX(m;inf) 3: Construct X m (t,T) 4: A m =Multiregress(X m (t,T),X) 5: Y =log([a m 1;m ...a m infi;m ]) 6: [slope intercept]=Fit(log([1:infi]),Y) 7: m =-(slope+1) 8: Calculate cross dependency matrix 9: end for Extend fingers Flex fingers Supinate Pronate Fine wire electrodes Motor neurons Neural activities ADI EMG recording system Sampling@4KHz Figure 3.3: The clinical experiments settings. 3 healthy subjects are implanted with fine wire electrodes measuring the iEMG signals when they are asked to do: i) finger extension; ii) finger flexion; iii) pronation iv) supination 3.3 Experiment Setup and Results In this section, we evaluate the effectiveness of our model and study the mathematical characteristics of BMBI under diverse cerebral-muscular interplays through realistic clinical experiments on 3 healthy subjects. Figure 3.3 shows 6 targeted muscles (i.e., 2 flexor muscles, 2 extensor muscles, 1 pronator muscle and 1 supinator muscle) of each subject are inserted with fine wire electrodes for measurement purpose. Subjects are asked to relax 6 seconds, then do the following actions: i) finger extension; ii) finger flexion at a consistent strength for 10 seconds or iii) pronate; iv) supinate. The entire process is repeated twice for each movement. The ADInstruments data acquisition system sampled the iEMG at 4 KHz after applying a 2 KHz low pass filter, and a 10 Hz high pass filter to minimize any motion artifacts from electrodes or leads. In addition, we compare the popular vector autoregressive moving average (V ARMA) with our fractal model in terms of sufficiency of capturing the cross-coupled dynamic behavior. We further show that the 29 0 0.1 0.3 0.5 0.7 0.9 Utilization of time series 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Average NRMSE Estimation error@length=1024 ANRMSE@dim=2 ANRMSE@dim=4 ANRMSE@dim=8 Estimation error@length=2048 0.1 0.3 0.5 0.7 0.9 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 ANRMSE@dim=2 ANRMSE@dim=4 ANRMSE@dim=8 0.05 0.15 0.25 0.1 0.3 0.5 0.7 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Observability Penalties ANRMSE with partial observability=4 ANRMSE with partial observability=2 ANRMSE with full observability=8 0.1 0.3 0.5 0.7 0.9 0 0.1 0.2 Estimation error@length=4096 ANRMSE@dim=2 ANRMSE@dim=4 ANRMSE@dim=8 a) b) c) d) Figure 3.4: The ANRMSE values (a-c) for partial observability of several signal lengths (1024; 2048; 4096). d) Partial observability estimating 8-dimensional cross-correlated signals from only 2; 4 observed channels. fractal connectivity could be statistically inferred by analyzing the dependency matrixA(L) in our model. The connectivity extracted by our model can lead to better recognition, prediction and control in CPS architectures. 3.3.1 Effectiveness of Spatio-Temporal Fractal Modeling One major challenge for the design of a CPS approach to BMBI is represented by the need to enable a cyber platform that is able to work with the real-time measurements, identify in real time a compact (few parameters) yet expressive mathematical model and rely on efficient parameter estimation algorithms (determines the mathematical model describing the CPS dynamics from minimum number of samples). This calls for robust and fast algorithms that work with short time series and consistently show good accuracy and fidelity compared to real-time measurements. To justify our proposed framework, we first measure the accuracy of the proposed mathematical identification algorithm in relation to the number of (observed) samples. Then, we show the importance of capturing the spatio-temporal interdependencies of 30 FPL PT ED FDP APL FPL PT SUP 5 -1 1 -3 -5 3 FDP FPL PT SUP ED FDP APL FPL PT SUP 3 5 1 -3 -5 -1 ED FDP APL FPL PT SUP ED FDP APL FPL PT SUP -1 -3 -3 0 5 3 ED APL ED APL FDP SUP ED FDP FPL PT SUP ED FDP APL FPL PT SUP -5 -1 3 -1 -1 5 APL Extend fingers Flex fingers Pronate Supinate Influencing factor Influencing factor Influencing factor Influencing factor Strong Weak Figure 3.5: Fractal connectivity network inferred from 4 different movements. 3 healthy subjects are asked to i) Extend all fingers; ii) Flex all fingers at a consistent strength for 10 seconds or iii) Pronate forearm; iv) Supinate forearm in the experiments. ED: Extensor Digitorum APL: Abductor Pollicis Longus -1 0 1 [mV] -1 0 1 [mV] 0 15 0 15 0 15 NRMSE=0.5934 NRMSE=0.5124 NRMSE=-1.164 NRMSE=-3.435 Raw iEMG data Proposed VARMA Figure 3.6: Comparison of first-order model-fitting to 2-channel (ED and APL) raw iEMG data collected from subject 3 under finger extension BMBI activities by measuring the deviation in the estimated parameters for two cases: (i) The parameters of the model when all the time series are given / observed; (ii) The parameters of the model when only a subset of time series are considered. To investigate the accuracy of the STF modeling framework, we first simulate the proposed model with random choice ofA(L) matrix and fractal differencing matrixD (L) under different signal cardinal- ity (i.e., considering 2, 4 and 8 time series). For each experimental investigation that considers a specific number of time series, we generate 1000 time series of different lengths (i.e., 1024, 2048 and 4096, re- spectively) and measure the average normalized root mean square error (ANRMSE) between the true and identified parameters as a function of considered number of samples from the initial time series. ANRMSE is calculated asE[1 kx ref xk kx ref E[x ref ]k ]. To obtain the ANRMSE values in Figure 3.4.(a-c), we perform 10 5 experiments for each choice of the number of considered time series, number of considered samples and length of the time series. Simply speaking, we aim to exploit the signals fractality and find the minimum number of samples that would lead to a good model with good accuracy (i.e., the ANRMSE is smaller than a threshold). This would help the CPS to build confidence about the model as the samples are recorded. As shown in Figure3.4.(a-c), the ANRMSE exhibits a phase transition from negligible to noticeable errors as a function of the number of considered samples. For instance, we could use only 30% of the recordings to learn a good CPS model. In other words, considering only some information about the state-space dynamics represented by the time series (i.e, 10% or 30% of the total time series length) the 31 ANRMSE remains almost the same to the case in which more and more samples are included in the iden- tification problem and contributing to higher computational complexity. This can translate into smaller model identification latency and power savings. For many CPS applications, it is important not only to identify a mathematical model with good accu- racy from minimum number of samples, but also to be able to retrieve a good model of the system when only a subset of state variables are observed (e.g., reduced order model). In order to study this problem, we consider that from a number of time series we only know a subset and identify a mathematical model of the form in equation (3.1). More precisely, we first simulate the proposed model under a cross-dependent model of 8 time series with length of 1024 and generating 1000 trajectories. Then we perform the pa- rameter estimation assuming we only have partial observability (i.e, only consider) to 2 and 4 channels from all 8. The ANRMSE values are reported in Figure 3.4.(d) as a function of the number of considered samples. As shown in Figure 3.4.(d), when only 4 channels of an 8-dimensional signal representation are observed, the estimation error almost remain the same compared to the case with full observability. It is also noticed that ANRMSE values of the estimated parameters increase noticeably when only two time series out of 8 cross-correlated time series are considered. This implies the failure to model a complex process involving multiple participating entitles (e.g., the cerebral-muscular activities consisting of mul- tiple functional muscles and neurons) and consider sufficient state-variables will lead to model misfitting and biased implication of how these entities are dependent and/or correlated (e.g., the contribution of some muscles in certain movement might be underestimated or overestimated). 3.3.2 Model Validation in Realistic Clinical Experiments To investigate the benefits of our model, we compared the goodness-of-fit (GoF) provided by our model and the extensively utilized V ARMA model with order p = 1 when applied to the raw iEMG signals of 6 channels collected from 3 subjects doing different movements. The GoF is quantified in terms of normalized root mean square error (NRMSE) computed for each channel for the two models and averaged over 3 subjects (see Table 3.1). NRMSE is a measure of how well a model fits to the data ranging from 1 (worst) to 1 (best). To better illustrate the capability of our fractal modeling approach, Figure 3.6 summarizes for 2 channels (namely ED and APL when doing finger extension) the comparison between: i) First column represents the raw iEMG signals, ii) second column denotes the time series generated by our fractal model, and iii) third column shows the time series generated by V ARMA. As one can notice, V ARMA model fits poorly to the raw data with negative NRMSE values. In contrast, our fractal model fits better with NRMSE values over 0.5 for all channels. Table 3.1 gives a comprehensive overview of the comparison across all 6 channels from 4 movements. The results consistently show the proposed model gives better fitting over V ARMA model. Consequently, our fractal model captures better than V ARMA not only the long-range memory of the signals, but also the cross-dependencies between signals. Taken together, these results show that modeling fractality can significantly improve not only the GoF of the mathematical model opening the avenue for endowing the CPS with built-in intelligence, but also can lead to a better understanding of fractal properties expressed by biological systems and develop new more efficient control strategies. 32 Table 3.1: Average goodness-of-fit in NRMSE(-inf: worst ; 1 : best) Channel Extend Flex Pronate Supinate V A Frac V A Frac V A Frac V A Frac ED -1.2 0.52 -0.86 0.47 -0.15 0.30 -0.65 0.46 FDP -2.4 0.57 -0.17 0.47 -1.4 0.49 -3.3 0.58 APL -3.6 0.59 -0.22 0.32 -0.61 0.45 -0.44 0.40 FPL -2.5 0.55 -1.2 0.51 -9.3 0.56 -2.3 0.52 PT -5.9 0.59 -1.8 0.54 -1.0 0.50 -6.2) 0.58 SUP -2.7 0.57 -0.13 0.25 -0.33 0.36 -0.37 0.55 3.3.3 Statistical Analysis of Fractal Connectivity Fitting the proposed model well to the real neuromuscular processes leads us to several follow-up research directions: i) How can we infer correlations between different participating entities (e.g., different muscles) jointly involved? ii) How can we statistically associate such correlations to specific movements from different subjects in multiple trials to reliably extract patterns? To answer these questions, we exploit the interdependency matrix A(L) and construct the connectivity network for pattern recognition. More precisely, we construct the directed graph G from the A(L) coefficients by comparing the off-diagonal coefficients in the symmetric position (i;j) and (j;i) of the matrix with a predefined threshold and drawing directed edge from j to i if muscle j exerts much greater influence on i. To deal with variabilities and present reliable patterns, for each connection inG, the estimatedA(L) coefficients are used to perform a t-test with the null hypothesis that the coefficients come from a distribution with a zero mean (i.e., different muscles are weakly correlated or uncorrelated if no movements are performed). With significance = 0:05, the connectivity network is then constructed by including all statistically significant connections, i.e., connections whosep-values are smaller than. The resulting connectivity graph for different movements are plotted in Figure 3.5. We quantify the influencing factorf as the difference between the in-degree and out-degree of a given node. A biggerf is associated with the muscle that is more active in corresponding movement and needs collective assistance from other muscles. The one with biggest f is the dominant muscle in a movement. As one can see from Figure 3.5, our fractal model captures accurately the major dominant muscles involved in all 4 movements and shows consistency with the clinical and anatomical observations [55] . 3.4 Summary In this work, we propose a data-driven fractal mathematical model capable of modeling multi-dimensional cross-dependent BMBI processes with long term dependencies. We justify our model through realistic clinical experiments measuring multi-channel iEMG signals of different natural forearm movements. The comparison between the well-known V ARMA and our fractal model shows that our model gives a better fit throughout the experiments. We also statistically infer the fractal connectivity network from the fitted model and show agreement with anatomical observations which lays the foundation for prediction and control of BMBI. 33 Chapter 4 Minimum Number of Sensors to Ensure Observability of Physical Systems: Case Studies Advances in wearable technology allow us to measure all sorts of physiological signals. Fitbit is probably the most familiar to the public in general. It measures data such as the number of steps per day, quality of sleep, steps climbed, and other personal metrics; and Emotiv is a company that develops brain- computer interfaces based on electroencephalography (EEG) technology. Nonetheless, these and other technologies are still in their infancy, and far from allowing to predict a cardiac arrest or the beginning of an epileptic seizure. The benefits generated by such capability are expected to revolutionize healthcare, in the sense that it will allow an impairment with a smartphone via wireless or bluetooth, capable of performing an emergency call, or stimulate locally certain points of the human body to mitigate the effects that follow each occurrences. Towards that auspicious future, we propose to explore a class of models that will enable the character- ization of certain physiological signal dynamics, and allow us to retrieve the overall dynamics associated with certain technologies by only considering a subset of its signals. These models rely on the assumption that acquired signals of the same physiological phenomena have spatial-temporal properties that can be captured by the model. This is the case for the electrocardiogram (ECG) signal that records the electrical voltages generated by the heart over a period of time, using electrodes placed across the patient’s body: limbs and chest. Commonly, there are between twelve and sixteen ECG electrodes placed on the surface of the patient’s body. These sensors capture the potential differences between electrodes that arise from heart-muscle depolarization during each cardiac cycle, measured from different angles in a frame whose heart sets the origin [54]. This emphasizes the spatial and temporal interdependence between physiolog- ical processes. Further, Fitbit sensor matches one of the leads locations, so one could wonder if it would suffice to predict abnormal cardiac activity, see Figure 4.1. Similarly, the EEG sensors measure electrical potentials (brain waves) resulting from ionic current within the neurons of the brain. Due to the nature of the electrodes used, different sensors may collect data associated with the same activity, and, thus, capturing interdependent temporal and spatial signals. The frequency of the brain waves recorded from the surface of the scalp can range from once per few seconds to more than 50Hz. The aspect of these waves is dependent on the activity of the correspondent cerebral cortex being acquired, and they can change drastically between states of wakefullness, sleep, coma or 34 Figure 4.1: Can the Fitbit sensor be used to retrieve the overall activity of the remaining sensors, hence, assess if one is about to have a cardiac arrest? in brain diseases such as epilepsy [54]. Finally, an electromyography (EMG) signal detects the electrical potential generated by the skeletal muscle cells when these cells are electrically or neurologically activated, hence, resulting in a technique for evaluating and recording the electrical activity produced by skeletal muscles. These signals can then contain dynamical fingerprints that are associated with a configuration of muscle activation, leading to the execution of a specific task. Once again, due to the interdependence of the different biological systems (muscles) in the human body, it is easy to understand that the spatial and temporal dependence is present. Furthermore, these signals can be analyzed to detect and predict medical abnormalities, activation level, or recruitment order, or to analyze the biomechanics of human or animal movement. Unlike current mathematical modeling approaches that rely on memoryless assumptions, the statistical analysis of physiological processes (ECG, EEG, blood glucose) demonstrates that they possess significant degree of long-range memory and fractality. In particular, numerous recent studies show that physiological processes can be more accurately modeled via fractal order dynamical systems [92, 82, 144, 138, 136]. Nonetheless, these efforts have either only demonstrated that the statistics of the physiological dynamics is of fractal nature (e.g., the autocorrelation function or the power spectrum exhibit a power law behavior) or only accounted for the temporal dependence of the signals (e.g., demonstrating that there is some form of persistence in time between changes in the magnitude of the physiological process). Subsequently, the spatial dependence existing between various physiological processes cannot be leveraged to obtain information about the signals that are not being directly measured. To overcome these limitations, we propose to use coupled discrete-time fractal order dynamical sys- tems (CDFODS) to capture the spatial-temporal characteristics existing between physiological processes. Further, if we aim to retrieve the state of the system from the measurements alone, commonly referred to as static observability, the solvability of a set ofp measurements equations to recover anndimensional parameter is required, hence, requiring at least as many measurements as the number of unknowns,pn, 35 in general. In addition, if the processes are not spatially dependent, when the sensing technology only accesses a state variable at any given time, then such conditions are not possible to satisfy. Subsequently, this model allows us to retrieve all its states from the collected data obtained from a small subcollection and the model. This property is commonly referred to as observability of the system, and the system (i.e., dynamics and sensing capabilities) that possesses it are referred to as being observable [16] – see Sec- tion 4.2 for details. Furthermore, in this work, we propose to extend the use of submodularity tools to find the minimum number of sensors in the CDFODS context that, to the best of our knowledge, has not yet been previously explored. Submodular functions are used across multiple fields of science, for instance, mathematics, economics, circuit theory, operation research and machine learning [8, 34, 80, 121]. In particular, examples of ma- chine learning applications include static sensor selection [68, 69] where a dynamic model is not explicitly considered; natural language processing [66]; robotics applications [6, 88]; and spatio-temporal processes modeled as linear-time-invariant models under uncertainty [120], just to name a few. The problem of determining the minimum number of sensors to ensure observability in linear-time invariant systems has been explored in [101], and its optimal solutions in [105]. In addition, we notice that the previous solu- tions [101, 105] considered the continuous-time integer (non-fractional) order systems. It is also important to notice that although [101, 105] propose approximations that resemble the greedy algorithms known to approximate those whose objective are given by a submodular function, they do not explicitly use these notions as we propose in this work. The contributions of this paper are fourfold: (i) we explain how to cast the evolution of several spatial- temporal-related physiological signals into a CDFODS framework; (ii) we show that the problem of deter- mining the minimum number of sensors required to ensure observability of CDFODS is NP-hard; (iii) we propose to use submodularity theory to approximate the results, which yields optimality guarantees; and (iv) we show its application in the context of three physiological signals, i.e., EEG, ECG and EMG. The remaining of the paper is organized as follows: Section 4.1 introduces the CDFODS and the math- ematical formulation of the problem to determine the minimum number of sensors to obtain an observable system. In Section 4.2, we revisit some of the properties of the CDFODS and submodularity theory. The theoretical results of this paper are presented in Section 4.3, whereas in Section 4.4 we study several applications of the proposed framework. 4.1 Problem Statement Consider a model dynamics described by a linear discrete-time fractional-order system as follows 2 6 6 4 D 1 . . . D n 3 7 7 5 x[k + 1] =Ax[k]; (4.1) wherek = 0; 1;:::;T ; i 2R + fori = 1;:::;n; andD i is the discretized fractional-order operator [16], and x[0] = x 0 2 R n the initial condition. Further, for brevity, we refer to the dynamics in (4.1) as 36 F(A;;K), where = ( 1 ;:::; n ). In addition, assume that a collection of sensors is deployed to collect data about the state of the system. If a sensor is able to capture a linear combination of state variables, then the collection of sensors measurements can be represented as follows: y[k] =Cx[k]; (4.2) whereC is a matrix with appropriate dimensions encoding the linear combination. In the static case, i.e., when we aim to retrieve the state of the system from the measurements alone, (static) observability re- quires the solvability of a set ofp measurements equations to recover anndimensional parameter, hence, requiring at least as many measurements as the number of unknowns,pn, in general. Alternatively, as often occurs in several setups, the sensor captures only a state variable (instead of a linear combination of these), i.e., the collection of measurements is given by y[k] =I J n x[k]; (4.3) whereJ denotes the indices of the state variables being measured, andI J n denotes the matrix that consists of the rows of the identity matrix with indices inJ . Hence, no solution exists that retrieves the state vari- ables that are not being measured, when only the collection of measurements is considered. To overcome the two issues mentioned, we propose to consider the model that captures the spatial-temporal relationship between the state variables that together with the measurements will enable the retrieval of the state. A systemF(A;;K) and a collection of measurements (4.3) is said to be (dynamically) observable at time k = 0 if and only if there exists someT such that the statex 0 can be uniquely determined from the knowl- edge ofy k andF(A;;K). Subsequently, the goal of this paper is to determine the minimumJ in (4.3) that, together withF(A;;K), ensures observability of the system at timek = 0. We have the following problem: Minimum Sensor Placement Problem GivenF(A;;K), determine the minimum number of dedicated sensorsJ such that arg min Jf1;:::;ng jJj s.t. (F(A;;K);I J n ) is observable. (4.4) Figure 4.2 illustrates one possible application of this problem, i.e., determine if a subcollection of sensors allow to retrieve the overall evolution of EEG signals. In particular, determining the smallest collection of such sensors will lead to the development of energy efficient EEG wearables. 4.2 CDFODS Observability and Submodularity In this section, we first review some properties of the CDFODS and the characterization of the feasi- bility space of (4.4), followed by a brief recap of submodularity properties. The closed-form to (4.1) is described as follows. 37 Figure 4.2: Are both technologies equivalent? Lemma 1 ([16, 53]). The solution to (4.1) is given by: x[k + 1] =G k+1 x[0]; (4.5) where G k = 8 > < > : I fork = 0; k1 P j=0 A j G k1j fork 1; withA 0 =A andA j = diag((1) j+1 1 j+1 ; : : : ; (1) j+1 n j+1 ). Please note that is a non-integer number and the term j = (+1) (j+1)(j+1) is expressed via the Gamma function: (x) = R 1 0 t x1 e t dt [10]. Subsequently, consider a sequence of measurements of the closed-form described in Lemma 1, i.e., y[1] =Cx[1] =CG 0 x[0]; y[2] =Cx[2] =CG 1 x[0]; :::; y[K] =Cx[K] =CG K1 x[0]: We can rewrite it as y 0:K = [(CG 0 ) | (CG 1 ) | ::: (CG k1 ) | ] | | {z } O k (F(A;;K);C) x[0]; (4.6) where y 0:K = [y | 0 y | 1 y | K ] | , and the matrixO k (F(A;;K);C) is commonly referred to as ob- servability matrix. In order to retrieve x[0], we can first premultiply both hand sides of (4.6) byO | O k (F(A;;k);C), which leads to O | y 0:K =O | Ox[0]: Thus, ifO | O is invertible, then we obtain a closed-form solution tox[0], i.e., x[0] = (O | O) 1 O | y 0:K : (4.7) 38 As a consequence, we obtain the following results: Theorem 1 ([16, 53]). The system described by (4.1) and (4.2) is observable if and only if there exists a finite timeK such that rank (O K (F(A;;K);C)) =n. Theorem 2 ([16, 53]). If the system described by (4.1) and (4.2) is observable, then the initial statex 0 can be retrieved as in (4.7). Remark 1. Due to the nature of Theorem 1, it may occur that a specific set of measurements do not yield observability of a given time K. Notwithstanding, it is possible that there exits K 0 > K that for the same collection of sensors such that Theorem 1 holds. In other words, observability implicit considers the tradeoffs between how much data is collected (with the same set of measurements), in particular, the time allowed to collect data, and the number of sensors acquiring it. In section 4, we show how the aforementioned characterization of the feasibility space of (4.4) can be used to obtain its solution. Briefly, we will use a submodular approach. Given a finite setV withn objects, i.e.,jVj =n, we can define a functionf : 2 V !R + that associates a positive scalar with each subset ofV. This function is said to be submodular if it satisfies the so-called diminishing returns property, i.e., for allXY andv = 2Y, we must have: f(X[fvg)f(X )f(Y[fvg)f(Y): In other words, the increment by considering a new element in a smaller sized set is at least as high as adding it to a superset of the latter. When f is submodular, one can use greedy algorithms that yield approximate solutions that are at most 33% worse than the optimal solution [96]. These approximation guarantee does not depend on the size ofV and it provides a worst case scenario that is often not attained; in addition, some improved guarantees are available when submodular functions have additional properties, see [76] for details. 4.3 Minimum Sensor Placement In this section, we present the main results of this paper. More precisely, in Theorem 3 we show our problem to be NP-hard. Notwithstanding, we present an heuristic Algorithm 2 that has polynomial computational complexity, and approximates the original problem with some optimality guarantees, see Theorem 4. Theorem 3. The minimum sensor placement problem for CDFODS (4.4) is NP-hard. Proof: The proof follows by noticing that there exists a set of coefficients that leads toA j = I n forj = 1;:::;K. Therefore, (4.4) reduces to a linear time-invariant system, and the problem reduces to the minimum observability problem for linear-time invariant systems presented in [101], that is NP-hard. Thus, because (4.4) contains (as subclass of problems) one that is NP-hard, it follows that (4.4) is also NP-hard. 39 Algorithm 2 Heuristic Algorithm to (4.4) Input: CDFODSF(A;;K); Output:J that is an approximate solution to (4.4); InitializeJ =;; repeat Initialize r = 0; r 0 = rank(O k (F(A;;K);I J n )); fori = 1 ton do J 0 =J [fig; r(i) = rank(O k (F(A;;K);I J 0 n ))r 0 ; if r(i)> r then r = r(i); i =i; end if end for if r > 0 then J =J [fi g; end if until r = 0. Next, we show that the feasibility space of (4.4) is given by a constraint, see Theorem 1, that is sub- modular. Lemma 2. Given a CDFODSF(A;;K), the following function f(J ) = rank(O k (F(A;;K);I J n )) is submodular inJf1;:::;ng. Proof: We defineV = f1; 2;:::;nKg as the set of row vector indices for observability matrix O k (F(A;;K);I n ). LetV J A V be the set of row vector indices forO k (F(A;;K);I J A n ), where J A f1;:::;ng. Letr(V J A ) be the rank of the matrix that contains as rows the row vectorsfg i g i2V J A indexed byV J A . Hence,r : 2 V !Z + R + is the rank function of observability matrix andr(V J A ) is the cardinality of maximum independent subset of vectors contained within the set of vectors indexed by V J A . Therefore,r(V J A ) = rank(O k (F(A;;K);I J A n )). Now consider two subsetV J A ;V J B V. LetI,I AB ,I A be the maximum independent subset of V J A \V J B ,V J A [V J B andV J A , respectively. By definition, we haveII A I AB . To prove the submodularity of the rank functionr, we need to show, r(V J A ) +r(V J B )r(V J A [V J B ) +r(V J A \V J B ) (4.8) or equivalently, r(V J B )jI AB jjI A j +jIj: (4.9) 40 following the fact thatjIj = r(V J A \V J B ),jI AB j = r(V J A [V J B ) andjI A j = r(V J A ). Observe that I AB \V J B is also a set of independent vectors and a subset ofV J B . Therefore, we have r(V J B )jI AB \V J B j; (4.10) and sinceI AB \V J B =I AB n (V J A nI), it follows that r(V J B )jI AB n (V J A nI)j =jI AB jjI A j +jIj: (4.11) Hence, the functionf(J ) =r(V J ) is submodular. We now consider the minimum sensor placement problem for CDFODS in (4.4). Although we showed the problem is NP-hard, Lemma 2 proves that the rank of the observability matrix is submodular. Thus, the problem can be solved via greedy approach that repeatedly adds a sensor increasing the rank until the conditions in Theorem 1 hold. Algorithm 2 implements such approach. Our first step starts with an empty setJ which contains the index of column vectors to be added toC. The algorithm will tentatively add one index to check if the rank of the observability matrix increases and saves the one that contributes more during one iteration. The loop repeats until the rank stops increasing, case in which the algorithm will stop withJ . Theorem 4. Algorithm 2 approximates the minimum observability problem for discrete-time fractional- order systems (4.4) in polynomial-time, i.e., with computational complexityO(n 5 ). Further, it achieves a solution that is at most 33% worse than the optimal solution. Proof: Algorithm 2 is a greedy method that progressively picks up the index of a column vector from the identity matrix and adds it toJ such that the maximal increase of rank for observability matrix O k (F(A;;K);I J n ) in each iteration is obtained. The algorithm will terminate when no further increase is possible. Notice that the outer loop will at most be executedn times asr(CG 0 ) =r(I J n ) =jJj, offering a lower bound for the rank of the observability matrix. No more thann indices will be added toJ until the row rank ofO k (F(A;;K);I J n ) isn or the system modeled by (4.1) is observable. In each iteration, the algorithm will loop over alln indices and check their possible contribution to the rank increase when the corresponding column vector is added toI J n . Therefore, any operation in the algorithm will be performed no more thann 2 times before rank of observability matrix is full-rank. Since the rank verification of the observability matrix is the most expensive operation and the remaining operations (i.e., comparison, set union and assignment) all have the time complexity ofO(1), the cost of the rank verification operation determines the overall time complexity of the algorithm. Checking the rank of a matrix can be done using rank-revealing QR factorization with a complexity ofO(n 3 ). Therefore, the overall complexity of the algorithm isO(n 5 ). In addition, this algorithm is the greedy algorithm used for submodularity function. Hence, because the objective function considered is submodular by Lemma 2 it follows that the algorithm ensures similar optimality guarantees. In particular, it achieves values within the mentioned bound. 41 Abductor Pollicis Longus Flexor Digitorum Profundus Flexor Pollicis Longus Pronator Teres Supinator Unused sensors Used sensors Extensor Digitorum Figure 4.3: Sensor distribution in the 6-channel iEMG signal clinical measurement experiment. The sen- sors in blue represent the minimal deployment of sensors that ensure the global dynamics of iEMG in all 6 muscles can be retrieved. The sensors in grey represent the unused sensors while the muscular activity at where the sensor in red is located is simulated based on the identified fractional order system. 4.4 Simulation Results A case-study investigation of EMG signals: We analyze the intramuscular EMG (iEMG) signals from clinical experiment to compare muscle con- tractions of transradial amputees to those of non-amputated subjects. All subjects are asked to perform forearm movements. The iEMG signals are recorded at different sites of the forearm muscles as shown in Figure 4.3: (1) extensor digitorum (ED); (2) flexor digitorum profundus (FDP); (3) abductor pollicis longus (APL); (4) flexor pollicis longus (FPL); (5) pronator teres (PT); and (6) supinator (SUP). Each sub- ject is inserted with fine wire electrodes for measurement purpose. Subjects are asked to relax 6 seconds, then do the finger flexion at a consistent strength for 10 seconds. The entire process is repeated twice. The ADInstruments data acquisition system sampled the iEMG at 4 KHz after applying a 2 KHz low pass filter, and a 10 Hz high pass filter to minimize any motion artifacts from electrodes or leads. As an example, we consider iEMG signals measured when the subject 3 is performing the finger flexion at a constant strength. We use identification techniques to estimate the fractional-order parameters (A;), that account for the spacial-temporal characterization of the signals. More specifically, the coupling matrix captures the spatial dependencies among different motor muscles involved in the forearm movement and the fractional order exponents of different iEMG channels lie in [0:14; 0:19] that support the nature of fractional dynamics. Based on the identified systemF(A;;K), we perform the greedy method proposed in Algorithm 2 to retrieve a small collection of sensors required such that the fractional order system is observable. In Figure 4.3, we depict in blue the identified sensors that ensure retrieval of initial states for unobserved muscular activities. In Figure 4.5(a-1), we report the retrieved initial states over all 6 sensing channels compared 42 against the actual measurements. These states were recovered using (4.7), when the sensors ED, FDP and APL are considered. In particular, we notice that we were able to successfully recover the initial states of sensors SUP, PT and FPL. To show the capability of retrieving the initial states, we simulate the muscular dynamics over first 10-second session of finger flexion at FPL based on the retrieved system states and the inferred innovation series from our estimation. To give an intuition, we show the comparison between the signal recorded and the simulated activities in Figure 4.5(a-2). As we can see from the figure, the simulated dynamics at the non-measured FPL channel fits well to the actually recorded muscular activities, thus, validating the capability of the proposed algorithm to recover global dynamics in the iEMG experiment. To explore the influence of the length of the time-series, i.e., the number of measurements samples collected, on the accuracy of retrieved global dynamics, we measured the accumulated errors of retrieved system initial states, i.e., sum of the deviation from the actual recordings over all sensing channels, and show in Figure 4.4. An important observation is that the increase in number of sensors deployed will help reduce the estimation errors, that are namely due to approximation errors and small weights in the coupling between some of the sensors’ signals. Similar conclusions can be achieved when we observe the physiological process for a longer period of time, given a fixed number of sensors used. However, there exists a phase-change phenomenon as the number of sensors used decreases: one can always retrieve the system’s initial states with similar accuracy when at least 3 sensors are used, and enough observations are made. However, there is a jump in the accumulated errors that persists over observation horizon if less than 3 sensors are deployed, which suggests the existence of a lower bound for the number of sensors to retrieve global dynamics (i.e., minimal observability), as predicted by the theoretical setting explored in this work. A case-study investigation of EEG signals: To explore the minimal observability of a neural system, we apply the proposed algorithm to a 64-channel electroencephalogram data set which records the brain activity of 109 subjects when they are performing motor and imagery tasks. In the experiment, each subject sits in front of a screen where targets might appear at the right/left/top/bottom side of the screen. Upon noticing the target, each subject is asked to open and close the corresponding fists or feet as a function of where the target appears. Each individual performed 14 experimental runs consisting of one minute with eyes open, one minute with eyes closed, and three two-minute runs of interacting with the target. The data set is collected by BCI2000 system with a sampling rate of 160Hz [118, 49]. Following the similar experimental settings to that of the iEMG, spatial-temporal parameters were esti- mated;in particular, the fractal order exponents range from 0.34 to 1.04. Then, using the estimated model, we used Algorithm 2 to obtain theF(A;;K) and sensors placement to achievie system’s observability. The minimal set of sensors and their deployment are reported in Figure 4.6 and and depicted in blue. More specifically, 30 out of 64 sensors are required to ensure observability. In addition, we retrieved the sys- tem’s initial states of all 64 EEG channels by applying equation (4.7) and as presented in Figure 4.5(b-1). Furthermore, as illustrated in Figure 4.5(b-2), the recovered neural system’s initial states follow closely the actual measurements. In particular, we simulate the neural dynamics at a distant PO 8 channel whose retrieved initial state differs by less than 4% from recorded value, and compare the model response with the actual EEG recordings. In what follows, we performed an experiment to check how the length of ob- servations affects the accuracy of the retrieved states under different neural sensor settings. The results are 43 0 10 20 30 40 Length of observations 10 -2 10 -1 10 0 10 1 10 2 Accumulated retrieval errors Deployed sensor=5 Deployed sensor=4 Deployed sensor=3 Deployed sensor=2 Deployed sensor=1 Figure 4.4: The figure shows the accumulated errors in retrieved system’s initial states change as a function of the number of used sensors and the observation length. The increase in number of sensor deployed pro- vides information gain that leads to the decrease in overall reduced deviation of states retrieved compared to the actual measurements. reported in Figure 4.10, and, as expected, it is shown that the errors of the retrieved states decrease as we make observations on the system dynamics for a longer period of time given a fixed set of sensors selected, and when we choose more sensors given a fixed length of observations. The similar phase-change phe- nomenon described for the same experiment using EMG, is also noticed when we use less than 16 sensors in which the accuracy does not improve at the expense of observation length. Further, there exists a sudden increase of errors over all sensing channel when the number of sensors we use drop from 16 to 8. A case-study investigation of ECG signals: To verify the efficacy of our theoretical analysis results, we also consider the application of the algorithm to analyze the electrocardiogram (ECG). The raw clinical ECG data was extracted from the PTB diagnostic ECG database [50]. Data on 52 healthy subjects (13 women, age 48 19 and 39 men, age 42 14) was obtained by the National Metrology Institute of Germany. Each subject’s record includes 15 different signals simultaneously acquired: the conventional 12-lead (I, II, III, aV R , aV L , aV F , V 1 , V 2 , V 3 , V 4 , V 5 , and V 6 ) and 3 Frank orthogonal leads (V X , V Y , and V Z ). Each signal is digitalized at 1000Hz, with a signal bandwith of 0Hz to 1KHz and with 1 uV LSB resolution [24]. As an example, we choose the ECG signal from the subject S0010 from the data set during a 39-second measurement session. Similarly, we first obtain the estimate off j g that corresponds to the fractional order derivative in equation (4.1) and the spatio-coupling matrixA. Based on the identified system, the greedy algorithm shows we do not need more than two sensors(i.e.,I andII) to retrieve the dynamics of all 15-channel cardiac activities. We show the recovered system states using the observation from only two sensors in blue in Figure 4.8.(c-1). An important observation is that the obtained deployment of sensors 44 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0 50 100 150 200 250 300 350 400 0.18 0.2 0.22 0.24 0.26 0.28 0.3 Electrocardiography@avR channel Recorded Simulated Channel ID -0.1 -0.05 0 0.05 0.1 0.15 0.2 Muscular activity Measurement Retrived initial states 800 850 900 950 1000 1050 1100 1150 1200 Sample ID -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 Intramuscular activity@FPL Recorded Simulated ED FDP APL FPL PT SUP Sensor used Sensor unused Sample ID Channel ID I II III avR avL avF V1 V2 V3 V4 V5 V6 VX VY VZ Measurement Retrieved initial states a-1) c-1) a-2) c-2) 0 10 20 30 40 50 60 70 -100 0 100 200 300 400 Measurement Retrieved initial states Channel ID 9200 9250 9300 9350 9400 9450 9500 9550 9600 -200 -100 0 100 200 Neural activity Recorded Simulated Electroencephalogram@PO8 channel Sample ID b-1) b-2) Cardiac activity Neural activity Muscular activity Cardiac activity Sensor used Sensor unused Sensor used Sensor unused Figure 4.5: The retrieved initial states of the system using (4.7) across different signals are presented in the top row. We show the simulated states evolution of sensors that were not considered in the experiments and compare them with actual measurements at these sensors. In the first column, Figure 4.5(a-1) shows The initial states of unused sensors are recovered by the proposed algorithm based on the measurements from minimal subset of sensors (ED,APL and FDP). In Figure 4.5(a-2), the simulated data based on the retrieved initial states at unused FPL-sensor is compared against the actual measurements during first 0.4 second of finger flexion. Similarly, Figure 4.5(b) and Figure 4.5(c) show the initial states retrieved for unused sensors and the simulated dynamics at one of these sensors during the experiments of EEG and ECG, respectively. aligns with the pattern distribution of coefficients in the coupling matrix A, see Figure 4.9. A subset of channels plays the dominant role in the cardiac dynamics and these channels themselves are tightly coupled such that we can retrieve global dynamics by the observations made upon a subset of them. This is verified by the output of algorithm and validated during our simulation to generate the system states evolution at aV R as shown in Figure 4.5.(c-2). The simulated model response is strongly coherent with actual measurement. We set up the similar experiment to show the relation between the observation length and number of sensors deployed in context of a cardiac system. The results are shown in Figure 4.10 and it is aligned with our previous observations made in the experiments of iEMG and EEG. The lower bound for the number of sensors used to reliably retrieve the global initial states can be identified by checking where the phase-change appears. In the case of ECG, we notice there exists a sudden rise in the errors when only one sensor is considered. In contrast, we can achieve the similar accuracy if 2 or more sensors are placed in the measurement system as described in Figure 4.8 given enough observations. 4.5 Summary Coupled-fractional dynamical systems capture the evolution of different physiological signals, such as electroencephalogram, electromyogram or electrocardiogram signals. In this work, we proposed to deter- mine the smallest collection of signals required to retrieve the overall evolution of the process modeled by 45 Used sensors Unused sensors Sensor of interest Figure 4.6: The 64-channel geodesic sensor distribution for measurement of EEG. The sensors in blue represent the minimum number of sensors and their deployment to guarantee that the CDFODS in (4.1), whose states correspond to the channel measurements, is observable. The sensor in red is used as a sanity check on the evolution of the identified CDFODS. The results are compared against the recorded activity in Figure 4.5.(b-2) the coupled-fractional dynamical system. In particular, we show that the problem is NP-hard, but a sub- modular approach can be leveraged to obtain approximate solutions with optimality guarantees. Further, we illustrated the proposed mechanism in the context of the different mentioned physiological signals. 46 0 10 20 30 40 Length of observations 10 0 10 5 10 10 10 15 Accumulated retrieval errors Deployed sensor =63 Deployed sensor =56 Deployed sensor =48 Deployed sensor =40 Deployed sensor =32 Deployed sensor =24 Deployed sensor =16 Deployed sensor =8 Deployed sensor =1 Figure 4.7: The accumulated errors of the retrieved neural system’s initial states over 64 channels change over observation horizon. A phase-change phenomenon is shown when number of sensors deployed drops from 16 to 8 where the errors do not decrease as more observations are made. V1 V2 V3 V4 V5 V6 aV R aV L aV F I II III Used sensors Unused sensors Figure 4.8: The deployment of 12-lead ECG system used in the experiment is shown. The sensors in blue (I and II) is the identified minimal subset of sensors to ensure the observability of the fractional order system. We simulate the cardiac activity at aV R given the identified fractional order system. 47 Figure 4.9: The coefficients colormap of spatio-coupling matrixA. The distribution of normalized coeffi- cients shows a subset of channels plays dominant roles in the cardiac dynamics. 0 5 10 15 20 25 30 35 40 Length of observations 10 -2 10 0 10 2 10 4 Accumulated retrieval errors Deployed sensor=14 Deplyoed sensor=12 Deplyoed sensor=10 Deployed sensor=8 Deplyoed sensor=6 Deployed sensor=4 Deployed sensor=2 Deployed sensor=1 Figure 4.10: The accumulated errors are plotted against the length of observations and different settings of sensors. The longer observations improve upon the overall accuracy of system states estimates while it also suggests a lower bound for number of sensors used for observability. This observation aligns with our theoretical analysis on minimal observability. 48 Chapter 5 Multifractal Geometry and Characterization of Networked Cyber-Physcial System: Algorithms and Implications 5.1 Multi-fracal Geometry of Complex Systems Complex systems consist of heterogeneous agents mutually influenced via interactions of different intensities over multiple spatio-temporal scales. This heterogeneity encompassed in both the participat- ing components and their varying interactions makes complex systems difficult to decipher. To under- stand and control these complex systems, the network theory provides an effective mathematical modeling framework that enables the encoding of the entities (nodes) of a complex system and their heterogeneous interactions (links) of different strength (weights) into a topological network configuration implicitly em- bedded in metric spaces, where the distance among nodes is decided both by the structural configuration of the system (topology) and the intrinsic nature of the inter-node couplings (e.g., social affinity, chemical bonds, traffic intensity or neural connectivity strength). In some cases, the properties of the inter-couplings among system components and the corresponding spatial embeddings even play a far more dominant role in regulating the overall system behaviors and dynamics. For instance, the atomic and molecular inter- actions among a chain of amino acids definitively dictate not only the dynamical spatial conformation of the corresponding protein but also its biological functionality [33, 36]. The disturbance of normal protein interactions can lead to irreversible pathological consequences known as proteopathies like Alzheimer’s, Parkinson’s [122] and Huntington’s disease [73]. Therefore, the study of structural organization, for- mation and dynamics of the complex systems can benefit from studying their geometrical properties and discovering new relationships between geometrical characteristics and network problems (e.g., community structure identification). Learning the geometric principles underlying the organization of complex systems modeled by weighted networks facilitates the identification of their fundamental properties. Some of complex networks have been found to be Small world or Ultra-small world. Small world network model characterizes a graph of sizeN for which its average path length increases proportionally to the logarithm of the number of nodes <d>logN. In contrast, the Ultra-small world networks are characterized by smaller shortest path dis- tances that scale asd min loglogN. Albert Barabasi and his colleagues found the Erdos-Renyi random network model can not explain the formation of densely interconnected hubs or clusters in a family of real 49 networks with degree distribution obeying a power-law [11]. In contrast to the Erdos-Renyi random net- work model that leads to a narrow normal degree distribution, the power-law degree distribution of these networks has such a long tail that we cannot reason about the interconnection density of the network given a randomly chosen sample, hence they are scale-free. The discovery of small-world property led to the belief that complex networks are not invariant under a length-scale transformation according to which an exponential dependence holds between the size of the network and its average path length. However, it is found that a variety of real networks exhibit self- repeating patterns at all length scales by applying a renormalization procedure [128, 127]. This illustrates the concept of self-similarity. The coexistence of self-similarity and small-world property in a variety of complex networks is further verified [110]. These two contradictory properties call for further investigation on the appropriate mathematical model of complex network and its main features. A phase transition phenomenon is found between the local self-similarity and the global small-world property by studying the stability of nodes by renormalization group theory [111]. The uncovered self-similarity in complex networks connects to the important fractal and multi-fractal geometry domain where a family of objects are distinguished based on their self-repeating patterns and invariability under scale-length operations. Such objects are known as fractal objects. A mono fractal object obeys a perfect self-repeating law at all scales. When embedded in Euclidean metric space and tiled by equally sized boxes at different scales, it becomes apparent that an important property of fractals is the power-law dependence between the mass distributionM(r) (e.g., the number of points in a box) and the scale factorr: M(r)r D (5.1) In Eq.(5.1),D is the fractal dimension and represents a real-valued number in contrast to the embedded space dimension which is always an integer. Fractal dimension is the major tool for describing the fractal geometry and the heterogeneity of irregular geometric objects that the dimension of its embedded space fails to capture. For instance, in Euclidean geometry, a straight line and a crooked line share the same geometrical dimension but have very distinct properties. Multi-fractals could be seen as an extension to fractals with increased complexity. They are invariant by translation although a distortion factorq needs to be considered to distinguish the details of different regions of the objects as a consequence of inho- mogenous mass distribution. Intuitively, multi-fractals are not perfect self-repetitions but rich in localized variations of detailed geometric configurations. Consequently, a single fractal dimension is not sufficient to characterize the irregularity of the geometric shapes as the scaling factor measured across the object could be different. As a result, multi-fractal analysis (MFA, see Methods for details) is proposed to cap- ture the localized and heterogenous self-similarity by learning a generalized fractal dimensionD(q) under different distortion factorsq. MFA has been applied to investigate the underlying geometrical principles in a wide spectrum of ap- plications including signal processing [94, 59, 60, 152], imaging processing [85, 65, 143], genomics [151, 93], geophysics [135, 81], turbulence analysis [89, 14, 94], network traffic modeling [45] and financial analysis [119, 63, 86, 25]. Irrespective of the effectiveness of MFA in various domains, its application to study the self-similarity of complex networks is not straightforward as the Euclidean metric is not well 50 G 0 G 1 G 2 Sierpinski(b=6,f=1/3) (u,v)-Flower(u=3,v=3) n=1 n=2 n=3 Figure 5.1: Failure of single (dominant) fractal dimension to capture the heterogeneity in detailed config- uration of fractal networks. A comparative example shows A) Sierpinski fractal network (s = 1=3;b = 6) and B) (u,v)-flower fractal network (u = 3;v = 3) share the same fractal dimension (1:631) yet having distinct topological structure. defined in a topological object like the complex network. The box-covering method was introduced [126] to calculate the fractal dimension of unweighted complex networks and the authors proved its reducibility to the well-known graph coloring problem, which is NP-hard. However, a single fractal dimension is not a sufficient characterization of self-similarities embedded in the complex network. For instance, how can we distinguish fractal networks that share exactly the same fractal dimension but look entirely different? Fig- ure 5.1 shows two fractal networks, namely, Sierpinski fractal networks and (u;v)-flower. Both networks have the exact same fractal dimension of ln(6)=ln(3) 1:631 but show distinct structural properties. Apparently, relying on mono-fractal analysis does not allow us to distinguish between these two networks. Another relevant question is how link weights affect the fractality/multi-fractality of the weighted complex network. Link weights play an important role in governing the dimension of the network as there exists a mapping from a weighted network to a network spatially embedded where weights translate to the length of links that affects its dimension [31, 35]. To address this problem, an alternative box-covering method (BCANw) was proposed for the numerical determination of the fractal dimension of weighted complex networks [142]. Its application has also been extended to study the inhomogeneity of weighted real-world networks through multi-fractal analysis. Following the same line, a similar study of multi-fractality em- bedded in weighted networks using the modified sandbox method (SBw) is reported [129]. However, neither of the two methods considers the impact imposed by the distribution of the link weights. Both algorithms are prone to intrinsic estimation bias as a consequence of i) ignoring the skewness of the link weight distribution and ii) the implicit assumption that a global fractal/multi-fractal scaling holds at all scales of the network. Moreover, there is no theoretical foundation to support the design and evaluation of both algorithms in order to analyze the factors that adversely impact their numerical accuracy. All these disadvantages leave room for an erroneous characterization of both the structural and dynamical features of the weighted complex networks. To overcome these issues, we first analytically study the multi-fractal structure of the Sierpinski fractal network family to set up the theoretical ground for evaluation and comparative analysis of our proposed 51 algorithms. We find that the multi-fractality identified by SBw can be just the side effect of the lim- ited size of the network considered. The analytical discussion of multi-fractality in Sierpinksi familyS provides the theoretical basis on which not only we can quantitatively reason about the existence of multi- fractality/fractality from an asymptotic perspective that numerical approaches will surely fail to offer, but also we can shed some light on the design of numerical algorithms for reliable estimation of multi-fractal spectrum of the complex networks. To motivate the design of a reliable algorithm which eliminates the disadvantages of BCANw and SBw, we analyze the source of the estimation bias of both algorithms through a set of numerical experiments. We show a compatible growth rule is required to remove the bias, given weighted complex networks of finite resolution. We also show a detailed quantitative error analysis to investigate the source of the intrinsic estimation bias of both algorithms. Based on both our theoretical findings and numerical experimental results, we propose the finite box- covering algorithm for weighted network (FBCw) and the finite sandbox algorithm for weighted network (FSBw) with improved performance. We compare the accuracy of the estimates obtained by FBCw and FSBw with our analytical results of Sierpinski fractal network as well as with those obtained by BCANw and SBw. The comparison shows that the proposed algorithms are not only able to give reliable numerical estimates of fractality with insensitivity to the distribution of link weights, but also are capable of detecting the fractal scaling dependence when it holds within a finite range of scales (i.e., scale-localized). More importantly, we apply the proposed algorithms to learn the multi-fractal structure of a set of real world weighted networks. We show the link weights play a definitive role in governing the existence of fractality in the network. The investigated weighted networks exhibit a phase transition from self-similar networks to small-world networks when converted to binary networks. Furthermore, we demonstrate that the fractal and multi-fractal scaling behaviors can be spatially localized and co-exist in the same network. Learning from our observations on the locality of the scaling behavior of real world weighted networks, we finally propose a network characterization framework based on the localized scaling feature space learned by the construction of scaling feature vectors for each node in the network. The proposed characterization is general and not limited to complex networks that are fractal or multi-fractal. It can be easily interfaced with subsequent analytical tools (e.g., machine learning algorithms) to unveil the intrinsic properties of the weighted complex network. To illustrate the benefits of our methodology, we apply our algorithms to the network community detection in the human brain connectome. The identified communities are consistent with our biological knowledge. The following discussion is organized in four parts. In the first and second part, we present our ana- lytical study on the multi-fractality of Sierpinski networks and the estimation error of previous numerical algorithms. In the third part of discussion, we will compare the performance of BCANw, SBw with the proposed FBCw and FSBw. Finally, we will present the multi-fractal analysis on a set of weighted real world complex network and propose a localized scaling based approach for the characterization of the weighted complex network. We provide an illustrative application example in network community detec- tion to show its effectiveness. 52 5.2 Multi-fractal Analysis of Sierpinksi Fractal Network Preliminaries: Before we present the analytical study of multi-fractality of Sierpinski network, we first formally introduce the definition of the Sierpinski family of our interest. For succinctness of statement, we introduce the following definitions: Letb2N be a positively integer-valued number ands2 (0; 1] be a real number. LetG = (V;E) be a network whereV denotes the collection of nodes andE denote the collection of links. LetD(v i ;G) denote the degree ofv i in networkG andW (e i;j ) be the weight of linke i;j A tree is an undirected connected acyclic networkG. Given G = (V;E) is a tree, letL : V ! V be a mapping function such that D(v i ;G) = 1;8v i 2 L(V )V .L(V ) is also known as the set of leaf nodes ofG. Thus, we can formally introduce the definition as, Definition 1 (Sierpinski network): A family of Sierpinski networks is an infinite set of treesS =fG 0 ;G 1 ;:::G n ;:::g subject to the following constraints: (1)G 0 = (V 0 ;E 0 ) wherejV 0 j = 1 andE 0 =;. (2) For anyk 1, G k :=G k nG k1 = (V k ; E k ) where V k =V k nV k1 and E k =E k nE k1 such that the following conditions are all met: (2.1)D(v i ; G k ) = 1;8v i 2 V k (2.2)jV k j =jE k j (2.3)8v i 2 V k ,9v j 2L(V k1 ) such thate i;j 2 E k (2.4) Fork = 1,D(v i ;G k ) =b and fork> 1,D(v i ;G k ) =b + 1,8v i 2L(G k1 ). (2.5) Fork = 1,W (e i;j ) = W 0 ,8e i;j 2 E k . Fork > 1,W (e i;j ) = sW (e i 0 ;j 0);8e i;j 2 E k and 8e i 0 ;j 02 E k1 . As Definition 1 states, the family of Sierpinski networks is constructed in an iterative fashion where the origin of the family is a singlen 0 inG 0 and any memberG k1 is a subgraph of its successorG k . By definition, a new family memberG k is introduced through the insertion of new nodes in V k and links in E k to its predecessorG k1 . The construction of V k and E k is regulated by constraints (2.1)-(2.5). The constraints (2.1), (2.2) and (2.3) state each newly added node has one and only one link to the leaf nodes ofG k1 . Constraint (2.4) defines the growth rule of the Sierpinski family where each leaf node in G k1 creates links to exactlyb nodes inG k . Constraint (2.5) decides the scaling factor of link weights between two generations. It is noted that we define the weights W (e i;j ) for an empty link set E 1 for maintaining the consistency of the iterative definition. To study the multi-fractality of Sierpinski networkS, we first need to map the network onto a metric space where the distance between a pair of nodes is defined as follows: d i;j = minfw p i;k1 +w p k1;k2 +::: +w p kn;j g (5.2) where w l;m = W (e l;m ) and link setfe i;k1 ;e k1;k2 ;:::;e kn;j g represents a path between v i and v j . The distance between a pair of nodes is thus decided by the path that minimizes the summation in Eq.(5.2). 53 The exponentp is a constant decided based on the context of the network. Sincep is fixed for a specific type of network considered, it is always possible to define w 0 l;m = w p l;m . Thus we will assume p = 1 without loss of generality. Lemma 1 (Diameter ofG k ): The diameterD m;k ofG k 2S is given by, D m;k = 2W 0 ( 1s k 1s ) (5.3) Proof: Due to the symmetry of G k and constraint (2.5), the longest path length is twice the distance betweenv i 2L(V k ) andn 0 . Thus, D m;k = 2d(v i ;n 0 ) = 2(W 0 +sW 0 +s 2 W 0 +::: +s k1 W 0 ) = 2W 0 1s k 1s (5.4) Lemma 2 (Size ofG k ): The network sizejV k j ofG k 2S is given by, jV k j = 1b k+1 1b (5.5) Proof: According to the growth rule (2.4), bjL(V k1 )j nodes will be newly introduced in the kth iteration. As a consequence, the size of G k can be calculated through the summation of a geometric progression, jV k j = 1 +b +b 2 +::: +b k = 1b k+1 1b (5.6) To analytically investigate the multi-fractality of Sierpinkski familyS, we consider a box covering method M that tiles theG k 2S;8k2N with boxesfB i (l)g. For improving the clarity of the paper, we herewith formally introduce the following definitions: Definition 2 (Box): A boxB(l) is a subset ofV such that for any pair of nodesv i andv j 2B(l),d i;j l. Definition 3 (Compact box): A box B(l) is compact if and only if d i;j > l holds for8v i 2 B(l) and 8v j 2VnB(l). Definition 4 (Box covering method): A non-overlapping box covering strategyM is a partition of G such that[B i (l) =V andB i (l)\B j (l) =;;8i6=j. Definition 5 (Optimal box covering): A non-overlapping box covering strategyM is optimal if there does not existM 0 such thatjM 0 j<jMj . Theorem 1 (Optimal box covering): A box covering strategyM is optimal if B i (l) is compact for 8B i (l)2M. Proof: Assume the optimal box covering isM 0 andM is compact. For arbitrary choice of a node v j , 9B i (l)2M such thatv j 2 B i (l). SinceM 0 is a partition ofG, hence9B i 0(l) such thatv j 2 B i 0(l)2 M 0 . GivenM 0 is optimal, let us assume B i 0(l)nB i (l)6=; such that9v k 2 B i 0(l) and v k = 2 B i (l). Equivalently,9v k 2 VnB i (l) such thatd j;k l. Consequently,B i (l) is not compact. This contradicts our assumption. Therefore,B i (l) =B i 0(l). Since our choice ofv j is arbitrary, we haveM =M 0 . 54 It should be noted that a compactM is not a necessary condition for optimality ofM. Theorem 1 states that an optimal solution is compact as long as such a compactM exists. To give some intuition, let us consider a network with 4 nodes connected. Assume the link weight is uniformly set to 1 and the size of the boxl to cover the network is 2. Obviously, the optimal covering strategyM requires 2 boxes. One covers the three nodes and the other covers the rest. No single box of size 2 can cover the entire network as the longest path is 3. ThereforeM is trivially optimal. However, the boxes are not compact based on the Definition 3 as there always exists a nodev i to which the shortest path ofv j 2 G is smaller thanl. Therefore, compactness is a stronger property of a covering method than its optimality. Theoretically, the generalized fractal dimension via box-covering method is given by D bc (q) =lim l!0 ln( P i (M i (l)=M 0 ) q ) ln(l=L) 1 q 1 (5.7) Eq.(5.39) holds only if the box covering strategyM is optimal. Interestingly, we are able to present a stronger box-covering strategy for Sierpinski familyS and prove its compactness (hence optimality) conditioned on the value ofs which is the scaling factor of link weights. Optimal box covering for SierpinskiS: The covering strategy starts with choosing how box size should be scaled in order to calculate the limit in Eq.(5.39) via a linear regression. In contrast to geometric fractal object that is well defined on Euclidean space, the complex network, once mapped to a metric space, has a limited resolution in a sense that we can not learn how the measure distribution P i ((M i (l))=M 0 ) q changes over continuously scaled box sizesl. To address this problem for the considered Sierpinski familyS, we choose to grow the box sizel by accumulating the unique link weights. Formally, let us consider G k = (V k ;E k ) 2 S where W (E k ) denotes the universal set of all its link weights. Let us define a strictly ordered setW > = fw 1 ;w 2 ;w 3 ;:::;w n g on G k such that w i 2 W (E k );8w i 2 W > and w i < w j if i > j. We define box-size growth rule as a setL=fl j jl j = 2 P n i=j+1 w i ;j n 1g. Once we set up the growth rule, the subsequent covering procedure can be stated as: 1. Givenl j , cover the subgraph inG k rooted on a nodev i 2 V j with a single boxB(l j ) of sizel j . 2. Repeat 1 for all the nodes in V j . 3. Cover every other node inG k with a box of sizel j . 4. Steps 1-3 yield a box-covering strategyM opt . Lemma 3 (Optimality of box covering):M opt is a compact optimal covering forS ifs< 1=3. Proof: First, let us prove that a box of sizel j is able to cover the subgraph rooted on any nodev i 2 V j . Let us denote the diameter of the subgraph ofG k rooted onv i asD m;ki . GivenG k is a tree such that the D m;ki can be calculated as: D m;ki = 2(s k1 W 0 +::: +s j W 0 ) = 2 s k s j s 1 W 0 (5.8) 55 By definition, we know forG k 2S l j = 2 n X i=j+1 w i (5.9) Wherew i = s i1 W 0 andn = k. Thusl j = D m;ki such that the subgraph rooted onv i 2 V j can be covered with a single box. Next, let us prove for any nodev i 2 V p wherep<j, it requires one and only one boxB(l j ) to cover it. We rewrite Eq.(5.9) as, l j = 2s 1s W 0 s j1 2s k 1s W 0 (5.10) The link weight between the nodev i 2 V j andv i 02 V j1 is equal tow j =W 0 s j1 . Givens2 (0; 1], the step 3 holds ifl j <w j . A sufficient condition for that is 2s 1s < 1<=>s< 1=3 (5.11) Given thatw j <w j1 <:::<w 0 =W 0 , no two nodesv i andv i 02V j1 can be covered by a boxB(l j ) ifs< 1=3. To finish our proof, we need to show thatM opt is compact. The proof follows the fact that for any boxB m (l j )2M opt , the minimal distance between a nodev i 2 B m (l j ) and a nodev i 0 outside the box d i;i 0 is no less thanw j whilew j >l j . Thus,B m (l j ) is compact for any such box inM opt so thatM opt is compact. By Theorem 1, the covering strategyM opt is optimal. CoveringS byM opt simply yields, N(B(l j )) =jV j j +jV j1 j =b j +b j1 +::: +b 0 = b j+1 1 b 1 (5.12) We set up the basis on which we can analytically derive the multi-fractal spetrum. It requires to calculate the limit in Eq.(5.39). We consider mass distributionM i (l j )=M 0 as our probabilistic measure. It is noted that we have two types of equally sized boxesB(l j ). Box type I covers every single node inV j1 and the other covers the rest. The number of nodes they cover is given by, M(l j ) = 8 < : b kj+1 1 b1 ; Box type I 1; otherwise (5.13) The probability measure(B 1 ) and(B 2 ) are calculated as, (B 1 ) = M(l j ) M 0 = b kj+1 1 b k+1 1 (5.14) (B 2 ) = 1 M 0 = b 1 b k+1 1 (5.15) 56 Assume 1j <k, (B 1 ) q =b qj (5.16) (B 2 ) q =b qk (5.17) Thus, X (B i (l j )) q =b (1q)j +b jqk1 (5.18) If we have lim k!1 j!1 j=k = O(1), which implies if we grow the Sierpinski network at the same speed as we shrink the box sizel. We can simply write Eq.(5.18) asb (1q)j givenj!1 such that the partition function(q) is obtained by, (q) = lim j!1 j(1q)ln(b) (j 1)ln(s) +ln(A) = (1q) ln(b) ln(s) (5.19) Eq.(5.19) suggests the partition function (q) is a linear function of q. In other words, the Sierpinski networkS converges to a perfect mono-fractal if we were able to measureG 1 given the scaling factor s< 1=3. In such case, the network could be characterized by a single fractal dimensionD =ln(b)=ln(s). In practice,k is always finite. Let us assume that we are learning the scaling dependence from a scaling rangej min < j < j max andk is big enough to approximateG 1 in the sense thatj k andk 1. If q > 0, we expect(q) stays as a linear function ofq as this is same as the case discussed. However,(q) will be affected by the value ofk whenq is negative. This translates to the fact that we will observe(q) is a non-linear function ofq that behaves different when it crosses the pointq = 0. Of particular note, such non-linearity of(q) has been reported in numerous prior works in which the studied Sierpinski network meets the mentioned conditions but is interpreted as multi-fractality. However, we argue that the observed non-linearity of(q) observed does not necessarily imply multi-fractality. The limited size of the graph considered with a small scaling factor could also be source of it, which does not mathematically link to multi-fractality. 5.3 Analysis of Finite Resolution and Link Weight Distribution 5.3.1 Estimation error analysis and stairway effect The link weights distribution of complex networks largely depend on the growth rule and weights allocation process. For instance, the distribution of link weights of Sierpinski family are shaped by the scaling factor s and growth rule b. Interestingly, for small scaling factor s, we prove G k approaches a mono-fractal that has no explicit dependence on weights distribution. Yet this is valid only for a complex network that has infinite resolution in the sense that box/sandbox can grow by infinitely small steps (but not continuously) in a network of unbounded range of scales. In most of cases, this does not hold for complex networks and perfect fractals of limited size (e.g., Sierpinski network of limited size). Therefore, when it comes to numerical calculation of the limit in Eq.(5.39) and Eq.(5.43) using linear regression which is shared by both box-covering and sandbox methods, we are able to show that the box/sandbox should 57 grow in a regulated way that is compatible with link weights distribution such that the stairway effect is minimized. The stairway effect is an immediate consequence of applying box-covering or sandbox methods to weighted complex network of finite resolution to estimate the generalized fractal dimensions using linear regression. For a linear regression that minimizes least square error (LSE) P (y i x i ) 2 , stairway effect can be stated as stagnant changes iny i irrespective of variation ofx i up to a certain range. We show a simple example in Figure 5.4 where the output and input observations are made from a linear relationy = 50 0:4x. The solid line shows the perfect fitting when no staircase is introduced. The two dashed lines correspond to the case where a set of unchanging observations are inserted (i.e., fake observations) between two actualy observations at two different locations (as denoted byx i ), creating two pieces of ”staircases” in the plot. As a consequence of minimization of LSE, the fitted lines in presence of staircases deviate from the perfect fitting differently based on the position of the staircases in the plot. We can show the estimation error is proportional to the width of the staircase and the number of fake observations inserted. To quantitatively perform error analysis in a more general case, let us denote Y =fy 1 ;y 2 ;y 3 ;:::;y N g andX =fx 1 ;x 2 ;x 3 ;:::;x N g as observation of output and input from a linear system. Let us assume the existence of a strictly linear dependency betweenY andX and observations are perfect, i.e., error-free. Trivially, linear regression can give perfect estimate of the slope 0 if only observations inY andX are considered. As the observations are not made continuously thus we can always insert fake observations as noise between any actual observationsx i andx i+1 . Let us introduce fake observationsY f =fy 0 i k jy 0 i k = y i ;k2 [1;n]g andX f =fx i k jx i k =k (x i+1 x i )=n +x i ;k2 [1;n]g. More precisely, we are inserting fake observation pairs (Y f ;X f ) between x i and x i+1 such that a small staircase is created because we assume all fakey observations of same value asy i . Thus, the optimization problem can be written as : arg min i X k=1 (y k x k ) 2 + in X k=i1 (y i x k ) 2 + N X k=i+1 (y k x k ) 2 (5.20) For one dimensional linear regression, it can be shown that the slope of the fitted line is given by = P i (x i x)(y i y) P i (x i x) 2 (5.21) By introducing the fake observations, both x and y are affected and the new averages can be calculated as, x 0 = N x + n 2 (x i +x i+1 ) N +n (5.22) y 0 = N x +nx i N +n (5.23) To simplify our analysis, we assume 1nN such that x 0 x and y 0 y. Therefore, we can write Eq.(5.21) as, 0 = P N j=1 (x j x) 2 + P n k=1 (x i x +k)(x i x) P N j=1 (x j x) 2 + P n k=1 (x i x +k) 2 (5.24) 58 where = (x i+1 x i )=n. Denote 2 x = P N j=1 (x j x) 2 and i =x i x. We have, 0 = 2 x + i P k ( i +k) 2 x + P k ( i +k) 2 = 2 x +n 2 i + i n(n + 1)=2 2 x +n 2 i + i n(n + 1) + 2 n(n + 1)(2n + 1)=6 (5.25) Givenn 1 and = (x i+1 x i =n), we defineW s =x i+1 x i as the width of the ”staircase” such that, j 0 jj 2 x +n( 2 i + i 1 2 W s ) 2 x +n[ 2 i + i ( 1 3 W 2 s +W s )] j =F ( i ;W s ;n)jj (5.26) From Eq.(5.26), we can make the following observations: i) The slope changing factorF ( i ;W s ;n) is a function of three factors: the deviation ofx i from the average of observationsfx i g, the width of the staircase and the number of fake observations inserted. The impact of the staircase on the estimation of the slope is decided by the location of the staircase, i.e., the sign of i . When i > 0 or the start of staircasex i is greater than x,F ( i ;W s ;n) is greater than 1 thus j 0 j is smaller thanjj. The introduction of fake observations, i.e., the staircase, will lead to underestimated (overestimated) slope of the fitted line given 0 is positive (negative). However, when i < 0, the influence is reversed. Of particular note, the stair effect does affect the estimate if i = 0. ii) When i > 0, the biased estimate decreases(increases if i < 0) as the width of the staircaseW s grows. This is aligned with our intuition that if the value of fake observationsfy i k g get stuck aty i over a very longx horizon, the fitted line tends to approach a line parallel with x-axis. iii) Given fixedW s andx i , the number of fake observationsn is proportional to deviation of from its actual value as the changing rate of the linear term in denominator of Eq.(5.26) is always greater than that in the numerator, rendering it a monotonically decreasing(increasing) function ofn when> 0 (< 0). 0 20 40 60 80 100 120 140 160 180 200 Startpoint of the staircase - . 2 5 - . 5 1 2 0 1 2 . 5 2 5 . -3 n=10 n=20 n=30 n=40 n=50 n=60 n=70 n=80 n=90 n=100 Relative error of estimated slope 10 x Figure 5.2: Relative error of estimate of 0 as function of insertion location of staircase and number of fake observations considered. 59 100 200 300 400 500 600 700 800 900 1000 Width of staircase 0 1 25 . 2 5 . 3 7 . 5 Rela ive e t rror of estimated slope ×10 -3 n=10 n=20 n=30 n=40 n=50 n=60 n=70 n=80 n=90 n=100 Figure 5.3: Relative error of estimate of 0 as function of width of staircase and the number of fake observations considered. To verify our theoretical findings, we use the same linear systemy = 500:4x and measure the estimation errors as a function of staircase location, number of fake observations and the width of staircase. Figure 5.2 shows the variation of the estimation error against different inserting locations of a staircase with number of fake observations ranging from 10 to 100 given a fixedW s = 100. As predicted by our analysis, the sign of (j 0 jj 0 j)=j 0 j changes across x = 100 and its amplitude exhibits an almost asymmetrical pattern where the error shrinks to zero asx i approaches x ( i ! 0) and becomes bigger otherwise. Given a fixed x i , the increase of number of fake observations n will further bias the estimates. Figure 5.3 shows the influence of the width of the staircase which coincides with our analytical prediction. 5.3.2 Finite resolution and compatible growth rule The above mentioned insertion of fake observations can be understood as an interpolation or oversam- pling process when a system with finite resolution is measured. For instance in a linear system, except for the case wherein input and output are decoupled in the system of interest, a staircase will be created in the measurement if the sampling rate is not compatible with the changing rate of the output, e.g., sampling rate is much greater than how the system actually changes its state. The changing rate of the system in this example intrinsically determines its ”resolution”. Similarly, a fundamental difficulty in extending the use of box-covering or sandbox method for de- termination of multi-fractality in complex network lies in its finite resolution. For a geometric multi- fractal/fractal object embedded in Euclidean space, the probability measure is well-defined on a contin- uous interval (0;L] where L is the length of the object. Alternatively stated, it is possible to grow box 60 0 200 400 600 800 1000 1200 1400 1600 1800 2000 X -800 -700 -600 -500 -400 -300 -200 -100 0 Y Staircase 1(x i =200) Y=50-0.4*X Fitted line with staircase 1 Staircase 2(x i =1400) Fitted line with staircase 2 Staircase 1 Staircase 2 Figure 5.4: A case study of stairway effect. A linear relationy = 50 0:4x is observed on a set of inputx. Two sets of unchangingy observations are manually inserted between two actual measurement to create ”stairs”. The introduction of such stairs biases the linear regression and causes estimation errors. size continuously and retrieve the estimates from a linear regression that considers all obtained measure- ments. However, the distance metric of a complex network is discrete and probability measure is not defined on a continuously spanningl horizon, i.e., for anyl i andl i+1 2 (0;L], there exist an infinite subset L i =fl i k jl i k 2 (l i ;l i+1 )g on which the probability measure is a constant. In other words, we can not distinguish between any element in this subset based on their associated probability measure, hence the resolution is finite. As a consequence, we observe staircases that correspond to the measurements on these subsetsL i if we directly apply box-covering or sandbox method with box size (or equivalently the radius of a sandbox, we use only the term ”box size” for short) growing continuously. The presence of staircases, as we discussed earlier, introduces bias into the estimates of generalized fractal dimension. Obviously, we can minimize the stairway effect if we can aptly choose a growth strategy to scale the box sizel such that no element in [L i is chosen as the size of a box. We call such strategy, if exists, as compatible growth rule. For unweighted networks, a compatible growth rule can be easily found by increasing the box size in a discrete way, i.e., adding one each time. This is feasible because even though the resolution of an unweighted network is limited yet it is homogenous across the network, i.e., distance between any directly connected pair of nodes is identical. However, such property does not hold for a weighted com- plex network due to the distribution of link weights. Searching for a compatible growth rule for box size scaling in a weighted network is much more difficult and sometimes impossible. To give some in- tuition, let us look at sandbox method. Formally, for a weighted network G = (V;E), let us assume v i 2 V as the center of sandbox andv j 2 V as a random node. Denoted i;j as the shortest path length between v i and v j . The shortest path distribution F di;j (l) = Pfd i;j lg has a discrete support set 61 L i =fl k jF di;j (l k )6= F di;j (l k 0);8k6= k 0 g defined byG. A compatible growth rule ofv i onG is thus a strictly ordered setL <;vi =fl 1 ;l 2 ;:::;l n g wherel k+1 > l k . Therefore, the compatible sandbox growth rule L(G) on G is defined by L(G) = \ i L <;vi where v i is the center of the sandbox. It should be noted that it is always possible to find a compatible growth rule for a sandbox centered at a fixed point by growing its size by its unique distance to all other nodes. However, there is no guarantee that this growth rule is compatible for another sandbox centered differently. A compatible growth rule requires that the support of path length distribution is shared among all choices of sandbox centers. Apparently, if the choice of sandbox in Eq.(5.43) ((see FBCw and FSBw description in Methods section) is randomized over all the nodes in an unweighted graphG,L(G) =f1; 2; 3;:::;d min g whered min is the shortest path length of the longest distance between any pair of nodes of G. However, it can be practically difficult to find in a weighted network of rich heterogeneity such L(G) that i) is shared by a sufficiently large subset ofV to be mathematically consistent with the estimate obtained by the box-covering method and ii) provides large set of samples ofl to numerically calculate the limit in Eq.(5.43). Such heterogeneity connects tightly to the skewness of the underlying link weight distribution ofG that governs not only the existence ofL(G) but also the design of a reliable algorithm to determine the multi-fractality ofG when L(G) does not exist. For instance, the unweighted network corresponds to the case where the link dis- tribution is a symmetrical delta function, hence the existence of multi-fractality is solely determined by the topological properties of the network. When the link weights are uniformly distributed, the topology of network again determines the multi-fractality of the network. Since the weights are uniformly dis- tributed, L(G) =fE[w]; 2E[w];:::;d min E[w]g is a statistically compatible growth rule. In other cases especially when the distribution is highly skewed, a poorly designed algorithm without awareness of link distribution leads to significant estimation errors in Eq.(5.39) and Eq.(5.43), rendering the basis of multi- fractal analysis questionable. This estimation errors can be well-explained by our analytical findings that the incompatible growth rule with presence of a skewed link weight distribution will lead to the stagnant observations in spite of the growing box size of l, i.e., the staircase. Wider staircases will produce un- derestimated slope, hence biased estimates of multi-fractality. Both BCANw and SBw that are previously proposed for numerical determination of multi-fractality in weighted networks fall into this category as they both rely on an incompatible growth rule and do not consider the skewness of link distribution. We show in the following discussion their disadvantages compared to our proposed algorithms. To make fair comparison under the same experimental setting with priorly known ground truth about multi-fractality of the study object, we choose Sierpinski fractal network family as our target network. We conduct compar- ative error analysis across all four algorithms when performed to estimate the dominant fractal dimension of the target networks as the numerical determination of the dominant fractal dimension serves as the very basis for multi-fractal analysis. 62 5.4 Comparative Analysis of Estimation Methods 5.4.1 Intrinsic estimation bias of BCANw and SBw To corroborate our argument, we first present two numerical experiments where a simple yet incom- patible growth rule is applied to both box-covering and sandbox methods when used for determination of the dominant fractal dimension of Sierpinski networkG 5 2S withb = 3;s = 1=2. The incompat- ible growth rule increases the size of the box linearly by accumulating a fixed step length. We show in Figure 5.5 the estimated fractal dimension using a box-covering method with this growth rule. The theo- retical fractal dimension isln(b)=ln(s) 1:585. As predicted, the staircases are present throughout the scales of l considered. To show its impact on the estimates, we plot the fitted line given by the linear regression on collected measurements and a reference line with the theoretical slope. As one notices in the figure, the staircases drive the estimates to deviate from the theoretical slope. We can make similar observations in Figure 5.6 that shows the estimated dominant fractal dimension of theG 5 using sandbox method following the same growth rule. The staircases correspond to incompatible choice of box size l from the set[L i for which the probability measure is not defined. As a result, the measure (B(l)) remains stagnant irrespective of changes inl, introducing estimation errors when linear regression is per- formed in Eq.(5.39) and (5.43) to determine numerically the multi-fractal spectrum. These two simple log(l/L) -2 0 2 4 6 logΣµ(B i (l)) Observations Fitted line by BC with linear growth rule Perfectly tted line fi Slope=-1.339 Slope=-1.585 -4 - . 3 5 -3 - . 2 5 -1 - . 0 5 -2 - . 0 5 -0 Figure 5.5: Observation of staircase effect in determination of dominant fractal dimension ofG 4 of Sier- pinski fractal network family using box-covering method with an incompatible linear growth rule. examples verify our analytical prediction in our analysis of incompatible growth rule when an incompat- ible growth rule is enforced. In what follows, we show that neither BCANw nor SBw is immune to such incompatible growth rule. Again, we useG 5 as a case study to compare with our simple settings of our first set of experiments. In contrast to a linear growth rule, both BCANw and SBw employ a growth rule 63 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 log(l/L) 0 1 2 3 4 5 6 log Σµ(B i (l)) Observations Fitted line SB with linear growth rule by Perfectly fitted line Slope=-1. 75 2 Slope=-1.585 Figure 5.6: Observation of staircase effect in determination of dominant fractal dimension ofG 4 of Sier- pinski fractal network family using sandbox method with an incompatible linear growth rule. -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0 1 2 3 4 5 6 7 logΣµ(B i (l)) Observations Fitted line via BCANw Perfectly fitted line Slope=-1.585 Slope=-1.375 log(l/L) Figure 5.7: Observation of staircase effect in determination of dominant fractal dimension ofG 4 of Sier- pinski fractal network family using BCANw. L < =fw 1 ;w 1 +w 2 ;:::; P i w i ;8w i 2W (G)g whereW (G) =fw 1 ;w 2 ;w 3 ;:::;w n g is an ordered set of all the weights ofG such thatw k w k+1 for all choices ofi. We apply the BCANw and SBw to estimate the dominant fractal dimension ofG 5 . To verify the existence of staircase effect when applying BCANw and SBw, we show two case studies in Figure 5.7 and Figure 5.8. As one can notice, accumulating the link weights to grow either the box or sandbox is still not compatible with the Sierpinski fractal network 64 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0 1 2 3 4 5 6 7 logΣµ(B i (l)) Observations Fitted line via SBw Perfectly fitted line log(l/L) Slope=-1.585 Slope=-1.3269 Figure 5.8: Observation of staircase effect in determination of dominant fractal dimension ofG 4 of Sier- pinski fractal network family using SBw. G 5 . We observe in both experiments that the staircases introduce large bias (1:375 and 1:327 compared to 1:585) in numerical determination of limits in Eq.(5.39) and 5.43. The failure to accurately calculate them translates directly to unreliable estimation of the multi-fractal spectrum. Moreover, we argue that the estimation errors recognized in Figure 5.7 and 5.8 do not come from ”in- frequent anomalies” of the experiments. Given a fixed size of the network, repeating the experiments using box-covering or averaging the result over an increased set of sandbox centers does not fundamentally com- pensate the error introduced by BCANw and SBw ignoring finite resolution and link weight distribution of the target network. We performed BCANw with random choice of node coloring order and repeated the experiment by 1000 to 11; 000 times with a step length of 200 to obtain the averaged number of boxes to cover the graph to avoid the bias introduced by the deterministic ordering. It should be also noted that the reason we take the average number of boxes comes from the practical consideration. It represents the average performance of the BCANw when it is computationally impossible to repeat the experiments indefinitely to obtain the minimal number of boxes given large-scale networks. We performed SBw with random choice of the center of sandbox from 5% to 100% of nodes inG 5 with a step length of 1:9%. For each step, the results are averaged over 1000 trials. To illustrate the importance of awareness of the finite resolution and link weight distribution on the estimation algorithm, we also performed the proposed FBCw and FSBw with the same settings as BCANw and SBw. The results are plotted in Figure 5.9. We show the estimation errors of the four algorithms normalized against the theoretical dominant fractal dimension. For box-covering based algorithms (BCANw and FBCw, blue lines), we plot the error against different numbers of trials (1000 to 11000). For sandbox-based algorithms (SBw and FSBw, orange lines), the error is plotted against utilization of total number of nodes used as the candidates for the sandbox center. Several key observations can be made for BCANw and SBw: i) As the averaging is performed over 65 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 Normalized estimation error Estimation error by BCANw Estimation error by SBw Estimation error by FBCw Estimation error by FSBw 11000 9000 7000 5000 3000 1000 Number of trials 100% 80% 60% 40% 20% 0 Utilization Figure 5.9: Normalized estimation errors of dominant fractal dimension ofG 5 (b = 3, s = 1=2) under different i) numbers of the repeated trials for box-covering-based methods (BCANw and FBCw) and ii) utilization of nodes as sandbox center for sandbox-based methods (SBw and FSBw). Averaging the esti- mations over an increasing number of box-covering trials or nodes used as sandbox centers brings trivial improvement to the intrinsically biased estimation of BCANw and SBw. The proposed FBCw and FSBw provide better accuracy by addressing the finite resolution and the skewness of the link weight distribution of the weighted complex network. an increasing number of trials or sandbox centers, the estimation error is improved slightly. For BCANw, this can be understood as the randomization helps remove the bias of the ordering by which we check the nodes to assign box ID. For SBw, the improvement on estimation error is more significant (from 19:3% to 17%) while it eventually approaches that of BCANw. This is well aligned with Eq.( 5.43) in that the randomized choice of sandbox is the necessary condition for the equivalency of sandbox method to box- covering method. ii) Even though the improvements on estimation accuracy are observed for both BCANw and SBw as the averaging helps remove the random bias, they are trivially small. The randomization is not able to fundamentally compensate for the estimation errors introduced by their incompatible growth rules. Therefore, the staircase effect from the case studies presented in Figure 5.7 and 5.8 is the intrinsic estimation bias of both algorithms. iii) In contrast, the proposed FBCw and FSBw algorithms consistently outperform the BCANw and SBw by a larger margin. The worst-case normalized estimation error of FBCw is less than 4% and that of FSBw is less than 7%. The experimental results show that the state-of-the-art BCANw and SBw methods are not immune to errors due to the influence of finite resolution and link weight distribution, hence suffering from the in- trinsic estimation bias. It should be noted that these biased estimations can be noticed only if we know the 66 ground truth of multi-fractality of the interested network. Such ground truth can hardly be reached if i) we have no access to estimation approaches with optimality guarantee (e.g., optimal box-covering or sandbox methods with compatible growth rule) and/or ii) the underlying mechanism that regulates the growth of the network is unknown or changing over time (e.g., non-deterministic). For weighted fractal networks that extend themselves based on simple rules (e.g., Sierpinski fractal family), our theoretical multi-fractal analysis shows that it is possible to develop an optimal approach based on which the ground truth (i.e., the theoretical multi-fractality) can be obtained. However, for most of real-world weighted complex networks there is no such ground truth against which we can compare our estimation of multi-fractality and it is practically very difficult to develop algorithms with optimality guarantees. As a consequence, the bias, which is very likely to exist when BCANw and SBw is used, can hardly be identified hence leading us to unreliable conclusions about the target networks and the urgent need for reliable numerical estimation approaches. As a case study, Figure 5.9 already showed the advantage of the proposed algorithms in esti- mating the fractal dimension ofG 5 . In what follows, we present a more comprehensive set of comparative analysis on the proposed FBCw and FSBw against BCANw and SBw. 5.4.2 FBCw and FSBw for weighted complex network of finite resolution To further validate the proposed FBCw and FSBw based on the known ground truth about the fractality of the target network, we consider the Sierpinski fractal family with ranged variations in the size of the graph and the skewness of link weight distribution. Formally, the skewness of the distribution is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. We introduce the Pearson’s moment coefficient of skewness as the measure of the asymmetry in link weight distributions as follows: = E[(X) 3 ] E[(X) 2 ] 3=2 = 3 3=2 2 (5.27) t are thet-th cumulants. The probability distribution with positive skewness usually has a longer right tail or the mass of the distribution is concentrated on the left of the distribution. We extensively measured the skewness of link weight distribution of theG 5 andG 8 under different copy factor (b = 2 to 8) and scaling factor s ranging from 0:95 to 4:5 10 4 . We report the results in Figure 5.10. We can observe that: i) the skewness of the link weight distribution of Sierpinski fractal network increases as the scaling factor decreases and the size of the network grows. Figure 5.10-(a) shows the skewness of link weight distribution of G 5 . The smaller scaling factor leads to a less asymmetrical distribution. The large copy factor further amplifies this skewness by placing more mass on the left side of distribution (i.e., links with small weights). Similarly, Figure 5.10-(b) shows a significantly increased skewness in large networks compared to Figure 5.10-(a). ii) For the same copy factor, the skewness of the link weight distribution tends to converge as the scaling factor decreases. The copy factorb dominantly influences the skewness of the link distribution when the scaling factor is small. iii) The skewness of link weight distribution does not increase linearly as the scaling factor decreases. A transition phenomenon can be observed as the scaling factor decreases. The skewness grows much slower at larges and seems 67 insensitive to the change of copy factors. This observation suggests some potential underlying phase transition of the Sierpinski fractal networks as scaling factor decreases. To further understand the observed transitional phenomenon, we show a case study by associating the scaling factor to the free energy of a multi-fractal and study its first-order discontinuity. We observe that there exists a critical scaling factor s that describes a transition from a mono-fractal phase to a multi- fractal phase (when considering networks of limited size) of Sierpinski fractal network. More specially, the transition point where the skewness of link distribution experiences a quick rising when scaling factors decreases below a turning point suggests that a critical exponent can be potentially associated to this point to describe a transition from a mono-fractal phase to a multi-fractal phase (when considering networks of limited size) of Sierpinski fractal network. Based on the multi-fractal analysis of Sierpinski fractal network, the Sierpinski fractal network converges to a perfect mono-fractal if the Sierpinski network grows at the same speed as we shrink the box sizel. The growth rate of the Sierpinski network is fundamentally determined by the scaling factors. Therefore, we expect that a proper choice ofs in a given networkG i of limited size will serve as a phase transition point where Sierpinski network family turns from a mono- fractal to a multi-fractal as a result of violating our assumptions, i.e., identical growth rate as the box size in a graph of unbounded size. In fact, we can analyze this transition from the perspective of thermodynamic free energy and inves- tigate whether there exists a discontinuity in the derivatives of free energy function associated with the multi-fractal spectra of Sierpinski fractal networks [1]. Ehrenfest classification [3] labels the phase transi- tion by the lowest derivative of the free energy that is discontinuous at the transition. For instance, the first order phase transition exhibits a discontinuity in the first derivative of the free energy with respect to some thermodynamic variable. In analogy to this thermodynamic concept, the free energy of a multi-fractal is defined as the mass exponent (q;s) that characterizes the scaling dependence of partition function Z(l;q;s) on the sizel of each partition (See [1,2] for detailed analysis). Z(l;q;s) = X i (B i (l);s) q ( l L ) (q;s) (5.28) We considers in the partition function and mass exponent as they are determined by the scaling factor s if we fix the size of given Sierpinski fractal network. Similarly as thermodynamics classifies the phase transitions based on the behavior of the thermodynamic free energy as a function of other thermodynamic variables, we introduce a set of experiments to investigate the first-order phase transition of the free en- ergy of Sierpinski fractal network as function ofs. In the experimental setting, we chooseG 4 from the Sierpinski fractal network family. G 4 has 364 nodes with a copy factorb = 3. We consider a series of decreasing scaling factors i = 0:95 i wherei 1 being the index of the scaling factor. As a case study, we measured the(q;s i ) against differents i with a fixedq =2 . We plot(q;s i ) againsts i to study the existence of discontinuity in Figure 5.13. Figure 5.13.(a) suggests a sudden transition in the free energy of Sierpinski fractal networkG 4 when we decrease scaling factors belows = 0:7738. This can be further verified in the Figure 5.13.(b) where the first-order derivative of the free energy has a numerical singular point between s = 0:7738 and 68 s = 0:7351. These two observations lead us to believe the existence of first-order discontinuity in the free energy function of Sierpinski G 4 , i.e., the first-order phase transition. To further corroborate our claim that this phase transition suggests a change from mono-fractal phase to multi-fractal phase of Sierpinski fractal network, we investigate the dependence of free energy (mass exponent)(q) on distorting exponent q on the interested range ofs between 0:9025 to 0:7351 and report it in Figure 5.14. By definition, the free energy (mass exponent)(q) of a mono-fractal exhibits linear dependence onq as opposed to a non- linear dependence on q for a multi-fractal. We can observe that the free energy (q) changes from a linear dependence behavior onq fors greater than 0:7351 to a nonlinear dependence as a function ofq for s = 0:7351. This indicates G 4 stays a mono-fractal before this critical point and then transits to a multi-fractal beyond it. This observation coincides with Figure 5.13 and altogether they experimentally verify our claim that there exists a critical scaling factor s that leads to a phase transition. It should be also noted that this critical point might not precisely coincide with the transition point of skewness of link distribution and it is also influenced by the size of the network (e.g., the criticals forG 4 analyzed in the experiment might not coincide with that of G 5 ). Figure 5.10 also shows that the skewness of the link distribution of Sierpinski fractal network is affected by the size of the graph, copy factor b and scaling factor s. In order to understand how this skewness has impact on the numerical determination of the multi-fractality and compare our proposed FBCw and FSBw with BCANw and SBw, we present two comparative experiments. In the first experiment, we consider a set of Sierpinski family members ranging fromG 3 (39 nodes) toG 8 (9840 nodes) given the fixed copy factorb = 3 and scaling factors = 1=2. The estimated fractal dimensions are reported in Figure 5.11 for BCANw (blue line), SBw (orange line), FBCw (yellow line) and FSBw (magenta line), respectively. For comparison purpose, we also show the theoretical dominant fractal dimension of the target networks with the dashed line. Based on the results in Figure 5.11, we can make following observations: i) The proposed FBCw and FSBw are much less sensitive to the size of target graph compared to BCANw and SBw. The normalized estimation errors of FBCw and FSBw performed on G 3 with only 39 nodes are 5:24% (averaged estimated fractal dimension = 1:50) and 6:56% (averaged estimated fractal dimension = 1:48), respectively. In contrast, the estimation errors of BCANw and SBw are 25:6% (1:18) and 22:4% (1:23), respectively. This property of the proposed FBCw and FSBw is very important in practice when used as the basis of multi-fractality analysis on real networks of which we have neither ground truth to reason about the estimation error nor scaling methods to improve the accuracy. It is critical to have algorithms that have no strict constraint on the target network and deliver reliable estimates in various settings. ii) As the size of graph grows, the accuracy of all four algorithms is improved as a consequence of more observations obtained to perform the linear regression. This is aligned with Eq.(5.39) and (5.43) in that the numerical calculation of the limit in both equations is asymptotically equal to the theoretical value given the linear regression performed on a network memberG 1 of the Sierpinski fractal network with unbounded size. However, it should be also noted that the BCANw and SBw still suffer from significant estimation errors compared to the theoretical value in spite of a large-scale target network (e.g.,G 8 ). Combined with Figure 5.10, one primary influencing factor is the increased skewness of link weight distribution of a larger 69 network that will worsen the performance of box-covering and sandbox algorithms with no compatible growth rule. In contrast, FBCw and FSBw quickly converge to the theoretical value with very small errors. To further corroborate our discussion on the adverse impact of skewed link weight distribution on the accuracy of BCANw and SBw, we present the second set of experiments. The experimental setup is motivated by the observation we made in Figure 5.10 that the skewness of link weight distribution is dom- inantly affected by the copy factorb. Therefore, we choose a member network from the Sierpinski fractal network family as the seed network. We adopt different values for the copy factorb with a fixed scaling factor s = 1=3 to generate an array of fractal networks. Then BCANw, SBw and the proposed FBCw and FSBw are employed to estimate the dominant fractal dimensions of all generated networks. Due to the constraint of the computing power, we chooseG 5 as seed network and the copy factor ranges from 2 (62 nodes) to 8 (37448 nodes). We report in Figure 5.12 the normalized estimation error against the corre- sponding skewness of the distribution for BCANw (blue line), SBw (orange line), FBCw (yellow line) and FSBw (magenta line), respectively. Figure 5.12 can be interpreted as follows. First, by increasing the copy factor one can notice a further skewed link weight distribution. As a result, the estimation accuracy of BCANw and SBw degrades accordingly. The normalized estimation error of BCANw grows from 6:79% ( = 4:0103,b = 2) to 15:92% ( = 20:2235,b = 8). Similarly, the degradation of estimation accuracy of SBw is worse than BCANw . The error increases from 10:01% to 17:2%. Second, the performance of FBCw and FSBw is not adversely impacted by the increased . Interestingly, the accuracy is improved as the copy factor increases. This improvement is discussed in our first set of experiments as a result of larger set of observations obtained for more reliable calculation of limit in Eq.(5.39) and (5.43). It is very important to note that both BCANw and SBw underestimate the dominant fractal dimension, which is predicted by our analytical findings in estimation errors analysis section. The incompatible growth rule of BCANw and SBw gives rise to the larger set of stagnant observations (i.e., wider staircase) when the skewness is positively higher. A network with higher positive skewness of link weight distribution has more links with smaller weights. SBw grows the sandbox by accumulating the link weights in an ascending order. In presence of highly (and positively) skewed link weight distribution, it might take a large amount of iterations to grow froml i tol 0 i such that the probability measure(B i (l i )) is not equal to (B i (l 0 i )). All the observations generated betweenl i andl i 0 become stagnant observations or staircases. The more positively skewed the distribution is, the wider the staircases will be and SBw is more likely to underestimate fractal dimension of the network, which is well aligned with our observations in Figure 5.12. For the similar reason, even though BCANw grows the box size by accumulating the unique distance in an ascending order, yet we have seen in Figure 5.7 that BCANw can not eliminate the staircase effect thus it is prone to underestimate the fractal dimensions as the network becomes more skewed in terms of the link weight distribution. We have validated the proposed FBCw and FSBw by showing we can obtain fractality estimation of better accuracy over the established BCANw and SBw for the weighted complex networks. In the following discussion, we will employ the proposed FSBw and FBCw for numerical identification of multi- fractality in a set of real-world complex networks. 70 5.5 Multi-fractal analysis of real networks 5.5.1 Vision and objectives of the multi-fractal analysis Multi-fractality is deeply rooted in the intrinsic heterogeneity of the networks. More specifically, the non-uniformness of the network structure serves as a major source for a spectrum of distinct self- similarities embedded in different regions of the network at a variety of scales. This embedded heteroge- neous self-similarities can be identified through the multi-fractal analysis. Intuitively, multi-fractal analysis can be understood as a microscope with an array of distorting filters that pick up a set of distinct scaling behaviors from corresponding parts of the network by changing the distortion factorq. A perfect geometric or topological fractal (e.g., fractal networks) shares the same scaling behavior (i.e., the dominant fractal dimension) that is immune to the changes ofq, suggesting a consistent self-similarity across the network. Such geometric or topological consistency in self-similarity is usually a result of a common underlying growing rule throughout the scales of the network considered (e.g., Sierpinski fractal network). However, such well-preserved growth rule is rarely found (e.g., non-fractal networks) or inconsistent (e.g., coexis- tence of small-world and fractal properties with phase transitions) in the real-world networks due to the complicated network formation process. This generation process cradles for the intrinsic heterogeneity in both the structural (e.g., network clusters, communities, hubs) and dynamical (e.g., network control, robustness) aspects of the real-world networks. In the following discussion, we will focus on the structural aspects of a set of weighted real-world networks to answer the following three key research questions: i) Whether the multi-fractal scaling behaviors can be observed in the target network and how can they be exploited for betterment of our understanding on the structural properties of real networks ? ii) What is the contribution of link weight to such scaling behaviors if verified in (i) and how will the change of link weight fundamentally impact the observed multi-fractality ? iii) How can the identification of the multi-fractality be leveraged to supplement our characteri- zation of the real-world complex networks and provide a novel perspective and a practical probe for unveiling their under-explored structural organization? To study these questions, we choose two weighted real-world networks. The first weighted network is a scientific collaboration network in astrophysics with 16705 nodes and 111252 edges. Each node represents an author and an edge connects two nodes if they published one or more papers together. The weight between any pair of nodes is determined by, w i;j = X k i;k j;k n k (5.29) n k is the number of authors ofkth paper. i;k = 1 only if authori co-authored thekth paper and it is 0 otherwise. The weight quantifies how frequently and closely two authors collaborate. The second weighted network comes from the Budapest Reference Connectome v3.0 which generates the common edges of the connectomes of 1015 vertices, computed from the MRI of the 477 subjects of the Human 71 Connectome Projects 500-subject release. For each edge e i;j , the weight w i;j is based on the electrical connectivity of two nodes and calculated by the number of fibersn divided by the average fiber lengthl. 5.5.2 Space-localized multi-fractal scaling To address the three major research questions, we consider three sets of experiments. In the first set of experiments, we study whether the target weighted networks in two different domains show any fractal or multi-fractal scaling behaviors. Towards this goal, we applied the proposed FBCw to both networks in order to learn the scaling dependence as expected by Eq.(5.39) when the distorting factorq is varied within a finite range from10 to 10 with a step length of 0:1. A key observation on Eq.(5.39) is that the role of the distorting factorq connects primarily to the identification of the non-uniformness of the probability measure (B) defined on the support of the weighted networks. Such non-uniformness of the measure and their distinct scaling dependence over the interested scalesl arbitrates the existence and properties of the multi-fractality in the target network. If the measure(B) is otherwise uniform at all scales, Eq.(5.39) will not be affected by the choice ofq, hence learning only the mono-fractality of the support. Motivated by such observations, we first look at the weighted collaboration network and report in Figure 5.15 the distribution of the measure (B(l)) over the partitions (i.e., boxes) of different scales (i.e., size of the box) to give an undistorted overview of the non-uniformness of the measure. Several key observations are due: i) the distribution of the measure(B(l)) changes from a near-uniform distribution to a peak shape as the scale increases. Alternatively stated, the probability to find any node in a given box at a specific scale l is also a function of the choice of that box. At almost each scale, there exists a partition that contains the dominant number of nodes. This scaling skewness of the measure strongly suggests the structural heterogeneity of the target network and serves as a necessary condition for the emergence of multi-fractality. ii) The rightmost X-axis boundary of the measure distribution marks the minimal number of partitions of scalel required to cover the target network. By learning the shrinking law of these boundaries, we can have a straightforward way to verify the existence of fractal scaling behavior. More precisely, we can notice that the rightmost boundaries of the measure do not shrink as quickly as an exponential function but following a power law (which is much slower as indicated in Figure 5.15). The corroboration of these two observations demonstrates the existence of multi-fractal behavior in the collaboration network. To further investigate the mulitfractality of the collaboration network, we performed the follow-up experiment to report the scaling dependence between P (B) q against the normalized box sizel=L in a loglog plot under different distorting factorsq. For ease of visualization and readability, we construct the plots by choosing only the cases whenq is integer-valued. The results are plotted in Figure 5.16. We can observe that the logarithmic distorted accumulative measurelog( P (B) q ) has a linear dependence that is almost immune to the changes of negativeq on the normalized scalelog(l=L), suggesting a mono- fractal scaling behavior. However, such linear dependence still holds and is subject to remarkable changes as a function of positiveq, which is an indicator of the existence of multi-fractality. To understand this, we need to link this observation with Eq.(5.39). Negative distorting factorq places greater weights to the partitions with smaller measures whereas does the opposite when positively valued. In other words, we 72 are able to learn distinct scaling dependence of different regions of the measure distribution, which again correspond to different parts of the target network. In our case whenq is positively valued, the observed multi-fractal scaling dependence corresponds to the partitions of the collaboration network with dominant probability measures. In contrast, the mono-fractal scaling behavior is strongly related to partitions with small probability measures. The two sets of distinct scaling behaviors not only verify the multi-fractal scaling dependence of target network but also suggest a co-existence of multi-fractality and mono-fractality in the same network while belonging to different parts of the network. To understand this, we need to look at how the partition is done to tile the target network with box-covering method. Eq.(5.39) holds only if the covering is optimal (i.e., with minimal number of boxes, see Definition 5). To achieve this, each box has to be as compact as possible such that it covers the maximal possible number of nodes in a connected component. Such connected components might coincide with regions of the network that are highly clustered such that their scaling follows power-laws characterized by different exponents, hence exhibiting multi-fractal behaviors. In contrast, the non-compact box covers nodes that failed to be reached by nodes in these connected components and demonstrate a shared mono-fractal scaling. In other words, it is the intrinsic variations of the network structure that contribute to the observed co-existence of distinct scaling dependence such that the observed multi-fractal scaling dependence is space-localized. The exact mathematical explanation for the coexistence of mono-fractality and multi-fractality calls for a more sophisticated understanding of the underlying network formation mechanism, which is beyond the scope of this work and remains as a future extension. 5.5.3 Scale-localized inconsistent multi-fractality of weighted real networks: We not only observed the inconsistency of scaling behaviors in different regions of the network. We also noticed that such scaling is not consistent over the interested scales even when the network exhibits same type of scaling (mono-fractal or multi-fractal) with a fixed q. More specifically, we observed the existence of a finite range of scales at which a localized scaling behavior holds. To better illustrate this, we have specifically picked cases whenq =6; 0; and 6 and reported the scaling dependence in aloglog plot for each of them. The results are shown in Figure 5.17. We use the dashed blue lines to show the range of scales where the fractal scaling appears and the red dashed lines to show outliners. Consequently, we can make the following two observations: i) Figure 5.17.(a)-(c) consistently show that the self-similar property does not hold at all scales of networks and it might only show up in a finite range of scales. Phase-transition behavior can be observed on boundaries of this range. Moreover, such phase transition phenomenon also holds under a variety of distorting factor q. In other words, the multi-fractality of the collaboration network is scale-localized. This finding resonates with our claims at the beginning of this section that there is no common underlying growth rule for the generation of real networks to produce simple self-repeating structures at all scales of the real networks. As we observed in Figure 5.16 and Figure 5.17, the self-similarity is neither spatially consistent across the network nor well-preserved at all scales of the network. ii) In such cases, it is not sufficient to have an algorithm that can reliably estimate the scaling de- pendence when it exists. It is also primarily important for the algorithm to detect the boundary of scales 73 between which such scaling dependence holds and make a localized estimation accordingly. We argue that BCANw ignored such localized fractal scaling by the implicit assumption that scaling behavior holds at all scales of the complex network. In contrast, our proposed FBCw is able to locate the phase-transitional scales based on which a reliable estimate is therefore made. To demonstrate this, we performed the BCANw and FBCw on the same network under identical experimental settings. We plotted the fitted linear functions by two algorithms in Figure 5.17.(a)-(c). Biased by the implicit assumption that fractal scaling holds at all scales, BCANw tends to average the estimates by considering all the observations, irre- spective of their contribution to the fractal scaling behaviors. In comparison, the proposed FBCw detected the locality of multi-fractality and ignored the outliners in the observations not belonging to the range where fractal scaling holds and fit a linear function only to those within it. The difference between the two fitted lines shows the estimation bias of BCANw by assuming a fractal scaling held at all scales. We have repeated the above experiments on the Budapest human connectome network and report the results in Figure 5.18 and 5.19. Figure 5.18 depicts the uniform distribution of probability measure of partitions for the Budapest brain network. Similar to our observations on the collaboration network, the distribution of measure shifts from a near-uniform distribution to a peak shaped non-uniform distribution, suggesting the underlying structural heterogeneity of the brain network, which serves as the major source of multi- fractality. By learning the shrinking behavior of the rightmost boundary of the measure distribution as the size of the box increases, one can observe a power-law dependence which is verified by the subfigure in Figure 5.18 where we plot the scaling dependence between the accumulative measure P (B) q and the normalized scalel=L in aloglog scale. Figure 5.18 is well aligned with our findings in the collaboration network. Budapest brain network also exhibits a localized fractal scaling over the ranged delimited by a pair of dashed blue lines, suggesting an inconsistent power-law scaling behavior valid only over a finite range of scales. The fractal organization of brain network is well reported in the related literature. How- ever, few prior efforts have identified the localized fractal scaling given a weighted brain network. To study the multi-fractality of the Budapest brain network, we performed the multi-fractal analysis on it and re- ported the scaling dependence under different choice of distorting factorq in Figure 5.19. We can identify the similar co-existence of mono-fractal and multi-fractal scaling as we did in collaboration network. The network regions that correspond to the partition with small measure follow dominantly a near-invariant power-law scaling dependence on the scale of the box. In contrast, the collections of densely connected nodes (e.g., connected components) compactly covered by the box shows a varying power-law dependence asq positively changes. However, a major distinction from the collaboration network is the range of scales where such power-law holds. As one can notice, there exists a scale around1 depicted by the grey dashed line such that the fractal scaling is not respected any more. No linear function could accurately explain the scaling dependence after this scale. This is aligned with our finding in Figure 5.18, suggesting a scale-localized fractal scaling behavior that is not globally consistent. 74 5.5.4 Link weight to dictate the mulifractality In the second set of experiments, we investigate how the link weights could fundamentally influence the underlying multi-fractality in both networks. Towards this end, we transformed both networks into bi- nary (i.e., unweighted) networks by removing the link weight between any pair of connected nodes. From the geometrical perspective, the link weights on the graph perform a scale transformation to the graph by increasing or decreasing the length of the links when spatially embedded while keeping its topologi- cal feature intact. By removing the link weights, we are studying existence of the multi-fractality from a pure topological perspective to understand the role of link weights via comparative analysis. More specifi- cally, we measured the distribution of the probability measure and studied the scaling dependence between minimal number of boxes covering the network and the scale of the box. Figure 5.20 and Figure 5.21 sum- marize the results for collaboration network and Budapest brain network, respectively. Thus, we make the following observations: i) The distribution of probability measure follows a similar changing pattern as the scale of box changes, i.e., from a near-uniform shape to a highly non-uniform shape. This is well aligned with our claims that such non-uniformness is a reflection of structural heterogeneity determined majorly by the topology of the network, which stays intact during our transformation. ii) However, the scaling dependence is fundamentally changed and we observed a total loss of fractal scaling behavior. Instead, the scaling can be well explained by an exponential law, indicating that both the collaboration network and the brain network behave as the well-known small-world networks. Compared with their weighted versions, the role of link weight is powerful in dictating the existence of multi-fractality in real networks. Consequently, this finding not only calls for developing new algorithms for estimating reliably the multi- fractal characteristics of weighted complex networks, but also highlights the importance of understanding the structural implications of the identified multi-fractality. This brings us to the following discussion on the third research question of this work. 5.5.5 Localized scaling based network characterization and community detection In the third set of experiments, we study how the multi-fractal analysis framework can be employed to characterize the complexity of networks beyond simply reporting whether they follow a multi-fractal/fractal scaling as many previous works did. We use the multi-scale analysis to quantify the global complexity of the network from a microscopic point of view. Based on the analysis, we proposed a general network char- acterization framework based on the localized scaling feature space constructed by learning the localized scaling feature vector for each node. We noticed in the first two sets of experiments that a real world network is complex in the sense that there exists no common growth rule that governs the evolution of the network generation process consistently in both the scale and space domain. Similar to our two target examples when certain variation or transformation introduced into the network, the fundamental structural behaviors of the network are subject to remarkable mutations. No single model or characterization is sufficient to fully understand the structural variations and their resulting complexity of the network, hence calling for a set of expressive characterizing strategies that supplement each other to give an unbiased and well-quantified overview of the 75 complex networks. We strongly believe the multi-fractal analysis is one of such frameworks and propose that the learned localized scaling behavior (not necessarily fractal or multi-fractal) can be leveraged to quantify the structural variations of the network. On one hand, at the microscopic level, the complexity of the network is embedded in the form of different chemical environments (i.e., the outer environment surrounding a given node) that each node interacts at different scales. More intuitively, the structural variations of the network can be understood as distinct views that a node observes with a lens of variable focal length ranging from the minimal path distance of the network up to its diameter. If all the nodes share the identical viewing experiences with such lens, then the network should have no structural variations like an unweighted lattice which can be fully characterized by its dimension. Otherwise, such microscopic differences in views at a variety of scales, when integrated collectively, translate into the observed structural variations from a global perspective that require multifaceted characterizations. On the other hand, multi-fractal analysis framework is exactly one of such multi-scale techniques to study and quantify the microscopic proxy of network complexity in terms of structural variations. From the mathematical point of view, Eq.(5.39) and(5.43) suggest the structural variations (i.e., structural hetero- geneity and link weight distribution) of network that are distributed in an inhomogeneous way and repeat locally and imperfectly (i.e., space-localized) within a finite range of scales (i.e., scale-localized) is the major contributor of the observed scaling behaviors in the complex networks. Reversely, it also provides a way to characterize such structural variation by identifying and quantifying the scaling behaviors (again, not necessarily fractal and/or multi-fractal). More specifically, the proposed SBw method for the weighted complex network is one of such quantifying tools which are able to measure the microscopic differences of the chemical environments for a given node at varied scales by learning its localized scaling dependence from where it is located. This underlying connection between the multi-fractal analysis and the microscopic view of the network complexity leads us to the straightforward implementation of our proposed localized scaling based network characterization approach. More precisely, we start with a given nodek of the network and perform the SBw centered at it withq = 0 to learn the scaling dependence it experiences as we increase the scalel of the sandbox up to the network diameterL. This results in a localized scaling feature vector of tuples S k = [s > k;1 ;s > k;2 ;:::;s > k;n ] > , where s k;i = (logN k (B(l i ));log(l i =L))). S k is populated by the sampled logarithmic scaling dependence between the normalized box scalel=L and the number of nodes covered by the sandbox centered atk, hence localized. We repeat the process for every node of the network to construct a localized scaling feature spaceS(G) =fS k jk2N ]g for the given networkG. The localized scaling feature spaceS(G) is uniquely spanned by the localized scaling feature vectors of different network nodes. Its structure and properties are determined by the original network. Therefore,S(G) can be leveraged to characterize the network from a scaling dependence perspective. To the best of our knowledge, this is the first time that the localized scaling behavior of network is proposed as a quantitative profiling approach to characterize the structural characteristics of the complex network. An immediate application of this profiling approach is an easy integration with unsupervised machine learning algorithms to perform label-free detection of the network communities. The basic assumption is 76 that the nodes sharing the same or similar scaling dependence localized to where they stand in the network should reside in similar chemical environments therefore belonging to the same network community. As a proof of concept, we performed the simple yet effective unsupervisedk-means clustering algorithm with elbow method for network community detection on the Budapest human connectome and visualized the result in Figure 5.22. This community detection approach identifies seven communities (colored differently in Figure 5.22). The detection process is totally label-free with no prior knowledge of the functionality and locations of brain components and solely based on localized scaling feature vector of each node. Several key observations are due: i) brain network is symmetrical so are the communities detected by the proposed approach, which is aligned with anatomical structure of human brain. ii) The dominant community (purple nodes) detected corresponds to the densely interconnected brain functional cluster formed by left and right Putamen , left and right Caudate, left and right Thalamus together with left and right Hippocampus. This community is consistent with the biological facts. The Putamen and Caudate are anatomically correlated to form the basal ganglia which is well known to be strongly interconnected with the cerebral cortex, thalamus, and brainstem to perform the control of voluntary motor movements and procedural learning. Thalamus is also manifoldly connected to the Hippocampus via the mammillo- thalamic tract and serves as an important relay station to propagate the sensory and motor signals to the cerebral cortex. This detected community serves as the major bridge between the two hemispheres and connectivity hubs to other functional entities thus sharing the similar chemical environment. iii) It can be noticed that the nodes belonging to the same community might not be necessarily immediate neighbors in contrast to the conventional modularity based communities. This is because the nodes are clustered based on their scaling dependence which is determined by the surrounding chemical environment at different scales. Therefore, it is possible that two nodes that are physically separated share the similar chemical environment to be labeled in the same community. For brain network, such chemical environment is a consequence of biological network evolution process and might have important functional implications that need to be explored in the future. In this sense, the concept of network community has been extended to characterize the node of the network from its relative spatial relation to the rest of the network. We hope the proposed localized scaling based network characterization and community detection can introduce a new research perspective for betterment of our understanding of the real world complex network in different domains. 5.6 Discussion The multi-fractal analysis has been long established to describe physical phenomena and objects by studying statistical scaling laws. The major attraction of its application stems from its capability to charac- terize the spatial and temporal irregularities that euclidean geometry fails to capture in real world physical systems, by an elegant interpretation of power-law behaviors. Its demonstrated effectiveness in character- izing complex systems motivates us to extend its formalism to the analysis of complex networks. However, 77 the multi-fractality of weighted complex networks, the role of interaction intensity, influence of the under- lying metric spaces and the design of reliable multi-fractality estimation algorithms are rarely discussed and remain an open challenge. In this work, we provide strong theoretical and experimental evidence for the intrinsic estimation bias of the previously proposed algorithms introduced by the incompatible growth of box scales and the implicit assumption that fractal scaling behaviors, if exist, hold at all scales of the networks. To overcome these disadvantages, we proposed two algorithms that can reliably estimate the multi-fractality of the network based on the critical points that correspond to a power-law scaling such that i) it avoids box scales that lead to stagnant probability measures (i.e., staircase effect) and ii) identify the range of scales where a power-law scaling holds. In addition, we demonstrated that the estimation bias of the previously proposed algorithms deteriorates as a function of link weight distribution skewness and can not be compensated by either repeating the experiments or increasing the size of the networks (irrespective of the fact that it is practically difficult to scale the real world networks without changing its properties). More importantly, our work showed that the estimation can hardly be trusted if it assumes the existence of a scaling law that rules the network formation process throughout the scales. We provide real world weighted network examples where the observed distribution of scaling behaviors is localized in both space (e.g., co-existence of mono- and multi-fractal scaling dependency) and scales (e.g., power-law scaling valid over a finite range of scales). Localized scaling behaviors reflect the fact that the network formation process of these networks is neither governed by a self-repeating iteration function system (IFS) that produces simple mono-fractal networks (e.g., Sierpinski fractal networks) nor a distribution of these IFSs throughout the scales of the network, which leads to a multi-fractal scaling behavior described by Eq.( 5.39) and Eq.( 5.43). Moreover, the formation of real world complex networks corresponding to different scales are constructed at different time points during the network formation process. The discontinuity of a power-law scaling at different scales therefore suggests that the network formation process is dynamic governed by heterogeneous forces as opposed to stationary models where either a fixed linking probability is assumed throughout the process (e.g., random graph theory) or a static linking policy (e.g., preferential attachment) is governing. Furthermore, the network formation process and the resulting heterogeneous characteristics of real world networks also can not be fully explained from a pure structural and topological perspective. It is necessary to understand the role of the interaction intensity among the network components, the associ- ated weight assignment process over time (e.g., how the weights change over time) and the metric space implicitly implied by the nature of these interactions (e.g., affinity relations, physical connectivities, causal dependences). We corroborate our argument with strong supporting evidence. More precisely, we identi- fied both theoretically and experimentally two fundamental fractality transition phases that are governed by the intensity of network interactions (i.e., weights) and the embedded metric space defined on these networks: i) The scaling of the weights in a network formation process governed by an IFS determines the frac- tality of the network at a given scale. We reported the theoretical upper bound for the scaling factors that transforms the network into a mono-fractal network given a) an IFS that leads to a family of Sierpinski 78 fractal networks of unbounded size and b) the box size of an optimal covering method can shrink at same rate as the network grows. We argue that the observed multi-fractality of Sierpinski fractal networks by prior works does not necessarily come from a distribution of fractal scaling behaviors (i.e., multi-fractality) but can result from any deviation from these conditions (e.g., limited size and network growth rate). As a simple synthetic fractal network as the Sierpinski fractal network is, the weights and their distribution ex- hibit surprisingly powerful impact on the fractality of the networks. In a set of more realistic experiments, we further showed: ii) The weights and the metric space defined on real world networks arbitrate the existence of the fractality. By converting the collaboration network and the human brain connectome into binary networks, we decoupled the metric spaces defined on both networks from the link weights and transformed them to be a function of network topology alone. We demonstrated the removal of observed localized fractal scaling behaviors and an exponential scaling law (i.e., small-world property) takes place after the transformation of both networks. While keeping the topological configuration intact, the redefinition of metric space fundamentally altered the statistical scaling law of both networks. This observation is not only important for the betterment of our understanding of the formation process of the real networks as the scaling behaviors reveal how the network grows. It is also a primary key to the network dynamics as the scaling behavior of the network plays a key role in governing the flows of the information across the network such as the rumor spreading in social networks or the protein exchange in a gene regulatory network. In these real world networks, the interaction intensity usually changes at a much more frequent pace compared to the changing rate of the network topology. For instance, a traffic network might stay structurally unchanged for a quite long time however the traffic volume (i.e., interaction intensity or weight) over its links varies constantly and fiercely. Given the fundamental role of link weights and metric space in determination of the scaling law, the time-varying network interactions can consequently impact the dynamics of the network. As a result, the failure to recognize the importance of link weight and metric space analysis will intrinsically limit our capability to characterize, predict and control the network behaviors. Moreover, the variations of network scaling behaviors closely connect to the change of network prop- erties, which leads us to solve a reverse research problem to characterize and quantify the heterogeneity of weighted complex networks by learning the scaling variations from a microscopic perspective of the net- work. We provide a general network characterization framework motivated by the observed locality and phase transition behaviors of the network scaling dependency. This characterization framework interprets the weighted complex network by the construction of a scaling feature space spanned by the localized scaling feature vectors determined both by the surrounding environment of individual nodes and the un- derlying metric space defined on the network. The proposed characterization is general and not limited to complex networks that are fractal or multi-fractal. It can be easily interfaced with subsequent analytical tools to unveil the intrinsic properties of the weighted complex network. As an important application, we showed the proposed characterization framework can be employed to learn the network communities that are consistent with our biological knowledge of the human brain connectivity patterns. 79 A very important aspect to emphasize is that the proposed characterization framework actually gives a general similarity metric within and between networks, which can be potentially leveraged as a basis for both structural and dynamic analysis on networks in a wide spectrum of applications. For instance, it can be interfaced with brain connectivity network constructed from real-time EEG measurements to identify tasks that correspond to different sets of scaling feature vectors, or to make both diagnosis and predictive analysis on brain-related pathological anomalies (e.g., traumatic damage, epilepsy) by learning corresponding scaling feature subspace. The proposed characterization framework can also be employed as self-similarity metric that enables the detection of anomalies or attack by comparing the learned scal- ing feature space to that during its normal operation mode in real time. In such cases, the benefit of the proposed characterization framework comes from its capability to quantify the variation of interaction in- tensities (e.g., change of transmitted power between grid node or maliciously injected traffic to overload the server) while no significant network structure mutation is present. On a different direction, this pro- posed framework also enables the fine-grain similarity analysis among a set of nodes in the same network. Aside from the presented network community detection based on this fine-grain similarity analysis, it is also useful to combine with domain knowledge (labels and attributes of nodes, e.g., functionality of brain region) to drive an informed exploration (e.g., any functional similarity between brain regions that are topologically apart but share the same scaling law). These examples may only constitute a small portion of its potential applications which necessitate our ongoing research efforts to extend the presented work to broader domains. 5.7 Detailed Implementation of FBCw and FSBw Multi-fractal analysis: Formally, let us consider a geometrical object tiled by boxesB(l) of sizel. Let us defineL,M 0 andM i (l) as the linear length of the fractal, the total mass and the mass of thei-th box of sizel, respectively. It is possible to determineN(M) that corresponds to the number of boxes sharing the same massM given the object tiled byB(l). The probability density function of mass thus could be estimated by histograms in a double logarithmic plotln(N(M)) againstln(M=M 0 ) under different choice of box sizesl. The multi-fractal formalism [100] states that if these histograms fall onto the same universal curve after rescaling both coordinates by a factorln(l=L), the object is a geometrical multi-fractal [134]. Alternatively stated, the above property holds if MM 0 (l=L) (5.30) and N() (l=L) f() (5.31) as the (l=L)! 0, where is the Holder exponent which can be determined by, = ln((B)) ln(l=L) = ln(M=M 0 ) ln(l=L) (5.32) 80 (B) is an arbitrary measure defined on the support while it is equal to the probability to find a point in a given box.N() is the number of boxes with holder exponent.f() is the singularity or multi-fractal spectrum if multi-fractal formalism holds [84, 27]. The multi-fractal spectrum shows the distribution of fractal dimensions across different sets of points sharing the same Holder exponent. Roughly speaking, it captures the variations in scaling behaviors of different subcomponents of the object. Equivalently, this variation could be captured by generalized dimensionD(q), X i M i (l) q M q 0 ( l L ) (q) (5.33) (q) = (q 1)D(q) (5.34) as we take the limitl=L! 0. (q) is called as mass exponent. Distorting exponentq can be arbitrarily real-valued which serves to distinguish the irregularity in various regions of the object by magnification of measures scaled differently. The equivalence between the pair of (f();) and ((q);q) is decided by the Legendre transformation, = d(q) dq (5.35) f() =q d(q) dq (q) (5.36) For a fractal object that can be characterized by a single fractal dimension, Eq.(5.33) suggests a sufficiently minimal fluctuation in measure (B i (l)) across all boxes of different sizes l. This directly translates to a narrowly distributed Holder exponent and a linear dependence between mass exponent (q) and distorting exponentq. In contrast, a multi-fractal is rich in fluctuations of measure and have a spectrum f() widely spanned over horizon and a non-linear (q) as a function ofq. These fluctuations are to be captured and magnified via different choices of distorting factor q. To give some intuition, when is a probability measure as in box covering process, bigger weights in the summation of Eq.(5.33) will be placed to smaller probabilities if q is negative and to greater probabilities otherwise. The generalized fractal analysis approach is also well known as multi-fractal analaysis (MFA) that has wide applications due to its power to capture the heterogeneity underlying the structures of the objects. FBCw and FSBw : Box-covering and sandbox methods form the basis for MFA on weighed complex network with the following definitions. Box covering method tiles the object of interest with boxes B(l) of different sizes l. An arbitrary measure(B i (l)) is defined for each boxB i that serves as support. Eq.(5.33) considers the case when is a probability measure such that, X i (B i (l)) q ( l L ) (q) (5.37) (B i (l)) q = ( M i (l) M 0 ) q (5.38) 81 when the limit l ! 0 is considered. Therefore, the generalized fractal dimension calculated by box- covering method is given by, D bc (q) =lim l!0 ln( P i (M i (l)=M 0 ) q ) ln(l=L) 1 q 1 (5.39) Eq.(5.39) determinesD bc (q) asymptotically from the scaling of number of non-empty boxes of decreasing sizel. Sandbox method investigates the scaling of an arbitrary measure within a region embedded in a metric space, i.e., a sandbox centered at certain point, as a function of its radiusl. Formally, letX be the support of the measure. LetD :XX !R be a metric space defined onX . For eachx i 2X , we can define the following probability measureM i (l)=M 0 as the chance to find an elementx k 2X with its distance tox i in metric spaceD less thanl, i (l) = M i (l) M 0 = 1 M 0 M0 X k6=i H(lD(x i ;x k )) (5.40) WhereM o =jXj andH is the heaviside function. However, it is known that the relation i (l) (l=L) D , whereD is the fractal dimension of the object, does not hold for all choices of sandbox centers asl!1 unless the center is the origin of the fractal [134]. Actually, sandbox method is equivalent to box covering method only if the choice of sandbox is randomized. We can rewrite Eq.(5.37) as, ( M i (l) M 0 ) q1 M i (l) M 0 ( l L ) (q) (5.41) Equivalently, E[( M i (l) M 0 ) q1 ] ( l L ) (q) (5.42) Therefore, the box-covering method can be understood as a sandbox method when the average is taken based on the measure distributionM i (l)=M 0 . Alternatively stated, the sandbox is equivalent to box count- ing only if choice of sandbox is randomized such that an estimate ofE[(M i (l)=M 0 ) q1 ] can be obtained. Denote<> as the operation to take average. We have, D sb (q) =lim l!0 ln(< (M i (l)=M 0 ) q1 >) ln(l=L) 1 q 1 (5.43) We propose the finite box-covering method (FBCw) and the finite sandbox covering method for weighted networks (FSBw) to address the intrinsic estimation bias introduced by the incompatible growth rule of the box in numerical determination of multi-fractality of complex network with finite resolution. Formally, a complex network with a finite resolution is defined as follows, Definition 6 (finite resolution): For a given weighted complex networkG = (V;E) with distance metric d i;j = minfw p i;k1 +w p k1;k2 +::: +w p kn;j g. The resolution ofG is finite if and only if the shortest path distributionF di;j (l) = Pfd i;j lg has a discrete support setL(G) =fl k jF di;j (l k )6= F di;j (l k 0);8k6= k 0 g. 82 The fundamental principle of FBCw and FSBw is to locate the scales of box that correspond to the compatible growth rule which is a function of the complex networkG. For each nodev i , the local compat- ible growth rule onG can be easily found by a strictly ordered setL <;vi =fl 1 ;l 2 ;:::;l n ;g wherel k+1 >l k andL <;vi L(G). However, it is usually difficult to find a shared compatible growth rule across the net- work for the sandbox method or to analytically derive the optimal box-covering strategy as we did for the Sierpinski fractal network governed by a simple generation rule. In this context, we propose a data driven filtering method to interface with the box-covering and sandbox method for FBCw and FSBw. Both algorithms stand as a two-step process. In the first step, the accumulative measure P (B i (l)) q given the distorting factorq will be first obtained by growing the scale of the boxl based on the unique path length of the network. Based on our discussion, this growth rule is generally incompatible. In the second step, we address this problem by a data-driven filtering procedure to obtain a subsetL < of the discrete support L(G) of F di;j (l) such that it is compatible with G. More precisely, FBCw and FSBw can be stated as follows: Step 1- Collecting the accumulative measure P (B i (l)) q : Given the distance metric d ij on G, calculate all pairs of distances and encapsulate them into a matrixD either by the Floyd Warshall algorithm (O(jVj 3 )) or the Dijkstra algorithm (O(jVj(jEj + VlogjVj))). Practically, if the graph is sparse in the sense thatjEjjVj 2 , it is recommended to use Dijkstra algorithm which outperforms Floy-Warshall algorithm by a significant margin. Given the distance matrixD, derive the strictly ordered unique distance sequenceD < =fd 1 ;d 2 ;d 3 ;:::;d N g whered i = d j if and only ifi = j. d N is the diameter of the network. D < serves as the tenta- tive growth rule which, as discussed, is usually an incompatible growth rule that gives rise to the staircase effect whereas it can be alleviated by the subsequent filtering step. It should be also noted that the cardinality ofD < can be a computationally prohibitive in some cases when the number of unique path length is very large (e.g.,jD < j of the collaboration network is close to 10 6 ). In such cases, a resampling functionS : D < ! D 0 < whereD 0 < D < will be useful to bring down the computational overhead to an acceptable level. The proper choice of the resampling function is not constrained and may be subject to change based on the target network. In most of cases, a linear resampling function should be satisfactory. Iterate on D < in an ascending order to perform the box-covering or sandbox covering procedure to obtain the accumulative measure M(d k ) = P (B i (d k )) q at the scale d k 2 D < . No con- straint is advised for the choice of a specific heuristic for this procedure. Practically, in the case that repeating the randomized box covering procedure for a large number of trials (to find the minimal number of boxes) is computationally impractical, WelshPowell algorithm usually gives satisfactory approximation after transforming the original network into its dual graph following the technique in [126]. Repeat the above procedure to obtain the accumulative measure sequence M =fM(d 1 );M(d 2 );:::;M(d N )g Step 2- Data-driven filtering for critical scales: As a consequence of growing the scale of box based on an incompatible growth rule D < for complex network G of finite resolution, there exists d i and d i 0 83 such that M(d i ) = M(d 0 i ) (i.e., the staircase effect). In practice, this condition can usually be relaxed tojM(d i )M(d 0 i )j where is a tuning threshold and conditioned on the property of the network. Therefore, for everyd i 2D < , the major task of step 2 is to filter out all thed i 0 wherejM(d i )M(d 0 i )j holds. Theoretically, it is possible to find a proper choice of such that the filtering can be done by enumeratively checking the condition for all choices ofd i . However, picking the proper can be tedious manual process. In this context, we propose a simple yet effective variance based sliding window filter to identify the criticald i that corresponds to a remarkable change inM(d i ). Formally, the sliding window filterF(x;t) F(x;t) = X ((xx i ) 2 ) (5.44) where x = [x 1 ;x 2 ;:::;x W ] > is aW -dimensional vector of observations starting att. W is the width of the sliding window. Then the data-driven filtering procedure can be stated as follows: Pickd i fromD < in an ascending order and calculate i =F(M;d i ). Repeat it for all choices ofd i to obtain =f 1 ; 2 ;:::; MW+1 g. Iterate on to find the indexi of the peaks in that correspond to the critical scaled i . Perform the regression to Eq.(5.39) and (5.43) using the identified critical scales. To better illustrate the efficacy of proposed data-driven filtering method, we plot in Figure 5.23 the raw M against the scale index obtained in Step 1 by WelshPowell algorithm based box-covering strategy to Sierpinski fractal networkG 4 withs = 1=2 andb = 3. The peaks of exactly correspond to the critical scalesd i where a significant change ofM(d i ) appears. These scales are identified and used for numerical determination of the scaling behavior instead of all the scales to avoid stagnant observations. 84 Scaling factor s Skewnwss of the link weight distribution a) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Scaling factor s 0 100 200 300 400 500 600 Skewness of link weight distribution Copy factor b=2 Copy factor b=3 Copy factor b=4 Copy factor b=5 Copy factor b=6 Copy factor b=7 Copy factor b=8 Transition point b) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 Copy factor b =3 Copy factor b =2 Copy factor b =4 Copy factor b =5 Copy factor b =6 Copy factor b =7 Copy factor b =8 Transition point Figure 5.10: Skewness of link weight distribution of Sierpinski fractal network. a) The skewness of link weight distribution ofG 5 as function of scaling factors and copy factorb = 2; 3; 4; 5; 6; 7; 8 . b) The skew- ness of link weight distribution ofG 7 as function of scaling factors and copy factorb = 2; 3; 4; 5; 6; 7; 8 85 1.1 1.2 1.3 1.4 1.5 1.6 BCANw SBw FBCw FSBw 1.585 Estimated dominant fractal dimension G 3 G 4 G 5 G 6 G 7 G 8 Graph scale Figure 5.11: Estimated dominant fractal dimension of Sierpinski fractal network family (G 3 toG 8 with b = 3 and s = 1=2) using BCANw, SBw, FBCw and FSBw. As predicted by Eq. (5.39) and (5.43), the estimation accuracy is improved as the numerical calculation of the limit by the linear regression is performed over a growing set of observations. However, the increased skewness of link weight distribution prevents BCANw and SBw from approaching the theoretical value as quickly as the proposed FBCw and FSBw do. 86 2 4 6 8 10 12 14 16 18 20 22 Skewness of link weight distribution -0.05 0 0.05 0.1 0.15 0.2 Normalized estimation error BCANw SBw FBCw FSBw γ Figure 5.12: Normalized estimation error of BCANw, SBw, FBCw and FSBw under different skewness of link weight distribution by changing the copy factor b of G 5 from 2 to 8. i) The performance of BCANw and SBw degrade as the grows. ii) BCANw and SBw tend to underestimate the dominant fractal dimension which is aligned with our theoretical prediction in analysis of the staircase effect. iii) The proposed FBCw and FSBw tends to be insensitive to the change of and benefit from the increased size of the target network. 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Scaling factor -0.2 0 0.2 0.4 0.6 First-order derivative d /ds Scaling factor -6.2 -6 -5.8 -5.6 -5.4 -5.2 -5 -4.8 Free energy 0 2 . 0 3 . 0 4 . 0 5 . 0 6 . 0 7 . 0 8 . 0 9 . 1 a) b) Figure 5.13: First-order phase transition of the free energy as function of scaling factors in Sierpinski fractal networkG 4 with copy factorb = 3. (a) The free energy ofG 4 exhibits a discontinuous behavior betweens = 0:7738 ands = 0:7351. (b) The observed possible discontinuity in the first-order derivative of free energy. . 87 q -4 -3 -2 -1 0 1 2 3 4 Free energy -10 -5 0 5 s=0.9025 s=0.8574 s=0.8145 s=0.7738 s=0.7351 Non-linear dependence Figure 5.14: Free energy (mass exponent)(q) deviates from linear dependence to non-linear dependence onq as scaling factor decreases below 0:7351. . 0 1000 2000 3000 4000 5000 6000 7000 Partition ID 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 Measure (B i (l)) Scale 10 Scale 20 Scale 30 Scale 40 Scale 50 Scale 60 Scale 70 Scale 80 Scale 90 Near-uniform Highly skewed Figure 5.15: Distribution of probability measure as a function of scale of multi-fractal analysis on collab- oration network. 88 -8 -7 -6 -5 -4 -3 -2 -1 0 log(l/L) -150 -100 -50 0 50 100 150 log( (B i (l)) q ) q=-10 q=10 Multifractal scaling behavior Mono-fractal scaling behavior q=0 Figure 5.16: Coexistence of multi-fractal and mono-fractal scaling in the collaboration network 89 -8 -6 -4 -2 0 64 65 66 67 68 69 70 71 FBCw BCANw q=-6 log( (B i (l)) q ) log(l/L) Localized fractal scaling non-fractal scaling -8 -6 -4 -2 0 6 7 8 9 10 11 12 13 FBCw BCANw log(l/L) Localized fractal scaling non-fractal scaling q=0 FBCw BCANw -8 -6 -4 -2 0 -80 -70 -60 -50 -40 -30 -20 -10 0 Localized fractal scaling log(l/L) FBCw BCANw non-fractal scaling q=6 a) b) c) log( (B i (l)) q ) log( (B i (l)) q ) Figure 5.17: The failure of BCANw to capture the localized fractal scaling of collaboration network over a finite range of network scales. In the case of real world networks, the self-similar property does not holds at all scales of networks. There might exist a finite range of scales where fractal scaling behavior dominates. Moreover, this phase transition phenomenon consistently holds under all distorting factorq, suggesting a localized multi-fractality. 90 0 50 100 150 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Scale 15 Scale 30 Scale 45 Scale 60 Scale 75 Scale 90 Partition ID Measure (B i (l)) N B(l) Power-law scaling of -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 log( (B i (l)) q ) log(l/L) Slope = -1.6616 Localized fractal scaling non-fractal scaling q=0 Figure 5.18: Distribution of probability measure as a function of scale of multi-fractal analysis on Budapest connectome network. -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 -80 -60 -40 -20 0 20 40 60 80 log( (B i (l)) q ) log(l/L) q=-10 q=10 q=0 Localized multifractal scaling behavior Localized monofractal scaling behavior Non-fractal scaling Figure 5.19: Coexistence of localized multi-fractal and mono-fractal scaling in the Budapest connectome network 91 0 1000 2000 3000 4000 5000 6000 7000 Partition ID 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 Measure (B i (l)) Scale 2 Scale 3 Scale 4 Scale 5 Scale 6 Scale 7 Scale 8 0 0.2 0.4 0.6 0.8 1 (l/L) 0 5000 10000 15000 N B(l) Fitted curve Exponential scaling of N B(l) “small-world” scaling Figure 5.20: The fundamental impact of link weights on the multi-fractality of network. We keep the exactly same structure of the collaboration network but remove all its weights to transform the network into a binary network. We performed the proposed box-covering method to measure the scaling dependence of number of boxes and the distribution of their associated measure. Figure shows the loss of multi-fractality as a result of removal of link weights. Instead, we notice it the scaling dependence is best explained by an exponential law (aexp(bx);a = 3:75 10 4 ;b =11:55) suggesting the unweighted collaboration network becomes a ”small-world” network. 92 0 10 20 30 40 50 60 70 80 0 0.2 0.4 0.6 0.8 1 Scale 4 Scale 6 Scale 8 Scale 10 Scale 12 Scale 14 Scale 16 Scale 18 Scale 20 (l/L) N B(l) Exponential scaling of 0 100 200 300 400 500 Fitted curve “small-world” scaling 1 0.75 0.5 0.25 0 N B(l) Partition ID Measure (B i (l)) Figure 5.21: The fundamental impact of link weights on the multi-fractality of Budapest connectome network. Figure shows the similar loss of multi-fractality by removing the weights on the links of Budapest connectome. The scaling behavior is best explained by an exponential law (aexp(bx);a = 934:7;b = 0:664) indicating that the common brain connectome is a ”small-world” if no weights are considered. 93 Figure 5.22: An example application of the proposed localized scaling feature space for characterization of weighted complex network. Interfaced with the unsupervised machine learning based clustering algo- rithm, the localized scaling based community detection is able to identify the brain network communities consistent with the anatomical facts. The detected communities are not limited to neighboring nodes but based on their relative spatial relations to the rest of network with potential functional implications. 94 0 10 20 30 40 1 2 3 4 5 0 10 20 30 40 0 0.2 0.4 Scale index Filter output log( (B i (l)) q ) Figure 5.23: An example application of the proposed data-driven filtering method. By applying the filter sliding through the observations, the peaks of the output correspond to the critical scales where a significant change of the accumulative measure P (B i (l)) q occurs. 95 Chapter 6 CPS Application Learning and Profiling based on Scalable Model of Computation To encompass the built-in intelligence and realtime processing capabilities of CPS within efficient computing and communication powers, it is of primary importance to understand the task structure of CPS applications, their computational and communication requirements and data access patterns. While the exact structure and dynamic dependencies of CPS applications cannot be fully predicted, it is crucial to develop a novel application profiling framework to learn the inter-dependency among application tasks for capture of the characteristics of computation and communication workloads, allowing maximized ex- ploration of fine-grain parallelism and concurrency for the design of an optimized NoC-based platform providing on-board real-time processing capabilities. To ensure a fast and unbiased evaluation of NoC- based many-core designs for CPS applications . These benchmarks must: (i) preserve the dependency patterns and traffic behavior of real applications; (ii) be scalable in terms of size, degree of spatio-temporal dependency, and amount of traffic load so that they provide a sufficient set of stressing test cases for the heterogeneous large-scale NoC architectures. Current application-based benchmark suites, synthetic task graphs, and trace-based benchmark suites do not concomitantly satisfy all the above-mentioned properties. Although certain application-based benchmark suites (e.g., Parsec [18], Splash-2 [146]) preserve the high- fidelity of the performance evaluation under a measuring framework with full architectural and operating system details, their applicability to large-scale NoCs presents the following limitations: i) Application-based benchmarks may not always prove useful in measuring NoC performance as their generation may focus on representative sets of applications for parallelism exploration, i.e., weak cross- task data dependencies. For instance, only a small portion of collected applications in Parsec and Splash-2 exhibit significant inter-processes data dependencies [12]. Therefore, they have limited effectiveness when testing the stress endurance of NoC, thus posing critical challenges to offering performance guarantees un- der extreme situations. Such a stress test is essential for a NoC design with predictable performance and ensured Quality-of-Service(QoS). ii) Application-based benchmark suites are not portable to a wide spectrum of architectures with rich het- erogeneities. They usually maintain a relative fixed set of applications based on a specific machine model. For instance, Parsec is assuming a homogeneous chip multiprocessor (CMP) system with shared memory 96 Figure 6.1: A simple case where data dependencies can be known only at execution time as user input determines both data and the type of task to be performed. while Splash-2 adopts a distributed shared memory (DSM) model [17]. Such assumptions limit their appli- cability to the evaluation of NoC in emerging heterogeneous systems, e.g., multiprocessor SoC(MPSoC) or hybrid CPU/GPU/FPGA system. iii) Application-based benchmark suites require costly simulations. In spite of their good fidelity, full- system simulations are necessary for using these benchmarks, which require extended simulation time, e.g., on the order of days or weeks, depending on the level of simulation detail, architecture size, and the duration of the application region of interest. The long iteration cycle makes design-space exploration very difficult. Such an iteration could be even more time-consuming considering the non-deterministic impact on the full-system behavior (e.g., scheduling, synchronization or execution pathways [5]) caused by changes in NoC designs. Synthetic benchmark suites are designed based on either task graphs that are statically extracted from applications (e.g., source code analysis) [104] or use stochastic models assuming a certain class of data generation processes (e.g., Poisson process) [30]. In contrast with the full-system simulations, the simula- tion time is greatly reduced due to simplified system details. Despite their fastness, none of the approaches is able to mimic the spatio-temporal behaviors of real application communications. The stochastic model- based traffic synthesis assumes each data generation process is independent and can be fully characterized by a set of parameters associated with the assumed stochastic model (e.g., the rate of a Poisson process). In this sense, synthetic benchmarks can be easily scaled to test NoCs of arbitrary size, topology and di- mensionality, but they can lead to unrealistic or biased evaluations as a result of the disconnection with the real applications. Static task graph-based benchmarks overcome the drawback of stochastic synthetic benchmarks by capturing some degree of the realistic spatio-temporal task dependencies. Static task graphs are determined via analyzing the source codes of application at compilation time. Computation and synchronization tasks are identified and represented as nodes in the resulting graph. The inter-task dependencies are captured by constructing directed links between a pair of task nodes. Therefore, the task structure of the application is naturally encoded by the size, composition and topology of the task graph. However, static task graph also 97 places significant limitations on its applicability as all tasks and dependencies must be known up front. However, in many cases, the inter-task data dependencies can only be fully known during execution time. To illustrate this, we show a simple segment of C-style pseudo codes in Figure 1 where the types of task performed cannot be decided at compilation time but based on the choice of user input. As a result, stati- cally extracted task graph is incapable of handling problems where the task breakdown, i.e., tasks and their dependencies, is only known at runtime, where a dynamically learned task graph during the execution time is thus required. Trace-driven benchmark suites collect inter-core communication traces during the application execu- tion under a specific full-system setting. The traffic trace is then used as the input to drive the target NoC architecture for performance evaluation. This technique serves as a trade-off between the application-based benchmarks of high fidelity at the expense of simulation cost and the synthetic benchmarks. Recent trace- driven benchmarks like Netrace [57] also consider inter-task data-dependencies for the preservation of real application behaviors, which improves their fidelity further. However, the trace-driven benchmarks are useful as long as the target architecture of interest coincides with that used for trace extraction. Otherwise, a trace recollection process through full-system simulation is required. Based on these observations, we address the NoC benchmark synthesis problem for fast performance assessment by employing a complex network analysis of real applications. More specifically, we propose a dynamical complex network frame- work to characterize both the spatial (inter-task data-dependencies) and temporal (timing dependencies) behavior of application workloads. We formulate the benchmark synthesis as an optimization problem and propose an efficient algorithm for generating large-scale benchmarks that preserve the structural features and inter-task dependencies of real applications. We believe that a good network generation model applied to NoC benchmark synthesis could help i) model the heterogeneous traffic structures of applications over the temporal and spatial domains; ii) offset the drawbacks of current NoC benchmark suites; iii) introduce a new research methodology for full-system exploration. To summarize, our main contributions are as follows: 1) We propose a mathematical model for benchmark synthesis that is able to capture the dynamic charac- teristics of real-world application workloads. 2) We propose a set of complex network metrics for characterizing the correlations and spatio-temporal behavior of real applications. These metrics can be used for checking their consistency in terms of the degree of spatio-temporal dependency of generated large-scale benchmarks. 3) We develop a benchmark synthesis algorithm for generating a large-scale dynamic application task graph while preserving the network characteristics of the application. 4) We validate the proposed algorithm by analyzing the statistical similarity between the synthesized benchmarks and real-world application traffic traces. The paper is organized as follows: Section 2 provides an overview of prior research efforts. Section 3 describes the proposed framework and formulates the NoC benchmark synthesis as optimization prob- lem. Section 4 introduces the complex-network inspired similarity metrics, analyzes their connection with 98 the application traffic behaviors and proposes a scaling algorithm for realistic large-scale benchmark gen- eration. In Section 5, we validate the algorithm through statistical comparison between the synthesized benchmarks and the real application traces. Section 6 provides the conclusions of our study. 6.1 Related Work and Novel contribution Prior research endeavors to address the system design exploration both in algorithmic and architec- tural aspects have been largely directed towards profiling applications using graphical models. Since the computation of any parallel algorithm can be viewed as a task dependency graph [71], parallelization of multi-threaded programs could be most effectively solved via the extraction of such graphs directly from applications. As such in the exploration of conventional multiprocessor systems like [72][106][4], task graphs are centered on essential analytical models to evaluate a wide range of scheduling algorithms in terms of scheduling length, time complexity and power consumption [28]. Although the task graphs used in these works vary in representation and semantics (e.g., considering system heterogeneity or capturing the communication workload rather than pure data dependencies), there are close similarities between them. Weighted directed acyclic graph (DAG) has been extensively studied to schedule a parallel program to an array of homogeneous processors such that the completion time of the program is minimized [72]. A standard set of task graph based benchmarks are proposed for the systematic evaluation of a wide spectrum of scheduling algorithms. The performance improvement obtained by the graph models are inspiring intensive research aimed at the extraction or synthetic generation of task graphs [64][3][139][47][95][32]. Task graph extraction from the C source codes is first addressed by [139] with an extraction tool open for academic use. It fails to ad- dress pointer-related structures due to the complexity of the task structure. [95] explores how to profile the VHDL-based hardware description using task graphs for high-level synthesis. In [3], a compile technique is proposed to synthesize static task graphs (STG) and derive dynamic runtime graph instances based on previously structured STG. On one hand, extracted application task features act as effective benchmarks for the assessment of various design methodologies. On the other hand, the runtime stochasticity embed- ded in the architectural heterogeneity and the temporal task behaviors (e.g., time-varying input vectors) makes the static profiling method hardly informative for hardware and software co-optimization at design time. Especially when considering NoC-based platforms, the topological tuples of the network add an additional degree of variation, making both the profiling and benchmarking approaches less trustworthy. In this context, the NoC community initiated an open standard of benchmarking for underlying NoC architectures. In [52][115][116], communication-centred design is proposed and key benchmark charac- teristics are defined. Starting from this initiative, several works propose benchmarks derived from: i) real applications traffic traces [77][57], ii) statistical models extracted from applications [130] and iii) commu- nication task graphs [104][141]. Unlike the benchmarks for a conventional parallel system, they are not abundantly available and well-maintained for broader research use. Application based benchmarks like Parsec and Splash-2 are alternatively used. However, as mentioned in the prior discussion, their applica- bility to NoC-based systems is limited. Therefore, these benchmarks are unable to sufficiently stress the 99 Sync Data Task 0 Task 2 Task 1 Task 3 Task j = Execution time= t State transition State Output Input Input k t Benchmark Synthesis Learning Similar structure Benchmark Scaling Networks-on-chip Section 3.3 Benchmark workloads synthesis Section 3.2 Mathematical modeling Section 4 Evolvable benchmark synthesis Application Profiling Application C++/C OpenMP Library ISA L L V M Instrumented LLVM IR Target platform Trace Trace or Figure 6.2: Problem Overview. We propose a mathematical framework (Section 3) that constructs graphi- cal models (Section 3.2) that are able to capture the sptio-temporal inter-task dependencies on which traffic can be synthesized (Section 3.3). The model can be learned by running the instrumented LLVM interme- diate representation of the application of interest and collecting the execution trace. We also propose a benchmark scaling algorithm (Section 4) to scale the constructed model while preserving key structural features of the original application model. underlying NoC systems and do not generate most interesting cases when network traffic approaches a transitional phase and demonstrates non-stationary behaviors. To address this problem, we will first present a mathematical model for characterizing the applica- tion traffic. Then, we formulate the NoC benchmark synthesis as an optimization problem and propose a synthesis framework based on runtime architecture-independent model learning. 6.2 MoC-based Application Profiling and Benchmarking Framework 6.2.1 Overview of the Problem The well-established benchmarking techniques are not perfect as each of them has (at least) a subset of the following major weaknesses: i) expensive development efforts and simulation time, ii) failure to preserve realistic traffic characteristics and consider their runtime variations, iii) poor scalability when it comes to providing traffic workloads that are suitable for stress testing not only a wide spectrum of current NoC architectures, but also the emerging (future) large-scale NoCs. To overcome these challenges, we have to address the following critical research problems: P 1 ) Can we establish a rigorous mathematical model with good fidelity in profiling the application traffic characteristics (i.e., it preserves its spatial patterns such as the inter-task data and control dependencies, and temporal dynamics such as the traffic generation process)? P 2 ) Can we learn and use this mathematical model for NoC benchmark synthesis such that the newly generated large-scale benchmarks preserve the statistical properties and traffic characteristics of real appli- cations? Alternatively, can we scale up this mathematical model and synthesize benchmark workloads that are able to test different NoCs while being spatially and temporally consistent with the original application 100 traffic behavior in statistical terms? P 3 ) Can we modify / perturb this mathematical model to simulate the runtime traffic variation of applica- tions? In what follows, we present a novel framework to address all these research problems. More specifically, we address the first problem by introducing a mathematical model that characterizes the application traffic as a directed dynamical graph. To address the second problem, we adopt a LLVM compiler-based task structure extraction approach to profile the application and propose a complex networks inspired traffic synthesis technique for generating traffic workloads at runtime, given the mathematical model of a pro- filed application. To tackle the third problem, we propose a scalable benchmark synthesis algorithm that can work with various statistical distributions. 6.2.2 Application Traffic model Router i Router j Router k Router i Router j Router k Execution time=T1 Execution time=T2 Router i Router j Router k Execution time=T1 Conflits ? Figure 6.3: An example of how a data dependency has an impact on traffic behaviors. 6.2.2.1 Vision of the Model An application consists of different tasks and their interactions (i.e., inter-task data and non-data de- pendencies). To analyze the structure and dynamics of its tasks, one can represent the application using 101 graphical models where tasks are represented as nodes and task interactions as edges. In spite of their wide use in validating the resource scheduling, task mapping, automatic parallelization as discussed in Section 2, their application to modeling the runtime application traffic behaviors is limited. For instance, commu- nication task graphs (CTG) used in prior NoC studies are not able to capture dynamic data dependencies, i.e., when a data set is generated, exchanged and how different data sets are related at runtime. Ignoring such dependencies might lead to biased network performance measurement. To give an intuition, we show a simple PE-based NoC example in which ignoring the data dependencies can lead to erroneous estimates of the NoC performance for an application of interest. Figure 6.3 shows three routersi;j andk (each with a single input buffer) interfacing three processing elements (PEs) and exchanging data for calculating the average and variance of a time series stored in tile i. Let us assume PEi sends this data to PEsj andk. The results computed by PEj will be reused by PE k, i.e., the average of the data set will be sent to PEk for calculation of the variance, thus there is data dependency between these two tasks (i.e., calculation of average and variance). During execution, what really happens is that the packets issued by PEj might never have conflicts with those injected by PEi because the computation in PEj usually takes more time than what it takes to move the data from PEi to PEk. However, if we use a conventional CTG or even a trace-based benchmark that does not consider task dependencies, we might end up with erroneous network performance measurement. This happens when the link between Routeri andj is heavily congested such that the packet injected by Routeri waits longer than the computation time of PEj. In such case, Routerj would still mistakenly inject the ”results” as in- structed by the collected trace even it has not received full data set from Routeri, resulting in a unrealistic traffic pattern. To address these problems, we propose a dynamic graphical model learned at runtime not only for ac- curate characterization of the application but also practical use as realistic traffic generator. More precisely, we propose to model each task as a data generation system, which consists of: i) a timed finite state ma- chine that governs its system state transition at runtime and ii) a data generation process that determines its communication patterns. By relating the input of the system to the system state transition that determines the output in a timed fashion, the proposed model is able to capture runtime inter-task dependencies and characterize the spatio-temporal patterns of the communication. 6.2.2.2 Model Description The keystone of the model is to set up an abstraction of the application that is able to not only mathemat- ically expressive in capturing the runtime application behaviors and its task structure, but also practically easy to be learned and used for realistic traffic generation. Towards this end, we follow the same idea to characterize a parallel program from a compiler perspective and define an application as a collection of tasks. Each task can be understood as a sequence of basic operations. Given a task, its execution might have i) data dependencies (i.e., it requires the output of other tasks) and/or ii) non-data dependencies (e.g., synchronization) on prior tasks. Once these dependencies are satisfied, the behaviors of tasks can be sum- marized as: i) processes its input (either from prior tasks or from user input), ii) generates a new set of data as output for tasks in the subsequent execution path and iii) exchanges them following a specific pattern 102 (i.e., a specific distribution of data generation). Intuitively, a task can be abstracted as data generation system: it checks upon its input and transits its state from IDLE to READY as its dependencies on prior tasks are satisfied over time. If the system enters READY , it will operate on its input and map them to output over an execution time horizon. Otherwise, it will keep still and waiting for the receipt of all its input. To formally characterize it, we introduce the following definition: Definition 1: An application taskA(t) is a data generation system determined uniquely by a quadruple (M;G(t);T;C) over time horizon [t;t +T ].M is a timed finite state machine.fG(t),t2Tg is a data generation process whereG(t 0 ) denotes the number of data units generated over time interval [t;t +t 0 ]. FunctionC maps a taskA(t) to a setC(A(t)) containing all other prior tasks upon which the execution of A(t) has dependencies. An application taskA(t) is defined over its execution horizonT , i.e., its active time period. To run taskA(t), all prior tasks inC(A(t)) have to be finished. To check upon whether such dependencies are met over time,A(t) maintains a timed state machineM to drive system state transition from IDLE to READY . Upon READY , the execution will be initiated to generate a new set of data that might be used for subsequent tasks. The data generation can be characterized by a processfG(t),t2Tg. To detail the timed state machineM, we formally introduce the following definition: Definition 2: A timed finite state machineM is a sextuple (I;S;s 0 ;O;F; ) whereI;S; andO are finite disjoint sets of inputs, states and outputs, respectively. s 0 is the initial state.F is the transition function F : 2 I S 2 T !S. is the output function :ST ! 2 O 2 T . Of note,I andO are the input and output alphabet with finite symbols, respectively. The idea is to introduce these two sets to model the input and output of a task.I andO provide abstract description of different dependency types. In practice, we use a simple integer alphabetf0; 1; 2g for bothI andO. The input is "0" if no corresponding dependency is met. Otherwise, "1" and "2" denote data dependency or non-data dependency, i.e., synchronization requirement, is satisfied, respectively. We use the finite alpha- bet set to avoid any architecture-specific assumptions, e.g., type of data or width of channels, such that the model is self-contained and general without a specific machine model, which might limit the applicability of the formalism. F is a timed transition function that maps a vector of inputsI(A(t)) =fi k ji k 2Ig, the current state s2S and a vector of time stampsft k g associated withI(A(t)) to the next state. We refer toi k 2I(A(t)) as an input channel andjIj is the width of input channel. Each input channel i k 2 I(A(t)) is paired with a time stampt k (denoted as (i k ;t k )) which determines the earliest time thati k can be checked. We introduce this time stamp to consider the time cost of task execution and communication which will be later detailed in the discussion of output function . Eachi k connects to an output of an upstream task on which the execution of taskA(t) depends, thusjI(A(t))j =jC(A t )j. The task dependency ofA(t) on a prior taskA 0 (t) is satisfied if and only if a letter "1" or "2" inI is received by input channeli k 2I(A(t)) and its associated time stampt k is not greater than current time stampt when the transition condition is being checked, i.e., the causal constraint. In contrast to ordinary finite state machine, we introduce an extra temporal dimension to guard the state transition such that the timing information of the application can be captured. Consequently, the transition functionF would drive the system state into READY if and only if 103 i k 6= 0 andt k t,8i k 2I(A(t)). The output function maps the timed current state (s;t), wheres2S andt2T , to a vector of output O(A(t)) =fo k jo k 2Og, guarded by an array of time stampsft+ k g. Similarly, we defineo k 2O(A(t)) as an output channel. k denotes the delay of output channelo k caused by the execution of the task on the input data setI(A(t)) and the data generation process, i.e., communicate data over a certain period of time, is equal to k;e + k;c , the execution delay and communication delay, respectively. Of note, k;e replies on mapping function from the task to a specific processing entity (e.g., a dedicated PE or a processor), i.e., the delay is decided by how ”fast” the task can be processed. In the model, we have no assumption on mapping function or processing entity, hence enhancing the expressivity of the model. k;c is the span of the data generation process which is described byfG(t);t2Tg. Given a specific task, the data generation process could be arbitrarily complicated whereas it is still possible to find a best-fit stochastic process model that best characterize its behaviors. For instance, the process could be memory-less (e.g., Poisson process), long-range memory (e.g., self-similar or fractal process) or a general-stable process. Connecting Definition 1 and 2, we have constructed the backbone of the model for NoC applications. Compared to the conventional definition of task in context of parallel program analysis, we view each task as a data generation process whose behaviors are governed by a timed state machineM and data gener- ation processG(t) given the execution time horizonT . Its dependencies are characterized byC(A(t)). Given a collection of tasks A =fA i (t)g, we are able to construct a graphical modelB(t) = (A;E;t) where each vertex a i corresponds to an application taskA i (t) and each directed edge e i;j exists if and only if taska i has, either data or non-data, dependency on taska j . Formally, we have the following defi- nition, Definition 3: A NoC applicationB(t) = (A;E;t;T ) over its execution time horizon [t,t +T ] is a dy- namical directed graph where each vertexa i 2 A is an application taskA i (t) and edgee i;j 2 E if and only ifA j (t)2C(A i (t)). In contrast to previous graphical model for application traffic, the proposed model not only translates the spatial dependencies into geometric characteristics of the graph (i.e., nodes, edges and their connection pattern), but also introduces a detailed description for tasks that are able to preserve the temporal depen- dencies. In the following discussion, we will present a traffic synthesis technique based on the proposed model to address the problem P 2 ). 6.2.3 Benchmark Workloads synthesis The large-scale benchmark synthesis problem can be stated as follows: How can the traffic be gen- erated for a given size and the application profiled by the proposed modelB(t) = (A;E;t;T ) such that traffic characteristics of the real application are preserved ?. Thus, our objective is to build a traffic gen- erator for NoC evaluation without interfacing it with a full-system simulator such that, the target NoC is identically stressed but requires less simulation time. To formally define the problem, letA be the universal set of tasks involved inB(t). AssumejAj =n, let S(t) be the n-dimensional state vector ofB(t) such that S(t) = [s 0 (t);s 1 (t);:::;s n1 (t)] T . We define the vector sequenceE 0 =¡S(0);S(t);S(2t);:::S(T )¿ as the recorded states of tasks during 104 application execution on target architecture over finite horizon [0;T ]. In other words,E 0 is the task state transition trace recorded from the execution of the real application. We defineE(B(t)) = ¡S(0),S(t), S(2t);:::;S(T )¿ as an execution of benchmarkB(t) over a finite horizon [0;T ] where t is time step of interested length, i.e., the cycle of simulation clock. Intuitively,E(B(t)) is the simulated system state transition trace. Since it is observed that the output of each task is uniquely decided by the system state s and the time stampt through the mapping relation for each taskA i (t) (see Definition 2). Therefore, the system state transition trace determines the traffic characteristics of the application. As a result, given an execution horizonT , ideally, the simulated system state transition traceE(B(t)) should be equal to the recorded state traceE 0 . Formally, we can formulate the benchmark synthesis problem as : NoC benchmark synthesis problem : Given an application profiled byB(t), a target architecture, execution timeT and the recorded application state transition traceE 0 Determine the initial states 0 , output function and data generation processG(t) for each taskA i (t) to obtain an executionE(B(t)) ofB(t) to minimize its deviation from the recorded trace: min s0; ;G(t) jjE(B(t))E 0 jj 2 (6.1) Equation 6.1 shows the proposed model enables us to provide a way to quantify the similarity of the characteristics between the synthesized traffic and the real application traffic by measuring the norm of deviation of state transition trace in both cases. To solve this problem, it should be first noted that the source of difficulty in minimizing (6.1) resides both in accurately identifying the task structures, i.e., tasks and their runtime inter-dependencies, and cap- turing its communication patterns (e.g.,g memory access events), i.e., learning the data generation process G(t). We thus propose a synthesis framework based on runtime architecture-independent model learning. Figure 6.2 shows the overview of the proposed benchmark synthesis framework. The overall framework could be understood as a two-stage process where i) an architecture-independent application profiling and model learning stage is set up for analysis of runtime application task graph and construction of the NoC application modelB(t) upon which ii) a subsequent benchmark generating stage is built to introduce re- alistic variation to the generated traffic model for extrapolated traffic synthesis given a target architecture. It should be noted that, instead of extracting static task graphs, we define NoC application modelB(t) in Section 3.2.2 as a dynamical graph that can only be learned during the execution of the program. This is because the statically extracted task graph is not a sufficient representative of the applications with un- known tasks and their spatio-temporal inter-task dependencies prior to execution of them. Specifically, we have modified the Contech compiler [108] that is based on the LLVM compiler frame- work [74] with OpenMP support that provides the ability to observe and manipulate the intermediate rep- resentation of a program. Following Contech compiler, the adopted profiling methodology is two-layered. The first layer is used to take the source code of the application as input and translate it into instrumented LLVM intermediate representation(IR). The compiler in the first layer will run a function-by-function 105 check to identify the basic blocks (e.g., basic actions or predefine functions) and insert inlined codes into target ISA assembly to collect the properties of memory access events, i.e., address, size, type and timing information during the execution time of the application. To capture the inter-task dependencies, the syn- chronizing actions are identified through analyzing the LLVM IR or the name of the function invoked. The address of the action, the order of the action with respect to other synchronization actions on this address and time stamps from before/after the action will be recorded in a local buffer for each thread. Eventually, a global event list will be generated where events from the same thread are stored in the event list in pro- gram order rather than the micro-architectural order from out-of-order processors or memory consistency, thus avoiding specific architectural assumptions. The second layer takes the extracted event list to infer the application modelB(t). Each task accu- mulates a list of basic block IDs and memory accesses from the event list until a synchronizing action is encountered. Then all previous blocks are assumed to be in the same context and merged into a single task A i . The task dependencies between other synchronizing actions are checked such that for eachA i 2B(t), we are able to determine the input channelI(A i ) (or output channel, equivalently) upon finishing process- ing all basic blocks in the event list. Alternatively stated, we are capable of construction of nodeA i and edgee i;j of all choice ofi andj forB(t) given the event list collected by the LLVM compiler with in- strumentation at execution time. This constitutes the topology, i.e., the structural features, of the proposed graphical model. Recall that we define application taskA i as a quadruplefM;G(t);T;Cg where function C(A i ) denotes the subset of tasks dependent onA i . By identifying tasks and their dependencies, we have learned also the functionC. To practically use the graphical model to generate traffic aligned with realistic application behaviors, we should derive finite state machineM and data generation processG(t) also from the collected trace in the first layer. We define each input channel of a task to correspond to the dependency on a upstream task. The state transits as any of its dependencies on prior tasks are met, i.e., either data or synchronizing dependencies are satisfied. The date generation processG(t) is initiated once the state machineM enters the end state where all dependencies are satisfied. Recall that we have recorded all memory access events by running the instrumented program. All memory access events, when mapped to a NoC-based architec- ture, translate to data injection events. Combined with time stamp of the memory accesses recorded, it is possible to either i) directly use the trace or ii) fit a stochastic processG(t) for data injection of each task A i . Together with execution horizonT which we use to run the program, we have learned the application traffic modelB(t). It should be noted that fitting a stochastic processG(t) to the recorded data generation process could be arbitrarily difficult as it might not be aligned with a known stochastic process or changes quickly over time such that we do not have sufficient data for estimation of the distribution parameters. Otherwise, we can fit a stochastic model tofG(t);tg to further reduce the complexity of the model. As a case study, we assumeG(t) follows a Poisson distribution such that for output channelo k : PfG(t) =kg = t k k! e t (6.2) 106 is the strength of Poisson flow. Given the size of data to be generated asL, the statistical average of k;c is given by, E[ k;c ] = L E[G(t)] t (6.3) Since E[ k;c ] is an unbiased estimate of k;c , we use E[ k;c ] to replace k;c . Of note, the assumption of Poisson distribution is helpful to give a case study whereas Equation (6.3) can be applied to other processes. Given the constructed modelB(t), a follow-up question is how can we make changes to the graphical model such that i) we can simulate the runtime variations of the traffic (i.e., Problem P 3 ), and ii) how can we scale it to test different NoCs while preserving its spatial-temporal characteristics of traffic (i.e., Problem P 2 ). Next, we address these problems by proposing a network generation algorithm based on complex network theory. 6.3 Evolvable Benchmark Synthesis 6.3.1 Overview Given an application described by the proposed modelB(t), it is desirable to generate an array of benchmarks that are diverse in scales but “similar” in spatial and temporal behaviors as theB(t). As we discussed in Section 3, the spatial dependencies are encoded by the structural characteristics ofB(t) while the temporal dependencies are embedded in structure of the task (i.e., the timed finite state machineM and the data generation processG(t)). Therefore, an efficient way to preserve such dependencies when editing the graph is to keep key structural features of the model at proper scales. For example, if we look at the graphical model at the highest scale, we will observe a single node. Then we replicate this single node and go back to the original scale. We will expect a very similar graph as the original one but doubled in size. Following the same idea, we can preserve any structure in the graph as long as we replicate a coarsened node at a proper scale. More precisely, we propose a scaling algorithm based on complex network generation that produces graphs that are similar toB(t). 6.3.2 Measuring the Graph Similarity To measure the similarity between graphs of various sizes, we introduce a set of structural metrics M =f;; ;D avg ;P avg g which are well used for comparing graphs. The average node degreeD avg shows the local interconnection strength. The average path distance P avg shows the average distance between all possible pairs of nodes in the graph. We denote as the assortativity metric which measures the tendencies of nodes to connect with other nodes that have similar degrees as shown in Figure 6.4. For directed graphs, the in and out-assortativity are measured, respectively. In general, lies between1 and 1. When = 1, the network is said to have perfect assortative mixing patterns, when = 0 the network is non-assortative, while at = 1 the network is completely disassortative. The clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. The betweenness centrality is an indicator of a node’s 107 Disassortative Assortative Betweenness Clustering High betweenness Highly clustered Loosely clustered Locally assortative Figure 6.4: Example graphs visualize assortativity, betweenness and clustering centrality in a network. It is equal to the number of shortest paths from all vertices to all others that pass through that node. Betweenness has important implications for the proposed graphical model. To give an intuition, we visualize these metrics in Figure 6.4 using example graphs. To understand the physical meaning of the metrics and motivate their link to the task structures in a realistic setting, we present a case study where we run the multi-threaded coarse grain hierarchical parallel genetic algorithm (HPGA) and show variations in task structure over time in Figure 6.5. The sequential version of HPGA has very simple task structure consisting of three basic blocks: i) Distribution of individuals (DI), i.e., candidate solutions; ii) Calculation of fitness (CF); iii) Produce the new generation (PG) based on fitness. In the example, the host is able to create new CF tasks and populations, i.e., a pool of candidate solutions, to parallelize the execution. Over the execution time, the task structure has been through variations whereas preserving several important structural features: i) The disassortativity of the graph is respected and preserved, i.e., nodes of high degree tend to connect to nodes with low degree. The task graph of HPGA is strongly disassortative suggesting the existence of global synchronization nodes. ii) The majority of nodes remains less clustered which indicate the source of potential parallelism; iii) The DI task preserves its high betweenness centrality as multiple populations and corresponding CF and PG tasks being created, which suggests DI as a synchronization node. Even though the example is just a case study with very simple task structure, yet we can make the following observations: i) The structural feature analysis on the extracted application task graph can help us identify the critical tasks such as synchronization node and potential parallelism. ii) By preserving key structural features like assortativity (not necessarily the absolute value of the metrics), we might be able to introduce realistic variations to the original task graph especially when we have no prior knowledge on how the real application changes over time. Based on these observation, we next present our benchmark scaling algorithm. 6.3.3 A Complex Network-inspired Benchmark Scaling Let us define the editing function asE : VE! VE andM(B(t)) as the similarity vector un- der above-mentioned metricsM. The problem to generate evolvable benchmarks can be formally stated as: 108 DI CF CF PG CF CF DI CF CF PG DI CF PG DI CF CF PG CF CF CF CF DI CF CF PG CF CF PG CF CF PG CF CF PG CF CF PG CF CF PG CF CF PG DI DI Creating CF tasks Creating populations Runtime Preserved disassortativity Preserved Disassortativity, local clustering and betweenness centrality Figure 6.5: The evolution of genetic algorithm at runtime. NoC benchmark scaling problem Given an application profiled byB(t) and the metrics of interestM 0 M. Determine a sequence of editing functionE = ¡E 0 ;:::;E n1 ¿ such that min E0;:::;En1 jjM 0 (B(t))M 0 (E(B(t)))jj 2 (6.4) Subject to: jV 0 (t)jN;V 0 (t)2E(B(t)) (6.5) Starting with aB(t) as seed, we need to determine a series of editing functions applied toB(t) to generate B 0 (t) such that a subset of structural featuresM 0 are preserved. Lemma 1: The problem described by (6.4) is NP-hard. Proof: The proof follows by noticing that the calculation of the average path of a graph requires finding all the paths in a graph. Thus the problem solution contains the solution to the longest path problem, which is NP-hard. Because the problem in (6.4) contains (as subclass of problems) one that is NP-hard, it follows that (6.4) is also NP-hard. Therefore, we propose a heuristic to solve this problem, which is inspired by the complex network generation and multiscale theory applied to solve combinatorial optimization problem in [109]. Alg. 3 shows the overall procedure of proposed algorithm. The proposed algorithm is a V-cycle scheme that solves the problem described in (6.4) using coarsening and refining iterations at multiple scales as shown in Figure 6.6. Our proposed algorithm starts from a seed application profiled by graphB(t) and recursively 109 Algorithm 3 Benchmark scaling algorithm Gen(B(t)) Require: Graph seedB(t); Selected metric set M 0 2 M; Downscaling function ; Upscaling function 1 ; Editing function ; Expected size of graphN Ensure: A set of accepted graphsB 0 (t) 1: i=0; 2: if Sanity check(B(t))==false then 3: ReturnB(t) 4: else 5: B 0 (t)=(B(t)) 6: B 0 (t)=Gen(B 0 (t)) 7: whilejV 0 (t)j<N andjjM 0 (B(t))M 0 (B 0 (t))jj 2 > do 8: B(t) scaled = 1 (B 0 (t)) 9: B 0 (t)=(B(t) scaled ) 10: end while 11: ReturnB 0 (t) 12: end if change the graph into greater scales (i.e. upscaling) until a sanity check is violated. The sanity check will control how deep the V-cycle would go by setting a lower bound for both number of nodes and edges re- mained. Once violated, the upscaling stops. Then an array of downscaling functions are applied to project the graph with “coarser details” to a graph of a finer resolution. After the graph is downscaled, a series of editing functions, i.e., node replication, insertion or deletion, are performed. To scale the benchmark while preserving the structural characteristics of the original graph, only node replication is considered. In other cases like simulation of application variation, there is no restriction on editing operations. 6.4 Experimental Results Experimental setup: To validate our mathematical framework for benchmark generation and scaling that preserve structural features of the extracted task graph, we consider three graph-based application traffic benchmarks, blackscholes, canneal and freqmine from Parsec 2:1. We present two sets of experiments to validate the proposed application traffic model and NoC benchmark synthesis algorithms. In the first set of experiments, we compare i) the packet injection patterns and ii) average latency of the network during the execution of the region of interest (ROI), i.e., parallel phase, from a full-system simulation, and those on a dedicated cycle-accurate NoC simulator driven by the traffic generated by the proposed model. We learn the modelB(i) by instrumenting the applications and collecting execution trace. The full-system simulation is performed by Gem5 simulator on 32- and 64-core in-order 2 GHz Alpha ISA processor running over a Linux kernel of version 2.6.27 which is patched for supporting 4-64 Alpha cores. The NoC interfaced with the processors is following the Garnet network model with mesh topology under deterministic dimension-order routing (DOR). The flit size is set as 8 bytes. Each input port has 4 virtual channels and the depth of each virtual channel is 4-flit. The dedicated NoC simulator is a cycle-accurate simulator written in C++ with settings that are identical to those used in full-system simulation. 110 (V,E) @ Finer resolution (V,E) @ Coarser resolution V-cycle Graph editing (V’,E’) @ original resolution Similar structure Figure 6.6: The overview of benchmark scaling algorithm. The procedure follows a V-cycle of coarsening and refining operations. By preserving nodes under proper scales, it is able to protect the structural features of interest. We first report three experiments performed under the full-system simulations using Gem5 on a 32-core system and the NoC stress test using a cycle-accurate C++ NoC simulator. To measure the goodness-of-fit of traffic behaviors using the synthetic traffic against those under the full-system simulation, i.e., whether the network communication exhibits close patterns under two workloads, we choose to measure the dis- tribution of average injection strength during ROI over all 32 cores considered. The average injection strength is calculated by averaging the total number of packets generated by the lapsed time. The results are reported in Figure 6.7 and normalized by the maximum injection strength of both cases. It is observed that the obtained distributions of injection strength under the synthesized NoC traffic are consistent with 0 5 10 15 20 25 30 35 Record Core ID 0 0.2 0.4 0.6 0.8 1 Normalized Injection strength 0 5 10 15 20 25 30 35 0.75 0.8 0.85 0.9 0.95 1 Recorded Simulated Record Core ID 0.2 0.4 0.6 0.8 1 Recorded Simulated Recorded Simulated 0 5 10 15 20 25 30 35 Record Core ID Blackscholes Canneal Freqmine Figure 6.7: Measuring the distribution of injection strength over different processors under three applica- tion benchmarks, blackscholes, canneal and freqmine, using both full-system simulation and synthesized traffic workloads based on the proposed model during ROI. The injection strength is calculated as the injec- tion rate of a processor averaged over the execution time. In all three cases, the synthetic traffic workloads stress the target NoC to exhibit close injection distributions. 111 ! 1 0 9 18 27 36 Blackscholes 32-core Full System 32-core Simulated 64-core Full system 64-core Simulated Canneal Freqmine Average latency/cycle Figure 6.8: Comparison of average latency those measured during the full-system simulation in all three benchmarks. It should be noted that the in- jection strength distribution is contributed by all runtime communication and computation events that are either producing or consuming data. These events are inter-coupled via the task dependencies embedded in the execution path of the application. Without the incorporation of such dependencies in the synthesized traffic, it is difficult to have close fitting to the real traffic behaviors that are usually identified through full-system simulation. In addition to injection strength distribution, we have also measured the average network latency for networks of different size driven by the full-system simulation trace or the generated traffic by the proposed model. The results are reported in Figure 6.8. Under different network settings, the NoC simulation driven by the proposed model demonstrates consistently close latency performance compared to that measured under full-system simulation with an error mean of %1:2 and %2:1 for 32-core and 64-core simulation, respectively. In the second set of experiments, we would like to check whether the proposed model is able to scale up the application model constructed to an expected scale, meanwhile introducing minimized deviation in the set of interested structural metrics (see Section 4.2). To motivate the protection of the structural features in a graphical model, not only the proposed model, but in general cases, we should be aware the following fact: as we mentioned in the previous discussion, the structural characteristics of most of application graphical models, are naturally encoding the spatial dependencies via construction of their ge- ometric structures, i.e., connection of nodes via edges. Actually, prior research efforts in parallelization of algorithms largely rely on the analysis of such structural features and their implications. The change in such structures has significant influence on the execution of the application. Obviously, scaling a graphical model that is able to be used for traffic synthesis is a shortcut to efficiently obtain an array of benchmarks. 112 Scaling factor=16 Scaling factor=8 Scaling factor=4 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 β γ α avg P avg D β γ α avg P avg D β γ α avg P avg D A) B) C) 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 β γ α avg P avg D β γ α avg P avg D β γ α avg P avg D 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 β γ α avg P avg D β γ α avg P avg D β γ α avg P avg D Blackscholes Canneal Freqmine Simarlity ratio Simarlity ratio Simarlity ratio Metrics Metrics Metrics Figure 6.9: Measuring structural similarities of graphical models scaled by a factor= 4; 8 and 16. However, editing the model arbitrarily might invalidate its applicability to traffic synthesis due to the loss of fidelity. Therefore, we propose the NoC benchmark scaling algorithm based on a complex network the- ory to obtain new models with expected scales, meanwhile respecting its original structural characteristics. We first constructed the model based on the collected application trace during the ROI phase for all three applications. Then, we use the models as seeds to perform the proposed algorithm. New models are generated with different sets of expected network sizes, i.e., scaling factor= 4; 8 and 16. We measured similarity under a set of metrics. The results are reported in Figure 6.9. For each scaling factor, the mea- surement is averaged over 100 iterations. Several key observations can be made from the results: i) The proposed algorithm maintains a low level of deviation on average across the set of metrics considered. For a scaling factor of 2, all three graphs stay quite structurally consistent with the original graph. ii) As the scaling factor increases, the average deviation increases due to the structural modification introduced randomly during the refining process. The refining process in the proposed algorithm will randomly con- nect the newly added node, replicated or randomly introduced, to a existing node in the graph. As the scaling factor goes up, the graph might undergo increased levels of coarsening and refining process, i.e., a ”deep-V” process, which boosts the chance of modifications to the graph introduced randomly during the process. Overall, the proposed model can reliably scale the benchmark by at least a factor of 8 and preserving some set of metrics even with a factor of 16. 113 6.5 Summary In this work, we have proposed a mathematical framework to synthesize real-world benchmarks that capture spatio-temporal dependencies of the applications. We validate the synthesized traffic through a statistical comparison against the full-system simulation results under real-world application workloads. To allow for the realistic generation of scalable benchmarks that preserve the spatio-temporal dependencies in applications, we have also proposed a NoC benchmark scaling algorithm. The experimental results shows the scaled graphical models are structurally consistent with the original graphs. 114 Chapter 7 Application-driven Runtime Reconfigurable Communication System Synthesis for CPS Design: A General Mathematical Framework. Workloads induced by real world applications demonstrate strong spatio-temporal variability. Hetero- geneous application tasks create inter-coupled traffic patterns with time-varying data and control depen- dencies exacerbating the workloads requirements running over a large scale NoC-based manycore plat- form [19]. For instance, big-data applications like biological simulations [148] usually exhibit high di- mensionality in their task structures [149], which are also varying vastly as a function of widely ranged objectives and unpredictable user input. Consequently, the failure to capture such variations through the development of a best-fit NoC-based system, translates to systems that are prone to computation, commu- nication and power inefficiencies. To address the spatio-temporal behavior of applications, prior research efforts have focussed on two design methodologies, namely the offline analysis of applications and optimization of application specific NoC architectures and the development of on-line optimization techniques. For instance, application- specific NoC synthesis has been well studied to achieve best-possible performance given a specific ap- plication. The synthesis usually generates well optimized NoC configurations and application mappings that are superior to baseline design in terms of energy efficiency and performance [124][132], application- specific performance (NoC-based accelerators)[83], reliability [153], lifetime cost [90]. Such optimization considers one or a subset of applications and captures their traffic pattern and task structures. The synthe- sis is static with no adaptivity and performed prior to its practical deployment, assuming the application structure is not changing over time. Thus performance is very well optimized if only a fixed set of appli- cations are considered throughout the life cycle of the platform and the applications are almost temporally homogenous. However, the assumptions made before the synthesis usually do not hold. In reality, it is rarely the case that a dedicated NoC is solely developed for exclusive tasks processing. Instead, the NoC is usually used as fundamental communication infrastructure that integrates a set of heterogeneous process- ing entities that are expected to run a wide range of applications upon deployment. Most of applications are very diverse in terms of communication patterns and, more importantly, the traffic patterns are also changing over time. Thus, the statically customized network structure could hardly be a feasible traffic carrier that brings good performance in general cases. 115 OS Domain@time t App 0 App 1 App 2 App 3 App 4 App 5 App 6 App N App N+1 App N+2 ... ... Application Profling Data Control Section II-B Application Modeling k t P(k,t) System Profiling Architectural Domain@time t Data Control Section II-A System modeling Greedy Construction Section II-C Submodular Optimization Section III Architectural study Compatible ? Relaxation Circuit Component N Figure 7.1: Overview of the proposed mathematical framework. The reconfigurable NoC system and its associated applications are characterized through the proposed system and application graphical models. Then the optimization to the network structural configuration is performed by exploiting the submodular property of the problem. In case of a valid solution does not exist, we introduce the relaxation on problem constraints to obtain a feasible solution while preserve the optimality bound. As an alternative to application specific NoC, the reconfigurable NoCs address this problem by chang- ing their structural features [26][61], routing algorithm [107], resource management strategy [75][67], task assignments [125] to fit to time-varying application requirements. In spite of their successful appli- cation in diverse settings, we still lack of a solid theoretical foundation upon which an analytical design methodology could be built to optimize the reconfiguration with optimality guarantees in general cases. Most previous reconfigurable NoCs are proposed, evaluated and validated through a subset of experi- mental instances without looking at entire problem space from a mathematical perspective. To address the above-mentioned problems, we propose a general mathematical modeling framework considering the spatio-temporal characteristics of workloads and make the following novel contributions: We propose a mathematical framework for capturing the dynamic nature of reconfigurable NoC and applications that enables the formulation of major reconfigurable NoC optimization problems. Our analytical formalism can be applied to arbitrary network topologies and sizes, routing, or het- erogeneous resource allocation problems. We illustrate the efficacy of this formalism by formulating the NoC reconfiguration as a dynamic optimization problem. We prove that this optimization is NP-hard and demonstrate that our mathe- matical formulation satisfies the submodularity property justifying that our proposed greedy based algorithms can attain the optimality region. We evaluate the impact of the proposed mathematical formalism by solving the NoC reconfiguration problem. The experimental results show a 52:3% reduction in network latency, increased capability of handling heavy traffic and 30:2% in energy reduction for our lightweight reconfigurable NoC when compared to baseline design. 116 The paper is organized as follows: In Section II, we formally set up the reconfigurable NoC plat- form and introduce several definitions that help to enrich the model expressivity. Section III presents a case study of the NoC runtime reconfiguration problem considering the case where application is charac- terized through a time-dependent graphical model (e.g., time-varying application graph). We show this optimization problem is NP-hard and prove its submodularity. By exploiting the submodularity property, we propose greedy heuristics with bounded optimality. Sections IV and V summarize our experimental results and our main achievements. 7.1 Mathematical Modeling and Optimization Framework 7.1.1 Architectural Modeling Framework Definition 1: A reconfigurable network-on-chip (NoC) is defined as a connected dynamic directed graph G(t) = (N(t);E(t); (t);C) at timet, wheren i 2N(t) is a tile and represents a collection of functional units, andE(t) denotes the collection of physical links between different nodes inN(t). Thee i;j 2E(t) denotes the link fromn i ton j . We should note thatN(t) is the set of enabled network functional units (e.g., DSPs, general processors, customized processing elements, memories or communication transceivers). The composition ofN(t) can change as the subset of nodes are enabled or disabled over time. Also, the reconfigurable NoC usually takes advantage of different switching techniques for best possible performance under diverse workloads. Therefore,e i;j could be a regular link between two routers or a direct link established between tiles without interfacing with routers. To distinguish between regular and circuit links, we introduce the following definition: Definition 2: Edge e i;j is a circuit link if there exists a direct link from n i to n j that allows circuit switching.E(t) induces a functionI :E(t)!R such thate i;j is a circuit link if and only ifI(e i;j ) = 1. The circuit switching link is set up in the regular network to improve throughput over critical traffic path in reconfigurable NoCs. Circuit switching reserves the entire bandwidth of the dedicated link and skips all routing stages, thus improving greatly the communication throughput when used between a pair of nodes. By connecting different circuit links together, it is possible for a subset of nodes to communicate over such dedicated circuit links for faster data exchange. To characterize such collection of nodes, we introduce the circuit component concept as follows: Definition 3: A subsetN 0 (t)N(t) is a circuit componentT if and only if for any pair of nodesn i and n j 2N 0 (t), there exists a circuit link between them. A circuit componentT inG(t) forms a connected subgraph enabling data transfer over dedicated links with augmented bandwidth and reduced latency. For example, the simplest circuit component is a pair of nodes with a direct link that skips the routers at both ends. Generally speaking, the reconfiguration of NoC can be viewed as a process of enabling circuit components in the regular network to provide express ways for critical data transmission. 117 0 1 2 3 4 5 C(n ) 3 ={ , } n 1 n 5 :X + X <=1 1 5 C(n ) 0 ={ } n 2 Invalid link Possible link 0 Figure 7.2: Augmented connectivity sets are shown forn 0 andn 3 . n 3 is configurable to set up link with n 1 andn 5 whereas both links can not be established at the same time. Similar link ton 0 is invalid due to physical limitation. Definition 4: The (t) :N(t)N(t)!G(t) is the routing function at timet that maps a pair of nodes n i andn j to a subgraphG 0 (t) ofG(t) such that: i)n i ;n j 2G 0 (t) and ii) there exists at least a path from n i ton j . Of note, the (N(t);E(t); (t)) forms a 3-tuple that defines a regular NoC without reconfiguration fea- tures, whereN(t) andE(t) characterize its structural properties. (t) denotes all possible paths for data exchange. To be able to model and express the reconfiguration capability, we define the augmented connectivity functionC: Definition 5: C :N(t)! 2 N(t) is the augmented connectivity function (ACF) that associates each node n i 2 E(t) at timet with a subset of nodesC(n i ) =fn j jn j 2 N(t)g, to whiche i;j could be possibly set up. The setC(n i ) is called augmented connectivity set (ACS). By definition,C(n i ) denotes all the possible links that could be alternatively established forn i other than the current links infe i;j je i;j 2 E(t)g. We should note that the detailed form ofC and its range are limited by the physical constraints in a specific architecture. For instance in Figure 7.2, a physical link can not be inserted betweenn 3 andn 0 that are too far away from each other given physical limitations (e.g., propagation delay should be less than one cycle). Moreover, a link might not be set up fromn i ton j even ifn j 2C(n i ). Such constraint comes mostly from architectural limitations which forbid some links from being set up simultaneously as shown in Figure 7.2. To consider such cases, we introduce a set of constraints associated withC(n i ). LetX j be a binary variable such thatX j = 1 ifn j 2C(n i ) ande i;j is set up. Otherwise,X j = 0. j X j K (7.1) where, j 2 f0; 1g represents whether X j is masked or not in the constraints. K 2 N denotes the maximum number of links that can be established simultaneously. Equation (7.1) defines a constraint, which encodes the physical limitation that a subset of nodes inC(n i ) can not connect ton i at the same time. By changing the configuration off j g, eq. (7.1) forms an array of constraints that capture all such physical limitations. Thus, we can define the reconfigurable NoC in the general case: 118 Definition 6 (reconfigurability): A noden i is reconfigurable if and only ifC(n i )6=;. An NoC system G(t) = (N(t);E(t); (t);C) is reconfigurable at timet if there exists at least one noden i 2N(t) that is reconfigurable. Definition 6 formally defines the reconfigurability. An NoC system G(t) is reconfigurable as long as any one of its nodes has non-empty ACS. By picking up nodes from ACS and constructing new edges over time, E(t) evolves as a function of reconfiguration decisions made prior to time t. Of note, Def- inition 1 enforces the connectivity of a given NoC system. Therefore, the routing algorithm (t) will also change accordingly to guarantee all the nodes are reachable. Combined with the dynamics ofN(t), G(t) = (N(t);E(t); (t);C) describes a dynamical system with time-varying structural characteristics, i.e.,N(t),E(t) and (t), driven by the reconfiguration decisions made uponC(n i ) over time. By intro- ducing attributes of interest and associating them withG(t) and application tasks, we are able to construct the theoretical basis on which general NoC reconfiguration could be formulated as optimization prob- lems. To provide an illustrative example of this framework, we present the runtime NoC reconfiguration optimization problem given time-dependent workloads characterized by graphical models. 7.1.2 Application Modeling Framework Generally speaking, any runtime NoC reconfiguration is an optimization process that searches for the best-fit network structural configurations and application task assignments (i.e, mapping of tasks to specific tiles), given an objective function and a set of constraints. In contrast to application-specific NoC synthesis, runtime optimization is an online process repeatedly synchronized with the time-dependent characteristics of the running applications. This process usually consists of two phases: an execution phase and an optimization phase. In the optimization phase, the optimization will take the profile of applications as input and decide on a best possible network configuration for the execution phase. Profiling the application is a very complex research topic and several models were developed in different contexts. A detailed discussion is beyond the scope of this work. Next, we consider profiling applications via graphical models although the proposed NoC system modelG(t) has no constraints on profiling techniques. Definition 7: An applicationA(t) is a dynamical directed graphA(t) = (V (t);C(t);P(;F)) at time t. Each vertex v i 2 V (t) is a task of the application. c i;j 2 C(t) represents a directed data/control dependency from taskv i to taskv j . Each applicationA(t) induces a profiling systemP(;F). is the finite functionality alphabet given specific application context. It contains symbols that characterize the functionality of interest. Functional- ity profiling functionF :V ! 2 relates each taskv i ofA(t) at timet with a finite set of symbols defined in as functionality requirement setF(v i ). To provide some intuition, can be as simple as an integer setf0; 1; 2; 3g in context of numeric computation where 2 and 3 represent “addition” and “multiplication”. 0 and 1 represent “integer” and “floating-point”, respectively.F will induce a functionality requirement set for each nodev i . A nodev i withF(v i ) =f1; 2; 3g requires floating-point addition and multiplication operations, while a node v j withF(v j ) =f0; 2g requires only integer addition. By changing alphabet based on application context, i.e., application task profiled with interested details,P(;F) is able to characterize each task with sufficiently many mathematical details. 119 Moreover, we will use the same profiling systemP(;F) to characterize the “capabilities” of each tile n i 2 G(t) such that we can easily compare capabilities of the NoC G(t) with the functionality re- quirements from the application domain. This constitutes the foundation for application task assignment. For eachn i ofG(t), we defineF(n i ) as capability set such that, an application taskv i with functionality requirement setF(v i ) can be mapped ton i only ifF(v i )F(n i ). Each directed edgec i;j 2C(t) is characterized by a data generation processP ci;j (k;t) =PfN (t) = k;k 2 Ng whereN (t) is a counting process which denotes the number of packets generated in time interval [t;t +]. P ci;j (k;t) captures the time-varying behavior of the application. To provide some intuition, the data generation process could be a Poisson process if no memory-effect is present or it could be a fractal process governed by power-laws exhibiting long-range memory dependency. Therefore, for an execution phase of lengthT , the average traffic volumeq(c i;j ) from taskv i to taskv j could be calculated as follows, q(c i;j ) = Z t+T t Z 1 1 kP ci;j (k;t)dkdt (7.2) We define byb(c i;j ) the minimal bandwidth requirement for communication fromv i tov j . It should be noted that this requirement comes from the execution time constraints posed by the associated tasks. We defineB(e i;j ;I(e i;j )) as the bandwidth provided by the linke i;j 2E(t) given NoC systemG(t). Attribute I(e i;j ) is introduced inB to consider the bandwidth difference between regular and circuit links. We check the validity of assigning a task to one tile in NoC system by comparing the functionality requirement and capability set, a pair of tasks v i , v j can be mapped to n i and n j only if b(c i;j ) B(e i;j ;I(e i;j )) and b(c j;i )B(e j;i ;I(e j;i )). 7.1.3 Runtime Reconfiguration Problem Formulation Based on the mathematical description of the reconfigurable NoC and the application, the runtime NoC reconfiguration is performed in optimization phase by modifying the structural properties ofG(t), i.e., changing theE(t) and constructing circuit componentsT , based on ACFC given applicationA(t) and execution phase horizonT . To formally state the optimization problem, we first introduce the definition of compatible partition of an applicationA(t) givenG(t). Let =f 1 ;:::; n g be a partition of a given applicationA(t) = (V (t);C(t);P(;F)) such that, Definition 8: The partition is compatible with NoC systemG(t) = (N(t);E(t); (t);C) at timet if and only if for any partition element k there exists a circuit componentT k inG(t) such that, [ vi2 k F(v i )[ ni2T k F(n i ); (7.3) j k jjT k j (7.4) T i \T j =;;8i6=j (7.5) By definition 8, the reconfiguration optimization process can be understood as a searching process wherein a compatible partition can be found such thatG(t) is also partitioned by corresponding circuit 120 components, which cover maximum number of overall traffic volume. Alternatively stated, the reconfig- uration objective is to adapt the NoC structure to provide maximum bandwidth, i.e., the dedicated traffic path, to maximum possible share of traffic. Constraint (7.3) represents a sanity check which guarantees that the circuit componentT k to which the subset of tasks in k are assigned, covers all functionalities required to execute them. Constraint (7.4) indicates the computing resources withinT k can not be shared at the same time. Constraint (7.5) induces a one-to-one assignment between a partition element k and a circuit componentT k . To quantify the bandwidth gain due to adoption of circuit components, we introduce intra-component bandwidth factorf a (Ti) and inter-component bandwidth factorf r (Ti;Tj). Thef a represents the ratio between the bandwidth of circuit link and regular link. Thef r is defined as follows: f r (Ti;Tj) = O(Ti;Tj) D(Ti;Tj) (7.6) whereO(Ti;Tj) represents the number of non-overlapping links of all possible paths fromT i toT j , theD(Ti;Tj) is the Manhattan distance fromT i toT j , which is calculated by the minimum Manhattan distance from any noden i 2Ti to any noden j 2T j , thef r gives an upper bound for bandwidth gain assuming data can be promptly exchanged within a circuit component such that all paths between two circuit components can be used simultaneously to send the data and, the data will be gathered from those paths at the destination node with no cost. Thus, we can formulate the runtime NoC reconfiguration problem as follows: Runtime NoC reconfiguration – primal problem formulation: Given a reconfigurable NoC systemG(t) = (N(t);E(t); (t); C), an applicationA(t) = (V (t);C(t);P(;F)) at timet and execution phase horizonT , Find a partition ofA(t) and the corresponding circuit componentsfT k g that minimize following cost function: min ;T X Ti X Tj6=Ti ( q(T i ;T j ) f r (Ti;Tj) +(D(Ti;Tj)(E s +E l ))q(T i ;T j )) (7.7) Subject to: Partition is compatible. Of note, theq(T i ;T j ) is the sum of average traffic volume fromT i toT j given the length of execution phase T . Theq(T i ;T j ) is calculated as the sum of traffic volume from all the nodes inT i to all the nodes inT j , given k and l are assigned to them, respectively. The summation in the objective function (7.7) is decided by two terms that consider the efficiency of communication and energy, respectively. The communication efficiency is quantified by how much traffic is left with no dedicated link to use (i.e.,q(T i ;T j )) and how well the traffic can be delivered (i.e., f r (Ti;Tj)). Ideally, the first term is zero when either all traffic use the dedicated communication bandwidth or there exist infinite number of non-overlapping traffic paths between two circuit components. E s is the switching energy in a router andE l is traversing energy per hop. D(Ti;Tj) is the Manhattan distance fromT i toT j . Thus, the second term calculates the overall energy consumption of all inter-component traffic that travels over regular links. is a tuning parameter decided in experiments, which balances the contribution of these two terms. 121 The objective function (7.7) seeks to guide the reconfiguration of the NoC structure given current application workload such that, the number of circuit components is maximized to sustain the traffic re- quirements while also improving the energy efficiency, i.e., the dedicated links provide superior bandwidth and reduce energy consumption by skipping multiple routing stages. Next, we show that this problem is NP-hard and propose a greedy algorithm that solves the problem while also considering the convergence guarantees to the optimality. 7.1.4 Complexity and Algorithm Analysis Theorem 1: The runtime NoC reconfiguration problem described in (7.7) is NP-hard. Proof: The proof follows by noticing that there exists an NoC systemG(t) = (N(t);E(t); (t);C) that, for any partition ofA(t), it is compatible. Therefore, the problem reduces to a quadratic assignment problem, that is NP-hard, between partitionf k g and circuit componentsfT k g that minimizes (7.7). Because (7.7) contains (as subclass of problems) one that is NP-hard, it follows that (7.7) is also NP-hard. Next, we show the feasibility space of (7.7) given by the compatibility constraint, see Definition 8, is submodular. To prove the optimization problem (7.7) is submodular, it should be noted that (7.7) could be equivalently defined as its dual maximization problem as: Runtime NoC reconfiguration – dual problem formulation: Given a reconfigurable NoC systemG(t) = (N(t);E(t); (t); C), an applicationA(t) = (V (t);C(t);P(;F)) at timet and execution phase horizonT . Find a partition ofA(t) and corresponding circuit componentsfT i g that maximize the following cost function: max ;T X Ti (f a (T i ) q(T i ) q + 1 E (T i )) (7.8) Subject to: Partition is compatible. Of note, theq(T i ) is the overall traffic volume inT i andq denotes overall traffic volume ofA(t) given execution phase horizonT . Thef a (T i ) is the intra-component bandwidth factor, i.e., ratio of bandwidth between circuit link and regular link. Thus, the first term in the summation considers the bandwidth gain obtained by assigning dedicated circuit link to application workloads. E (T i ) represents the amount of energy saved by using dedicated links for trafficq(T i ), compared to the energy consumed by using regular link instead.E (T i ) depends on how each task inA(t) is assigned to tiles inG(t). Lemma 1: The runtime NoC reconfiguration objective function described in (7.8) is submodular. Proof: Given an application partition , let us define two compatible circuit component setsT A T B whereT A orT B is a collection of circuit components that are compatible with a subset of partition elements in. LetT e 2T be an arbitrary circuit component to which a partition k will be assigned. We denote G(T A ) as the objective function defined in (7.8) givenT A . Thus, ifT e = 2T B , G(T A [T e )G(T A ) =f a (T e ) q(T e ) q + 1 E (T e ) (7.9) 122 G(T B [T e )G(T B ) =f a (T e ) q(T e ) q + 1 E (T e ) (7.10) Otherwise, ifT e 2T B ,G(T B [T e )G(T B ) = 0. Therefore, G(T B [T e )G(T B )G(T A [T e )G(T A ) (7.11) holds for anyT A T B T andT e 2T . Hence, the objective function (7.8) is submodular. Moreover, eq. (7.9) showsG is also monotonic. For monotonic submodular functions[97], we have the following theorem, Theorem 2: Given a monotonic submodular functionG, G(;) = 0, the greedy maximization algorithm returns: G(;T greedy ) (1 1=e) max jTjN G(;T ) (7.12) where N is maximum number of circuit components that are possibly constructed. Thus, even though the runtime NoC reconfiguration problem is NP-hard, we can propose a greedy heuristic Alg. 4 with guaranteed optimality. In Algorithm 4, G(t) andA(t) are first constructed for an execution phase of lengthT . Then the algorithm will partition the applicationA(t) constrained by the physical limitations posed byG(t) such that the maximum possible amount of traffic is covered within all partition components f k g, i.e., the cut weight is minimal. We assume the Fiduccia-Mattheyses algorithm [46] that is able to handle unbalanced partitions. The physical limitations majorly prevent infeasible partitions. A partition is considered infeasible if any of its partition components is beyond a predefined size such that no compatible circuit component is possibly found, see Definition 8, (7.3), (7.4), and (7.5). Then a greedy heuristic will construct a circuit component T e that maximizes the incremental gain ofG and add it toT . The cycle repeats until is compatible. Algorithm 4 returns a solution to (7.8) with bounded optimality as long as it exists. Otherwise, i.e., no circuit components can be constructed to be compatible with given partition, or it takes indefinite time to reach one, Algorithm 4 might practically fail. To obtain an approximated solution in such cases, we propose a relaxation process. Definition 9: Anl-relaxed circuit component is a subset of nodesN 0 (t) N(t) if and only if for any pair of nodesn i ,n j , there exists a linke i;j between them such that, at mostl of all such links are regular links. Definition 10: Partition isl-compatible with NoC systemG(t) = (N(t);E(t); (t);C) at timet if and only if for any partition element k there exists a l-relaxed circuit componentT i inG(t) such that (7.3), (7.4), and (7.5) are met. The definitions 9 and 10 set the relaxation on the concepts of circuit component and compatible par- tition. Ideally, we reconfigure the NoC structures to fit the application workloads with the hope that all traffic paths are able to exploit their dedicated bandwidth, i.e., circuit links. However, there is a chance that such circuit links are difficult to construct given architectural and physical constraints. Therefore, we need to relax the Definition 8 in such cases, while still requiring a link to exist between any pair of nodes, to allow some of such links to be regular links. Definition 9 generalizes the circuit component such 123 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 13 14 10 11 E S W N A) “Pass” “Drive” “Sink” B) C) One-bit switch box Driven 13 14 10 11 Pass Pass Sink Circuit link ownership @time t 13 14 10 11 Pass Pass Negotiation ownership link@time t Tile 13 Tile 11 0 1 1 0 1 0 REQ ACK Negt_link Turn around 0 1 1 REQ Negt_link Time out Abort 1 0 0 ACK Negt_link Yield 0 1 1 0 1 0 Tile 13 Tile 11 Time = t Time = t+turnaround_time Data Negt_link Data Negt_link Circuit link D) Configuration pattern Figure 7.3: A case study: low-level architecture details Algorithm 4 Greedy maximization algorithm to (7.8) Require: ApplicationA(t) = (V (t);C(t);P(;F)); Reconfigurable NoCG(t) = (N(t);E(t); (t);C); Execution phase lengthT ; Ensure: Compatible application partition and constructed circuit components setT 1: =Paritition(A(t);G(t)) 2: repeat 3: T =T[ arg max Te (G(;T[T e )G(;T )) 4: until is compatible orT ==N(t) that not only the dedicated links, i.e., circuit link, but also the express links , i.e., one-hop regular links between routers, are considered. Thus, the constraint for (7.8) could be relaxed tol-compatible as stated by Definition 10. It is noted that the relaxation on the constraint does not affect the submodularity ofG, thus the optimality bound in (7.12) holds. In what follows, we will instantiate a reconfigurable NoC architecture as optimization case study to evaluate the efficacy of the proposed mathematical framework with realistic workloads characterized by graphical models. 7.2 Architectural Case Study To understand how the proposed mathematical framework could be effectively applied to a reconfig- urable NoC system, we consider a simple reconfigurable NoC with switch boxes to provide dedicated links, i.e., the circuit link, between different network tiles. More precisely, we study a regular mesh NoC system G(t) = (N(t);E(t); (t)) to which a subnet of dedicated links is attached. Formally, for each noden i 2 N(t), ACFC is induced to generate a set of nodes to which a link could be set up (i.e., ACS, see Definition 5). Therefore, we have a simple reconfigurable NoC system characterized by quadruple 124 Algorithm 5l-relaxed Greedy maximization algorithm to (7.8) Require: ApplicationA(t) = (V (t);C(t);P(;F)); Reconfigurable NoCG(t) = (N(t);E(t); (t);C); Execution phase lengthT ; Ensure: Compatible application partition and constructed circuit components setT 1: l = 0 2: =Paritition(A(t);G(t)) 3: repeat 4: repeat 5: T =T[ arg max Te (G(;T[T e )G(;T )) 6: until isl-compatible orT ==N(t) 7: if isl-compatible then 8: ReturnT and 9: end if 10: l =l + 1; 11: untill ==j arg max k j k jj G(t) = (N(t);E(t); (t);C). To detail the construction of ACS for each node n i , we set up low-level architectural features for the switch box. A switch box is a set of programmable pass gates. For simplicity of illustration, Figure 7.3.(B) shows only the NMOS part of it. To allow circuit links to be set up between different tiles, a switch box is organized by an array of one-bit switch boxes. The length of the array is equal to the bitwidth of a circuit link. A one-bit swap box consists of 6 pass gates such that any pair of ports can be directly connected by a dedicated link. A tile connects to a switch box through a set of similar links controlled by the pass gates. By cascading such links among a subset of nodes, it is possible to establish circuit links between them. Figure 7.3.(B) shows the configuration of a set of switch boxes such that, a set of nodesfn 10 ;n 11 ;n 13 ;n 14 g becomes a circuit componentT k . To simplify the hardware setup and minimize the area overhead, we assume each node in the cir- cuit component should time-multiplex the link, i.e., one driver for any circuit link during a slotted time assigned. More precisely, for an execution time of lengthT , each noden i 2T k is assigned with a band- width, i.e., being a driver for the circuit link, proportional to its traffic share in q(T k ). Therefore, each switch box could work in 3 possible modes, namely, ”Driver”, ”Sink” and ”Pass” based on the ownership of the link. A switch box is in ”Pass” mode if the tile connected to the switch is not the driver or sink of the data being transferred, thus ”passing” the data. Otherwise, it is in ”Driver” or ’Sink’ mode. The working modes can be identified by the configuration pattern of switch box as shown in Figure 7.3.(C). A switch box is in ”Pass” mode if i) the tile is disconnected to the switch box and ii) switch box is programmed as any one of three configurations on top in Figure 7.3.(C). Otherwise, it will be in ”Driver” or ”Sink”. As a simple reconfigurable NoC, we enforce the statically assigned time slot for each node in the circuit component. So there is chance that a node becomes the driver yet with no data to transmit. To maximize the link utilization in such case, it is necessary to make possible the shift of circuit link ownership during such 125 an idle ”Driver” phase. We thus adopt a self-negotiated link access control (SNAC) as shown in Figure 7.3.(D). We build up a one-bit bi-directional negotiation link (NL) between tiles. A simple negotiation protocol is implemented over the circuit link between a driver-sink pair to bargain over the ownership. More precisely, a ”Driver” during its assigned slot will automatically obtain the access to NL and the circuit link. When a data transmission is in progress, no negotiation is necessary. Otherwise, if the driver has no data to send, either a transmission is complete before the expiration of assigned time slot or no planned transmission, driver will send ACK sequence ”010” to sink to yield the control over the circuit link to the sink. Upon receiving this sequence, if the sink has anything to send to the current driver, a negotiation happens: the sink will send REQ sequence ”101” and wait for ACK sequence. Upon valid ACK, the sink will obtain the rest of time slot and use it for transmission to the previous driver, i.e., the sink and driver switch their roles. Otherwise, after a preset waiting threshold, the negotiation is a failure and the sink will abort the request. In the following discussion, we will consider a set of real world applications and perform the optimization to construct circuit components by exploiting the submodularity properties of the problem as stated in (7.8). 7.3 Experimental Results Experiment setup: We consider real world workloads induced by 6 SoC applications that are character- ized by the proposed graphical modelA(t) = (V (t);C(t);P(;F)). The applications include video ob- ject plan decoder (VOPD), multi-window display (MWD), MP3 encoder/decoder, H.263 encoder/decoder and MPEG-4. The number of tasks ranges from 11 to 16. We set up a 4x4 reconfigurable NoC described in section III as target system. The NoC system is implemented using fully synthesizable Verilog and synthesized under SMIC 65nm process using Synopsys Design Compiler with a fixed frequency constraint of 200MHz. All simulations are done using Synopsys VCS ported with Tcl scripts to load in the appli- cation traffic workloads. Throughout the simulations, NoC adopts wormhole switching for regular data transmission under variable flit-width ranging from 16-bit to 256-bit such that, we can test the network performance under different flit injection rates given a fixed data generation rate. Each port in the router has 4 4-flit virtual channels. We do not insert repeaters to the circuit links and assume the propagation de- lay should be within 1 cycle under 200MHz. Thus, we constrain the feasible size of one circuit component to be less than 5. Power estimation is done by feeding the Switching Activity Interchangeable File (SAIF) to Design Compiler during the synthesis. We extract the switching statistics by RTL simulation in VCS and transform it into SAIF files. Algorithm 5 is implemented using C++. We use Fiduccia-Mattheyses algorithm for partition of the applications. Performance evaluation: Figure 7.4 shows the optimized network configurations for all 6 applications obtained by solving the submodular maximization problem in (7.8). Application partition is first ob- tained to minimize the cut cost, i.e., traffic between different partition components, see (7.7) for detailed explanation. Then, we run thel-relaxed greedy maximization algorithm to construct circuit components T such that a circuit component configuration is obtained, which is compatible with the partition. As 126 shown in Figure 7.4, our algorithm identifies the critical traffic paths of all applications (see solid rectan- gles) and constructs circuit components (dashed rectangles) to cover most of them. An important obser- vation is that the configuration of the network varies greatly for different applications, which suggests the spatio-temporal variability of the applications when ported to the NoC system over time. To evalu- 5 8 7 6 4 3 2 16 14 15 1 9 10 11 12 13 5 6 7 8 1 2 3 4 14 15 16 9 10 11 12 13 70 362 362 362 49 27 357 353 300 313 16 16 16 16 16 157 16 313 94 500 (A) VOPD (B) MP3 Encoder 3 7 6 8 1 2 3 4 11 9 10 12 5 6 7 8 1 2 3 4 9 10 11 12 13 0.025 2.083 4.06 0.5 1 1 2.083 0.01 0.5 4.06 0.87 0.18 0.15 5 6 7 8 1 2 3 4 14 9 10 11 12 13 0.025 0.25 0.187 0.025 0.5 0.1 3.672 3.672 3.672 0.38 4.06 0.5 0.01 2.083 0.5 5 7 6 3 2 1 4 12 13 8 10 11 14 9 (D) H.263 Decoder 8 1 2 5 4 1 9 10 13 12 4 5 6 7 8 1 2 3 9 10 11 12 6 5 7 8 1 2 3 4 9 10 11 12 7 8 1 2 3 4 10 11 12 0.193 0.025 38.001 38.016 24.634 46.733 37.958 4.06 0.5 0.01 38.001 2.083 C) H.263 Encoder 5 6 7 8 1 2 3 9 10 11 12 64 4 128 96 96 128 96 96 96 96 64 64 96 5 6 7 10 1 2 3 4 12 11 4 10 8 5 6 7 1 2 3 9 E) MWD 4 5 10 8 6 1 11 2 3 7 12 5 6 7 8 1 2 3 4 9 10 11 12 F) MPEG-4 0.5 190 0.5 60 40 250 500 173 670 32 910 600 Application partition Established circuit link Communication bandwidth(MB/s) 32 ( Circuit component 9 Unused tiles Figure 7.4: Optimized network configuration for real world applications 0 8 24 32 0.0001875 0.000375 0.00075 0.0015 0.003 Baseline Reconfigurable Noc MP3 Encoder ! 1 0 7.5 15 22.5 30 0.0002 0.0004 0.0008 0.0016 0.0032 Baseline Reconfigurable Noc H.263 Decoder 0 12.5 25 37.5 50 0.003 0.006 0.012 0.024 0.048 Baseline Reconfigurable Noc H.263 Encoder Average latency (cycles) 0 750 1500 2250 3000 0.02775 0.0555 0.111 0.222 0.444 Baseline Reconfigurable Noc VOPD 0 1000 2000 3000 4000 0.0416 0.0833 0.166 0.333 0.666 Baseline Reconfigurable Noc MPEG4 ! 1 0 350 700 1050 1400 0.0145 0.029 0.058 0.116 0.232 Baseline Reconfigurable Noc MWD Normalized power savings Baseline Optimized@Flitwidth=16 Optimized@Flitwidth=16 Optimized@Flitwidth=32 Optimized@Flitwidth=64 Optimized@Flitwidth=128 Optimized@Flitwidth=256 VOPD MP3 H263.Enc H263.Dec MWD MP4 Injection rate(flits/node/cycle) 1 0.75 0.5 0.25 0 16 Figure 7.5: Network latency and energy savings comparison between the baseline and optimized recon- figurable NoC under different traffic workloads.(a) Network latency measurements and (b) corresponding normalized energy consumptions. ate the performance under different traffic pressures, we run the applications on networks with different physical interconnection bandwidth (i.e., flit-width=16-bit to 256-bit) under 200MHz to measure the net- work latency. In such settings, the flit injection rate has to be increased/decreased to meet the application bandwidth requirements as the physical bandwidth shrinks/grows. We compare the network latency for re- configurable NoC optimized by Algorithm 5 and the baseline regular mesh NoC. The results are reported in Figure 7.5.(a). For applications with smaller bandwidth requirements like MP3 Encoder, H263 Decoder and Encoder, both networks demonstrate no saturation phase transition under experimental settings. How- ever, the optimized reconfigurable network shows on average 52:3% latency reduction compared to the baseline design. This is because most of the communication with heavy traffic loads are identified and take advantage of the dedicated links without traveling through multiple routing stages. For applications 127 with heavy traffic requirements like VOPD, MPEG4 and MWD, the optimized network not only shows improved network latency before phase transition, but also exhibits its capability to endure greater traffic pressure, i.e., the network becomes heavily congested under a greater flit injection rate compared to the baseline design. This improvement comes from the fact that, the communication paths with heaviest traffic are covered mostly by the circuit links, thus alleviating the traffic pressure posed on the regular network. To show the improved energy efficiency of our optimized network, we report the normalized energy savings under different network settings in Figure 7.5.(b). Combined with Figure 7.5.(a), we have the following key observations: i) Our optimized network shows improved energy efficiency ranging from 22% to 38% with an average of 30:2%, under all workloads and ii) The energy efficiency increases as the network becomes more congested. These observations are supported by the fact that the traffic over the circuit links consumes less energy by skipping multiple routers. The energy savings are even greater when the network is congested as the recursive switching within a router for blocked packets can be avoided. 7.4 Summary In this work, we lay the theoretical foundation for modeling the reconfigurable NoC in general and propose a mathematical framework for optimization of the NoC reconfiguration. We formulate the NoC reconfiguration as an optimization problem and prove its submodularity. Based on our theoretical anal- ysis, we propose a greedy algorithm bounded by guaranteed optimality. As a case study, we propose a simple reconfigurable NoC as architectural instance to validate the framework. We perform the proposed optimization and evaluate it with real-world workloads. The results show a 52:3% reduction of network latency on average, increased capability of handling heavy traffic and 30:2% in energy reduction compared to baseline design. 128 Bibliography [1] Chiuso A. Regularization and bayesian learning in dynamical systems: Past, present and future. Annual Reviews in Control - submitted, 20XX. [2] T. Abdelzaher. Research Challenges in Distributed Cyber-Physical Systems. IEEE/IFIP Intl. Conf. on in Embedded and Ubiquitous Computing, 2008. [3] Vikram Advea and Rizos Sakellariou. Compiler synthesis of task graphs for parallel program perfor- mance prediction. In Languages and Compilers for Parallel Computing, pages 208–226. Springer, 2001. [4] Kunal Agrawal, Charles E Leiserson, and Jim Sukha. Executing task graphs using work-stealing. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1–12. IEEE, 2010. [5] Alaa R Alameldeen and David A Wood. Variability in architectural simulations of multi-threaded workloads. In High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. The Ninth International Symposium on, pages 7–18. IEEE, 2003. [6] Nikolay A. Atanasov, Jrme Le Ny, Kostas Daniilidis, and George J. Pappas. Decentralized active information acquisition: Theory and application to multi-robot slam. pages 4775–4782, May 2015. [7] et. al B. He. Grand challenges in interfacing engineering with life sciences and medicine. IEEE Trans. on Biomedical Engineering, 2013. [8] F. Bach. Learning with Submodular Functions: A Convex Optimization Perspective. ArXiv e-prints, November 2011. [9] R. Baheti and H. Gill. Cyber-physical systems. In T. Samad and A. M. Annaswamy, editors, The Impact of Control Technology. 2011. [10] D. Baleanu, J.A.T. Machado, and A.C.J. Luo. Fractional Dynamics and Control. Springer, 2011. [11] Albert-L´ aszl´ o Barab´ asi and R´ eka Albert. Emergence of scaling in random networks. science, 286(5439):509–512, 1999. [12] Nick Barrow-Williams, Christian Fensch, and Simon Moore. A communication characterisation of splash-2 and parsec. In Workload Characterization, 2009. IISWC 2009. IEEE Int’l Symp. on, pages 86–97. IEEE, 2009. [13] J. Bassingthwaighte, L. Liebovitch, and B. West. Properties of fractal phenomena in space and time. In Fractal physiology. Springer, 1994. [14] Roberto Benzi, Giovanni Paladin, Giorgio Parisi, and Angelo Vulpiani. On the multifractal nature of fully developed turbulence and chaotic systems. Journal of Physics A: Mathematical and General, 17(18):3521, 1984. 129 [15] M. Besserve, N. Logothetis, and B. Schlkopf. Statistical analysis of coupled time series with kernel cross-spectral density operators. In NIPS, 2013. [16] Maamar Bettayeb, Said Djennoune, Said Guermah, and Malek Ghanes. Structural properties of linear discrete-time fractional-order systems. Proceedings of the 17th World Congress of the Inter- national Federation of Automatic Control, Seoul, Korea, pages 15262–15266, 2008. [17] Christian Bienia, Sanjeev Kumar, and Kai Li. Parsec vs. splash-2: A quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors. In Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on, pages 47–56. IEEE, 2008. [18] Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. The parsec benchmark suite: Characterization and architectural implications. Technical Report TR-811-08, Princeton University, January 2008. [19] P. Bogdan. Mathematical modeling and control of multifractal workloads for data-center-on-a-chip optimization. In Proceedings of the 9th International Symposium on Networks-on-Chip, page 21. ACM, 2015. [20] P. Bogdan and R. Marculescu. Towards a science of cyber-physical systems design. IEEE/ACM Intl. Conf on Cyber-physical Systems (ICCPS), 2011. [21] Paul Bogdan. A cyber-physical systems approach to personalized medicine: challenges and oppor- tunities for noc-based multicore platforms. In Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, pages 253–258. EDA Consortium, 2015. [22] Paul Bogdan and Yuankun Xue. Cyber-physical systems for personalized and precise medicine. In 2015 IEEE 58th International Midwest Symposium on Circuits and Systems (MWSCAS), pages 1–4. IEEE, 2015. [23] R. Bousseljot, D. Kreiseler, and A. Schnabel. Nutzung der ekg-signaldatenbank cardiodat der ptb ¨ uber das internet. Biomedizinische Technik/Biomedical Engineering, 40(s1):317–318, 1995. [24] R. Bousseljot, D. Kreiseler, and A. Schnabel. Nutzung der ekg-signaldatenbank cardiodat der ptb ¨ uber das internet. Biomedizinische Technik, 40(S1):317–318, 1995. [25] Laurent Calvet and Adlai Fisher. Forecasting multifractal volatility. Journal of econometrics, 105(1):27–58, 2001. [26] Chia-Hsin Owen Chen, Sunghyun Park, Tushar Krishna, Suvinay Subramanian, Anantha P Chan- drakasan, and Li-Shiuan Peh. Smart: A single-cycle reconfigurable noc for soc applications. In Proceedings of the Conference on Design, Automation and Test in Europe, pages 338–343. EDA Consortium. [27] Canus Christophe, Jacques L´ evy V´ ehel, and Claude Tricot. Continuous large deviation multifractal spectrum: definition and estimation. In Fractals 98. World Scientific, 1998. [28] Jason Cong and Bo Yuan. Energy-efficient scheduling on heterogeneous multi-core architectures. In Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, pages 345–350. [29] J. Crutchfield and B. McNamara. Equations of motion from a data series. Complex systems, 1(417- 452):121, 1987. [30] William James Dally and Brian Patrick Towles. Principles and practices of interconnection net- works. Elsevier, 2004. 130 [31] Li Daqing, Kosmas Kosmidis, Armin Bunde, and Shlomo Havlin. Dimension of spatially embedded networks. Nature Physics, 7(6):481–484, 2011. [32] Robert P Dick, David L Rhodes, and Wayne Wolf. Tgff: task graphs for free. In Proceedings of the 6th international workshop on Hardware/software codesign, pages 97–101. IEEE Computer Society, 1998. [33] Christopher M Dobson. Protein folding and misfolding. Nature, 426(6968):884–890, 2003. [34] Jack Edmonds. Matroids and the greedy algorithm. Mathematical Programming, 1(1):127–136. [35] Thorsten Emmerich, Armin Bunde, Shlomo Havlin, Guanliang Li, and Daqing Li. Complex net- works embedded in space: Dimension and scaling relations between mass, topological distance, and euclidean distance. Physical Review E, 87(3):032802, 2013. [36] John R Engen. Analysis of protein conformation and dynamics by hydrogen/deuterium exchange ms, 2009. [37] D. Sussillo et al. A recurrent neural network for closed-loop intracortical brain-machine interface decoders. J. of Neural Engineering, 2012. [38] J. L. Collinger et al. High-performance neuroprosthetic control by an individual with tetraplegia. Lancet, 381, 2013. [39] L. R. Hochberg et al. Neuronal ensemble control of prosthetic devices by a human with tetraplegia. Nature, 442, 2006. [40] S. P. Kim et al. Neural control of computer cursor velocity by decoding motor cortical spiking activity in humans with tetraplegia. J. of Neural Engineering, 2008. [41] S. Shoham et. al. Statistical encoding model for a primary motor cortical brain-machine interface. IEEE Trans. on Biomedical Engineering. [42] C. Ethier, E. R. Oby, and L. E. Miller. Restoration of grasp following paralysis through brain- controlled stimulation of muscles. Nature, 485, 2012. [43] E. V . Evarts. Temporal patterns of discharge of pyramidal tract neurons during sleep and waking in the monkey. J. of Neurophysiology, 1964. [44] J. M. Fan. Intention estimation in brain-machine interfaces. J. of Neural Eng, 11, 2014. [45] Anja Feldmann, Anna C Gilbert, and Walter Willinger. Data networks as cascades: Investigating the multifractal nature of internet wan traffic. In ACM SIGCOMM Computer Communication Review, volume 28, pages 42–55. ACM, 1998. [46] Charles M Fiduccia and Robert M Mattheyses. A linear-time heuristic for improving network par- titions. In Design Automation, 1982. 19th Conference on, pages 175–181. IEEE, 1982. [47] Kunal Ganeshpure and Sandip Kundu. On run time task graph extraction of soc. In SoC Design Conference (ISOCC), 2010 International, pages 380–383. IEEE, 2010. [48] Mahboobeh Ghorbani and Paul Bogdan. A cyber-physical system approach to artificial pancreas design. In Proceedings of the ninth IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis, page 17. IEEE Press, 2013. [49] AL. Goldberger, LAN Amaral, L Glass, JM Hausdorff, PCh Ivanov, RG Mark, JE Mietus, Moody GB, C-K Peng, and HE Stanley. Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation, 101(23):e215–e220, 2000. 131 [50] Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. Physiobank, physiotoolkit, and physionet components of a new research resource for complex physiologic sig- nals. Circulation, 101(23):e215–e220, 2000. [51] C. Granger and R. Joyeux. Essays in econometrics. chapter An Introduction to Long-memory Time Series Models and Fractional Differencing, pages 321–337. Harvard University Press, Cambridge, MA, USA, 2001. [52] Cristian Grecu and et. al. Towards open network-on-chip benchmarks. In Networks-on-Chip, 2007. NOCS 2007. First Int’l Symp. on, pages 205–205. IEEE. [53] Said Guermah, Said Djennoune, and Maamar Bettayeb. Controllability and observability of linear discrete-time fractional-order systems. Applied Mathematics and Computer Science, 18(2):213– 222, 2008. [54] John E Hall. Guyton and Hall textbook of medical physiology. Elsevier Health Sciences, 2015. [55] G. Henry. Gray’s Anatomy 40th Edition. Elsevier, Limited, 2008. [56] R. Herrmann. Fractional Calculus: An Introduction for Physicists. World Scientific. [57] Joel Hestness, Boris Grot, and Stephen W Keckler. Netrace: dependency-driven trace-based network-on-chip simulation. In Proc. of the Third Int’l Workshop on Network on Chip Architec- tures. ACM, 2010. [58] J. R. M. Hosking. Fractional differencing. Biometrika, 68(1):165–176, 1981. [59] Plamen Ch Ivanov, Luis A Nunes Amaral, Ary L Goldberger, Shlomo Havlin, Michael G Rosen- blum, Zbigniew R Struzik, and H Eugene Stanley. Multifractality in human heartbeat dynamics. Nature, 399(6735):461–465, 1999. [60] Plamen Ch Ivanov, Luıs A Nunes Amaral, Ary L Goldberger, Shlomo Havlin, Michael G Rosen- blum, H Eugene Stanley, and Zbigniew R Struzik. From 1/f noise to multifractal cascades in heart- beat dynamics. Chaos: An Interdisciplinary Journal of Nonlinear Science, 11(3):641–652, 2001. [61] Chris Jackson and Simon J Hollis. Skip-links: A dynamically reconfiguring topology for energy- efficient nocs. In System on Chip (SoC), 2010 International Symposium on, pages 49–54. IEEE, 2010. [62] Friston K., Harrison L., and Penny W. Dynamic causal modelling. NeuroImage, 19(4):1273 – 1302, 2003. [63] Jan W Kantelhardt, Stephan A Zschiegner, Eva Koscielny-Bunde, Shlomo Havlin, Armin Bunde, and H Eugene Stanley. Multifractal detrended fluctuation analysis of nonstationary time series. Physica A: Statistical Mechanics and its Applications, 316(1):87–114, 2002. [64] Torsten Kempf, Kingshuk Karuri, Stefan Wallentowitz, Gerd Ascheid, Rainer Leupers, and Heinrich Meyr. A sw performance estimation framework for early system-level-design using fine-grained in- strumentation. In Design, Automation and Test in Europe, 2006. DATE’06. Proceedings, volume 1, pages 6–pp. IEEE, 2006. [65] NC Kenkel and DJ Walker. Fractals in the biological sciences. Coenoses, pages 77–100, 1996. [66] Katrin Kirchhoff and Jeff Bilmes. Submodularity for data selection in machine translation. pages 131–141, October 2014. 132 [67] Sebastian Kobbe, Lars Bauer, Daniel Lohmann, Wolfgang Schr¨ oder-Preikschat, and J¨ org Henkel. Distrm: distributed resource management for on-chip many-core systems. In Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system syn- thesis. [68] Andreas Krause and Carlos Guestrin. Submodularity and its applications in optimized information gathering. ACM Trans. Intell. Syst. Technol., 2(4):32:1–32:20, July 2011. [69] Andreas Krause, Ajit Singh, and Carlos Guestrin. Near-optimal sensor placements in gaussian processes: Theory, efficient algorithms and empirical studies. J. Mach. Learn. Res., 9:235–284, June 2008. [70] Santosh Kumar, Wendy Nilsen, Misha Pavel, and Mani Srivastava. Mobile health: Revolutionizing healthcare through transdisciplinary research. Computer, (1):28–35, 2013. [71] Vipin Kumar, Ananth Grama, Anshul Gupta, and George Karypis. Introduction to parallel comput- ing: design and analysis of algorithms. Addison Wesley, 2003. [72] Yu-Kwong Kwok and Ishfaq Ahmad. Benchmarking and comparison of the task graph scheduling algorithms. Journal of Parallel and Distributed Computing, 59(3):381–422, 1999. [73] Christian Landles and Gillian P Bates. Huntingtin and the molecular pathogenesis of huntington’s disease. EMBO reports, 5(10):958–963, 2004. [74] Chris Lattner and Vikram Adve. Llvm: A compilation framework for lifelong program analysis & transformation. In Code Generation and Optimization, CGO 2004. Int’l Symp. on, pages 75–86. IEEE, 2004. [75] Jaekyu Lee, Si Li, Hyesoon Kim, and Sudhakar Yalamanchili. Adaptive virtual channel partitioning for network-on-chip in heterogeneous architectures. ACM Transactions on Design Automation of Electronic Systems, 2013. [76] Hui Lin and Jeff Bilmes. Multi-document summarization via budgeted maximization of submodular functions. pages 912–920, 2010. [77] Weichen Liu and et. al. A noc traffic suite based on real applications. In VLSI (ISVLSI), IEEE Computer Society Annual Symposium on. IEEE, 2011. [78] L. Ljung and T. Glad. On global identifiability for arbitrary model parametrizations. Automatica, 30(2):265–276, 1994. [79] Lennart Ljung. Perspectives on system identification. Annual Reviews in Control, 34(1):1 – 12, 2010. [80] L. Lov´ asz. Mathematical Programming The State of the Art: Bonn 1982. Springer Berlin Heidel- berg, Berlin, Heidelberg, 1983. [81] Shaun Lovejoy and Daniel Schertzer. Scale, scaling and multifractals in geophysics: twenty years on. In Nonlinear dynamics in geosciences, pages 311–337. Springer, 2007. [82] Brian N. Lundstrom, Matthew H. Higgs, William J. Spain, and Adrienne L. Fairhall. Fractional dif- ferentiation by neocortical pyramidal neurons. Nature Neuroscience, 11(11):1335–1342, November 2008. [83] Turbo Majumder, Partha Pratim Pande, and Ananth Kalyanaraman. High-throughput, energy- efficient network-on-chip-based hardware accelerators. Sustainable Computing: Informatics and Systems, 3(1):36–46, 2013. 133 [84] B Mandelbrot, L Calvet, and A Fisher. Large deviations and the distribution of price changes. Technical Report, 1165, 1997. [85] P Martinez, D Schertzer, and KK Pham. Texture modelisation by multifractal processes for sar image segmentation. 1997. [86] Kaushik Matia, Yosef Ashkenazy, and H Eugene Stanley. Multifractal properties of price fluctua- tions of stocks and commodities. EPL (Europhysics Letters), 61(3):422, 2003. [87] P. Mehta and S. Meyn. Q-learning and pontryagin’s minimum principle. In Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proc. of the 48th IEEE Conference on. IEEE, 2009. [88] Alexandra Meliou, Andreas Krause, Carlos Guestrin, and Joseph M. Hellerstein. Nonmyopic infor- mative path planning in spatio-temporal models. pages 602–607, 2007. [89] C Meneveau and KR Sreenivasan. Simple multifractal cascade model for fully developed turbu- lence. Physical review letters, 59(13):1424, 1987. [90] Brett H Meyer, Adam S Hartman, and Donald E Thomas. Cost-effective slack allocation for lifetime improvement in noc-based mpsocs. In Proceedings of the Conference on Design, Automation and Test in Europe, pages 1596–1601. European Design and Automation Association, 2010. [91] E. Montroll and M. Shlesinger. Maximum entropy formalism, fractals, scaling phenomena, and 1/f noise: a tale of tails. Journal of Statistical Physics, 1983. [92] F.C. Moon. Chaotic and Fractal Dynamics: An Introduction for Applied Scientists and Engineers. A Wiley-Interscience publication. Wiley, 1992. [93] Pedro A Moreno, Patricia E V´ elez, Ember Mart´ ınez, Luis E Garreta, N´ estor D´ ıaz, Siler Amador, Irene Tischer, Jos´ e M Guti´ errez, Ashwinikumar K Naik, Fabi´ an Tobar, et al. The human genome: a multifractal analysis. BMC genomics, 12(1):506, 2011. [94] Jean-Franc ¸ois Muzy, Emmanuel Bacry, and Alain Arneodo. Wavelets and multifractal formalism for singular signals: Application to turbulence data. Physical review letters, 67(25):3515, 1991. [95] Ravi Namballa, Nagarajan Ranganathan, and Abdel Ejnioui. Control and data flow graph extraction for high-level synthesis. In VLSI, 2004. Proc.. IEEE Computer society Annual Symp. on, pages 187–192. IEEE, 2004. [96] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions part I, journal=. [97] George L Nemhauser and Leonard A Wolsey. Best algorithms for approximating the maximum of a submodular set function. Mathematics of operations research, 3(3):177–188, 1978. [98] H. Ohlsson, J. Roll, and L. Ljung. Manifold-constrained regressors in system identification. In Decision and Control, 2008. CDC 2008. 47th IEEE Conference on, pages 1364–1369, Dec 2008. [99] K. Oldham and J. Spanier. The fractional calculus : theory and applications of differentiation and integration to arbitrary order. Mathematics in science and engineering. Academic Press, New York, 1974. [100] L Olsen. A multifractal formalism. Advances in mathematics, 116(1):82–196, 1995. [101] A. Olshevsky. Minimal controllability problems. IEEE Transactions on Control of Network Systems, 1(3):249–258, Sept 2014. 134 [102] M. Opper and G. Sanguinetti. Variational inference for markov jump processes. In J.C. Platt, D. Koller, Y . Singer, and S.T. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 1105–1112. Curran Associates, Inc., 2008. [103] N. H. Packard, J. P. Crutchfield, J. D. Farmer, and R. S. Shaw. Geometry from a time series. Phys. Rev. Lett., 45:712–716, Sep 1980. [104] Esko Pekkarinen, Lasse Lehtonen, Erno Salminen, and Timo D H¨ am¨ al¨ ainen. A set of traffic models for network-on-chip benchmarking. In System on Chip (SoC), 2011 Int’l Symp. on. IEEE, 2011. [105] S. Pequito, G. Ramos, S. Kar, A. P. Aguiar, and J. Ramos. On the Exact Solution of the Minimal Controllability Problem. ArXiv e-prints, January 2014. [106] Kirk Pruhs, Jiri Sgall, and Eric Torng. Online scheduling. pages 115–124. CRC Press, 2003. [107] Zhiliang Qian, Syed Mohsin Abbas, and Chi-Ying Tsui. Fsnoc: A flit-level speedup scheme for network on-chips using self-reconfigurable bidirectional channels. Very Large Scale Integration (VLSI) Systems, IEEE Trans. on, 2015. [108] Brian P Railing, Eric R Hein, and Thomas M Conte. Contech: Efficiently generating dynamic task graphs for arbitrary parallel programs. ACM Trans. on Architecture and Code Optimization (TACO), 12(2):25, 2015. [109] Dorit Ron, Ilya Safro, and Achi Brandt. Relaxation-based coarsening and multiscale graph organi- zation. Multiscale Modeling & Simulation, 9(1):407–423, 2011. [110] Hern´ an D Rozenfeld and Hern´ an A Makse. Fractality and the percolation transition in complex networks. Chemical Engineering Science, 64(22):4572–4575, 2009. [111] Hern´ an D Rozenfeld, Chaoming Song, and Hern´ an A Makse. Small-world to fractal transition in complex networks: a renormalization group approach. Physical review letters, 104(2):025701, 2010. [112] D. Rubin. Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100, 2005. [113] J. Sabatier, O. P. Agrawal, and J. A. Tenreiro Machado. Advances in Fractional Calculus: Theo- retical Developments and Applications in Physics and Engineering. Springer Publishing Company, Incorporated, 1st edition, 2007. [114] E. Salinas and L. F. Abbott. Vector reconstruction from firing rates. J. of Computational Neuro- science, 1, 1994. [115] Erno Salminen and et. al. Requirements for network-on-chip benchmarking. In NORCHIP Confer- ence, 2005. 23rd. IEEE, 2005. [116] Erno Salminen, Cristian Grecu, Timo D H¨ am¨ al¨ ainen, and Andr´ e Ivanov. Network-on-chip bench- marks specifications part i: application modeling and hardware description. [117] J. C. Sanchez, J. C. Principe, J. M. Carmena, M. A. Lebedev, and M. A. Nicolelis. Simultaneus prediction of four kinematic variables for a brain-machine interface using a single recurrent neural network. IEEE Engineering in Medicine and Biology Society (EMBS, 2004. [118] G. Schalk, D.J. McFarland, T. Hinterberger, N. Birbaumer, and J.R. Wolpaw. Bci2000: A general- purpose brain-computer interface (bci) system. IEEE Transactions on Biomedical Engineering, 51(6):1034–1043, 2004. 135 [119] Francois Schmitt, Daniel Schertzer, and Shaun Lovejoy. Multifractal analysis of foreign exchange data. Applied stochastic models and data analysis, 15(1):29–53, 1999. [120] Franc ¸ois Schnitzler, Jia Yuan Yu, and Shie Mannor. Sensor selection for crowdsensing dynamical systems. 2015. [121] Alexander Schrijver. A combinatorial algorithm minimizing submodular functions in strongly poly- nomial time. J. Comb. Theory Ser. B, 80(2):346–355, November 2000. [122] Dennis J Selkoe. Cell biology of protein misfolding: the examples of alzheimer’s and parkinson’s diseases. Nature cell biology, 6(11):1054–1061, 2004. [123] M. Shlesinger, G. Zaslavsky, and J. Klafter. Strange kinetics. Nature, 363(6424):31–37, 1993. [124] Amit Kumar Singh, Akash Kumar, and Thambipillai Srikanthan. Accelerating throughput-aware runtime mapping for heterogeneous mpsocs. ACM Transactions on Design Automation of Electronic Systems (TODAES), 18(1):9, 2013. [125] Amit Kumar Singh, Muhammad Shafique, Akash Kumar, and J¨ org Henkel. Mapping on multi/many-core systems: survey of current and emerging trends. In Proceedings of the 50th Annual Design Automation Conference, page 1. [126] Chaoming Song, Lazaros K Gallos, Shlomo Havlin, and Hern´ an A Makse. How to calculate the fractal dimension of a complex network: the box covering algorithm. Journal of Statistical Me- chanics: Theory and Experiment, 2007(03):P03006, 2007. [127] Chaoming Song, Shlomo Havlin, and Hernan A Makse. Self-similarity of complex networks. Na- ture, 433(7024):392–395, 2005. [128] Chaoming Song, Shlomo Havlin, and Hern´ an A Makse. Origins of fractality in the growth of complex networks. Nature Physics, 2(4):275–281, 2006. [129] Yu-Qin Song, Jin-Long Liu, Zu-Guo Yu, and Bao-Gen Li. Multifractal analysis of weighted net- works by a modified sandbox algorithm. Scientific reports, 5, 2015. [130] Vassos Soteriou, Hangsheng Wang, and Li-Shiuan Peh. A statistical traffic model for on-chip inter- connection networks. In Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, 2006. MASCOTS 2006. 14th IEEE Int’l Symp. on. IEEE, 2006. [131] F. Sowell. Unpublished manuscript, title = Maximum likelihood estimation of fractionally inte- grated time series models, year = 1989,. [132] Krishnan Srinivasan, Karam S Chatha, and Goran Konjevod. Linear-programming-based techniques for synthesis of network-on-chip architectures. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, 14(4):407–420, 2006. [133] A. Svenkeson, B. Glaz, S. Stanton, and B. West. Spectral decomposition of nonlinear systems with memory. Physical Review E, 93(2), 2016. [134] Tam´ as T´ el, ´ Agnes F¨ ul¨ op, and Tam´ as Vicsek. Determination of fractal dimensions for geometrical multifractals. Physica A: Statistical Mechanics and its Applications, 159(2):155–166, 1989. [135] Yves Tessier, Shaun Lovejoy, Pierre Hubert, Daniel Schertzer, and Sean Pecknold. Multifractal analysis and modeling of rainfall and river flows and scaling, causal transfer functions. Journal of Geophysical Research: Atmospheres, 101(D21):26427–26440, 1996. 136 [136] Stefan Thurner, Christian Windischberger, Ewald Moser, Peter Walla, and Markus Barth. Scaling laws and persistence in human brain activity. Physica A: Statistical Mechanics and its Applications, 326(34):511 – 521, 2003. [137] Constantino Tsallis. Possible Generalization of Boltzmann-Gibbs Statistics. J. Statist. Phys., 52:479–487, 1988. [138] Robert G. Turcott and Malvin C. Teich. Fractal character of the electrocardiogram: Distinguishing heart-failure and normal patients. Annals of Biomedical Engineering, 24(2):269–293. [139] Keith S Vallerio and Niraj K Jha. Task graph extraction for embedded system synthesis. In VLSI Design, Proc. 16th Int’l Conf. on, pages 480–486. IEEE, 2003. [140] M. Velliste, S. Perel, M. C. Spalding, A. S. Whitford, and A. B. Schwartz. Cortical control of a prosthetic arm for self-feeding. Nature, 453, 2008. [141] Zhe Wang and et. al. A systematic network-on-chip traffic modeling and generation methodology. In Circuits and Systems (APCCAS), 2014 IEEE Asia Pacific Conference on, pages 675–678. IEEE, 2014. [142] Dai-Jun Wei, Qi Liu, Hai-Xin Zhang, Yong Hu, Yong Deng, and Sankaran Mahadevan. Box- covering algorithm for fractal dimension of weighted networks. Scientific reports, 3:3049, 2013. [143] Herwig Wendt, St´ ephane G Roux, St´ ephane Jaffard, and Patrice Abry. Wavelet leaders and bootstrap for multifractal analysis of images. Signal Processing, 89(6):1100–1114, 2009. [144] Gerhard Werner. Fractals in the nervous system: Conceptual implications for theoretical neuro- science. Frontiers in Physiology, 1:15, 2010. [145] B. West. Fractal physiology and chaos in medicine. World Scientific, 2012. [146] Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. The splash-2 programs: Characterization and methodological considerations. In ACM SIGARCH computer architecture news, volume 23, pages 24–36. ACM, 1995. [147] Y Xue, S. Pequito, J. Coelho, P. Bogdan, and G. Pappas. Minimum number of sensors to ensure observability of physiological systems: A case study. In Communication, Control, and Computing (Allerton), 2016 54th Annual Allerton Conference on, pages 1181–1188. IEEE, 2016. [148] Yuankun Xue, Zhiliang Qian, Paul Bogdan, Fan Ye, and Chi-Ying Tsui. Disease diagnosis-on- a-chip: Large scale networks-on-chip based multicore platform for protein folding analysis. In Proceedings of the 51st Annual Design Automation Conference, pages 1–6. ACM, 2014. [149] Yuankun Xue, Zhiliang Qian, Guopeng Wei, Paul Bogdan, Chi-Ying Tsui, and Radu Marculescu. An efficient network-on-chip(noc) based multicore platform for hierarchical parallel genetic algo- rithms. In Networks-on-Chip (NOCS), 2014 The Intl. Symposium on, pages 290–295, Sep 2014. [150] Yuankun Xue, Saul Rodriguez, and Paul Bogdan. A spatio-temporal fractal model for a cps ap- proach to brain-machine-body interfaces. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 642–647. IEEE, 2016. [151] Zu-Guo Yu, V o Anh, and Ka-Sing Lau. Measure representation and multifractal analysis of com- plete genomes. Physical Review E, 64(3):031903, 2001. [152] Wei-Xing Zhou et al. Multifractal detrended cross-correlation analysis for two nonstationary sig- nals. Physical Review E, 77(6):066211, 2008. 137 [153] Yong Zou and Sudeep Pasricha. Reliability-aware and energy-efficient synthesis of noc based mp- socs. In Quality Electronic Design (ISQED), 2013 14th International Symposium on, pages 643– 650. IEEE, 2013. 138
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Understanding dynamics of cyber-physical systems: mathematical models, control algorithms and hardware incarnations
PDF
Theoretical foundations for modeling, analysis and optimization of cyber-physical-human systems
PDF
Theoretical foundations and design methodologies for cyber-neural systems
PDF
Quantum computation in wireless networks
PDF
Dynamic graph analytics for cyber systems security applications
PDF
Verification, learning and control in cyber-physical systems
PDF
Dealing with unknown unknowns
PDF
High-performance distributed computing techniques for wireless IoT and connected vehicle systems
PDF
Enhancing collaboration on the edge: communication, scheduling and learning
PDF
Heterogeneous graphs versus multimodal content: modeling, mining, and analysis of social network data
PDF
Data-driven and logic-based analysis of learning-enabled cyber-physical systems
PDF
Assume-guarantee contracts for assured cyber-physical system design under uncertainty
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
PDF
An FPGA-friendly, mixed-computation inference accelerator for deep neural networks
PDF
Theoretical foundations for dealing with data scarcity and distributed computing in modern machine learning
PDF
Building straggler-resilient and private machine learning systems in the cloud
PDF
AI-enabled DDoS attack detection in IoT systems
PDF
Learning, adaptation and control to enhance wireless network performance
PDF
Empirical methods in control and optimization
PDF
Control and optimization of complex networked systems: wireless communication and power grids
Asset Metadata
Creator
Xue, Yuankun
(author)
Core Title
Theoretical and computational foundations for cyber‐physical systems design
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
04/11/2018
Defense Date
01/23/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
application profiling,complex networks,cyber physical system,manycore systems,mathematical modeling,multifractal,network on chip,OAI-PMH Harvest,reconfigurable systems,statistic physics,statistical inference,statistical machine learning
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Bogdan, Paul (
committee chair
), Jonckheere, Edmond (
committee member
), Krishnamachari, Bhaskar (
committee member
), Nakano, Aiichiro (
committee member
)
Creator Email
urashima9616@gmail.com,yuankunx@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-5016
Unique identifier
UC11670340
Identifier
etd-XueYuankun-6207.pdf (filename),usctheses-c89-5016 (legacy record id)
Legacy Identifier
etd-XueYuankun-6207.pdf
Dmrecord
5016
Document Type
Dissertation
Rights
Xue, Yuankun
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
application profiling
complex networks
cyber physical system
manycore systems
mathematical modeling
multifractal
network on chip
reconfigurable systems
statistic physics
statistical inference
statistical machine learning