Close
The page header's logo
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Underwater navigation strategies and emergent collective behavior in bioinspired swimmers
(USC Thesis Other) 

Underwater navigation strategies and emergent collective behavior in bioinspired swimmers

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Copy asset link
Request this asset
Transcript (if available)
Content UNDERWATER NAVIGATION STRATEGIES AND EMERGENT COLLECTIVE BEHAVIOR IN
BIOINSPIRED SWIMMERS
by
Haotian Hang
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MECHANICAL ENGINEERING)
May 2025
Copyright 2025 Haotian Hang



Dedication
To all kindred spirits.
ii



Acknowledgements
I would like to thank lots of people in my life who are crucial to the successful completion of this work.
First, I would like to express my deepest gratitude to my advisor, Prof. Eva Kanso, for her guidance
and support throughout my graduate studies. This thesis would not have been possible without her help
and advice. She has greatly enlarged my perspective and vision in doing research. Her inspiration and
guidance opened the gate of multiple different research areas for me, e.g., nonlinear dynamics and control,
machine learning, high-performance computing, and statistical physics, rather than becoming a "pure fluid
dynamicist". Actually, it was not until very recently that I began to realize that she has led me into the field
of biological physics, which emerges as part of physics [334]. Her creative thinking and her own strong
enthusiasm and interest in doing research, as well as her excellent writing skills, also inspired me a lot.
Importantly, she showed me how to understand complex biophysics phenomena can be understood from
the first principle, with mathematic tools as simple as possible.
I want to thank my thesis committee members, Prof. Mitul Luhar, Prof. Aiichiro Nakano, Prof. Ivan
Bermejo-Moreno, and Prof. Assad Oberai, for their insightful advice, guidance, and support. I also thank
Prof. Geoffrey Spedding and Prof. Mitul Luhar for organizing the SuperFluid Group meeting. These
aroused many insightful discussions among different groups.
I would also like to thank the mini-MURI team, which shares the same funding with our group on
fish schooling, including Prof. Matthew McHenry from UCI and his postdoc Ashley Peterson; Prof. Rajat
Mittal and Prof. Jung-Hee Seo from JHU, and their student Ji Zhou; Prof. Derek Paley from University
iii



of Maryland, his postdoc Dr. Weikuo Yen, and his student Rose Gebhardt. Discussions with them during
the multi-disciplinary project exposed me to different perspectives on the same problem, which largely
enlarged my vision. I would also like to thank the other MURI team; the review meetings with them also
broadened my horizons.
I would like to thank other collaborators, who helped me a lot in my research. I would like to thank
Prof. Amneet Pal Singh Bhalla from SDSU and Prof. Boyce E. Griffith from University of North Carolina at
Chapel Hill for developing the open source CFD software, IBAMR, as well as their unconditional support
in helping me to use their software. This is crucial for my PhD research. I would also like to thank Prof.
Kunihiko (Sam) Taira and Prof. Jeff Eldredge on insightful discussions on data-driven methods, especially
the discussion with Dr. Alec J. Linot from Prof. Taira’s group on graph neural network. I would like to
thank Dr. Alex Barnett for detailed help and discussion when we started working on boundary element
methods. I would also like to thank Dr. Josh Merel for his guidance on reinforcement learning and thank
Prof. John Costello for his help in guiding my first paper during my PhD about bending propulsors.
And then, I would like to thank all my lab members, who made this lab like a big and warm family,
including: Chenchen Huang, Jingyi Liu, Hanliang Guo, Tommaso Redaelli, JP Raimondi, Alyssa Chan,
Hao Cheng, Yusheng Jiao, Zitao Yu, Feng Ling, Basile Radisson, Kalyan Naik Banoth, Yi Man, Morgan
Jones, Janna Nawroth, Anup Kanale, Sina Heydari. The kind support and all the research-related or nonresearch-related discussions with them are extremely helpful for me. In which, I would like to specially
thank some of them. Firstly, I would like to thank Sina Heydari, who is my mentor when I started my
PhD. The guidance from him and collaboration with him was great. Secondly, I’d like to thank Feng Ling
and Yusheng Jiao. Because they have different educational backgrounds compared to most of us, who are
trained in engineering, their perspectives from mathematics and physics are invaluable to me. I would
also like to thank Hanliang Guo, who technically does not overlap with me, but did come to the lab often
iv



during the first several years and met with us every time during DFD. His insight and long-term career
advice are very helpful.
Most importantly, I would thank my parents for their constant love and support. Their support and
education have shaped me to become the person I am now. They not only tried their best to give me the
highest quality of education, at the cost either financially or emotionally, but also their own enthusiasm
in pursuing more knowledge has made an excellent model for me at an early age. My family is always my
biggest support.
I would like to thank my friend, Siyan Song. It is the willingness to not move too far from her that
pushed me to work hard in tedious and stressful college entrance exams or other standard tests, which
gave me a chance to study in top universities. However, ironically, this also makes me further and further
from her.
I would like to thank my undergraduate advisors. The lab led by Prof. Hong Liu in SJTU is super
friendly to support undergraduate student to do research. My advisors, Prof. Yang Xiang, Prof. Bin Zhang,
Dr. Bin Yu, Dr. Suyang Qin not only taught me a lot from the most basics of reading literatures, designing experiments, writing codes, but also, more importantly, gave me a chance to lead my own research
project since I was a sophomore. This deeply cultivated my interest in doing research in bio-inspired fluid
dynamics.
I would like to thank my roommates Greg Park and Binyou Wang, who are also pursuing their doctoral
degrees at the same time, in dental school and chemistry, respectively. We did not only take care of each
other in life, but also shared our research progress and had interesting discussions. I would also like to
thank my aunt Qiong Jin and uncle Guangming Wang, who took care of me a lot, especailly when I was
having my appendectomy during my first year and during the pandemic. I’d also like to thank my mentors
during my internship at Wells Fargo, Dr. Nengfeng Zhou, and Dr. Harry Zhang. The intern experience
greatly deepens my understanding of robust and interpretable machine learning. I’d also like to thank
v



Jorge Sanchez from Santa Monica Airport, who brought me to the air and let me experience interesting
fluid phenomena, like stall and turbulence, while keeping me safe.
I started my PhD during the pandemic. Although it makes many things tough at the beginning, but it
is also a good thing on the other hand: lots of online seminar emerged during that time, and some of them
sustained later. I would like to thank two of them which influenced me the most: ReCoVor, IBiM.
Lastly, I would also thank the whole open source community, from code sharing on Github to lots
of questions answered on Stackoverflow to numerous tutorials on Youtube and Bilibili. These materials
enable me to learn any concepts much quicker than in previous years and are also invaluable resources
for the success of numerous large language models (LLMs), which are extremely helpful during the end of
my PhD. I also hope to contribute my knowledge and expertise to these communities in the future.
I’d like to end this acknowledgment with two famous quotes I like:
՚צכԒэѿݚѺ䩟Ћࠀߓ๨܊䩟Ћܞ߹ھК䩟ЋТ߹Ћϳс
Philosophers have only interpreted the world, in various ways; the point, however, is to change it. -- Karl Marx
vi



Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 General thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 A path toward autonomy in fluid environment . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Biomimetics and Bioinspiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Literature review on fish swimming . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Literature review on flow sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2.1 Biological basis of flow sensing and development of flow sensors . . . . 12
1.2.2.2 Mathematical models on flow sensing and underwater navigation . . . . 13
1.2.2.3 Source seeking and Chemotaxis . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.3 Literature review on collective locomotion . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.3.1 Near-field High Fidelity Hydrodynamic Interactions . . . . . . . . . . . . 16
1.2.3.2 Fish school as a model of social interaction and active matter . . . . . . . 20
Chapter 2: Hydrodynamics of individual swimmers: flexible versus rigid flapping swimmer . . . . 24
2.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Performance of flexible swimmer comparing to rigid swimmer . . . . . . . . . . . . . . . . 29
2.3 Passive Flexion improves swimming efficiency but not speed . . . . . . . . . . . . . . . . . 33
2.4 Passive Hydrodynamics provide guideline for active flexion . . . . . . . . . . . . . . . . . . 35
2.5 Parametric study and relation to biological observations . . . . . . . . . . . . . . . . . . . 37
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 3: Navigation in unsteady flows: wake tracking . . . . . . . . . . . . . . . . . . . . . . . . 43
3.1 Learning To Track Hydrodynamic Trails Using Local Flow Sensing . . . . . . . . . . . . . 44
3.2 Learned policies are analogous to Braitenberg’s Vehicles . . . . . . . . . . . . . . . . . . . 48
3.2.1 Statistical Measures for Evaluating the Performance of Braitenberg Strategies . . . 49
3.3 Robustness of the RL-inspired strategies over Parameter Space and under Noise . . . . . . 50
3.4 RL-Inspired Braitenberg Strategies Track Both Odorless and Scented Trails . . . . . . . . . 52
vii



3.5 Generalization to Unseen Hydrodynamic Trails . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5.1 Application of Braitenberg Strategies in 3D wakes . . . . . . . . . . . . . . . . . . 54
3.6 Stability Analysis in Traveling Wave Signal Emphasizes the Importance of Sensor Placement 55
3.7 Versatile navigation strategies are applicable to track vortical and turbulent plume . . . . . 57
3.8 Biological Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.9 Analysis of proportional controller in traveling wave signal field . . . . . . . . . . . . . . . 63
3.9.1 Linear stability analysis of the system . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.9.2 Nonlinear analysis of the system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.10 Analysis in traveling wave signal field using gradient in longitudinal direction . . . . . . . 72
3.11 Discussion of the dynamic system when having larger speed . . . . . . . . . . . . . . . . . 76
3.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Chapter 4: Navigation in unsteady flows: point-to-point navigation . . . . . . . . . . . . . . . . . . 80
4.1 Zermelo optimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2 Distinguishing Egocentric from Geocentric Learning . . . . . . . . . . . . . . . . . . . . . 81
4.3 Egocentric Learning Requires Sensing Flow Gradients . . . . . . . . . . . . . . . . . . . . . 85
4.4 Egocentric Policies are More Robust to Transfer to New Flow Environments . . . . . . . . 89
4.5 Egocentric Policies are Rotationally-Invariant and Better Adapt to Novel Conditions
Unexplored During Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.6 Interpretation of Underwater RL Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Chapter 5: Hydrodynamics of groups of multiple flapping swimmers . . . . . . . . . . . . . . . . . 102
5.1 Mathematical models of flow-coupled flapping swimmers . . . . . . . . . . . . . . . . . . . 102
5.2 Flow coupling leads to stable emergent formations . . . . . . . . . . . . . . . . . . . . . . . 104
5.3 Emergent formations save energy compared to solitary swimming . . . . . . . . . . . . . . 108
5.4 Linear phase-distance relationship in emergent formations is
universal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.5 Leader’s wake unveils opportunities for stable emergent formations . . . . . . . . . . . . . 113
5.6 Parametric analysis over the entire space of phase lags and lateral offsets . . . . . . . . . . 117
5.7 Analysis of larger groups of inline and side-by-side swimmers . . . . . . . . . . . . . . . . 120
5.8 Mechanisms leading to loss of cohesion in larger inline formations . . . . . . . . . . . . . 122
5.9 Critical size of inline formations beyond which cohesion is lost . . . . . . . . . . . . . . . . 125
5.10 Phase control to stabilize unstable inline school . . . . . . . . . . . . . . . . . . . . . . . . 128
5.11 Mapping emergent spatial patterns to energetic benefits . . . . . . . . . . . . . . . . . . . . 129
5.12 Feedback control for maintaining school cohesion in uncoordinated flapping swimmers . . 131
5.12.1 Flow sensing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.12.2 Simplified sensing scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.12.3 Flow sensing model during free swimming . . . . . . . . . . . . . . . . . . . . . . . 134
5.12.4 Frequency analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.12.5 Sliding mode controller design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.13 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Chapter 6: Collective phenomena at extreme scale . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1 Mathematical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.2 Statistical and data-driven analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.2.1 Order parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.2.2 Identifying splitting and merging events . . . . . . . . . . . . . . . . . . . . . . . . 145
viii



6.2.3 Spatial correlation in velocity fluctuations. . . . . . . . . . . . . . . . . . . . . . . . 145
6.2.4 Time delays during turning and information propagation within the group. . . . . 146
6.2.5 Coarse Graining of the active matter . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.3 Dynamic reorganization, splitting and merging in large fish schools . . . . . . . . . . . . . 148
6.4 Global polarization is lost with increasing number of swimmers . . . . . . . . . . . . . . . 148
6.5 More is different . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.6 Flow interactions trigger spontaneous self-reorganization within
the school . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.7 Scale-free correlation in cohesive groups breaks down during school self-reorganization . 154
6.8 Information propagation during turning, splitting, and rejoining . . . . . . . . . . . . . . . 156
6.9 Analytical derivation of continuum model of small perturbation in polarized schools . . . 160
6.10 Global rotational order is lost with an increasing number of
swimmers independent of hydrodynamic interaction . . . . . . . . . . . . . . . . . . . . . 170
6.11 Correlation length in stable milling patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.12 Scaling law in stable milling patterns predicts the loss of stability . . . . . . . . . . . . . . 171
Chapter 7: Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.2 Visions on Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.2.1 Full-stack of Biomimetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.2.2 Physics-aware machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Fluid simulation models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
A.1 CFD simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
A.2 Vortex Sheet simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
A.2.1 Forces and moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
A.2.2 Numerical implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
A.2.3 Fast multipole methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
A.3 Minimal Hydrodynamic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
A.3.1 Time-delayed particle model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
A.3.2 Potential dipole model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
A.4 Odor simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
A.4.1 Laminar Plume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
A.4.2 Turbulent Plume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Appendix B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Appendix C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Agent-based model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
C.1 Mathematical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
C.2 Computational method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
ix



List of Tables
3.1 Comparison between different source-seeking strategies in different signal fields . . . . . . 45
3.2 Success rate of excitatory and inhibitory strategies in CFD simulations of flows past
stationary and moving bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3 Training instances of RL policies based on different sensory cues . . . . . . . . . . . . . . . 61
3.4 Biological data of animals’ locomotion capabilities . . . . . . . . . . . . . . . . . . . . . . . 62
4.1 Minimal observations for successful learning . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2 Performance of trained RL policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
C.1 Summary of the dataset for polarized schools . . . . . . . . . . . . . . . . . . . . . . . . . . 242
x



List of Figures
1.1 Schematics of a bio-inspired underwater vehicle . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Summary of my work in the context of autonomy . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Bioinspiration and Biomimetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Physics mechanism as a low dimensional latent space . . . . . . . . . . . . . . . . . . . . . 7
1.5 Literature review on flow sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Bending rules of swimming fish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Wake and flow velocity of free swimmers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Active anterior-to-posterior bending of free swimmers . . . . . . . . . . . . . . . . . . . . 28
2.4 Active anterior-to-posterior bending can minimize lateral forces and negative thrust . . . 28
2.5 Active anterior-to-posterior bending of swimmers fixed in oncoming flow . . . . . . . . . 30
2.6 Performance of active anterior-to-posterior bending as a function of phase . . . . . . . . . 31
2.7 Passive anterior-to-posterior bending of free swimmers . . . . . . . . . . . . . . . . . . . . 33
2.8 Active bending in agreement with passive hydrodynamics improves swimming performance 34
2.9 Swimming performance of actively flexion swimmer scaled by performance of rigid
swimmer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.10 Relation to fish swimming behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1 Following odorless and scented trails at different scales . . . . . . . . . . . . . . . . . . . . 46
3.2 Concentration of odor versus x at the midline behind the source in different plumes . . . . 47
3.3 RL-policies and Braitenberg strategies for tracking hydrodynamic trails . . . . . . . . . . . 47
xi



3.4 Generalization to unseen flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Stability analysis in 1D traveling signal field . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6 Versatile strategies track vortical and turbulent wakes . . . . . . . . . . . . . . . . . . . . . 59
3.7 RL-training for different sensory cues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.8 Parametric study for different sensory cues . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.9 Statistical characteristics of the RL-trained policies and RL-inspired strategies . . . . . . . 65
3.10 Robustness of the RL policies and RL-inspired strategies to sensory limitations . . . . . . . 66
3.11 Parametric study for different sensory cues using gradient in longitudinal direction . . . . 67
3.12 Trajectories in traveling wave signal field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.13 Analysis of the controller measuring lateral gradient of the signal field . . . . . . . . . . . 69
3.14 Validation of the analytical solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.15 Analysis of the controller measuring longitudinal gradient of the signal field . . . . . . . . 73
3.16 Trajectories in traveling wave signal field with large agent speed . . . . . . . . . . . . . . . 76
4.1 Hand-crafted policies for Zermelo problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2 Autonomous Underwater Navigation in Unsteady Flows . . . . . . . . . . . . . . . . . . . 85
4.3 Learning Underwater Navigation Using Egocentric Observations Requires Sensing Flow
Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 Trajectories of Trained Agents in Physical and Flow Observation Spaces . . . . . . . . . . 90
4.5 Flow Observations and Transfer from Low- to High-Fidelity Flow representations . . . . . 91
4.6 Distribution of observations when generalizing from VS wake to CFD wake . . . . . . . . 92
4.7 Distribution of observations in wakes of different Reynolds numbers . . . . . . . . . . . . 93
4.8 Transfer to Novel Reynolds Numbers and Policy Invariance under Rotational Symmetry . 95
4.9 Transfer to Locations outside Training Domain and Interpretation of RL Policies . . . . . . 97
5.1 Flow-coupled swimmers self-organize into stable pairwise formations . . . . . . . . . . . . 105
5.2 Pairs of swimmers in CFD simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
xii



5.3 Pairs of swimmers in VS simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4 Influence of fluid property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.5 Emergent equilibria in pairwise formations . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.6 Hydrodynamic benefits and linear phase-distance relationship in pairs of swimmers . . . . 110
5.7 Hydrodynamic torque for pair of swimmers at different spatial configurations . . . . . . . 111
5.8 Predictions of equilibrium formations from the wake of a solitary swimmer . . . . . . . . . 116
5.9 Predictions of equilibrium locations, power savings, and cohesion, from the wake of a
solitary leader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.10 Time-delay particle model and stability of swimmer . . . . . . . . . . . . . . . . . . . . . . 118
5.11 Equilibria are dense over the parameter space . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.12 Larger inline and side-by-side formations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.13 Inline formations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.14 Side-by-side formations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.15 Loss of cohesion in larger groups of inline swimmers . . . . . . . . . . . . . . . . . . . . . 125
5.16 Prediction of equilibrium formations, cohesion, and power savings from the wake of
upstream swimmers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.17 CFD simulation of larger inline schools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.18 Passive and active methods for stabilizing an emergent formation of four swimmers . . . . 131
5.19 Alternative formations of four swimmers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.20 Fourier analysis of flow velocity sensed by the follower . . . . . . . . . . . . . . . . . . . . 134
5.21 Dominant frequency of sensed flow velocity fs . . . . . . . . . . . . . . . . . . . . . . . . 134
5.22 Controller behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.1 Emergent behavior in a school of 50,000 fish . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.2 More is different: self-organized behavior depends on group size . . . . . . . . . . . . . . . 151
6.3 School cohesiveness depends on hydrodynamic intensity of individual swimmers . . . . . 152
xiii



6.4 Polarized school does not split without hydrodynamic interaction . . . . . . . . . . . . . . 153
6.5 Noise is not required for self-organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.6 Correlation length in cohesive and dynamically changing polarized schools . . . . . . . . . 157
6.7 Information transfer during turning, splitting and merging . . . . . . . . . . . . . . . . . . 158
6.8 Correlation length during turning and splitting . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.9 Additional data on turning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.10 Additional data on merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.11 Additional data on splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.13 Emergent collective behaviors at extreme scale . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.15 Correlation length and scaling law in milling pattern . . . . . . . . . . . . . . . . . . . . . 173
A.1 Schematic of the vortex sheet model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
A.2 Schematics of time delayed particle model . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
A.3 Parametric study of the model behavior over frequency ratio and amplitude ratio . . . . . 229
A.4 Scaled distance as a function of time for different parameter values . . . . . . . . . . . . . 230
C.1 Reproduction of phase diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
xiv



Abstract
Autonomous vehicles navigating in unsteady fluid environments, either in air or water, have aroused
growing attention from aviation to ocean monitoring. To achieve efficient navigation under various and
dynamically-changing environments, the ability to harvest energy and extract information from fluid
structures is important. Aquatic and aerial animals have demonstrated fascinating abilities in these tasks.
Importantly, they typically do not approach these tasks individually, instead, collective locomotion plays
a crucial role in their efficiency and mobility.
In this thesis, we start with a single agent, discussing the optimal gait to achieve high efficiency and
speed in the context of fluid-structure interaction (FSI). We also utilized reinforcement learning (RL) to
discover novel strategies to navigate bio-inspired vortical wakes. Our works uncovered the physics mechanisms behind these navigation strategies and emphasized the crucial role of flow sensing, especially flow
gradient, in navigating through traveling-wave signal fields.
Then, we moved on to multiple agents with high-fidelity hydrodynamic interactions. Built upon the
literature that passive hydrodynamics is enough to stabilize two synchronized agents at energy-saving
equilibria, we explored how these passive formations scale to larger schools by solving the FSI problems
among up to 10 agents. We found that "cooperative" side-by-side spatial patterns, which share energetic
benefits, scale to more agents, while "selfish" inline schools, with energetic savings in favor of trailing
agents, lose their cohesion up to a certain number of agents. We also designed feedback controllers to
stabilize unstable schools based purely on local flow sensing.
xv



To study even larger schools, we employed a self-propelled particle model of fish based on data-inferred
behavioral rules and far-field flow interactions. At a lower number of agents (∼ 100), these models are
known to generate collective patterns such as polarized schooling and rotationally-ordered milling. Our
work shows "more is different" by simulating up to 104
agents. The globally ordered schools typically break
down with more agents, and locally ordered subgroups interact with each other, undergoing consistently
splitting and merging. We evaluated the information transfer within polarized clusters, and found that the
information transfer within cluster is enhanced during merging, while depressed during splitting.
These results not only inform future design and control of autonomous robots tasked with navigation
in fluid environments but also contribute to a broader perspective on nonlinear dynamics, flow control
and biological physics.
xvi



Chapter 1
Introduction
1.1 General thoughts
As the beginining of my thesis toward a doctor of philosophy, I would like to share some philosophic
thinking and motivations of my work at the beginning.
1.1.1 A path toward autonomy in fluid environment
Autonomy was first introduced by Immanuel Kant, which stands for the ideal of free will or a rational self.
Following these ideas, the first autonomous robots, Elmer and Elsie were built by William Grey Walter
in the late 1940s [212]. They were capable of phototaxis, which is moving in response to light stimulus. Nowadays, autonomy has been considered in all aspects of engineering, including self-driving cars,
unmanned factories, etc.
Here we are considering autonomous navigation in the flow field, which is inherently different from
moving on land, etc. There are three main points that make flow-based navigation much more difficult.
(1) Autonomous driving is an information-rich environment, where vision and telecommunication are
typically available. (2) When maneuvering on the road, a 2D motion is typically considered; namely, the
vehicle will not leave the terrain it is moving on. On the contrary, when navigating in fluid environments,
motions of different directions are coupled nonlinearly. (3) Road environments are artificially constructed
1



with prescribed maps. On the other side, fluid environments are constantly changing. These bring unique
challenges and opportunities to achieve autonomy in a fluid environment. Autonomy is typically not
achieved by a single controller; instead, a hierarchy of controllers that addresses different problems is
preferable [491, 224, 407]. In particular, three levels of control autonomy for robotics are defined in [147]:
sensory-motor autonomy, reactive autonomy, and cognitive autonomy. When navigating in fluid environments, interaction with fluid is important in each level of autonomy, as illustrated in Fig. 1.2. At the
lowest level, in sensory-motor autonomy, understanding the mechanism of fluid-structure interaction facilitates the design of propulsor morphology and kinematics to achieve optimal efficiency or reach certain
force production. Contributions of me in this area are [502, 181, 377]. At the middle level, near-field flow
interactions among different agents provide energetic benefits and stability at equilibrium locations in reactive autonomy. One contribution of me in this area is [193, 182]. At the highest level, the agents find
ways to navigate through complex time-varying environments. Traditionally, it can be achieved by "Simultaneous Localization and Mapping (SLAM)"-like tasks [292, 120, 23, 164, 327, 326, 251], which involves
transforming egocentric information to geocentric (allocentric) frame. Recently, data-driven methods have
enabled us to directly work on nonlinear information in egocentric frame, for an individual or for a group
of collaborative agents. Contributions of me in this area are [184, 217, 183].
1.1.2 Biomimetics and Bioinspiration
To achieve autonomy, biological behavior is a treasure trove given to us by nature [491, 291]. Many interesting thoughts came from biology and finally became very useful engineering designs, such as radar and
winglets. However, what to learn and how to learn from biological behavior is still an open question. Here
I’d like to share my own opinion.
As illustrated in Fig. 1.3, I classified "learning from biological behavior" into four stages. The first stage
is directly mimicking the "superficial" features of biological behaviors, like the kinematics or morphology
2



Fluid Environment
Stimulus source
Flow sensing: Lateral line system
Canal neuromastsSuperficial neuromasts
Caudal fin Dorsal fins
Pectoral fin
Ventral fin
Anal fin
Actuation: Fins
Aquatic animals
Underwater vehicles
Obstacles
Fluid Environment
Focal Agent
Fluid Structure
Interaction
Flow
Sensing
Sensory
cues
Actuation
Internal goal
Other Animal/Vehicle
Fluid Structure
Interaction
Focal Agent
Visual
sensing
Olfactory
sensing
Figure 1.1: Schematics of a bio-inspired underwater vehicle. A focal agent, which is a fish or a vehicle for analysis, senses
and interacts with the local flow field surrounding it. The flow field contains information generated by other fish, vehicles, or
obstacles in the field. The internal goal of the fish can be efficient swimming, collective behavior, predation, escaping, quick
maneuver, etc, or a combination of some goals. The actuation is derived to achieve the internal goal based on sensory input.
3



High
Low
Low
Cognitive autonomy High
Reactive autonomy
Sensory-control autonomy
Path planning
Estimation
Localization
Noise rejection
Energy harvest
Prescribe force, motion
Optimize efficiency
Level of
control
Fidelity of
physics model
Source seeking via information contained in flow
A B
C
PIV measurement
of flapping wings
D
CFD simulation of bending propulsor
E
F
Collective locomotion
using behavior laws
Emergent formation
of fish shcool
-10 0 10
vorticity vorticity
A B
Figure 1.2: Summary of my work in the context of autonomy. [147] defined three levels of control autonomy for robotics:
sensory-motor autonomy, reactive autonomy, and cognitive autonomy. At lower level, we need to resolve more accurate physics
model, and at higher level, we need to apply more complex control methodologies. A. Collective locomotion of fish school at extreme scale. Simulation contains 50,000 agents under local behavior rules and all-to-all hydrodynamic interactions [183]. B. Deep
reinforcement learning (DRL) discovers interpretable and generalizable sensory control strategies for tracking hydrodynamic
trails purely based on local flow sensing [184, 217]. C. Flow-coupled swimmers self-organized into formations of different spatial
patterns and power saving [193]. D. Leading edge vortex (LEV) formation around flapping wings with active or passive pitching
motion visualized by Particle Image Velocimetry (PIV) [502, 377]. E. Biologically-inspired bending swimmers outperform rigid
swimmers in swimming speed and efficiency [181].
of animals. A typical example is Leonardo von da Vinci’s flying machine designs (Fig. 1.3). However, these
attempts failed because of a lack of understanding of physics in addition to the constraints in material
properties and power sources.
The second stage is taking inspiration from the key features of biological behaviors. George Cayley
is regarded as "the father of aviation" for his concept of modern aeroplane in 1799, where he proposed
to generate lift force and thrust force separately. Let us look into the common properties and differences
between this idea and real birds’ flight. Birds’ flight can be separated into two phases temporally: the
flapping phase and the gliding phase. In the flapping phase, birds generate both thrust force and lift force
using the unsteady motion of their wings. In the gliding phase, birds keep their wing straight, generating
lift and slowing down because of drag. In George Cayley’s genius idea, he totally discarded the complex
flapping phase and learned how to generate lift using cambered wings of birds. He then creatively proposed
to generate lift and thrust in a spatially separate way instead of a temporally separate way by introducing
4



another thrust-generating engine. A similar story also appeared in the relationship between Von Neumann
architecture and the biological brain.
The third stage of biomimetics emerges in the new century. With the advances in science and technology, man-made vehicles can really mimic biological behaviors in some aspects [288, 523]. If I continue
taking examples in aviation, “Robobee” took off in 2013, which mimics the flapping flight of insects [288,
335]. These advances are based on a deeper understanding of the underlying physics [272, 126, 109], and
design and manufacture of micro actuators which are not biomimetic or bioinspired [499]. Meanwhile,
there are also parallel advances in artificial muscles [30, 313], neuromorphic computing [152, 294], etc,
which reproduce part of biological behaviors.
The fourth stage of bioinspiration would come because of two reasons. Firstly, biological behavior is
not necessarily optimal. Although millions of years of nature selection endowed animals with incredible
abilities in interacting with the surrounding environments, nature selection is not an optimization process.
Instead, it is based on "final elimination" principle. Moreover, the goals of both biological behaviors and
engineering applications are complex and might not align with each other. Secondly, man-made vehicles
would operate in environments where animals never reach or operate. For example, the first helicopter
has operated on Mars, which, as far as we know till now, hasn’t been explored by any animals. Moreover,
researchers have begun thinking of the possibility of designing flapping-wing vehicles on Mars [225].
Along through this timeline, the important thing is understanding the universal physics mechanisms
underlying these phenomena. As illustrated in Fig. 1.4, although the real-world scenario experienced by
animals and vehicles is a high dimensional space, the underlying physics is in a low dimensional subspace.
For example, fish and dolphins are different in the phylogenetic tree, but they developed tails of similar
shape (although obviously in different directions) because they interact with the same fluid environment.
For the rest of this thesis, we studied fish, for their fascinating ability in propulsion efficiency [404,
255], maneuverability [482], utilizing ambient flow structures [270], and collective behavior [266, 194,
5



Biology
1480s
Biomimetic (failed) Bioinspiration
(find key feature)
Biomimetic Bioinspiration
(beyond biology)
since 1800s since 2000s
???
Figure 1.3: Bioinspiration and Biomimetics. Biological behaviors, provide lots of intuitions to engineering applications.
First stage. Early attempts of biomimetics (like von Davinci’s sketch) directly mimic the morphology and kinematics of biological
behavior. These early attempts failed because of a lack of understanding of physics and engineering techniques. Second stage.
Bioinspiration takes inspiration from biological behaviors, and design engineering applications based on the key features. Third
stage. With the advance of measuring techniques and a deeper understanding of physics, people can mimic exactly what animals
are doing in a certain aspect. Fourth stage. Man-made vehicles could operate in environments where animals never explore and
would surpass the performance of animals in the near future. Under these scenarios, biological behaviors would still provide us
with inspiration and guidelines. Sources of the subfigures are listed as follows. First row (from left to right): generated by ChatGPT
- DALL-E 3; von Davinci’s sketch of flying machine; Wright Brother’s first flight [500]; Robobee [288]; "Ingenuity" Helicopter
operating on Mars [211]. Second row (from left to right): generated by ChatGPT - DALL-E 3; von Neumann architecture [476];
architecture of neuromorphic computing [307].
193] which is not achievable by any man-made vehicles (Fig. 1.1). Although lots of studies have been
focused on these problems, it remains elusive to which extent these behaviors are based on active control or
passive interaction. Moreover, what is the minimal sensory cue required for active control under different
scenarios?
The physics mechanism behind fish’s behaviors have two parts, one is its “internal” mechanism, which
is how to make decisions based on all the sensory cues it gets from the environment, and the other is its
“external” mechanism, which is how to interact with ambient flow field to sense and navigate.
The philosophy of my work is that we build an abstract computational model of the fish and use it to
discover the “external” mechanism of interacting with the ambient fluid environment. Based on what is
optimal in interacting with the fluid, we give a reasonable deduction of the fish’s “internal” mechanism.
6



Nature
phenomena
(high
dimensional)
Engineering
application
(high
dimensional)
Physics
mechanism
(low dimensional)
Encoder Decoder *
Figure 1.4: Physics mechanism as a low dimensional latent space. Both biological behavior and engineering applications
live in high-dimensional real-world scenarios. We need to learn the low-dimensional latent space between them, which is physics.
It translates knowledge from natural phenomena to engineering purposes, like an autoencoder-decoder.
1.2 Literature review
1.2.1 Literature review on fish swimming
Millions of years of natural selection endowed fish with remarkable abilities to swim efficiently compared
to underwater man-made propulsors [404, 255]. Several fish use body and caudal fin (BCF) deformations
for propulsion. Details of BCF deformations have been used to classify fish swimming modes [404, 405,
282, 415]. Invariably, in most BCF swimming modes, anterior to posterior bending of the fish body seems
to play an important role in swimming efficiency, and it is often linked to increased flexibility towards the
fish tail or caudal peduncle [87, 451, 450, 452, 156].
In an effort to document the bending rules of fluid-based propulsors, both aerial and aquatic, [284]
collected morphometric data of the flexion parameters across length scales and animal taxa. They identified
two parameters to characterize the bending behavior: flexion ratio l/L between the flexion distance l
and total length L of the propulsor and maximum flexion angle A. They found that the flexion ratio and
maximum flexion angle of all surveyed animals, including fish, clustered in a limited design space: bending
occurs at about 70% of body length at maximum flexion angle of about 30◦
, as shown in Figure 2.1 for
7



swimming fish based on the data collected in [284]. It is unclear the extent to which this anterior-toposterior bending is active or whether it follows passively due to the interaction of a flexible posterior
with the fluid motion. Either way, these findings raise the question of whether hydrodynamics could have
provided a selective force for driving this convergent bending design.
To address these questions, we analyze the influence of bending on the swimming speed and efficiency
of a simplified fish model that consists of anterior and posterior sections connected via a rotational joint
at the flexion point (see Figure 2.1A). The fish anterior undergoes periodic planar pitching while the posterior either (i) moves in synchrony with the anterior as if the two parts were a single rigid body, (ii)
bends actively at distinct amplitude and phase relative to the fish anterior, or (iii) bends passively due to
interactions with the flow generated by the fish anterior. We find that swimming with passive bending
could be more efficient than rigid flapping but at the cost of diminished swimming speed. Active bending provides more possibilities to alter the swimming performance through not only the flexion ratio and
maximum flexion angle as reported in [284] but also the phase difference ϕ between the flapping motions
of the anterior and posterior parts. Importantly, we find that antiphase anterior-to-posterior flexion can simultaneously enhance swimming speed and efficiency in a region of the design parameter space (l/L, A).
Flexion ratios and angles that lead to significant improvements in speed and efficiency overlap with the
observations of real fish reported in [284]. We analyze in depth the hydrodynamic mechanisms underlying
these improvements in swimming performance.
Details of the flow field around swimming fish have received a great deal of attention. Several studies
used particle image velocimetry (PIV) to measure the flow field around live fish and analyze the interplay
between body deformations and thrust production; see, e.g., [328, 329, 270, 451, 156]. In-silico models of
various degrees of fidelity to fish morphology and kinematics have also been used to examine the offsets of
body deformations on swimming speed and efficiency [121, 235, 450, 127]. Importantly, several experimental and numerical studies have shown that plates and foils undergoing pitching or heaving motions provide
8



good approximations of the fluid-structure interactions in swimming fish [52, 488, 254, 308]; including the
reverse von Kármán vortex wake left behind swimming fish and flapping foils [434, 442].
A variety of fluid-structure interaction models have been proposed to analyze the effect of body flexibility on bending in flows; see, e.g., [408] and references therein. Here, we present a brief literature review
focused on this topic. [188] conducted experiments on a flapper with a rigid leading edge and flexible tail
fixed in a water channel and found that flexibility can enhance both efficiency and thrust production. [122]
and [123] numerically simulated the flapping motion of articulated rigid links and found that joint flexibility can reduce the power required for flapping. [6] used a filament of uniform flexibility to model the tail of
swimming fish in the context of the vortex sheet method and predicted enhancement in efficiency rather
than thrust when choosing parameters (dimensionless rigidity and reduced pitching frequency) that are
consistent with biological data. [379] conducted a large set of experiments on two-dimensional pitching
and heaving flexible plates at various stiffness values, kinematic parameters, and incoming flow speeds. By
combining grid search and gradient-based optimization methods, they found that optimizing the pitching
angle with heaving can almost double the propulsor efficiency compared to heave-only motions. [197]
conducted simulations combining three-dimensional Navier-Stokes equation with one-dimensional Euler–Bernoulli beam theory to analyze the motion of heaving flexible plates, and identified local peaks in
swimming speed over a parameter space consisting of the beam material property and heaving frequency.
Also using three-dimensional simulations of pitching plates of uniform flexibility, [100] found that the
phase delay ϕ between the leading and trailing edge of the plate decreases with increasing stiffness κ. For
large stiffness, the plate moves in no-neck mode (in-phase in our notation), in which thrust production is
close to that of a rigid pitching plate with similar trailing edge displacement.
The effect of uniform flexibility on swimming speed, thrust generation, swimming energetics, and
stability have been analyzed in numerous other experimental and computational studies; see, e.g., [411,
139, 312, 311, 88, 201, 479, 393]. Specifically, [277, 189, 435] have indicated that flexibility could lessen
9



or prevent thrust production. Flexible propulsors have also been widely used in man-made biomimetic
underwater autonomous vehicles (UAV); see, e.g., [151, 233, 160, 523, 489]. Most notable is the Tunabot
design of [523] and [489] which mimics the shape and bending kinematics of yellowfin tuna (Thunnus
albacares) and Atlantic mackerel (Scomber scombrus).
While most studies have focused on uniformly flexible bodies, [87] and [284] noted that the stiffness
along the fish body is not uniform, but decreases towards the tail, and that the propulsor becomes highly
flexible at the flexion point of the body; see Figure 2.1(A). To explore the effects of non-uniform flexibility
on efficiency and thrust production, [285] considered flexible plates of inhomogeneous stiffness undergoing heaving and pitching motions in a water tunnel and found that non-uniform stiffness can improve
thrust production, and that in order to achieve optimal propulsion, the morphologic factor (flexion ratio)
and kinematic factor (motion type and motion parameters) should be considered simultaneously. [470,
471] analyzed the effect of non-uniform flexibility on flight performance in the context of a tumbling wing
model, and found that wing tip flexibility that follows the empirical rules reported in [284] leads to improved flight performance.
1.2.2 Literature review on flow sensing
Swimming animals and obstacles can produce long-lasting wakes in their fluid environment as shown in
Figure 1.5A. Thus, the air and sea are not information-free but contain lots of information from animals
and vehicles moving through [419]. This information is extremely important in underwater environments
since lots of typical localization and sensing methods, such as GPS, camera, and radar, do not work well
underwater.
Many aquatic animals exhibit fascinating abilities to sense and respond to the information contained
in such flows by the lateral line system of fish [129, 303, 140, 420, 337, 391, 85, 350] or the whiskers of
harbor seal [106, 401, 493, 162, 185]. This ’distant-touch’ capability is manifested in behaviors ranging
10



Northerly wind direction
Bend Sensors
Whisker
Flexible membrane
A
B
C D
Figure 1.5: Literature review on flow sensing. A. Vortex street behind Jan Mayen Island in an image taken from the NASA
Terra satellite, using one downward-looking camera on the Multiangle Imaging Spectroradiometer (MISR). Northerly winds run
from left to right, and the streamwise observation length is 365 km. Image courtesy of NASA/GSFC/LaRC/ JPL and the MISR
team and [419]. B. Illustration of fish’s lateral line system including superficial neuromast and canal neuromast. Image courtesy
of [497]. C. The first piezoresistive ALL sensor fabricated by [134] in 2002. D. Flow sensor inspired by whisker of harbor seal
developed by [35].
11



from rheotaxis [350] to obstacle avoidance [498], schooling [194, 33, 503, 193], energy harvest [270] and
predation [370]. These behaviors have even been reported in blind fish, where visual cues are totally
absent [498, 319, 350]. Many mathematical models have been developed to explain these flow sensing
behaviors [391, 85].
1.2.2.1 Biological basis of flow sensing and development of flow sensors
Biological experiments have found that many fishes have remarkable ability in sensing the flow field
around them via lateral line [110, 48, 90, 247, 91, 129, 303, 497, 140, 420, 337, 391, 85, 350, 373]. The
sensing unit in the lateral line systems is called neuromast, which is composed of both superficial and
canal neuromasts as shown in Figure 1.5B. Superficial neuromasts (SNs) are standing on the skin of fish,
which are directly excited by the viscous boundary layer [247, 497]. SNs are able to sense flow direction
and velocity [90]. Canal neuromasts(CNs) are recessed in bone-like canals and are suitable for measuring
pressure [91, 85]. In order to explain neuromast’s ability in sensing flow field, lots of mathematics models
have been proposed. CNs can be simplified as a rigid hemisphere cupula sliding over a frictionless plate and
coupled with a linear spring [410, 336]. [302] simplifies SN as a cylindrical beam that deflects in response
to an oscillating flow field.
Studies have also found that other animals have their own organs to sense flow field, such as whiskers
of harbor seal [106, 401, 493, 162, 185], hairs of arthropods [67], antenna of copepods [508]. All these
biological sensing organs are accurate in the sensing flow field; for example, seals were reported to detect
flow velocities as low as 245µm s−1 using specially adapted whiskers [105].
Inspired by biological behavior, lots of efforts have been put into developing bio-inspired flow sensors [246, 276, 516]. The first artificial lateral line(ALL) sensor was fabricated in 2002, as shown in Figure 1.5C [134]. Since then, lots of ALL sensors have been developed as reviewed in [516, 276, 410]. Nowadays, the state-of-art artificial SN sensors can reach a threshold of detection as low as 2.5µm s−1
[300].
12



Meanwhile, [35, 36] developed whisker-like sensors inspired by the whiskers of harbor seals as shown
in Figure 1.5D. Another promising kind of flow sensor utilizes quantum sensors to sense the magnetic
field induced by the motion of sea water [150], which is inherently remote sensing and characterizes the
structure of the flow instead of measuring the flow at a single point.
1.2.2.2 Mathematical models on flow sensing and underwater navigation
Another important question is that when we are able to get local flow property, how are we able to decode
some properties of the flow field or use it as the basis of underwater navigation? There are mainly two
approaches to utilizing local flow information: physics-based methods and data-driven methods.
Using physics-based methods, previous studies have found that from these local measurements, we
can decode some local flow properties such as whether the flow is shearing or rotating locally or the vorticity and Q criterion [85, 83, 84]. [506] showed the ability of artificial lateral lines in both dipole source
localization and wake characterization. [463] built a sensing platform with distributed pressure sensors
and placed it behind a cylinder flow in a water channel. From the signal of these pressure sensors, both
information about the wake, such as frequency, traveling speed and wavelength and lateral displacement
and orientation of the platform relative to the von Karman vortex street, can be decoded. Furthermore,
using the data-driven method, some global, structural properties of the flow field can be discovered. [82, 7]
found that the measurement of flow field using an array of sensors or a time series of data in a fixed position is able to decode the type of the wake behind a pitching airfoil using neural network. Using Bayesian
method, [468, 483] found the optimal sensory placement on a swimming fish for locating oscillating cylinder or leading fish school in the flow field. [372] placed a trailing airfoil that can rotate passively in the
wake of a leading oscillatory airfoil in a water tunnel experiment and found that the induced angular velocity of the trailing airfoil can be used to classify vortex wakes via a neural network. In a water tunnel
experiment, [503] decodes the phase difference between the follower hydrofoil and the leader using the
13



time series data of a differential pressure sensor mounted on the follower via 1D-CNN and control the
follower to achieve vortex phase matching based on this information.
Inspired by bio-behavior, rheotaxis, which means the behavior of fish to orient and swim against incoming flow, has long been studied as a standard scenario in flow sensing and navigation [287, 12, 11,
318, 25, 337, 458, 350, 373]. [85] optimized the sensor placement on an elliptical body, and designed a
proportional controller based on differential pressure sensing for rheotaxis. [350] found that by sensing
the temporal change of circulation, fish are able to perform rheotaxis behavior in a parabolic flow.
Another interesting direction in underwater navigation is the extension of Zermelo’s problem in unsteady flow field. [46] investigated Zermelo’s problem using a Reinforcement Learning (RL) approach in a
2D turbulence environment by assuming the absolute position is known. [174] extended Zermelo’s problem to traveling across a cylinder flow based on local flow sensing. They used RL to train the agent to
fulfill the navigation problem. They found sensing local velocity is more informative than sensing local
vorticity.
An important issue is dealing with self-induced flow signals. A big challenge in applying these navigation strategies in real robotics is that the external flow is coupled nonlinearly with the flow structure
induced by the focal swimmer’s body motion. [264] used Fast Fourier Transformation (FFT) to analyze
pressure and shear stress signal from a fish in the school, and found that the signal in frequency domain
can reflect relative-position, phase-differences, and the tail-beat frequency of its neighbor. [5] built a model
between moving kinematics and pressure sensing and found that at low-speed swimming, acceleration of
the swimmer dominates the sensed signal. [395] implemented a Braitenberg strategy on a robotic fish with
pressure sensors to fulfill rheotaxis in a uniform flow by moving against the flow. [468, 483] optimized the
sensor placement on moving fish bodies using Bayesian filter coupled with Navier-Stokes solver.
14



1.2.2.3 Source seeking and Chemotaxis
Among various biological behaviors of sensing flow stimulus and responding accordingly, source seeking
problem aroused our interest because it has implications in both understanding bio-behavior (such as predation and schooling) and designing bio-inspired vehicles. In most bio-behavior, the behavior of response
to hydrodynamic or mechanical signals and olfactory or chemotactic signals are combined. Lots of studies
reveal that animals are responding to chemical signals ranging from micro scale to macro scale, including
bacteria [41], Pacific salmon [187], seabirds [338], lobsters [28, 108], fruit flies and mosquitoes [352], blue
crabs [487], and moths [63, 64]. To uncover the mechanism of chemotaxis, various gradient-based models
of following chemical signals are proposed based on the hypothesis that the concentration field is spatiotemporally smooth, large signal-to-noise ratio, and sufficiently large amplitude, such as [207, 138, 22,
344, 47, 321, 179, 21]. [179] consider tracking time-dependent signals, advected by a background flow in
one dimension. [163] developed an adaptive chemotaxis model in a one-dimensional traveling wave signal
field. [79, 77, 78] employed extreme seeking control to estimate gradient of the scalar field while moving
toward source. To deal with intermittent and noisy fields, which are common in macro-scale phenomena,
[465, 128] proposed a Bayesian-based method called the "infotaxis" strategy, which maximizes information
acquisition by the swimmer. It has been implemented on robots [297, 279, 299]. [280] utilized tree-based
method, spatial-aware optimization, and DRL to find a more optimal strategy which beats infotaxis. [223]
consider a walking drosophila larva following the odor motion at Reynolds number and Peclect number in
the range O(102
)−O(103
). However, the choice of using a Reynolds-averaged Navier-Stokes (RANS) fluid
solver prevents them from resolving the spatial-temporal flow field, which restricts the possible strategies.
[414] used reinforcement learning (RL) to train recurrent neural network (RNN) agents in tracking turbulent wakes. They proved that memory is required for navigating in this scenario, but did not provide
any interpretation or proof of stability of their policy. With limited memory, [464] optimized a finite state
machine for olfactory tracking in turbulent regimes.
15



1.2.3 Literature review on collective locomotion
Animals typically perform locomotion in groups for various reasons [356, 359, 96, 94, 95, 429, 253, 230,
522, 367]. When considering the groups of flying and swimming animals, the hydrodynamic interaction
plays a crucial role in near-field interactions. The wake behind swimming fish coherent patterns [442,
436], which interacts with the trailing swimmers, and provides energetic benefits and stability [486, 374,
402, 519]. However, when studying collective behavior composed of hundreds of agents, social interaction drives each individual to locally attract to and align with their neighbors [96, 94, 95, 154]. When
hydrodynamic interaction needs to be combined with this model, reduced-order models are typically employed [142, 521] due to the computational complexity in solving the fluid-structure interaction among
many individuals [73]. These models are also connected to both social interaction models and active matters [469, 440, 439, 293].
1.2.3.1 Near-field High Fidelity Hydrodynamic Interactions
Flow interactions are thought to allow flying and swimming animals to derive energetic benefits when
moving in groups [486]. However, direct assessment of such benefits is challenging, chiefly because animal groups do not generally conform to regular patterns – individuals in these groups dynamically change
their position relative to their neighbors [356, 429, 295, 16, 314]. Also, because direct energetic measurements in moving animals, flying or swimming, are notoriously difficult and often unreliable as proxy for
hydrodynamic energy savings [192, 296, 237, 295, 266, 518, 438, 519]. These difficulties hinder a direct
mapping from the spatial pattern of the group to the energetic benefits or costs incurred at each position
within the group.
An understanding of how the spatial arrangement of individuals within a group influences their cost
of locomotion can provide insights into the evolution of social structures, resource allocation, and overall
fitness of each individual in cooperative activities such as foraging, mating, and evasion [490, 96, 94, 97,
16



24, 136, 216]. It could also guide the design of bio-inspired engineering systems and algorithms that steer
groups of entities, such as swarms of autonomous robotic vehicles, underwater or in flight, that collaborate to achieve a desired task while minimizing energy consumption and improving the overall system
efficiency [262, 523, 92, 265, 42].
Various direct and indirect approaches have been employed to understand the potential energetic benefits of group movement. Li et al. [266] associated energy savings in pairs of flapping robotic swimmers
with a linear relationship between their flapping phase lag and relative distance. Based on this, a strategy,
called Vortex Phase Matching, was extrapolated for how fish should behave to maximize hydrodynamic
benefits: a follower fish should match its tailbeat phase with the local vorticity created by a leader fish.
Pairs of freely swimming fish seemed to follow this linear phase-distance relationship even with impaired
vision and lateral line sensing, that is, in the absence of sensory cues about their relative position and
neighbor-generated flows. Interestingly, the same linear phase-distance relationship was uncovered independently in flapping hydrofoils and accredited solely to flow interactions [524, 339, 341]. It is, therefore,
unclear whether vortex phase matching is an active strategy, mediated by sensing and feedback control,
that fish employ to minimize energy expenditure or if it arises passively through flow interactions between
flapping swimmers. Importantly, active or passive, it is unclear if this strategy scales to groups larger than
two.
In an effort to directly gauge the energetic benefits of schooling, metabolic energetic measurements
were recently performed in solitary and groups of eight fish, and impressive energetic savings were attributed to schooling compared to solitary swimming when the fish were challenged to swim at high
speeds [519]. Lamentably, the study made no mention of the spatial patterns assumed by these physicallythwarted individuals [519]. In an independent previous study [16], changes in spatial patterns and tailbeat
frequencies were reported in similar experiments, albeit with no energetic measurements. Specifically, [16]
17



showed that, when challenged to sustain higher swimming speeds, the fish in a group rearranged themselves in a side-by-side pattern as the speed increased, presumably to save energy.
Taking together the results of [519, 16], are we to conclude that side-by-side formations are more
energetically beneficial than, say, inline or diagonal formations? The answer is not simple! The metabolic
measurements of [295] in a school of eight fish report that side-by-side formations, though beneficial,
produce the least energetic savings compared to diagonal formations [486]. In an experimental study of a
single fish interacting with a robotic flapping foil, the freely-swimming fish positioned itself in an inline
formation in the wake of the flapping foil, supporting the hypothesis that swimming in the wake of one
another is an advantageous strategy to save energy in a school [438]. Why did the fish in the experiments
of [16] self-organize in a side-by-side formation when challenged to swim at higher swimming speeds?
The answer is not simple because ample hydrodynamic mechanisms for energy savings in fish schools
have been stipulated for each possible configuration – side-by-side, inline, and diagonal (see, e.g., Fig. 1
of [519]) – but no assessment is provided of the relative advantage of these configurations. For example,
side-by-side formations, where fish mirror each other by flapping antiphase, are thought to create a walllike effect that reduces swimming cost [519, 143]. A fish swimming in the wake between two leading
fish encounters a reduced oncoming velocity, leading to reduced drag and thrust production [486]. Inline
formations, where fish swim in tandem, are thought to provide benefits to both leader and follower, by
an added mass push from follower to leader [143, 454] and reduced pressure on the follower [249]. All
of these mechanisms can, in principle, be exploited by schooling fish as they dynamically change their
relative spacing in the group. But are these mechanisms equally advantageous? Or is there a hierarchy
of hydrodynamic benefits depending on the relative position within the school? The literature offers no
comparative analysis of the energetic savings afforded by each of these configurations.
18



The study of [295] is arguably the closest to addressing this question, but, to map the energetic benefits
for pairwise configurations, the authors employed statistical averages in a school of eight fish, thus inevitably combining the various hydrodynamic mechanisms at play and cross-polluting the estimated benefits of each configuration. A cleaner analysis in pairs of flapping foils shows that these relative positions
– side-by-side, inline, and diagonal – all emerge spontaneously and stably due to flow interactions [341],
but provides no method for estimating the energetic requirements of these formations, let alone comparing them energetically. Even vortex phase matching makes no distinction between side-by-side, inline, or
diagonal pairs of fish [266]. It simply postulates that an unknown amount of energetic benefit is acquired
when the linear phase-distance relationship is satisfied. Thus, to date, despite the widespread notion that
group movement saves energy, a direct comparison of the energetic savings afforded by different spatial
formations remains lacking. Importantly, it is unknown whether and how the postulated benefits scale
with increasing group size.
Here, to circumvent the challenges of addressing these questions in biological systems, we formulate
computational models that capture the salient hydrodynamic features of single and pairs of swimming fish.
Namely, we represent each fish as a freely-swimming hydrofoil undergoing pitching oscillations about its
leading edge. A single-flapping hydrofoil shares many hydrodynamic aspects with its biological counterpart, including an alternating, long-lived pattern of vorticity in its wake [442, 443, 254, 467, 415]. These
similarities have been demonstrated repeatedly, within biologically relevant ranges of flapping parameters [442, 436], for different geometries [475, 431, 165, 20], material properties [323, 378, 20], and flapping
kinematics [451, 235, 208].
Although studies have found that in large schools, passive hydrodynamics is not enough to form or
maintain a school [73], it is controversial on whether two swimmers can form a stable school passively.
Studies on pitching/heaving airfoils/plates/flags have found that a stable formation can form via purely
passive hydrodynamics [524, 33, 384, 339, 248]. For the models that characterize the body deformation of
19



fish more realistically, [466] trained the following fish to harvest vortices in the wake of the leading fish
via combining reinforcement learning and direct numerical simulation. Two more recent studies showed
that two fish like swimmers are unable to form a stable school only using passive hydrodynamic interaction [485, 526]. Experimental studies using robotic fishes have found that vortex phase matching (VPM)
is the key for energy saving in fish schooling [266], and learned a controller to achieve VPM using proprioceptive sensing [265]. Importantly, [526] found that by applying RL-trained controllers which guide
a single swimmer to swim at a given speed in a given direction, schooling fish can self-organize into stable and energetic-saving schools without any knowledge about each other. This again emphasizes the
importance of hydrodynamic interaction in stabilizing fish schools dynamically.
1.2.3.2 Fish school as a model of social interaction and active matter
When considering an increasing number of agents interacting with each other, the details of physical
interaction are becoming less important (Fig. 1.2); instead, using a combination of passive physical interaction and active neural control, animals of diverse sizes, species, and substrates, ranging from cell [172],
insects [153, 69, 397], birds [68, 45], and fish [154] arrived at a similar set of behavior laws: alignment,
attraction, and propulsion [469, 440, 439, 96, 94, 95, 441, 10, 161, 72, 93]. Simulations, either directly based
on these simple rules, or based on behavior rules derived from biological experiments [154, 60], have successfully reproduced the collective patterns observed in nature – disordered swarming, rotational milling,
and polarized schooling [9, 209, 96, 61, 142, 203]. These global states emerge as a result of combining
local behavior laws and individual noise (free will). However, in most of these simulations, either a small
number of swimmers (< O(100)) [61, 142, 203], or a periodic boundary condition [469, 440, 439, 364, 170,
55, 505, 81], or bounded domain [158, 39, 203], are typically considered.
Lots of examples in biology and society have shown that "more is different" [8, 423]. In addition, animal
groups typically live maneuver with free boundary, which has an inflow of information [71]. Thus, we are
20



curious about whether the global patterns that emerge in a small number of swimmers can sustain their
stability under an increasing number of agents in a free boundary.
Collective responsiveness in animal groups manifests in long-ranged spatial correlations [333, 68, 213].
Correlation measures how the change in the behavior of one individual influences the behavior of others
in the group. An animal group exhibits maximal responsiveness to a perturbation, say, caused by an attacking predator [180, 390], when the range of spatial correlations scale with the group size, that is, when
correlations are scale-free [68, 180]. Analysis of empirical data of large bird flocks confirms that spatial
correlations scale linearly with group size [68]. This is surprising because in most systems, correlations
decay with distance, due to noise and the nature of interactions between individuals [68, 520]. For example,
in the game of telephone, where a player secretly shares a phrase with the next person, who then passes it
along to the next player and so on, the interaction range is one, whereas the correlation length — representing how far the phrase spreads before becoming distorted — goes well beyond one, but typically does not
scale with group size [310, 65]. In physical models of flow-coupled swimmers, microscopic [75, 445, 446]
and inertial [444, 193, 340], perturbations get amplified as they propagate via the fluid medium, hindering group cohesion. Biological swimmers, on the other hand, seem to correlate their tailbeat frequencies
and phase [524, 16, 339, 517, 266], but the extent of flow-mediated correlations is limited in space [193].
Recent evidence suggests that visual interactions are sufficient to induce collective responsiveness and
scale-free correlations in polarized robotic groups [520]. Could scale-free correlations be a universal feature of emergent polarization in self-propelled individuals interacting visually, including in schooling fish?
If so, how do flow interactions and dynamic self-organization within the group affect spatial correlations?
Importantly, how fast does information travel within a polarized group?
Inspired by the analysis of information propagation in bird flocks [17], we consider the behavior of fish
during collective turns. We find that the information about the change in direction propagates linearly in
time across the group, at speeds much faster than the individual swimming speed. This is in sharp contrast
21



to the diffusive information propagation in symmetric, consensus-based models [469], and in the absence
of behavioral inertia [17]. Here, we show that symmetry is broken due to the non-reciprocal nature of
the interactions between individual swimmers [19, 149], much like in the game of telephone, where the
message is transmitted from one person to the next person who did not already have the information. This
non-reciprocity inherently breaks symmetry and ensures that the message travels ballistically in time in
one direction, as opposed to the diffusive propagation that occurs when each person randomly chooses to
transmit the information in either direction, left or right, irrespective of where it came from [40].
Another direction concerns a continuum description of the school, describing them as an active matter using statistical physics models, like Ising model [301], and XY model [245]. Unsteady dynamics of
collective animal groups are regarded as "active turbulence" [170, 55, 114, 39, 171]. These phenomena are
similar to inertial turbulence in the sense of having chaotic dynamics and deterministic statistics. However, the difference is that the energy comes from macroscopic activities, and transfers from large length
scale to small length scale in inertial turbulence. On the other hand, in active turbulence, energy comes
from the activity of microscopic particles, and transfers from small to large length scale [170, 55, 114, 39].
In inertial turbulence, the famous 5/3 law in energy spectrum was derived from dimensional analysis
by Kolmogorov [243, 244] in 1941, and validated by numerous experiments and simulations [514, 349].
However, in active turbulence, different systems would generate different E − k relationships in energy
spectrum [376]. This illustrates that the energy transfer rate depends on the specific system we are considering.
Thus, deriving a continuous description is crucial for unraveling the physics behind the active turbulence [57]. Toner-Tu equation [440, 439, 441] provides a general form of PDE governing the temporal
evolution of active matters, which includes anistropic terms and higher-order terms in addition to standard
advection and diffusion terms in N-S equation. Based on this, different systems have been analyzed using velocity correlation and velocity-density correlation in Fourier space [364, 505] to discover the sound
22



speed [158], correlation of fluctuation [68], etc. Recent advances in data-driven methods have made it increasingly possible to derive continuum descriptions directly from microscopic particle simulaitons [363,
427, 258, 132, 131].
These collective behavior models and analysis tools not only explain the collective behavior of animal
groups but also connect well to the development of active matters [293, 70, 114, 474], social behavior of
human [4, 93, 261, 354, 325, 171], and potential application in coordinated robots swarm [267, 42, 430, 520].
23



Chapter 2
Hydrodynamics of individual swimmers: flexible versus rigid flapping
swimmer
2.1 Problem formulation
We model the flexible swimmer as a planar two-link body of total length L, negligible thickness e ≪ L, and
total mass per unit depth m = ρeL, where ρ is both the swimmer and fluid density, assuming a neutrallybuoyant fish. The flexion point indicates where the anterior link (of length l) is joined to the posterior
link; see Figure 2.1A. The anterior link undergoes sinusoidal pitching motion θa(t) = a sin(2πf t), where
θa is the angle relative to the swimming direction, taken to be parallel to the x-axis. Here, a is the flapping
amplitude, f = 2π/T the flapping frequency, and T the flapping period. When the posterior is connected
rigidly to the anterior link at zero flexion, the posterior motion θp(t) is equal to θa(t) and the flexion angle
α = θp − θa is identically zero for all time. The two links form a single rigid plate (Figure 2.2A) whose
swimming motion due to sinusoidal pitching has been extensively analyzed [220, 409, 322, 194, 250]. To
explore the effects of body bending on swimming, we consider two cases: (i) active bending where the
flexion angle α(t) is controlled by the swimmer, and (ii) passive bending where the flexion angle α(t) is
dictated by the physics of fluid-structure interactions.
24



B
Flexion
point
Flexion
angle
A
αFlexion distance, l
0°
80°
40°
20°
60°
Sturgeon
Molly
Rosy barb
Butterfly fish
Koi
Dogfish
Leopard shark
Tiger shark
Tuna
Atlantic salmon
Flying fish
Clownfish
Bass
Yellow amberjack
flexion angle, A
meter body length centimeter
0.6
0.4
0.2
1.0
0.0
0.8
flexion ratio, l/L
Figure 2.1: Bending rules of swimming fish. A. Schematic of flexion parameters in fish with flexion distance l, ratio l/L, and
maximum angle defined as in [284]. B. Flexion ratio l/L and maximum flexion angle A observed in different fish species; data
taken from [284]. The observed flexion ratio and angles are fairly consistent among different fish species, despite large variations
in length scales.
0 0.5 1
−30°
0°
30°
−15°
15°
A
−30°
−15°
0°
15°
30°
10 0.5
a
−30°
−15°
0°
15°
30°
10 0.5
A
C
B In-phase Active Flexion
Anti-phase Active Flexion
Rigid Swimmer
D Passive Flexion
Figure 2.2: Wake and flow velocity of free swimmers. A. Rigid swimmer undergoing periodic pitching (inset) of period
T = 1 and amplitude a = 15◦
. B., C. Active bending with both anterior and posterior sections undergoing periodic pitching albeit
at different amplitude and phase. The anterior follows the same pitching motion (blue line in inset) as the rigid swimmer while
the relative rotation of the posterior follows a prescribed Jacobi elliptic sine function (red line in inset) with flexion amplitude
A = 30◦
, flexion ratio l/L = 0.7, elliptic modulus M = 0.9, and phase ϕ = 0 (in-phase) and ϕ = π (antiphase) in B. and
C., respectively. D. Passive bending of posterior while anterior follows the same prescribed pitching as the rigid swimmer. Joint
parameters are set to κ = 0 and c = 1. The dissipation time is set to be Tdiss = 1.625T in A.-C.,
√
2.09T in D..
25



When the two-link swimmer bends actively, we allow the anterior link to have a phase advantage of
magnitude ϕ relative to the flapping motion of the posterior link. At ϕ = 0, both anterior and posterior
links flap in phase and the swimmer bends in the direction of flapping (Figure 2.2B); for ϕ = π, they flap
antiphase resulting in bending in the opposite direction to the anterior pitching motion (Figure 2.2C). The
flexion angle α(t) follows a Jacobi elliptic sine function α(t) = A sn(4Kf t, M), where A is the maximum
flexion angle and M is the elliptic modulus that controls the shape of the elliptic sine function. As M → 0,
the elliptic sine function tends to a sinusoidal function and as M → 1, it approaches a square wave shape.
The parameter K is introduced to ensure that the flapping frequency of the posterior link is the same as
that of the anterior link; namely, K = F(π/2, M) where F(π/2, M) is the incomplete elliptic integral of
the first kind. In this paper, without other specifications, we fix M = 0.9 and explore the effects of the
anterior-to-posterior phase difference ϕ, flexion ratio l/L, and maximum flexion angle A on the swimming
performance.
We write the equations governing the self-propelled motion of the two-link swimmer in
non-dimensional form. To this end, we scale all parameter values using L/2 as the characteristic length
scale, 1/f as the characteristic time scale, and ρ(L/2)2
as the characteristic mass per unit depth. Accordingly, velocities are scaled by Lf /2, forces by ρf 2
(L/2)3
, moments by ρf 2
(L/2)4
, and power by
ρf 3
(L/2)4
. The equation of motion governing the free swimming x(t) is given by Newton’s second law
mx¨ = Fx − Dx. (2.1)
Here, Fx and Dx denote the x-components, in the swimming direction, of the hydrodynamic pressure force
normal to the swimmer and the drag force due to skin friction tangential to the swimmer. By definition, Fx
and Dx can be either positive or negative, and thus can propel the swimmer forward or resist its motion.
26



However, for notational convenience, we refer to Fx as thrust and Dx as drag, though technically they can
be either; we refer to negative values of Fx as negative thrust.
We calculate the hydrodynamic pressure force in the context of the inviscid vortex sheet model [343,
204, 205, 194], and the drag force based on a skin friction model that emulates the effect of fluid viscosity [135, 384]. A brief overview of the vortex sheet method and its numerical implementation is given in
Sec. A.2.
When the swimmer bends passively, the relative rotation α(t) of the posterior end is not prescribed a
priori and follows from the physics of fluid-structure interactions. Considering that the rotational joint at
the flexion point is equipped with a torsional spring of stiffness κ and damping coefficient c, we write the
equation governing the rotational motion of the posterior link
Ip(
¨θa + ¨α) + cα˙ + κα = Mp + Minertia, (2.2)
where Ip and Mp are the moment of inertia and hydrodynamic moment acting on the posterior link about
the flexion point, Minertia is an inertial moment that arises because the flexion point about which the
moments are balanced is moving.
To assess the swimming performance of the two-link swimmer, we introduce four metrics: the periodaverage swimming speed U =
R t+T
t
xdt ˙ at steady state, the thrust force Fx, the period-average input
power P =
R t+T
t
P(t)dt required to maintain the prescribed flapping motions, and the propulsion efficiency mU2/2P T defined as the kinetic energy of the swimmer divided by the input work over one
flapping period.
27



BA C
swimming speed
U
12
8
4
6
10
thrust force
Fx input power P 2000
1000
0
time, t
t
s+3Tt
s
t
s+T t
s+2T
circulation ᴦ
0
20
−20
500
1500
time, t time, t
-100
100
50
0
-50
Rigid Flapping In-phase Active Flexion Anti-phase Active Flexion
t
s+3Tt
s
t
s+T t
s+2T t
s+3Tt
s
t
s+T t
s+2T
Figure 2.3: Active anterior-to-posterior bending of free swimmers. Time-dependent speed, thrust, input power and
circulation for A. rigid swimmer undergoing pitching at a = 15◦
, T = 1, active flexion B. at phase difference ϕ = 0, C. at phase
difference ϕ = π. In B. and C., flexion ratio l/L = 0.7 and flexion angle A = 30◦
. Solid lines represent the instantaneous
values and dashed lines represent time-period averages. Average thrust Fx is positive in all cases. The results are shown after
the swimmers have reached steady state ts = 15T. The dissipation time is set to be Tdiss = 1.625T .
-15 -10 -5 10 15 20
-150
-100
-50
0
50
100
150
50
Fy
A
Fx
Rigid Flapping
i
ii
iii
-120 -80 -40 0
-600
-400
-200
0
200
400
600
40 80
Fy
B
Fx
In-phase Active Flexion
i
ii
iii
-5 10 15 20
-200
-100
0
100
200
0 5
Fy
C
Fx
Anti-phase Active Flexion
i
ii
iii
i ii iii
10 units force
i ii iii i ii iii
Figure 2.4: Active anterior-to-posterior bending can minimize lateral forces and negative thrust. Force hodograph
of A. rigid flapping, B. in-phase flexion, C. antiphase flexion for the same cases shown in Figure 2.3. The arrow indicates the
direction of time. The white, blue, red, yellow points are: t/T = 0, 0.25, 0.5, 0.75. Bottom row shows force distribution at
the instants indicated in red stars. Blue arrows represent pitching direction of the anterior link motion while red arrows of the
posterior link.
28



2.2 Performance of flexible swimmer comparing to rigid swimmer
We compare the free swimming that results from flapping while undergoing active and passive bending
to that of rigidly flapping. All swimmers have the same total length L and undergo the same sinusoidal
pitching motion about their leading edge θa = a sin(2πf t) with a = 15o
and f = 1, albeit exhibiting
distinct bending patterns. Figure 2.2 shows snapshots of the wake represented by the free vortex sheet
and velocity field generated by a swimmer undergoing A rigid flapping, B in-phase active bending with
flexion amplitude A = 30o
and flexion ratio l/L = 0.7, C antiphase active bending at the same flexion
amplitude and ratio, and D passive bending. All snapshots are taken at the same instant in the flapping
cycle (at 0.25T after steady state has been reached). Compared to the rigid swimmer, in-phase flexion
produces wider wakes and larger leading edge circulation and instantaneous flow speeds, while antiphase
flexion is characterized by a leaner, longer wake with weaker leading edge circulation and lower flow
speeds. The main features of the instantaneous flow during antiphase flexion, namely, the leaner wake
and weaker leading edge circulation are also observed in passive bending of the swimmer body. These
flow features are important for the hydrodynamic forces exerted on the swimmer as discussed later.
We quantitatively evaluate the steady state motion of the rigid and actively-bending swimmers in Figure 2.2A-C. In Figure 2.3, from top to bottom, we report the instantaneous (solid lines) and period-average
(dashed lines) values of the swimming speed, thrust force, input power, and circulation. On average, rigid
flapping produces the lowest swimming speed, while antiphase flexion is the highest. Fluctuations around
the average swimming speed are the smallest for the swimmer undergoing antiphase flexion. The discrepancy in average swimming speeds between the three flapping modes is surprising at first sight given that
the average values of the thrust force are comparable. Note that in all cases, average thrust Fx is positive.
However, a closer look at the instantaneous thrust shows that the swimmer undergoing antiphase flexion hardly experiences negative thrust over its flapping cycle. In-phase flexion leads to negative thrust of
29



2
0.5
1
1.5
0
drag force
Dx
time, t time, t time, t
Rigid Flapping BA C In-phase Active Flexion Anti-phase Active Flexion
-20
-10
0
10
20
500
0
1000
1500
input power
P circulation ᴦ
t
s+3Tt
s
t
s+T t
s+2T t
s+3Tt
s
t
s+T t
s+2T t
s+3Tt
s
t
s+T t
s+2T
Figure 2.5: Active anterior-to-posterior bending of swimmers fixed in oncoming flow. Time-dependent drag force,
input power, and wake circulation for A. rigid swimmer, B. in-phase active flexion (ϕ = 0), C. antiphase active flexion (ϕ = π).
In all cases, the swimmer is fixed in a uniform oncoming flow at U = 9. Results are shown after the swimmers have reached
steady state ts = 11T. From top to bottom, drag force, input power and circulation of wake are shown. Solid lines represent the
instantaneous values and dashed lines represent time-period averages. The dissipation time is set to be Tdiss = 1.625T.
high magnitudes over larger subintervals of the flapping cycle, as highlighted further in Figure 2.4. Consequently, the required input power for in-phase flexion is largest compared to both rigid flapping and
antiphase flexion. This is also true of the overall wake circulation. It is worth noting that, by Kelvin’s
circulation theorem, circulation around the leading edge must be equal to the overall circulation in the
wake. Therefore, compared to rigid flapping, in-phase flexion increases the circulation around the leading
edge of the swimmer while antiphase flexion decreases leading edge circulation as noted qualitatively in
Figure 2.2.
The results in Figure 2.3 indicate that the swimmer undergoing antiphase flexion achieves higher swimming speed at lower power requirement and energetic cost. To elucidate the hydrodynamic forces at play,
we report in Figure 2.4 the force hodograph defined as a plot of the lateral pressure force Fy versus thrust
Fx acting on each swimmer. Snapshots of the distribution of hydrodynamic pressure forces along the
swimmers are depicted in bottom row of Figure 2.4, and indicate that the x-component of the forces on
30



7
0
3
5
4
6
2
1
A
1.8
3.0
3.4
1.4
2.6
2.2
15
0
5
10
0.2 0.4 0.6 0.8 10 0.2 0.4 0.6 0.8 10
2.2
0.2
0.6
1.4
1.0
1.8
scaled
drag
B
scaled
speed
Active Flexion of Free Swimmer Active Flexion of Swimmer Fixed in Flow
20
25
0.0
0.4
0.8
1.2
1.6
Flexion phase Flexion phase
-1
-0.5
0
1
0.5
15
0
5
10
20
25
30
scaled
input power
-1
-0.75
-0.5
-0.25
0
0.25
0.5
0.75
scaled
efficiency
flexion
agreement
scaled
efficiency
scaled
input power
flexion
agreement
Figure 2.6: Performance of active anterior-to-posterior bending as a function of phase. A. Free swimmer: swimming
speed (black), power requirement (red), and efficiency (blue) are scaled by the corresponding values of a rigid swimmer. B.
Swimmer fixed in oncoming flow of uniform speed U: drag force (black), power requirement (red), and efficiency (blue) are
scaled the corresponding values of a pitching rigid plate. Parameters values are set to a = 15◦
, l/L = 0.7, A = 20◦
, M = 0.9,
U = 9. Flexion agreement parameter Z between the relative velocity of an actively bending posterior and the fluid velocity
generated by a passively bending swimmer at zero stiffness κ = 0 (green) as a function of phase ϕ during A. free swimming and
B. holding station in oncoming flow U = 5.
the anterior and posterior sections during antipase flexion act opposite to each other as in a tug-of-war,
leading to overall reduction in thrust values. Importantly, antiphase flexion also reduces the lateral force
and negative thrust, with negative thrust experienced only over a small subinterval of the flapping period,
as noted earlier. In contrast, in-phase flexion significantly increases the lateral force and negative thrust.
To explain the effect of flexion on the lateral force experienced by the swimmer, it is instructive to reexamine the flow field around the rigid and actively-bending swimmers in Figure 2.2A-C. A large leading
edge vortex (LEV) is known to generate a large lift in flapping flight; see, e.g., [109, 124]. In swimming,
larger leading edge circulation creates larger lateral force, which explains why, compared to rigid flapping,
in-phase flexion increases the lateral force acting on the swimmer while antiphase flexion decreases it. Lift
is beneficial for flight but large lateral forces are detrimental to swimming speed, as noted in [117] for fish
and recapitulated here in the context of our swimmer model.
To further analyze the difference in the swimming performance between rigid flapping and flapping
with inphase and antiphase active bending, we fix the swimmer in an oncoming uniform flow of speed U
and we compute the hydrodynamic drag forces in each case. Unlike in the case of the free swimmer where
the period-average of the total thrust and drag forces must be zero, here the swimmer may experience
31



a net period-average force. In Figure 2.5, we report the drag force, input power, and circulation in the
wake of the fixed swimmer. Period-average values are shown in dashed lines. Compared to rigid flapping,
antiphase flexion reduces instantaneous drag, power, and circulation, while in-phase flexion increases
all three quantities. Reduced drag implies lower thrust requirement for steady state swimming, which
provides another perspective for understanding the improved performance of antiphase flexion.
We next examine the period-average performance of actively bending swimmers as a function of
anterior-to-posterior phase difference ϕ. In Figure 2.6A, we consider the case of free swimming, we fix
the flexion ratio l/L = 0.7 and flexion amplitude A = 20◦
, and we plot the swimming speed U, input
power P and efficiency η versus ϕ, all scaled by the corresponding values of a rigidly flapping swimmer
Urigid, Prigid, ηrigid, respectively. We find that active bending is always beneficial in terms of enhanced
speed relative to rigid flapping, albeit at an increased power requirement. Importantly, as the anteriorto-posterior bending changes from in-phase flexion to flexion at a phase lag, the scaled speed increases
and the scaled power requirement decreases. Optimal performance occurs at ϕ = 0.9 and ϕ = 0.8 in
terms of maximum swimming speed and minimum input power and maximum efficiency, respectively. In
Figure 2.6B, we fix the swimmer in oncoming flow of uniform speed U and compute the scaled drag force
Dx, input power P and efficiency η as a function of ϕ scaled by the corresponding values of a fixed rigid
flapper. We find that, as the anterior-to-posterior bending changes from in-phase flexion to flexion at a
phase lag, the scaled drag decreases and so does the scaled power requirement. Specifically, analogous to
the free swimmer, drag and input power are minimal at ϕ = 0.8. Taken together, these results imply that
antiphase active flexion is near optimal for enhancing speed and efficiency and reducing drag force and
power requirement.
To complete this analysis, we also explored the effect of the flapping parameter M on the swimming
performance. We found that for a fixed phase, the swimming speed and efficiency change monotonically
32



0°
20°
40°
60°
80°
100°
0
0.5
1
1.5
2
2.5
0.2
0.4
0.6
0.8
1
0 20 40 60 80
stiffness, κ
100
Flexion Amplitude
0
̟
0 20 40 60 80
stiffness, κ
100
0 20 40 60 80
stiffness, κ
0 20 40 60 80 100
stiffness, κ
100
B
C
Phase
Scaled Speed Scaled Efficiency
0°
90°
45°
-45°
-90°
A Passive Flexion
time, t
0°
90°
45°
-45°
-90°
0°
90°
45°
-45°
-90°
t
s+3Tt
s
t
s+T t
s+2T t
s+4T t
s+5T
κ=0
κ=20
κ=40
l/L=0.6
l/L=0.7
l/L=0.8
prescribed pitch
passive tail flexion
̟/2
Figure 2.7: Passive anterior-to-posterior bending of free swimmers. A. Top to bottom: passive flexion angle α of posterior
end at three stiffness values: κ = 0, κ = 20, and κ = 40. The anterior link is pitching at a = 15◦
. The damping ratio is set to
c = 1, and flexion ratio to l/L = 0.7. B. The bending parameters (phase ϕ and maximum flexion angle A). C. Swimming speed
U and propulsion efficiency η as a function of stiffness κ for three flexion ratios l/L = 0.6, 0.7, 0.8 reported in orange, green
and black, respectively. The results are shown after the swimmers have reached steady state ts = 25T. The dissipation time is
set to be Tdiss =
√
2.09T.
with M with maximum speed and minimum efficiency as M → 1. That is, reversing the bending direction
with a quick flicker improves speed at the cost of decreasing efficiency.
2.3 Passive Flexion improves swimming efficiency but not speed
Is active bending necessary for obtaining this enhancement in swimming speed and efficiency over rigid
flapping? To address this question, we examine the free swimming of a passively bending swimmer, where
the posterior end flaps passively under the effect of hydrodynamic forces and moments generated by the
pitching motion of the anterior section. Elastic forces due to a spring of stiffness κ located at the flexion
point are also at play. In Figure 2.7A, we keep all parameter values the same as those used for the actively
bending swimmer, and, from top to bottom, we report the flapping motions of the anterior and posterior
ends for stiffness values κ = 0, 20, and 40. At zero stiffness, flexion introduces no restoring forces and
33



large leading
edge vortex
Flow direction
smaller leading
edge vortex
Flow direction
(Downwash)
In-phase Active Flexion Anti-phase Active Flexion
Energy cost:
fish posterior actively bends opposite to
the local flow produced during passive flexion
Improved speed and efficiency:
fish posterior actively bends in agreement with
the local flow produced during passive flexion
A B
Figure 2.8: Active bending in agreement with passive hydrodynamics improves swimming performance. Illustration
of the interaction with the flow field for A. in-phase and B. anti-phase active flexion. Maximum improvement in swimming
performance occurs when the tail benefits from flows created by the anterior portion of the fish body. Light grey arrows represent
the flow direction (in accordance with Figure 2.2). Red arrows and blue arrows represent the flapping direction of the posterior
and anterior sections of the fish body, respectively.
moments. The posterior part rotates antiphase relative to the flapping motion of the anterior part, at an
amplitude comparable to the anterior pitching amplitude. The associated wake, shown in Figure 2.2D and
represented by the free vortex sheet in the inset of Figure 2.7A, shares similar features to the wake obtained
during antiphase active flexion. At spring stiffness κ = 20, the flexion amplitude increases (αmax ≈ 80◦
),
and the wake also exhibits larger lateral dispersion. At large stiffness κ = 40, the posterior part rotates
in-phase with the anterior part at the same flapping amplitude in a way reminiscent to rigid flapping; as
reported in [100] for flexible pitching plates.
The relative motion of the posterior part is close to a sinusoidal function for all stiffness values κ.
Therefore, for each κ value, we fit α(t) by a sine function α(t) = A sin(ωt − ϕ) using a standard algorithm [324]. For all fitting, we have at least 95% confidence and frequency ω ≈ 2π, thus ensuring
convergence of the fitting. In Figure 2.7B, we report the fitted flexion amplitude A and phase difference
ϕ as a function of spring stiffness κ for three distinct flexion ratios l/L = 0.6, 0.7 and 0.8. We find that
for all l/L, as stiffness κ increases, the phase difference ϕ decreases monotonously from π to 0 implying
that the posterior flapping motion changes from antiphase to in-phase. The maximum flexion angle A first
increases with increasing κ, then decreases to nearly zero at large stiffness implying rigid flapping of both
anterior and posterior ends.
34



We compute the associated swimming speed and efficiency for each stiffness value κ and we scale
the results by those of a rigid swimmer; see Figure 2.7C. Clearly, the swimmer with passive flexion never
surpasses the swimming speed of a rigid swimmer. At very low stiffness (κ ≈ 0), passive flexion results
in a swimming speed close to that of rigid flapping while doubling the swimming efficiency. The increase
in swimming efficiency at small stiffness comes purely from a decrease in power requirement compared
to rigid flapping. This is in contrast to active flexion where the enhancement in swimming speed and
efficiency noted in Figure 2.6 comes at an increase in power requirement relative to rigid flapping. As κ
increases, the scaled swimming speed and propulsion efficiency decrease, indicating that restoring elastic
forces are detrimental to both speed and efficiency. For large κ, the speed and efficiency converge to
the same speed and efficiency as the rigid swimmer, consistent with the results of [100]. We repeat this
analysis for three flexion ratios l/L = 0.6, 0.7, 0.8. The scaled speed seems to increase monotonically
with increasing l/L, but the scaled efficiency seems to peak at l/L = 0.7 but only for a range of small
κ values. These findings imply that, unlike flapping insect wings [125, 206], restoring elastic forces seem
to be detrimental to swimming performance. Swimming efficiency peaks at low stiffness values when the
restoring spring forces are weak and the posterior end is driven passively by the fluid forces.
2.4 Passive Hydrodynamics provide guideline for active flexion
Could the swimmer learn from passive flexion to improve its performance by bending actively in a way that
exploits the hydrodynamic forces generated naturally during passive flexion? The results in Figures 2.2-2.6
for actively bending swimmers suggest that maximum benefit occurs for near anti-phase flexion, whereas
the results in Figure 2.7 show that maximum efficiency for passively bending swimmers occurs for zero
stiffness (κ = 0) for which the posterior bends anti-phase. Importantly, the main features of the instantaneous flow during antiphase active flexion (Figure 2.2C) are also observed in passive bending at zero
stiffness (Figure 2.2D). We thus posit that active bending is most beneficial when the swimmer actively
35



beats its tail in a direction that takes advantage of the natural flows that arise during passive bending.
To test this hypothesis, we define a flexion agreement parameter Z that aims to relate passive and active
bending. Starting from a swimmer bending passively at zero spring stiffness κ = 0 (Figures 2.2D and 2.7A),
we assume a hypothetical posterior that is actively flapping about the flexion point of the swimmer at a
relative angle α = A sin(2πt−ϕ). We compute the fluid velocity u(s, t) induced by the passively-bending
swimmer along the hypothetical posterior that is bending actively. The flexion agreement parameter Z is
given by
Z =
1
T
1
L − l
Z T
0
Z L−l
0
u(s, t) · v(s, t)dsdt, (2.3)
where v(s, t) = sα˙ n is the relative velocity of the hypothetical actively-flapping posterior and s is a
dummy variable denoting the rectilinear distance from the flexion point. Positive values of the flexion
agreement parameter imply a beneficial interaction between the flow generated during passive flexion
and the velocity of the hypothesized posterior during active flexion, whereas negative values indicate a
detrimental one.
In Figure 2.6A, we set A to be equal to the maximum flexion angle of the passively-bending swimmer,
and we vary ϕ from 0 to π. We find that the agreement parameter Z, normalized by its maximum value,
is largest for antiphase active flexion and smallest near in-phase active flexion. This result indicates that
antiphase active flexion matches the local flow created during passive flexion best, whereas in-phase active
flexion acts opposite to these flows. That is, the swimmer during antiphase flexion can utilize better the
flow field generated by the pitching motion of its anterior section, and thus it can achieve higher swimming
speed and efficiency compared to in-phase flexion.
To emphasize the effect of the interaction between the flow field and kinematics of active flexion on
swimming performance, we schematically summarize the two cases of in-phase and antiphase flexion
in Figure 2.8. During in-phase active flexion, the flow field is characterized by a strong leading edge
vortex around the anterior section of the fish and large lateral forces. Further, the posterior part bends
36



2
2
2
2
2.5
2.5
2.5
0.4 0.5 0.6 0.7 0.8 0.9 1.0
flexion ratio, l/L
0
o
10o
20o
60o
50o
40o
30o
maximum flexion angle, A
2
2
2
2
2
2
5
5
5
5
5
7 7
7
7
1
3
5
7
1
2
3
scaled
speed
scaled
efficiency
0.4 0.5 0.6 0.7 0.8 0.9 1.0
flexion ratio, l/L
0
o
10o
20o
60o
50o
40o
30o
maximum flexion angle, A
0.5
0.5
0.5
1.1 1.1
1.1
1.1
1.1
1.1
1.2
1.2
1.2
1.2
A B In-phase Active Flexion Anti-phase Active Flexion
1
1
1
1
1
1
Figure 2.9: Swimming performance of actively flexion swimmer scaled by performance of rigid swimmer. Average
speed and efficiency versus flexion ratio l/L and flexion angle A for A. in-phase flexion (ϕ = 0) and B. antiphase flexion (ϕ = π).
The amplitude for the proximal part and the elliptic modulus are a = 15◦
and M = 0.9. Dark grey areas indicate regions of
diminished performance while light gray areas indicate improved performance over a rigid swimmer. The dissipation time is set
to be Tdiss = 1.625T.
opposite to the local flow field of a passively bending swimmer. During antiphase active flexion, the
leading edge vortex is smaller and it is followed by a counter-rotating vortex around the fish mid-section
such that the posterior part is moving in synchrony with the downwash flow induced by this counterrotating vortex. The flow field is helping the motion of the posterior end. These results indicate that active
body deformation that are in agreement with local flows produced during passive deformations are more
advantageous for enhancing swimming speeds and efficiencies.
2.5 Parametric study and relation to biological observations
Lastly, we explore the effect of maximum flexion angle A and flexion ratio l/L on the period-average
values of the swimming speed and efficiency for both in-phase and antiphase active flexion. Specifically,
37



flexion angle, A
0°
10°
20°
60°
50°
40°
30°
0.4 0.5 0.6 0.7 0.8 0.9
flexion ratio, l/L
enhanced speed
enhanced efficiency
fish data
a
d
f
j
g
e
k
b
n
c
m
l
h
i
Mollya
b Rosy barb
c Clownfish
Butterfly fishd
Flying fishe
Koif
Bassg
Atlantic salmonh
i Yellow amberjack
j Dogfish
Leopard sharkk
Tunal
Sturgeonm
n Tiger shark
Figure 2.10: Relation to fish swimming behavior. Comparison of fish flexion parameters (black dots, cite from [284]) and
regions of optimal performance predicted by our model (anti-phase active flexion swimmer): pink region corresponds to 200%
enhancement in swimming speed, and green region corresponds to 300 % enhancement in swimming efficiency, both compared
to a swimmer of the same total length rigidly flapping with no flexion. Overlap of the two regions is indicated in beige. The
contour grey line encloses a region of 600% enhancement in efficiency.
we examine the range l/L ∈ [0.4, 1.0] and A ∈ [0◦
, 60◦
] for ϕ = 0 and ϕ = π. In Figure 2.9, we report the period-average values normalized by the corresponding values for a rigid swimmer with pitching
amplitude equal to the anterior part amplitude. We highlight in light and dark grey respectively the regions in the parameter space where the flexible swimmer outperforms and underperforms the rigid swimmer. The swimmer with in-phase flexion swims slower than the rigid swimmer for small flexion ratios
(l/L < 0.7) and high flexion amplitudes (A > 40◦
), and swims faster than rigid swimmer otherwise. This
swimmer, however, is always less efficient than the rigid swimmer for reasons explained previously. In
Figure 2.9B, for most parameter values, the antiphase swimmer outperforms the rigid swimmer in terms
of swimming speed and efficiency. Note that the region with the highest swimming speed advantage
lies in l/L ∈ [0.6, 0.8] and A ∈ [30◦
, 60◦
], and the region with the highest efficiency advantage lies in
l/L ∈ [0.6, 0.8] and A ∈ [25◦
, 50◦
].
We compare the regions of highest swimming speed and efficiency obtained during antiphase flexion
in Figure 2.9B to the design parameters of biological fish reported in Figure 2.1. In Figure 2.10, we plot
the flexion angle A as a function of the flexion ratio l/L for the fish data in Figure 2.1 and we superimpose on this design space the regions of 200% enhancement in speed and 300% enhancement in efficiency
38



from Figure 2.9B. As shown in Figure 2.10, there is significant overlap between these regions of improved
performance and the biological data. Indeed, all biological data lie within the region of improved efficiency.
Many of these fish are known to exhibit migratory behavior that requires efficient swimming. Even
baby clownfish are reported to migrate over long distances [413]. Tuna can cover 7600 km in one traveling
phase [214] and tiger shark are capable of traveling long distances in short time [412]. Our results in
Figure 2.10 are consistent with these facts: tiger shark and clownfish lie in the intersection region of
improved speed and efficiency, and tuna lies in the region with highest increase in efficiency. On the other
hand, butterflyfish, who are only known to migrate over short distances during spawning [504], and Koi,
for which there is no evidence of migration, lie in the regions characterized by smaller increases in speed
and efficiency.
To conclude this section, a few remarks on Re numbers are in order. The fish listed in Figure 2.10 vary
in length, swimming speed, tailbeat frequency, and cross-sectional geometry, whereas the model abstracts
these details into a simple two-link fish and flows represented by the vortex sheet method. Specifically, the
biological data spans a wide range of Re numbers (Re ∼ 103–107
). In the model, we used non-dimensional
parameters with fluid density ρ = 1, fish length L = 2, tailbeat frequency f = 1, and we obtained a
range of dimensionless swimming speeds U = 4.5–10 by varying the bending kinematics. Here, U = 4.5
is the speed of the rigid flapper. Because skin friction is accounted for in the model, it is possible to
calculate an effective Re = ρLU/µ in the context of our dimensionless vortex sheet model, where µ is a
dimensionless viscosity. Starting from the density (103 Kg·m−3
) and viscosity (10−3 Pa·s) of water, and
using the range of length scales and flapping frequencies from the biological fish data, we arrive at a range
of non-dimensional viscosity µ ∼ 10−8–10−2
. However, in our model, our choice of the drag coefficient
Cd = .664√
ρµL = 0.04 (see Appendix B) fixes the value of the dimensionless viscosity to µ ≈ 10−3
.
Thus, Re for the range of swimming speeds (U = 4.5–10) obtained in the model is of the order Re ∼ 104
.
39



2.6 Summary
We analyzed the swimming performance of flapping swimmers undergoing active and passive deformations. Whereas fish exhibit a variety of swimming modes [404], we simplified body deformations to account for only anterior-to-posterior bending, with one degree of freedom describing the relative rotation
between the two sections. We explored the effects of morphological and kinematic parameters on the
swimming speed and efficiency.
We found that passive body bending, at negligible body stiffness and minor elastic forces, caused
anterior-to-posterior antiphase flexion. This antiphase flexion is dictated by the flow physics and causes
the swimmer’s morphology to get more streamlined compared to rigid flapping with no flexion, thus creating leaner wakes that reduce drag and power requirement and increase efficiency. While drag reduction is desirable for improved efficiency, passive bending also reduces thrust production, thus diminishing
swimming speed. Interestingly, restoring elastic forces seemed detrimental to both swimming speed and
efficiency for a range of intermediate stiffness values. These findings are consistent with the hypothesis
that for maximum efficiency, the fish tail and posterior body should flex like water, exhibiting little or
no resistance to the flows generated by the flapping motion of the anterior portion of the body. This hypothesis could explain how anesthetized fish, with no muscle activity, placed in periodic wakes generate
oscillatory body deformations that allow the fish to swim upstream [31].
We also found that a swimmer that actively creates antiphase anterior-to-posterior bending enjoyed
the same benefits of leaner wake and reduced drag as a passively bending swimmer while mitigating
the reduction in thrust and swimming speed. To quantify the hydrodynamic mechanisms leading to this
improved swimming performance, we introduced a flexion agreement parameter that compares the active
flexion velocity of the swimmer’s posterior to the local flow velocity during passive flexion. We found
that during active in-phase flexion, the posterior beats opposite to the local flow that would naturally arise
during passive flexion, leading to a negative flexion agreement parameter, and thus lower swimming speed
40



and efficiency. During active antiphase flexion, the posterior flaps consistently with the local flow, leading
to a positive flexion agreement parameter, and improved speed and efficiency.
These findings suggest tremendous versatility in swimming performance, even when accounting only
for coarse anterior-to-posterior bending motions. They indicate that fish can readily and fluidly transition
from efficient (passive bending) to fast (active bending) by actively beating their tail in agreement with the
local flow generated during passive bending.
To explore the role of flow physics on the convergent bending rules of [284], we examined the effect
of flexion ratio and maximum flexion angle on the performance of swimmers undergoing active antiphase
anterior-to-posterior bending. We found an optimal region in this design space that simultaneously enhances swimming speed and efficiency. Importantly, we found that this region has a significant overlap
with the fish bending parameters reported in [284]; see Figure 2.10 and Figure 2.9B. Fish are able to adjust
their swimming speed by altering their tailbeat frequency [199]. Thus, fish could, in principle, maintain
kinematic flapping patterns that optimize efficiency while increasing their tailbeat frequency to achieve
higher swimming speeds.
Taken together, our results have two major implications on understanding the role of body bending
in fish swimming. They are consistent with the hypothesis that fish that actively bend their bodies in a
fashion that exploits the local hydrodynamics can at once improve speed and efficiency. They also support
the hypothesis that flow physics could have provided a selective force for driving the evolution of fish
bending patterns.
Beyond the trade-offs between speed and efficiency explored here, our model could be generalized
in future studies to examine transient swimming maneuvers such as turning. [371] proposed that a passive posterior not only improves efficiency, but also improves fish maneuverability compared to a rigid
body. [117] pointed out that although large lateral forces are detrimental to fish swimming speed, they
can improve fish maneuverability. Our model predicts large lateral forces during in-phase active flexion,
41



suggesting that this bending pattern, while not optimal for forward swimming, could be beneficial for
turning motions. These considerations, as well as models of higher fidelity to the fish biomechanics and
fluid environment, will be explored in future research.
Finally, our finding that active body bending and tailbeat patterns that match local flow velocities that
would be produced naturally by anterior sections of the fish body could lead to improved performance and
energy savings might have important implications on understanding the mechanisms driving body and
caudal fin deformations in swimming [383, 53], schooling [194, 33, 266], and navigating ambient unsteady
flows [270].
42



Chapter 3
Navigation in unsteady flows: wake tracking
Characteristics of Laminar, Turbulent, and Vortical Trails
Consider a source of size L moving in a fluid at a constant speed U with a constant odor emission rate r.
Depending on the lengthscale of the source and the fluid environment, water or air, the signal field, consisting of the odor concentration, has different characteristics. At the micron-scale in water, the Reynolds
number, the ratio of inertial to viscous forces, Re = UL/ν, where ν is the kinematic viscosity, is nearly
zero, and the Péclect number, the ratio of advective to diffusive time scales, Pe = UL/D, where D is molecular diffusivity, is typically less than 10 [Berg1977, Redaelli2021, 40]. The odor concentration diffuses
into a laminar plume, governed by the advection-diffusion equation (Fig. 1.1A, Methods). The strength of
the signal decays exponentially with distance, with a global maximum located at the source of the signal
(Fig. 1.1A,D).
For a millimeter-scale insect moving in air, Re and Pe are large. The signal field is transported by a
turbulent flow (Fig. 3.1B, Methods). Here, we employed a particle-based two-dimensional plume model
to simulate the random release of odor packets from the source [137, 414]: the odor packets get advected
by a mean background flow and random cross-stream perturbation, while they decrease in intensity and
increase in size over time (Methods).
43



For a source of moderate size and speed, such as a swimming fish [370, 106, 419], the wake is characterized by coherent vortical structures (Fig. 3.1C). Here, we solved the Navier-Stokes equation, using an open
source computational fluid dynamics (CFD) solver [210, 43, 167], for the flow past a pitching airfoil following sinusoidal oscillations in a uniform background flow, where we set the Reynolds number to Re = 5000
and the Strouhal number St = fL/U, f being the flapping frequency, to St = 0.25, comparable to those of
swimming (Methods). We solved for the concentration field numerically by solving the advection-diffusion
equation at Péclect number Pe = 10, 000. The signal field has a traveling-wave character of wavelength
λ and frequency f. In fact, this traveling wave behavior emerges for all physical quantities, odor, speed,
pressure and vorticity, in the wake of the source (Fig. 3.1C).
The signal field in the laminar plume (Fig. 3.1A) has a single local maximum at the source, whereas
in the turbulent (Fig. 3.1B) and vortical (Fig. 3.1C) trails, a source-seeker would encounter multiple local
maximum and minimum advected downstream by the background flow. Importantly, the signal decays
downstream at different rates Fig. 3.2): in the laminar and turbulent regimes, the signal decays exponentially downstream because of molecular and turbulent diffusion [98], whereas in the vortical wake, the
signal lasts much longer compared to both laminar and turbulence plumes (Fig. 3.2). For an aspiring trail
follower, this behavior of the signal field is intimately connected to which sensing and response strategies are viable. Whereas it is feasible to measure the gradient in the signal field and use a gradient ascent
strategy to locate the source in the laminar flow regime [41], spatial gradients are not well defined in the
turbulent regime where the signal is sporadic and scarce [465, 280, 464, 414]. In this work, we focus on
following vortical trails [86].
3.1 Learning To Track Hydrodynamic Trails Using Local Flow Sensing
Consider the problem of an artificial agent navigating an unsteady hydrodynamic trail with the aim of
locating its source. The unsteady trail consists of alternating vortices formed in the wake of a flapping
44



Table 3.1: Comparison between different source-seeking strategies in different signal fields. An
autonomous agent applies different strategies to track signal fields of different characteristics. We proposed two new strategies that track turbulent plumes and vortical wakes, respectively. In turbulent plume,
our proposed strategy does not need to know the mean flow direction as a prior [330, 456, 259, 414], and requires less memory usage [465, 387, 464]. In vortical flow, our proposed strategy responds instantaneously
to the spatial gradient without requirements on memory and is optimized via DRL [86].
Flow Environment Signal Non-local
information
Signal
gradient Strategy Memory
Gradient ascent [Berg & Brown 1972]
Run and tumble [Berg & Brown 1972]
Surge and cast [Murlis et al. 1992]
Infotaxis [Vergassola et al. 2007]
RL with RNN [Singh et al. 2023]
Finite state controller [Verano et al. 2023]
[Reddy et al. 2022]
Phase gradient [Colvert et al. 2020]
This study
This study
Laminar
Turbulent
Turbulent with mean flow
Vortical wake
Mean flow
No
Mean flow
Mean flow
No
No
No
No
No
No
Yes
Yes
No
Yes
No
No
Yes
No
Spatial map
Entire time period
No
One timestep
RNN latent states
Yes
Latent states
No Few timesteps
[Kadakia et al. 2022] One timestepNo
Related
organism
Bacteria
--
Moth
--
--
--
[Breugel & Dickinson 2014 ; Leitch et al. 2021] Ground & air speed Drosophila
No
Laminar
Turbulent
Turbulent
Turbulent
Vortical wake
No No Yes Bacteria
--
Mouse
Turbulent with mean flow
Turbulent with mean flow
Turbulent with mean flow
No
No
Yes
No
--
Drosophila larva
Mean flow
No
No
No
No
Yes
No
Yes
Method
Handcrafted
RL
RL
Handcrafted
Bayesian
RL
RL
Handcrafted
No Handcrafted
Bioinspired
Bioinspired
Bioinspired
airfoil at Reynolds number Re = 5000 and St = 0.25, comparable to those of swimming fish [442, 436, 254]
(see Sec. A.1). Traveling-wave signal fields with wavelength λ and frequency f emerge for all different
physical quantities in the wake, as illustrated in the bottom part of Fig. 3.1C, which is unique to vortical
flows, comparing to the signal field of laminar and turbulent plumes (the bottom part of Fig. 3.1A,B).
Thus, we started by considering mechano-sensing, especially sensing flow speed because of its biological
relevance [247, 129, 369, 391, 85].
To track the vortical wake, we modeled the agent as a self-propelled swimmer [262, 79, 207], with no
inertia, moving at a constant speed V at a varying angle θ(t) from the x - axis. It is equipped with two sets
of flow sensors, located at ℓhead and ℓtail from the center, each providing one sensory measurement shead
and stail, respectively (Fig. 3.1D). We added another difficulty to let the agent only access the gradient of
the flow field as opposed to sensing the absolute value of flow [174, 414]. Thus, inspired by the lateral line
system and functional evidence suggesting that flow receptors are optimized to correlate with differential
flow signals [247, 129, 391, 85], we considered shead = (n · ∇|u|)head and stail = (n · ∇|u|)tail that
measure the local gradients of flow speed normal to the agent’s swimming direction. Saving these two
45



vorticity
0
.5
-.5
y
A B
f t
0
1
L=1
C U=1
x
0-2-4
D Action at
Sensory observation ot
Reward rt
Agent
Actor-critic
neural network
hydrodynamic trail
E
0-2-4-6
vorticity
-10
10
pressure
-.5
0
speed
f t
0
odor
.25
1.5
2.5
0
1
0
1
0
1
0
1
0
odor
.25
odor
x
f t
f t
0-2-4-6
x
0
1
0
.5
-.5
0
.5
-.5
F
speedodor
Environment
y
f t
y
f t
sensor
left-right
gradient
observation
pressure
0
.25
Figure 3.1: Following odorless and scented trails at different scales. A. In laminar plumes, odor is released from a fixed
source, being advected by uniform background flow and undergoing diffusion process, resulting in a Péclect number Pe = 10.
The steady-state solution is plotted on top, and concentration along midline y = 0 is plotted at the bottom (Method Sec. A.4.1).
B. Plume simulation represents the stochastic release of odor packets from a source, which are transported by the turbulent wind.
The homogeneous turbulent flow is modeled by the superposition of a uniform background flow and random perturbation [137,
414] (Method Sec. A.4,A.1). C. Hydrodynamic trail created by a Joukowsky airfoil undergoing pitching oscillations in uniform
background flow U = 1 with tailbeat amplitude A = 0.2 and frequency f = 1.25, at Re = 5000 and St = 0.25, comparable to
that of swimming fish [442, 254]. Odor is released from the airfoil’s trailing edge at an emission rate of R = 1, carried by the flow
field and undergoing diffusion, resulting in a Péclect number of Pe = 10, 000. Various physical quantities in the flow field, e.g.
flow speed, pressure, vorticity, and odor, exhibit traveling wave characteristics. The decay rates of concentration along midline
in these plumes are reported in Fig. 3.2. The vortical flow shows the weakest decay. D. The agent is modeled as a swimmer with
body-fixed frame (t, n), moving at a speed V and able to control its heading direction t. The heading direction θ is defined by
the angle between its orientation and x-axis of inertial frame. Two pairs of sensors, each measuring the normal gradient of flow
signal, are placed on head and tail of the agent at distances ℓhead and ℓtail from the center. E. Control policies are represented
by neural networks and trained by Deep Reinforcement Learning (DRL). The agent takes the local observation ot (local signal
gradient) as input and navigates within the environment with actions at (angular velocity). F. Statistics on the gradient directions
of various physical quantities within the flow field in C.
sensory measurements, the agent is at all times ignorant of all other information in the flow field and
blind to its position relative to the source generating the flow. To respond to this limited and localized
information, with no memory of past measurements, we provided the agent only control over its angular
velocity ˙θ, as opposed to direct control over its heading angle θ [174]. The agent’s motion is thus described
by the nonholonomic unicycle model [262, 79, 207],
x˙ = V cos θ, y˙ = V sin θ, ˙θ = π(shead, stail). (3.1)
46



x
-1-2-3-4-5-6-7
0
1
.8
.6
.4
.2
c
laminar plume
turbulent plume
vortical wake
Figure 3.2: Concentration of odor versus x at the midline behind the source in different plumes. Odor concentration
in laminar plumes is shown as a red line. Time-averaged odor concentration in the turbulent plumes is shown as the blue line,
with the standard deviation illustrated as a transparent box. Time-averaged concentration in vortical flow (Re = 5000, St = 0.25,
Pe = 10, 000) is shown as a black line, and a sample of the instantaneous curve is shown in grey.
sensor at tail
0
1
-1
3
sensor at head
0 1-1
Turning
rate
-3
x(U/L)
0
Episodes
Reward
x
0-2-4-6
C
B
Excitatory Braitenberg
0
.5
-.5
y
x
0-2-4-6
sensor at tail
0 1-1
A Inhibitory Braitenberg
sensor at tailsensor at tail
Turning
rate
speed vorticity
pressure
odor
0
.5
-.5
y
E
D
0 1 2 3
×104
0
20
40
60
0
.5
-.5
y
speed
88%, 100%
pressure
89%, 72%
vorticity
88%, 90%
speed (RL)
87%, 91%
speed
vorticity
pressure
odor
100%
Figure 3.3: RL-policies and Braitenberg strategies for tracking hydrodynamic trails. A. Learning curves corresponding
to 10 example training instances, where half converged to one of two classes of policies; moving average of the total reward
is shown in red and blue, total reward for randomly-chosen 10% of the training episodes from all training instances is shown
as black dots. Convergence of different training instances is reported in Tab. 3.3. Representative policies of the two classes of
converged policies are represented as colormaps over the observation space (stail, shead), which are normalized by maximal signal
value in the flow field. B. Example trajectories based on the RL policies. C. Abstraction of excitatory and inhibitory Braitenberg
strategies from RL strategies. Evaluation of the agent’s behavior based on the RL policies and simplified counterparts (Sec. 3.2.1
and Figs. 3.9 and 3.10) show no major difference in performance. D. Example trajectories of excitatory (red) and inhibitory (blue)
policies are superimposed based on different sensory cues. E. Statistics on the gradient direction of various physical quantities
along trajectories applying the RL policies or Braitenberg strategies. The success rates of excitatory and inhibitory strategies are
given below the statistics of each sensory cue.
We aimed to learn control policies ˙θ = π(shead, stail) that guide the agent to track vortical wakes.
47



To train the agent to track the hydrodynamic trail from repeated experiences of navigating in the flow
environment, we employed model-free deep reinforcement learning based on Proximal Policy Optimization [400]. Details of the RL setup are given in Method Sec. B. We repeated the training twenty times, starting from randomized initialization in the flow field of Fig. 3.1C. Each training session consists of 30, 000
episodes, using flow speed as the signal field, and with parameter values Ωmax = 3U/L, V = 0.25U, and
ℓhead = ℓtail = 0.25L. In each episode, the agent was assigned a random initial position, orientation, and
phase in the period of the unsteady wake. Surprisingly, these training instances converge to two classes
of policies robustly (Table 3.3). In Fig. 3.3A, we reported the evolution of the cumulative reward for ten
training instances, five from each of the two converged classes of policies, shown in red and blue; solid
lines show average value, and transparent envelopes show standard deviations.
The two classes of policies exhibit striking features (Fig. 3.3A): (1) both are largely independent of the
measurement at the head and (2) although both converge during training, they exhibit opposite behaviors.
The red excitatory policy instructs the agent to turn in the same direction as the flow gradient at tail, that
is, in the direction of increasing flow speed. The blue inhibitory policy instructs the agent to turn in the
opposite direction to the flow gradient at tail, that is, in the direction of decreasing flow speed. Deploying
either policy, the agent successfully tracks the wake to its generating source (Fig. 3.3B). When applying the
excitatory strategy, it ventures more into the wake, slaloming between vortices, similar to experimental
observations in live animals [106, 370, 270, 401].
3.2 Learned policies are analogous to Braitenberg’s Vehicles
These features of policies suggest that they can be approximated analytically as bang-bang controllers [403]
whose behaviors depend only on the sensory measurement at tail stail. Starting from the colormaps in
Fig. 3.3A that express the RL action ˙θ over the observation space (shead, stail), without loss of significant
features, we discounted the sensory measurement at head shead entirely and used ℓ and s to refer to the
48



location and measurement of a single sensor, where ℓ can be either positive or negative, negative ℓ implies
sensor at tail. These simplifications (Fig 3.3C) lead to the analytic expressions
Excitatory: ˙θ = Ω sgn(s),
Inhibitory: ˙θ = −Ω sgn(s),
(3.2)
where Ω = Ωmax.
It is remarkable that RL robustly converged to two trail-following policies that are readily interpretable
and approximated by the simple analytic expressions in (3.2). These RL-inspired strategies exhibit similar behavior to the RL policies they approximate (Fig. 3.3B,D, Movie S.1), while being more amenable to
analysis.
These strategies in Fig. 3.3A,C evoke Braitenberg’s vehicles, originally proposed as thought experiments for constructing complex behavior from simple sensorimotor responses to environmental stimuli [54]. Braitenberg’s vehicles have been used to model animals’ tropotaxis behaviors [406], including
rheotaxis [395], phototaxis [525], chemotaxis [273], etc.
3.2.1 Statistical Measures for Evaluating the Performance of Braitenberg Strategies
Fig. 3.9A shows the distribution of the terminal locations of the agent. The excitatory policy is more
accurate in honing in on the source generating the flow, but the inhibitory policy has a higher success rate
in moving upstream. The distribution of search times in Fig. 3.9B illustrates that the inhibitory strategy
locates the source of the wake faster because the corresponding trajectories meander less in the wake. For
evaluating how much the agent can utilize the flow field for efficient navigation, we calculated the flow
agreement parameter along each trajectory [194, 14, 193]
Flow agreement parameter =
1
(tf − t0)Ω2
Z tf
t0
˙θ(t)ω(x(t), t)dt, (3.3)
49



which evaluates how much the control input ˙θ(t) matches the local flow vorticity ω(x(t), t). We also
calculated the thrust parameter to evaluate how well the agent can utilize the flow field to generate thrust.
Given that the thrust force of a flapping swimmer scales with the square of the swimmer’s lateral velocity
relative to the surrounding fluid’s velocity [442, 148, 339], we defined thrust parameter along a trajectory
as
Thrust parameter =
1
(tf − t0)V 2
Z tf
t0
( ˙y(t) − v(x(t), t))2
dt. (3.4)
Fig 3.9C, D shows the flow agreement parameter and the thrust parameter of trajectories generated by both
excitatory and inhibitory strategies starting from the 76, 500 different initial conditions. These parameters
characterize the opportunity for the agent to utilize the wake for efficient navigation. Fig 3.9C, D shows
that the trajectories instructed by excitatory strategy have a positive flow agreement and larger thrust
parameter on average. In contrast, the inhibitory strategy proposes trajectories with negative flow agreement parameters and lower thrust parameters. These show that by moving inside the wake, the excitatory
strategy has more opportunity to benefit from the flow to swim more efficiently [270].
3.3 Robustness of the RL-inspired strategies over Parameter Space and
under Noise
The policies’ performance in locating the source when starting downstream is then evaluated as in Methods
Sec. B. The excitatory and inhibitory RL policies achieved 87% and 91% success rates, while the RLinspired counterparts achieved 88% and 100%, all when tested in the same wake used during training.
Statistical properties of the tests for each policy showed that the excitatory policy is more accurate in
honing in on the source, while the inhibitory strategy locates the source faster because the corresponding
trajectories meander less in the wake (see quantitative analysis in Sec. 3.2.1 and Fig. 3.9). By the same
50



token, because it causes the agent to move inside the wake and slalom between vortices, the excitatory
policy provides the agent with more opportunities to utilize the flow for efficient navigation. These trends
are consistent in the original RL policies and simpler RL-inspired strategies.
To test the robustness of policies under real-world noises, we challenged the agent in two major ways:
(i) we decreased the rate at which it collects observations and responds to flow signal by increasing the
decision timestep ∆t, and (ii) we imposed sensory limits smin below which it was not able to sense the
flow signal (Fig. 3.10). Both RL policies and RL-inspired strategies exhibited remarkable robustness to
these reduced abilities, with stronger robustness exhibited by the RL-trained policies.
The parameter values of the controllers in (3.2) were inherited from the RL policies, with angular speed
Ω = Ωmax and ℓ = −0.25L. The next question we asked is whether the performance of the controller is
robust or sensitive to the specific parameter choice. To address this question, we tested the performance
of each RL-inspired strategy in (3.2) over an entire range of sensor locations ℓ/L ∈ [−0.75, 0.75] and
angular velocities ΩL/U ∈ [0.5, 10], corresponding to turning radii V /ΩL ∈ [0.025, 0.5]. We used the
same wake employed during training and performed the standard test introduced in Methods Sec. B for
each combination of parameters (ℓ/λ, V /ΩL). The resulting success rates are shown in Fig. 3.4A, B. The
parameter values of the RL policies are indicated in black dots. Both excitatory (red) and inhibitory (blue)
policies are successful, with over 80% success rates, for nearly the entire parameter space provided that
sensory measurements are taken at tail (ℓ < 0). Importantly, because the performance of the RL-inspired
strategies is insensitive to parameter values, an agent does not need to fine-tune its sensor location ℓ
(provided ℓ < 0) and turning radius V /ΩL to a specific wake type; its performance should be robust to
variations in the wake itself.
51



3.4 RL-Inspired Braitenberg Strategies Track Both Odorless and Scented
Trails
How do these trail-following strategies perform when different types of flow information are available to
the agent, especially when measuring the odor field carried by the flow field?
The traveling wave characteristic is robust across these sensory cues (Fig. 3.1C). RL training based on
these 3 types of sensory cues converged except for the policy based on vorticity measurements (Table 3.3
and Fig. 3.7). For pressure/odor, RL training led to only inhibitory/excitatory-like policies. The RL policies
in Fig. 3.7 also inspire the bang-bang controller (3.2).
In Fig. 3.3D, we show results based on direct implementation of the simpler RL-inspired strategies
in (3.2) for the same parameter values used during RL-training ℓ = −0.25L and Ω = 3U/L. Sample trajectories, starting from the same initial conditions and following the excitatory and inhibitory strategies
are plotted in Fig. 3.3D and Suppl. Movies S.3. All of them have a high success rate in the wake we considered. To explain this, we plotted the direction of the spatial gradient of the signal along corresponding
trajectories in Fig.3.3E. All of them prefer to point upstream direction as opposed to no preferred directions
in the whole signal field (Fig. 3.1E). This indicates that the motion of the agent shapes the signal it acquires.
Success rates of different sensory cues over the parameter space (ℓ/λ, V /ΩL) are shown in Fig. 3.8 for
both excitatory and inhibitory strategies, exhibiting high success rates, over 80%, over large ranges of the
parameter space, provided that the sensory measurement is trailing the agent’s position (ℓ < 0). Consistent
with RL training, chemo-sensing functions only under the excitatory strategy. That is, it requires turning
in the direction of higher concentration because the gradient of concentration is, on average, pointing
outside the wake (Fig. 3.1F). All other signals functioned well via excitatory and inhibitory policies, even
the vorticity, for which the training did not converge. This is because the reward used in RL training
requires the agent to accurately locate the source, whereas the success rate in Fig. 3.8 is based on the
52



success rate: 100%, 100%
E
F
G
success rate: 100%, 76% success rate: 85%, 84%
success rate: 91%, 5%
success rate: 100%, 55%
Turning radius V/ ΩL
.025
.05
.125
.25
.5
50%
80%
80% 50%
.50-.5
Sensor location
-.75 -.25 .25 .75
.50-.5
Sensor location
-.25 .25 .75
Success
rate (%)
100
50
0
A
Success
rate (%)
100
50
0
-.75
Turning radius V/ ΩL
.025
.05
.125
.25
.5
Excitatory Braitenberg
B Inhibitory Braitenberg
C
D
Figure 3.4: Generalization to unseen flows. A., B. Parametric study performed using the simplified policies in Fig.3.3C with
flow speed as sensory cue. Success rate is plotted as a function of sensor location ℓ/λ and turning radius V /ΩL. The corresponding
colormaps using other sensory cues are reported in Fig. 3.8. Sample trajectories based on employing the RL-inspired strategies
in unseen wakes: C. pitching airfoil at Re = 1000 and St = 0.25 placed in a background flow which changes its direction; D. 3D
swimming fish [402]; E. pitching airfoil at Re = 5000 and St = 0.1; F. burst-and-cost pitching airfoil at Re = 1000, St = 0.25,
and duty cycle 0.6; G. fixed cylinder at Re = 400. Furthermore, success rates for all hydrodynamic trails we tested are reported
in Table 3.2. Parameters are chosen as Ω = ±3, and ℓ = −0.25λ.
ability of the agent to move upstream, even if it ends at a lateral offset y from the source (trajectories
in Fig. 3.3D). But even by this measurement of success, the performance of all sensory measurements
exceeded those of the vorticity sensor, both in range and robustness to parameter values and in maximum
success rate. These results, failure of vorticity-based RL training and suboptimal performance of associated
RL-inspired strategies, are consistent with recent reports of the inefficacy of sensing vorticity in point-topoint underwater navigation [174, 217], and challenge approaches that advocate for sensing vorticity [350].
Sensing flow velocity, pressure, or odor is farely better in tracking flows. Indeed, since pressure sensors
are most readily available commercially [463, 395, 173], the strategy based on sensing pressure would be
most amenable for immediate implementation in a robotic demonstration. Combining different types of
sensors at once will be the topic of future studies and may lead to more robust trail-following behavior.
53



3.5 Generalization to Unseen Hydrodynamic Trails
We next tested the performance of the strategies in (3.2) in wakes not seen during training of the “parent"
RL policies. Specifically, we tested in flows past pitching foils mimicking turning gait(Fig. 3.4C) and intermittent swimming (Fig. 3.4F), and different Strouhal number and Reynolds number spanning Re = 500 -
5000 and St = 0.1 - 0.25 (Fig. 3.4E, Table 3.2); four wakes generated by flows past a fixed cylinder spanning
Re = 200 - 500 (Fig. 3.4G, Table 3.2); and one three-dimensional (3D) wake of past an undulating body at
Re = 2400 from [402] (Fig. 3.4D). For the latter, we extended the simplified policies to control both yaw and
pitch angles of the agent (Sec. 3.5.1). Because each wake has a different wavelength, we used the wavelength λ of the wake to scale the corresponding sensor location ℓ = −0.25λ relative to the wake in Fig. 3.1.
Performance in all wakes we performed testing is summarized in Table 3.2. Without any other parameter
tuning, RL-inspired strategies achieved success rates of nearly 100%, in guiding the agent to reorient upstream and track most of these hydrodynamic trails. In challenging scenarios, e.g. direction of background
flow is changing (Fig. 3.4C) or intermittent flapping (Fig. 3.4F), in which the traveling wave structure is
not available everywhere in the entire time series, excitatory strategy is performing notably better than
inhibitory strategy by navigating inside the wake, where the traveling wave structure is available. Despite their parsimony and simplicity, the RL-inspired strategies are remarkably generalizable, allowing the
agent to track hydrodynamic trails of various characteristics, from thrust to drag jets to 3D wake behind
undulatory self-propelled swimmers (Fig. 3.4).
3.5.1 Application of Braitenberg Strategies in 3D wakes
In 3d, we defined the coordinate x = (x, y, z) of the agent in the fixed inertial frame (ex, ey, ez) and affixed
a body frame (t, n, b) to the agent. For the body-fixed frame, t is the swimming direction of the swimmer,
n and b are both orthogonal to t and orthogonal to each other. We used Euler angles (θ, ψ, ϕ) to describe
the orientation of the agent. We exclude the spinning around t by setting ϕ = 0, and control its heading
54



B
0
2
4
0
-̟/2
-̟
θ
̟/2
̟
A sensor at tail sensor at center sensor at head
stable
unstable
neutral
-3 0 3
15 1816 17
unstable
stable
traveling wave
15 1816 17 15 1816 17
x + f t x + f t x + f t x/
y/
Figure 3.5: Stability analysis in 1D traveling signal field. A. Example trajectories of agent with sensor placements at tail
(ℓ = −0.25), center (ℓ = 0), and head (ℓ = 0.25), starting from the same initial condition (x, θ, t)|0 = (0, 0.55π, 0) in a
simplified signal field A cos(2πx/λ + 2πf t). B. Phase portraits on the phase space (x + λf t, θ) for sensor at tail (ℓ = −0.25),
center (ℓ = 0), and head (ℓ = 0.25). Trajectories are superimposed after they have reached the quasi-steady state. Swimming
speed V = 0.25 and turning rate Ω = 1.
angle θ and ψ based on the gradient of the signal field in b and n directions, respectively. Thus, in lab frame
(ex, ey, ez), the body frame is expressed as t = (cos ψ cos θ,sin ψ cos θ, − sin θ), n = (− sin ψ, cos ψ, 0),
b = (cos ψ sin θ,sin ψ sin θ, cos θ). Based on this convention, equation of motion of the agent is written
as
x˙ = ( ˙x, y,˙ z˙) = V t,
˙θ = −Ω sgn(b · ∇g)|x+ℓt, ψ˙ = Ω sgn(n · ∇g)|x+ℓt,
(3.5)
where g(x, t) is the signal field. In the case shown in Fig. 3.4D, the sensory cue is chosen to be the flow
speed |u|.
3.6 Stability Analysis in TravelingWave Signal Emphasizes the Importance
of Sensor Placement
The most salient and common features among all these wakes are their (quasi-) periodicity and travelingwave structure (Fig. 3.1C). But how does this feature contribute to the success of the RL policies and
Braitenberg strategies? More importantly, why the sign of control gain is not important, but the sensor
placement is crucial? To address these questions, we analyzed the stability of the RL-inspired strategies in
a 1D traveling-wave signal of the form g(x, t) = A cos(2π(x + λf t)/λ). It is a general solution of wave
55



equation gtt = λ
2f
2
gxx. This 1D wave solution is not only the simplest representation of hydrodynamic
wake left by a fish-like swimmer [339, 193], but also is fundamental to different fields of physics, e.g. sound
propagation, electromagnetic wave, quantum physics, etc.
The excitatory and inhibitory strategies are related by change of sign from Ω to −Ω, which, in the
traveling wave signal field, is equivalent to a translation in either space x → x+λ/2 or time t → t+1/(2f).
It thus suffices to analyze, say, the excitatory strategy with the understanding that the asymptotic behavior
of both strategies is similar.
Trail tracking in this 1D traveling wave signal field amounts to moving upstream in the direction of
θ = 0, opposite to the traveling wave direction. Numerical simulations show that upstream motion in
this signal field depends on the location ℓ of the sensor (Fig. 3.5A). To analyze the problem analytically,
we introduced a coordinate transformation z = x + λf t, where z represents a Lagrangian point moving
with the traveling wave, and substituted z and g(z) = g(x + λf t) into (3.1) and (3.2) to eliminate explicit
dependence on time,
z˙ = V cos θ + λf,
˙θ = Ω sgn
2πA
λ
sin θ sin
2π
λ
(z + ℓ cos θ)
.
(3.6)
The system in (3.6) has two moving equilibria at θ = 0 (moving upstream) and θ = ±π (moving downstream). The stability of these equilibria depends on the sensor location. In Fig. 3.5B are depictions of
the phase portraits of (3.6) over the phase space (z, θ). Dashed black lines indicate level sets where ˙θ is
0. The phase velocity ˙θ between these levelsets alternate directions. The linear stability at these equilibria θ = 0, ±π varies periodically over coordinate z. The shape of the levelsets ensures that for ℓ < 0,
a trajectory travels longer distance in the regions where the phase velocity points towards θ = 0; the
upstream equilibrium is thus stable. For ℓ > 0, trajectories travel longer in the regions where the phase
velocity points towards θ = ±π; the downstream equilibrium is stable. At the bifurcation point ℓ = 0,
both equilibria are neutrally stable.
56



3.7 Versatile navigation strategies are applicable to track vortical and
turbulent plume
Is lateral gradient of flow field and sensor offset at tail required for wake tracking? To address this question,
we tried two different changes in the design of the agent.
In Fig. 3.6A, we considered using a tangential gradient of flow speed to replace the lateral gradient.
In simulation, longitudinal gradient works as well as lateral gradient in wake tracking. Similar stability
analysis is carried out in Sec. 3.10 and Fig. 3.15. Stability analysis also shows that moving upstream is an
asymptotic behavior when having sensor at tail.
The fact that the RL policies ignore entirely the sensory measurements at head for wake tracking is
puzzling. In nature, most sensors are located at the head of animals. The setae and whiskers of copepods
and harbor seals precede the organism, and the lateral line sensors, while distributed along the entire fish
body, occur in higher density at the head [391, 85]. Why does our optimal RL solution seem to contradict
these observations? It doesn’t. It simply reflects a time delay between sensing and response. For example,
consider a seal with flow sensors at its head and the time delay between sensory input and response output.
If the point-particle model is used to describe how the seal orients to the sensed signal, it would seem that
its response depends on a past signal, now in the flow behind its current location, as if instantaneously
responding to a sensor at its tail. To support this proposition, we examined the RL policies in a simplified
traveling wave signal field. Namely, we considered a policy π(s(x(t − τ ), t − τ )) where the response
depends on a delayed signal measured at time t − τ , where τ is the time delay, in a 1D signal field that
57



satisfies g(x, t) = g(2π(x + λf t)/λ), where λf is the speed of the traveling signal. Taking the Taylor
series expansion of x(t − τ ) about time t, assuming small τ , we get that
˙θ(t) = π (s(x(t − τ ), t − τ )) ≈ π (s(x − ℓ cos θ, t)),
where ℓ = V τ
(3.7)
That is, a time-delayed response is equivalent to having a sensor located at a distance V τ behind the
agent’s location. The agent with a time-delayed response (Fig. 3.6B), like its counterpart with instantaneous
response to a sensor at the tail (solid lines in Fig. 3.3B,D), tracks the wake to its generating source with
similar trajectory and success rate.
Last question is how to extend this controller to turbulent flow. In turbulent flow, spatial gradient of the
signal field is ill-defined in the entire domain (there is a jump at the boundary of each odor packet, and zero
otherwhere.). Thus, we employed measuring temporal difference, which is a first-order approximation of
measuring tangential gradient with an offset. It is formulated as follows. At each timestep t, the agent
compares the current signal g(x(t), t) with a previous the signal at previous time g(x(t − τ ), t − τ ) and
turns based on whether it is bigger or smaller than the current signal:
˙θ(t) = Ω sgn {g[x(t), t] − g[x(t − τ ), t − τ ]} (3.8)
The temporal difference of signal is an approximation of the longitudinal gradient of the signal field plus
a time delay. In turbulent plume, Ω > 0 and Ω < 0 generate similar success rates (Fig. 3.6)C, but do
not intuitively map to excitatory or inhibitory strategy since they are responding to temporal difference
instead of lateral gradient. The success rate outperforms RNN agents in [414].
In addition to biological data, we collected data from underwater robotic systems at different levels of
biomimicry [512, 195, 266, 269]. Robotic systems capable of large ranges of body deformations exhibited
58



0
.5
-.5
y
x
0-2-4-6
success rate: 83%, 82%
0
.5
-.5
y
L=1
success rate: 94%, 86%
success rate: 100%, 98%
0
.5
-.5
y
B
A
C
Figure 3.6: Versatile strategies track vortical and turbulent wakes. A. In a vortical wake, an agent measuring longitudinal
gradient of flow speed at tail with parameters ℓ = −0.25, Ω = 3 is applied. B. In a vortical wake, an agent measuring time
longitudinal gradient at time delay with parameters τ = 0.1, Ω = 3 is applied. C. In a turbulent plume [137, 414], an agent
measuring temporal difference along its trajectory as described in Sec. 3.7 with parameters τ = 1, Ω = 2 is applied. The
trajectories show the classic "cast and surge" behavior [103, 330]. Parameters are chosen in the range of the values of harbor
seals. As a baseline for comparison, the success rate in [414] of RNN agent tracking signal field of constant direction is about
75%.
turning radii comparable to those of swimming organisms [512, 195, 266]. Using the results in Fig. 3.4A,
we predicted ranges of sensor locations (shown in yellow in Table 3.4) for which these robotic systems
would exhibit success rates exceeding 95%. To probe these predictions numerically, we employed a threelink fish model [218, 216] and developed, based on the RL-inspired excitatory and inhibitory strategies,
controllers that directly mapped local flow signals to body deformations, much like Braitenberg’s vehicles
that linked signal intensity to wheel rotation [54]. Numerical tests indicate the success of the three-link
fish in tracking a traveling wave signal (Suppl. Movie S.4). Further analysis and experimental validation
of these predictions will be the topic of future research.
3.8 Biological Data
We sought to evaluate our predictions of the success rate of the RL-inspired policies over the parameter
space of turning radius V /ΩL and sensor location translated from time delay ℓ/L ≈ V τ /L in terms of
the parameters achievable by biological and robotic systems. To this end, we gathered data on the turning
59



radius R and swimming speeds V of aquatic organisms and underwater robotic vehicles. Fish data are
based on the review paper by Domenici et al. 1997 [111] and other sources [486, 396, 56, 231, 112, 186,
360, 421, 236, 484, 50, 481]. Copepod data are collected from [342], and data for sea lions, which are close
relatives of harbor seals, came from [144].
We next sought to collect data on the time delay τ between mechanosensing and motor response of
aquatic animals. For fish, [113] measured the latency of fish evasion response to a mechano stimulus by
measuring body flexion and neural activity. The time delay ranged from 5 to 150 ms. Similarly, in [392],
the authors measured the stimulus-to-response time exhibited by hatchling Xenopus tadpoles and found
the response time is about 70 ms. For harbor seals, [232] measured the reaction time of harbor seals to
acoustic stimulus, and found the reaction time of body motion ranged from 188 – 982 ms. For copepods,
[260] measured the reaction time of tethered Undinula vulgaris (Calanoida) to a hydrodynamic stimulus.
They found that the reaction time of power stroke was under 2.5 ms following the onset of the stimulus.
Some of these data were available in the min-max range (e.g. [232, 113]), which we converted to mean
and standard deviation using the range rule std = (max − min)/4 [455]. The complete dataset is available
in the Table. 3.4.
60



Table 3.2: Success rate of excitatory and inhibitory strategies in CFD simulations of flows past
stationary and moving bodies. Wake type depends on how many vortices are shed in a period and their
spatial relationship [494, 398]. When two single vortices are shed in one period, such as Figs. 3.1C of the
main text, the wake is called a 2S wake. If two pairs of vortices and two single vortices are shed during
one period, such as in Fig. 3.4E of the main text, the wake is categorized as a 2P+2S wake. A von Kármán
vortex street that generates a jet pointing upstream is a drag wake (Fig. 3.4G in the main text), while a
reverse von Kármán vortex street is a thrust wake (Fig. 3.1C). The success rate of excitatory and inhibitory
strategies of ℓ = −0.25λ and Ω = 3U/L are reported in the right two columns.
label structure f U ν A Re St wake type
(%)
success
excitatory
(%)
success
inhibitory
I airfoil 1.25 1 2 · 10−4 0.2 5000 0.25 2S thrust 88 100
II airfoil 1.25 1 2 · 10−3 0.2 500 0.25 2S balanced 88 100
III airfoil 0.5 1 2 · 10−4 0.2 5000 0.1 2P+2S 100 100
IV airfoil 0.5 1 2 · 10−3 0.2 500 0.1 2S drag 100 100
V airfoil 1.25 1 1 · 10−3 0.2 1000 0.25 intermittent wake 100 55
VI airfoil 1.25 1 1 · 10−3 0.2 1000 0.25 curved wake 91 5
VII cylinder - 1 5 · 10−3
- 200 0.203 2S drag 99 99
VIII cylinder - 1 3.33 · 10−3
- 300 0.216 2S drag 80 80
IX cylinder - 1 2.5 · 10−3
- 400 0.222 2S drag 85 84
X cylinder - 1 2 · 10−3
- 500 0.229 2S drag 100 100
XI
3D
undulating
fish
1 0.48 2 · 10−4 0.32 2400 0.42 2S balanced 100 76
Table 3.3: Training instances of RL policies based on different sensory cues. We conducted multiple
training instances for four different signals transported by the flow field. The cumulative reward is shown
in Fig. 3.3 of the main text and Fig. 3.7 based on representative training instances for each sensory cue.
sensory cue
number of
training instances
convergence
fraction
excitatory
fraction
inhibitory
fraction
speed n · ∇|u| 20 17/20 8/17 9/17
vorticity n · ∇|ω| 15 0/15 - -
pressure n · ∇p 11 9/11 0/9 9/9
odor n · ∇c 9 9/9 9/9 0/9
61



Table 3.4: Biological data of animals’ locomotion capabilities. Spe
c
ies Comm
o
n n
a
m
e
B
o
d
y
l
e
ngt
h
(
m
)
S
w
imm
i
n
g
s
p
e
e
d
(
m
/
s
)
Turn
i
n
g
r
a
d
i
u
s
(
m
) Re
y
n
o
l
d
s n
u
m
b
e
r Re
feren
c
e
O
n
corh
y
n
chu
P
a
c
i
f
i
c trout 0.3
3 0.16
5 0.125
4 5445
0
W
e
i
h
s 1
9
7
3;
Sat
o et a
l . 200
7;
B
r
ett 19
6
5
Co
r
y
p
h
a
e
n
a h
i
p
purus Do
l
p
h
i
n
f
i
s
h 2 ± 0.3 1.
5
± 0
.
5 0.
33 ± 0.0
795 3.
0
0E+06 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97;
W
e
bb
&
K
e
y
e
s 1
981
X
e
n
o
m
y
stu
s n
igri
K
n
ite
f
i
s
h 0.11
3 0.
95 ± 0.1
6
5 0.00621
5 10735
0 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97; K
a
s
a
p
i et a
l . (199
3
)
Pterop
h
yll
u
m e
i
m
e
kei
A
n
g
l
e
f
i
s
h 0.07
3 0.
7
8
5
± 0
.
0
675 0.004745
± 0.0001
5 5730
5 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97; Do
m
e
n
i
c
i
&
B
l
a
ke
1
991
E
s
o
x
l
u
c
ius
P
i
k
e 0.
5
1
1
± 0
.
0
4
4 0.
2
2
1
± 0
.
0
3
5 0.04599
± 0.00
4 11293
1 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97;
D
i
a
na
1
9
80; Ha
r
p
e
r
&
B
l
a
ke
1
9
9
0
M
i
c
r
o
pteru
s d
o
l
o
m
ieu
S
m
a
l
l
m
o
uth
b
a
s
s 0.
31 ± 0.07 0.
8
± 0
.
2 0.0341
± 0.01
4 24800
0 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97;
W
e
bb
1
9
83; P
e
a
ke
& Fa
r
r
e
l
l 2
004
O
n
corh
y
n
c
h
u
s
m
y
k
i
s
s Ra
i
n
b
o
w trout 0.
1
1
7
± 0
.
0
035 0.
35 ± 0.01 0.02106
± 0.00
3 4095
0 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97;
W
e
bb
1
9
76; K
i
e
ffer et a
l . 199
8
Seri
o
la
dor
s
a
l
i
s
Y
e
l
l
o
wta
i
l 0.18
9 0.7708
± 0.093
6 0.0434
7 14568
1 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97;
W
e
bb
&
K
e
y
e
s 1
9
8
1;
W
e
g
n
e
r et a
l . 201
8
T
h
u
n
n
u
s a
l
b
a
cares
Y
e
l
l
o
w
f
in
t
u
n
a 0.
8
4
5
± 0
.
0
475 0.
68 ± 0.11 0.3972
± 0.174
4 57460
0 Do
m
e
n
i
c
i
&
B
l
a
ke
1
9
97; B
l
a
ke et a
l . 199
5;
B
l
o
c
k et a
l . 199
7
P
i
n
n
i
p
e
d
Z
a
l
o
p
h
u
s
S
e
a L
ion 1.
8
0
5
± 0
.
0
8
5 2.8857
± 0.767
5 0.
4
0
5
± 0
.
1
394 5.
2
0E+07
F
i
sh
et
a
l.
2
0
0
3
Z
o
o
p
l
a
n
kton
B
e
s
t
i
o
l
i
n
a
s
i
mili
s a
nd
Parv
o
c
a
l
a
n
u
s
c
r
a
s
s
i
r
o
s
t
r
i
s Cop
o
p
e
d 0.0000
5 0.0017
± 0.000
7 0.000076
± 0.00003
2 0.
1
N
i
i
mot
o e
t a
l.
2
0
2
0
T
ype
B
o
d
y
l
e
ngt
h
(
m
)
S
w
imm
i
n
g
s
p
e
e
d
(
m
/
s
)
Turn
i
n
g
r
a
d
i
u
s
(
m
) Re
feren
c
e
M
u
l
t
i
l
i
n
k
f
i
s
h 0.4
5 0.2
5 0.0
9
L
i et a
l . 202
0
M
u
l
t
i
l
i
n
k
f
i
s
h 0.3
4 0.06
3 0.
1
H
i
rat
a et a
l . 200
0
M
u
l
t
i
l
i
n
k
f
i
s
h 0.6
5 0.1
9 0.2210449
1
Yu et a
l . 200
8
D
riv
e
n b
y
c
a
u
d
a
l
f
i
n 1.2
3 1.
1 1.80072449
9
L
i
a
ng et a
l . 201
1
D
riv
e
n b
y prop
eller 1.
6 1.
1 4.20169049
8
L
i
a
ng
et
a
l.
2
0
1
1
Re
s
p
o
n
se
t
i
me (
m
s
)
Fish
5
-
1
50
S
eal
1
8
8
-
9
8
2
Z
o
o
p
l
a
n
kton 2.5
S
w
imm
i
n
g
s
p
e
e
d a
nd
turn
i
n
g
rat
e o
f a
n
i
m
a
l
s
S
w
imm
i
n
g
s
p
e
e
d a
nd
turn
i
n
g
rat
e o
f
r
o
bots
Re
s
p
o
n
se
t
i
me
o
f animals
Re
f
e
r
e
n
c
e
Kas
tel
e
i
n et a
l .
2
0
1
1
D
o
m
e
n
i
c & Hale
2
0
1
9
; Ro
b
e
r
ts et a
l .
2
0
1
9
Le
n
z & Har
tli
n
e
1
999
F
i
s
h
U
n
d
e
r
w
a
t
er
r
o
b
o
t
s
62



0 1 2 3
Episodes (×104
)
sensor at tail
3
Turning
rate
-3
x(U/L)
0
CA B
0 1 2 3
Episodes
0 1 2 3
Episodes (×104
)
-10
0
10
20
30
40
50
60
Reward
sensor at tail
sensor at head
1
-1
0
1-1 0 1-1 0
vorticityspeedodor pressure sensor at head
1
-1
0
1-1 0
sensor at tail
1-1 0
sensor at head
1
-1
0
0 1 2 3
Episodes
D
(×104
) (×104
)
Figure 3.7: RL-training for different sensory cues. A. Lateral gradient of odor concentration. B. Lateral gradient of flow
speed. C. Lateral gradient of pressure. D. Lateral gradient of vorticity magnitude. The convergence rates over multiple training
instances are reported in Table 3.3. In A and C, RL training converges to excitatory and inhibitory strategies, respectively. In B,
different training instances converge to either excitatory or inhibitory strategies. In D, the training does not converge. The insets
show the converged RL policy by plotting the action ˙θ as a colormap over the observation space (stail, shead). These colormaps
exhibit similar features, though not as pronounced, to those obtained in Fig 3.3A of the main text: to first order-approximation,
the action ˙θ is mostly independent of the signal shead. We thus approximate these strategies following (3.2). Parameter values:
ℓtail = −0.25L, ℓhead = 0.25L, Ω = 3U/L. The flow field used during training is the same as that in Fig. 3.3 of the main text.
3.9 Analysis of proportional controller in traveling wave signal field
To further analyze the system, we linearize the Ω sgn(·) function in (3.6) to linear control law G·. To
remove the explicit dependence on time, a coordinate transformation z = x + fλt is performed, thus the
system is written as
z˙ = V cos θ + fλ, ˙θ =
2AGπ
λ
sin θ sin(2π
λ
(z + ℓ cos θ)) (3.9)
After this, the phase portrait of the system is plotted in Fig. 3.13. To analyze the behavior of this dynamic
system, we first find the equilibria of the system. θ = 0 and θ = ±π are two manifolds corresponding
to moving upstream and downstream in the signal field. When the speed of the agent V is larger than
the traveling speed of the signal field fλ, a set of equilibrium points (z, θ) = ((nV λ + 2fλℓ)/2V, π ±
arccos(fλ/V )), n ∈ Z appear as shown in Fig. 3.13(D-F).
63



Turning radius V/ ΩL
.025
.05
.125
.25
.5
100
50
0 .025
.05
.125
.25
.5
Success rate (%)
.25 .50-.25-.5
ℓ \ λ
.25 .50-.25-.5
ℓ \ λ
.25 .50-.25-.5
ℓ \ λ
vorticity speedodor pressure
50%
80%
Turning radius V/ ΩL
.25 .50-.25-.5
ℓ \ λ
CA B D
Figure 3.8: Parametric study for different sensory cues. Success rate of the agent as a function of sensor location ℓ/λ
and turning radius R = V /ΩL for different sensory cues with excitatory (red) and inhibitory (blue) RL-inspired strategies. Four
different sensory cues are tested: odor (A.), flow speed (B.), pressure (C.), and vorticity (D.).
3.9.1 Linear stability analysis of the system
Then, we analyze the linear stability of the system at these equilibria as follows. The Jacobian matrix of
the system is
J =


∂z˙
∂z
∂z˙
∂θ
∂ ˙θ
∂z
∂ ˙θ
∂θ

 =


0 −V sin θ
C D


(3.10)
where
C = AG(
2π
λ
)
2
sin θ cos(2π
λ
(z + ℓ cos θ)),
D =
2AGπ
λ
cos θ sin(2π
λ
(z + ℓ cos θ)) − AG(
2π
λ
)
2
sin2
θℓ cos(2π
λ
(z + ℓ cos θ)).
(3.11)
At equilibria θ = 0, ±π, the Jacobian matrix is simplified as
Jθ=nπ =


0 0
0
2AGπ
λ
(−1)n
sin( 2π
λ
(z + ℓ(−1)n
))


, n = 0, ±1 (3.12)
64



A
C
Duration (tf
-t
o
) f
D
-1 0 1
0
Flow agreement
parameter
Thrust parameter
0
1
2
3
×10-2
0 20 40 60 80 100
6
9
0 3 9
-.5 .5
6
0
.5
1
3
-1 0 1
Flow agreement
parameter
Thrust parameter
0 3 9
-.5 .5
6
B
Duration (tf
-t
o
) f
0 20 40 60 80 100
Duration (tf
-t
o
) f
0 20 40 60 80 100
Duration (tf
-t
o
) f
0 20 40 60 80 100
-1 0 1
Flow agreement
parameter
-.5 .5 -1 0 1
Flow agreement
parameter
-.5 .5
Thrust parameter
0 3 96
Thrust parameter
0 3 96
RL-trained policies Braitenberg strategies
pdfpdfpdf
0
1
2
3
pdf
terminal y location
-1 0 1-.5 .5-1.5 1.5
terminal y location
-1 0 1-.5 .5-1.5 1.5
terminal y location
-1 0 1-.5 .5-1.5 1.5
terminal y location
-1 0 1-.5 .5-1.5 1.5 Figure 3.9: Statistical characteristics of the RL-trained policies and RL-inspired strategies. The statistical characteristics
are based on all trajectories, out of the 76,500 test cases for each policy, which succeed in moving upstream. The success rate for
RL-trained policies and RL-inspired strategies are 86.82% and 87.5%, respectively, for the excitatory policies (red), and 91.47%
and 100% respectively, for the inhibitory policies (blue). A. Distribution of the terminal y(tf ) location of the swimmer. The
dashed grey line represents the tailbeat amplitude of the pitching airfoil. B. Distribution of duration tf − t0 to reach the source
normalized by the period 1/f of the wake. Compared to the RL-inspired policies, the RL-trained policies are, on average, slower
in reaching the terminal location than the RL-inspired strategies. C. Flow agreement parameter and D. thrust parameter of the
trajectories following each strategy. Note that, while the RL-inspired strategies are pure bang-bang controllers, the RL-policies
aren’t. The small asymmetry in the distribution of terminal positions y(tf ) based on RL-trained policies in panel A and the larger
Shannon entropy (R
p log pdx) in flow agreement and thrust parameters highlight the slight deviations between the RL policies
and the RL-inspired Braitenberg strategies.
65



decision timestep scaled by pitching period f Δt
80
60
40
20
0
success rate (%)
100
A
B
80
60
40
20
0
success rate (%)
100
80
60
40
20
0
rate of undetected signal (%)
100
sensing limit scaled by average signal strength
0 0.5 1 1.5 2
C
0 0.2 0.3 0.4
decision timestep scaled by pitching period f Δt
0.5
sensing limit scaled by average signal strength
0 0.5 1 1.5 2
RL-trained policies Braitenberg strategies
0 0.2 0.3 0.4 0.50.1 0.1
Figure 3.10: Robustness of the RL policies and RL-inspired strategies to sensory limitations. A. Success rate as a
function of decision timestep ∆t for the excitatory (red) and inhibitory (blue) strategies, respectively. RL-trained policies are
robust to increase in ∆t up to 50% of pitching period 1/f, while Braitenberg strategies are robust up to 35% - 45% of pitching
period. B. Success rate as a function of sensory constraints imposed on the agent, by which it cannot discern and respond to flow
gradients below a certain limit; namely, for RL-trained policies, we treated shead or stail to be zero and input them into the actor
DNN when |shead| < sthreshold or |stail| < sthreshold. For RL-inspired policies, we set ˙θ = 0 for |s| < sthreshold. The decrease
in success rate of the RL-trained policies is more gradual compared to the RL-inspired strategies. C. Proportion of timesteps (%
value) for which the agent detects no signal. The proportion is calculated based on all successful trajectories. The RL-trained
policies are more robust than the Braitenberg strategies (higher success rates), even though the proportions of time with no
detected signal are larger for the RL-trained policies.
66



Turning radius V/ ΩL
.025
.05
.125
.25
.5
100
50
0
.025
.05
.125
.25
.5
Success
rate (%)
.25 .50-.25-.5
ℓ \ λ
.25 .50-.25-.5
ℓ \ λ
.25 .50-.25-.5
ℓ \ λ
vorticity speedodor pressure
50%
80%
Turning radius V/ ΩL
.25 .50-.25-.5
ℓ \ λ
CA B D
50%
Figure 3.11: Parametric study for different sensory cues using gradient in longitudinal direction. Success rate of the
agent as a function of sensor location ℓ/λ and turning radius R = V /ΩL for different sensory cues with excitatory (red) and
inhibitory (blue) RL-inspired strategies. Four different sensory cues are tested: odor (A.), flow speed (B.), pressure (C.), and
vorticity (D.).
x/λ
-1-2-3 1 2 30
y/λ
1
2
0
-1
-2
-4 4
B
A
C measure longitudinal gradient
measure lateral gradient
signal
field traveling direction
y/λ
1
2
0
-1
-2
y/λ
1
2
0
-1
-2
sensor at tail
sensor at center
1
0
-1
sensor at head Figure 3.12: Trajectories in traveling wave signal field. A. Illustration of the traveling wave signal field A cos(2πxλ +
2πf t). B., C. trajectories for sensor at tail (red, ℓ = −0.25λ), center (green, ℓ = 0), head (blue, ℓ = 0.25λ) with initial orientation
ranging from 0 to 2π (corresponds to color ranging from dark to light ). In B. and C., lateral gradient and longitudinal gradient
of the signal field are used as sensory cue, respectively. Parameters are A = 1, G = 1, V = 0.25.
67



Let λ1, λ2 be the two eigenvalues of the Jacobian matrix,
λ1λ2 = det(J) = 0, λ1 + λ2 = tr(J) = 2AGπ
λ
(−1)n
sin(2π
λ
(z + ℓ(−1)n
)) (3.13)
Thus, for equilibria θ = 0, ±π , at least one of the eigenvalues is 0, and the sign of the other eigenvalue
changes periodically with z. Thus, both moving upstream and moving downstream are neither stable nor
unstable from a linear stability analysis perspective. This periodic change of stability is also illustrated in
phase portrait (Fig. 3.13). Moreover, ℓ acts as a bifurcation parameter, at same location on z-θ plane, if the
system is locally stable when ℓ > 0, it is unstable when ℓ < 0, and vice versa.
At equilibria (z, θ) = ((nV λ + 2fλℓ)/2V, π ± arccos(fλ/V )), n ∈ Z, the Jacobian matrix is written
as
J(z,θ)=( nV λ+2fλℓ
2V
,π±arccos
fλ
V
) =


0 −V sin θ
AG(
2π
λ
)
2
sin θ(−1)n −AG(
2π
λ
)
2
ℓ(1 −
f
2λ
2
V 2 )(−1)n


, n ∈ Z (3.14)
Let λ1, λ2 be the two eigenvalues of the Jacobian matrix,
λ1λ2 = det(J) = AGV (
2π
λ
)
2
(1 −
f
2λ
2
V 2
)(−1)n
λ1 + λ2 = tr(J) = AG(
2π
λ
)
2
ℓ(1 −
f
2λ
2
V 2
)(−1)n+1
(3.15)
If n is an even number, λ1λ2 > 0. If ℓ > 0, λ1 + λ2 < 0, both eigenvalues are smaller than zero,
which shows the equilibria are stable as shown by blue points in Fig. 3.13(F). If ℓ < 0, λ1 + λ2 > 0,
both eigenvalues are larger than zero, which shows the equilibria are unstable as shown by red points in
Fig. 3.13(D). If n is an odd number, λ1λ2 < 0, one of the eigenvalue is smaller than zero, and the other is
larger than zero. Thus, the equilibria are saddle points as shown by green points in Fig. 3.13(D-F).
68



Sensor at tail
ℓ \ λ= -0.25
Sensor at middle
ℓ \ λ= 0
Sensor at head
ℓ \ λ = 0.25
Small speed V=0.25 Large speed V=5
A C B
ED F
0 31 2
x + λf t
0 31 2
x + λf t
0 31 2
x + λf t
0
-̟/2
-̟
̟/2
̟
θ
0
-̟/2
-̟
̟/2
̟
θ
unstable
stable neutral unstable
stable
stable
stable
unstable
neutral unstable
Figure 3.13: Analysis of the controller measuring lateral gradient of the signal field. Phase portrait of the dynamic
system (Equation 3.9) is plotted on z- θ plane. Sensor location ℓ and agent speed V are two bifurcation parameters. (A., D.), (B.,
E.), (C., F.) represents sensors at tail (ℓ = −0.25), center (ℓ = 0.0), head (ℓ = 0.25), respectively. (A. - C.) represents the situation
when agent speed V = 0.25 smaller than traveling speed of signal field fλ = 1, while (D. - F.) represents large agent speed
V = 5. In (D. - F.), points represent equilibrium points, in which red, green, orange, and blue represent unstable source nodes,
saddle nodes, neural equilibrium points, and stable equilibrium points, respectively. Other parameters are chosen as A = 1,
λ = 1, f = 1, G = 1.
In conclusion, the linear stability of moving upstream (θ = 0) and downstream (θ = ±π) both changes
periodically with z. When the agent has a swimming velocity V larger than the traveling speed of signal
field fλ, there is a set of equilibria. When having aft sensors (ℓ < 0), half of these equilibria are unstable,
and another half are saddle points. When having fore sensors (ℓ > 0), half of these equilibria are stable,
and another half are saddle points. There are two bifurcation parameters in this system, sensor location ℓ
and agent speed V .
69



3.9.2 Nonlinear analysis of the system
After this, we performed a nonlinear stability analysis to analyze the stability of moving upstream . We
used Taylor expansion to expand Equation (3.9) around equilibrium manifold θ = 0 , and get
˙θ =
2AGπ
λ
(θ −
θ
3
6
+ O(θ
5
)) sin(2π
λ
(z + ℓ − ℓ
θ
2
2
+ ℓO(θ
4
))) (3.16)
in which z(t) = z0 +
R t
0
(V cos θ(τ ) + fλ)dτ = z0 + fλt +
R t
0
V cos θ(τ )dτ , which is expanded as
z(t) = z0 + fλt +
R t
0
V − V θ2/2 + V O(θ
4
)dτ = z0 + fλt + V t −
R t
0
V θ2/2 + V O(θ
4
)dτ . Since θ(τ ) is
what we need to solve, the order of R t
0
V θ2/2dτ is unknown, but from numerical simulation in Fig. 3.14,
given small V , we find that discarding this term does not have a big influence on the behavior of the
system. Thus we can have an estimation z ≈ z0 + fλt. Thus, if we keep all the terms up to O(γ
3
), the
system is written as
˙θ =
2AGπ
λ
(θ −
θ
3
6
) sin((2πV
λ
+ 2πf)t +
2π
λ
(ℓ + z0 − ℓ
θ
2
2
)) (3.17)
For simplicity and clarity of the equation, we define W = (V + fλ)/AG, tˆ = 2AGπt/λ + AG(ℓ +
z0)/(V + fλ), C = πℓ/λ, the equation is written as
dθ
dtˆ
= (θ −
θ
3
6
) sin(Wtˆ− Cθ2
)
(3.18)
For convenience, we drop the hat from tˆfor the derivations below
˙θ = (θ −
θ
3
6
)(sin(W t) cos(Cθ2
) − cos(W t) sin(Cθ2
)) (3.19)
70



When θ → 0, we took the approximation cos(Cθ2
) → 1 at O(θ
4
), sin(Cθ2
) → Cθ2
at O(θ
6
). Thus the
simplified system at θ ≈ 0 is
˙θ = θ sin W t − θ
3
(
sin W t
6
+ C cos W t),
(3.20)
which is a modified FitzHugh–Nagumo model with nonlinear coefficients [146, 332, 163]. This equation
is also a Bernoulli differential equation, which is further transformed into a first-order non-homogeneous
linear differential equation. The solution is written as
θ =
"
e
2(1−cos W t)/W
e
2(1−cos W t0)/W /θ2
0 + 2 R t
t0
e
2(1−cos W τ)/W (
sin W τ
6 + C cos W τ )dτ #1/2
(3.21)
The solution is oscillatory and has “period" T ≈ 2π/W. Since R 2(m+1)π/W
2mπ/W e
2(1−cos W τ)/W sin W τ dτ = 0,
only the cos W τ term contributes to the asymptotic behavior when t → ∞.
For figuring out the asymptotic behavior of the system, we consider the upper bound and lower bound
of γ. For the lower bound, we look at the series at tm = 2mπ/W, m ∈ Z, we have
lim m→∞
θ(tm) = "
1
CW
π
R 2π/W
0
e
2(1−cos W τ)/W cos W τ dτ #1/2
t
−1/2
m
(3.22)
Since R 2π/W
0
e
2(1−cos W τ)/W cos W τ dτ < 0, C > 0, which implies ℓ > 0, makes square root meaningless.
On the other hand, if C < 0, which implies ℓ < 0, the lower bound of θm approaches O(t
−1/2
m ).
For the upper bound, we look at the series at tm = (2m + 1)π/W, m ∈ Z, we have
lim m→∞
θ(tm) = "
e
4/W
CW
π
R 2π/W
0
e
2(1−cos W τ)/W (cos W τ )dτ #1/2
t
−1/2
m
(3.23)
Similar to the analysis of lower bound, when ℓ < 0, the upper bound of γm approaches O(t
−1/2
m ). And
when ℓ > 0, the upper bound is meaningless in terms of square root. According to Squeeze Theorem [417],
71



0 5 10 15
t
0.5̟
0.4̟
0.3̟
0.2̟
0.1̟
0.0
θ
0.6̟
numerical simulation
analytical solution
0 5 10 15
t
Small speed V=0.25 BA Large speed V=5
Figure 3.14: Validation of the analytical solution. Comparison of θ over t with aft sensors (ℓ = −0.25) between simulation
of numerical simulation (blue line) and analytical solution (red line) for both small agent speed V = 0.25 (A.) and large agent
speed V = 5 (B.). Parameters and initial conditions are A = 1, λ = 1, f = 1, G = 1, x0 = 0, θ0 = 0.3π.
if ℓ < 0, both the lower bound series and upper bound series of θ converge to O(t
−1/2
)implies the function
θ(t) decays in the order of O(t
−1/2
) when t → ∞.
Similar analysis is carried on at the manifolds θ = ±π . When ℓ > 0, the manifolds are stable with the
same convergence rate O(t
−1/2
). When ℓ < 0, the manifolds are unstable. This combined with the linear
stability analysis at equilibria z = nπ/k − ℓ cos θ shows when ℓ < 0, namely an aft sensor is employed,
the agent moves upstream in the traveling wave. The comparison between the numerical simulation and
analytical solution is given in Fig. 3.14.
3.10 Analysis in traveling wave signal field using gradient in longitudinal
direction
Next, we analyzed a agent following the longitudinal gradient of the signal field (in t direction) in the same
traveling wave signal field. The dynamic system is written as
x˙ = V cos θ, y˙ = V sin θ, ˙θ = −
2πAG
λ
cos θ sin[2π
λ
(x + ℓ cos θ) + 2πf t]
(3.24)
72



Sensor at tail
ℓ \ λ = -0.25
Sensor at middle
ℓ \ λ = 0
Sensor at head
ℓ \ λ = 0.25
Small speed V=0.25 Large speed V=5
A C B
ED F
0 31 2
x + λf t
0 31 2
x + λf t
0 31 2
x + λf t
0
-̟/2
-̟
̟/2
̟
θ
0
-̟/2
-̟
̟/2
̟
θ
half stable
unstable stable
neutral half stable
neutral
Figure 3.15: Analysis of the controller measuring longitudinal gradient of the signal field. Phase portrait of the
dynamic system (Equation 3.25) is plotted on z- θ plane. Sensor location ℓ and agent speed V are two bifurcation parameters.
(A., D.), (B., E.), (C., F.) represents aft sensors (ℓ = −0.25), mid sensors (ℓ = 0.0), fore sensors (ℓ = 0.25), respectively. (A.
- C.) represents the situation when agent speed V = 0.25 smaller than traveling speed of signal field fλ = 1, while (D. - F.)
represents large agent speed V = 5. In (D. - F.), points represent equilibrium points, in which red, green, orange, blue represent
unstable source nodes, saddle nodes, neural equilibrium points, and stable equilibrium points, respectively. Other parameters are
chosen as A = 1, λ = 1, f = 1, G = 1.
73



Similar to Sec. 3.9, a coordinate transformation z = x + λf t is performed, and thus the system is
written as
z˙ = V cos θ + fλ, ˙θ = −
2πAG
λ
cos θ sin(2π
λ
(z + ℓ cos θ)) (3.25)
Similar to Sec 3.9.1, the linear stability analysis is inconclusive. Thus, the phase portrait is plotted Fig. 3.15.
Under this scenario, both moving upstream (θ = 0) and moving downstream (θ = ±π) are not equilibria. Instead, equilibria are located at θ = ±π/2, and both equilibria are half-stable in θ direction. When
looking at the equilibria θ = π/2 with aft sensors ℓ = −0.25, we found that when θ is approaching π/2
from positive direction (θ > π/2), θ → π/2
+ is stable. But approaching from negative θ → π/2
− is unstable. Because of the symmetry of the dynamic system, θ → −π/2
+ is unstable and θ → −π/2
− is stable.
These implies that if the initial condition is θ ∈ (−π/2, π/2), θ will oscillate within this range. If the initial
condition is θ > π/2 or θ < −π/2, θ will converge to π/2 or −π/2. If any small disturbance drives θ in
the range of (−π/2, π/2), it keeps oscillating inside this range. And this makes θ = 0 an attractor. Similar
to the situation when using gradient in n direction as sensory cue, sensor location ℓ acts as a bifurcation
parameter. A fore sensor ℓ > 0 makes θ = ±π an attractor.
To analyze the problem in depth, we linearized the controller to a proportional controller, and
θ = ±π/2 are two manifolds in the dynamic system. When the speed of the agent V is larger than
the traveling speed of the signal field λf, a set of equilibrium points (z, θ) = ((nV λ + 2fλℓ)/2V, π ±
arccos fλ/V ), n ∈ Z appear as shown in Fig. 3.15(D-F).
Here, we analyze the stability of the dynamic system at equilibrium θ = π/2. We perform a coordinate
transformation ˆθ = θ − π/2, and get
z˙ = −V sin ˆθ + fλ, ˙ˆθ =
2πAG
λ
sin ˆθ sin(2π
λ
(z − ℓ sin ˆθ)) (3.26)
74



For simplicity, we drop the hat from ˆθ and perform the same Taylor expansion as before
˙θ =
2πAG
λ
sin θ sin(2πf t +
2π
λ
(z0 − ℓθ + ℓ
θ
3
6
)) (3.27)
For simplicity and clarity of the equation, we define W =
fλ
AG , tˆ=
2AGπ
λ
t +
AGz0
fλ , C =
πℓ
λ
, the equation
is written as
dθ
dtˆ
= (θ −
θ
3
6
) sin(Wtˆ− 2Cθ)
(3.28)
For convenience, we drop the hat from tˆfor the derivations below
dθ
dt = (θ −
θ
3
6
)(sin(W) cos(2Cθ) − cos(W) sin(2Cθ)) (3.29)
When θ → 0, we take approximation cos(2Cθ) → 1 − 2C
2
θ
2
at O(θ
4
), sin(2Cθ) → Cθ − C
3
θ
3/6 at
O(θ
5
). Thus the simplified system at θ ≈ 0 (we keep the terms up to O(θ
2
))is
dθ
dt = θ sin(W t) − Cθ2
cos(W t)
(3.30)
It is also a Bernoulli differential equation, and the solution is
θ =
e
− cos(W t)/W
e− cos(W t0)/W /θ0 −
R t
t0
Ce− cos(W τ)/W cos(W τ )dτ
(3.31)
When having aft sensors (ℓ < 0), with initial condition θ0 > 0, limt→∞ θ(t) = 0. With initial condition
θ0 < 0, the asymptotic behavior of θ(t) moves away from 0.
Back to the system before coordinate transformation (ˆθ = θ − π/2), when having aft sensors ℓ < 0,
initial condition θ ∈ (π/2, π), limt→∞ θ(t) = π/2, initial condition θ ∈ (0, π/2), the asymptotic behavior
of θ(t) is moving away from π/2.
75



x/λ
-100 -50 50 100 0
A
B measure longitudinal gradient
measure lateral gradient
y/λ
20
60
0
-20
-60
sensor at tail
sensor at head sensor at center 40
-40
y/λ
20
60
0
-20
-60
40
-40
Figure 3.16: Trajectories in traveling wave signal field with large agent speed. A., B. trajectories for sensor at tail (red,
ℓ = −0.25λ), center (green, ℓ = 0), head (blue, ℓ = 0.25λ) with initial orientation ranging from 0 to 2π (corresponds to
color ranging from dark to light ). In A. and B., lateral gradient and longitudinal gradient are used as sensory cue, respectively.
Parameters are A = 1, G = 1, V = 5.
3.11 Discussion of the dynamic system when having larger speed
In the dynamic system analyzed in Sec. 3.9 and 3.10, swimming speed V is another bifurcation parameter
for both sensing lateral gradient and longitudinal gradient. When V > fλ, the phase portrait is plotted in
Fig. 3.13(D-F) and Fig. 3.15(D-F).
For measuring lateral gradient, the effect of larger speed is as follows. Firstly, a set of equilibria appears
at (z, θ) = ((nV λ + 2fλℓ)/2V, π ± arccos fλ/V ), n ∈ Z. Secondly, because V > λf, time derivative
of transformed coordinate z˙ = ˙x + λf < 0 at θ = ±π or θ = 0, which is different from the cases in
Fig. 3.13(A-C). Since coordinate z is the frame that always moves with the traveling wave, the physical
meaning of the bifurcation of speed V is whether the motion of the agent dominates or the motion of the
wave itself dominates the relative motion between the agent and the wave.
In Fig. 3.13(D), the manifolds at both θ = 0 and θ = ±π are stable manifolds, and the fixed points
are either unstable points or saddle points. This shows that when the swimming speed is larger than the
wave’s traveling speed, and the agent is responding to aft sensors, both moving upstream and downstream
76



are stable. On the other hand, when having fore sensors (Fig. 3.13F), both θ = 0 and θ = ±π are unstable
manifolds, and the fixed points are either stable fixed points or saddle points. This implies that responding
to fore sensors makes both moving upstream and downstream unstable.
For the sensory control strategy measuring longitudinal gradient, similar effects happen as in Fig. 3.15.
When having aft sensors (ℓ < 0), both manifolds at θ = ±π/2 become unstable, which makes both moving
upstream and downstream to be attractors (Fig. 3.15D). When having fore sensors (ℓ > 0), both moving
upstream and downstream are repellers (Fig. 3.15F).
3.12 Summary
Using RL, we discovered two policies for tracking biologically relevant hydrodynamic trails at intermediate
Reynolds numbers. The policies rely only on local and instantaneous flow sensing. From these policies, we
extracted two parsimonious, interpretable, and generalizable response strategies, where the swimmer measures locally a differential flow signal and responds by turning either towards or away from the direction
of stronger signal (Fig. 3.3). These remarkably simple response strategies depend only on two parameters –
the minimum turning radius R = V /ΩL of the agent and the sensor location ℓ. Through rigorous stability
analyses, we proved that the Braitenberg strategies are stable in signal fields with traveling wave character, provided posterior sensor placement ℓ < 0 (Fig. 3.5). We showed that this requirement for sensor
placement ℓ at the tail is equivalent to a time delay τ between sensing and response. Using Monte Carlo
simulations, we demonstrated that these rigorous results carry over to unfamiliar wakes (Fig. 3.4, Table 3.2)
and to sensors that probe different types of flow signals (Fig. 3.3D,E and Fig. 3.8). Moreover, inspired by
these strategies, we hand-crafted a Braitenberg strategy to track turbulent plumes via responding to the
temporal-difference of the signal field, where spatial gradient of signal is not available (Fig. 3.6B).
We then commented on the relation to other source-seeking problems and strategies [26, 386], (Table 3.1). Diverse animals, ranging in size and movement abilities, live in different Reynolds numbers and
77



Péclect numbers, and thus experience signal fields with different features. The turbulent plume typically
considered in source seeking is homogeneous turbulence, namely, the flow model is combined by a uniform background flow and random perturbation [465, 386, 223, 464]. However, typical fluid environments
encountered by aquatic and aerial animals are not like this. Instead, long-lasting coherent structures exist
and dominant time-evolution of the flow even in high Reynolds number turbulent flows [419, 432].
To assess the broader significance of our results, we gathered experimental data of the turning radius
R, swimming speed V , and time delays τ between sensory input to motor response output recorded in
fish [392, 113], harbor seals [232], and copepods [260] (Table 3.4, Sec. 3.8). We chose these organisms
because of strong experimental evidence indicating their ability to track hydrodynamic trails [370, 369,
106, 401, 49, 509]. Copepods live in low Reynolds number flow. Thus, the environment is represented
by a diffusive source in a uniform background flow as in Fig 3.6A. Gradient-ascent algorithm has been
traditionally applied in this regime [2, 234, 41, 3, 480, 207]. Here, in Fig 3.6A, we showed that having a
time-delay in the controller does not influence its performance. The parameters of the controller (turning
radius, response time) are chosen based on the actual parameters of copepods. Different kinds of fish cover
a large range of Reynolds numbers. Here we used the intermediate Reynolds number case as we considered
before. In Fig 3.6B, using the parameters of fish, time-delayed controller also successfully tracked the wake.
Harbor seals have larger size and swimming speed, and thus live in a larger Reynolds number regime.
Studies have found that harbor seals can not only track the vortical wakes left by conspecifics [401], but
also track the turbulent wake left by a small submarine [106]. Thus, we tried to extend the time-delayed
strategy in turbulent plume [137, 414]. Although there isn’t a single traveling wave structure with clear
wavelength and frequency, odor structures also travel downstream. Since there is no spatial gradient in
this model, we utilized the temporal difference of signal along its trajectory to replace the spatial gradient
as described in Sec. 3.7. The extended strategy also works in the turbulent plume because of the odor
structure traveling downstream. Compared to the strategy proposed in [223], the proposed strategy also
78



depends on the downstream motion of odor packets but does not require mean flow direction or explicit
knowledge of the moving direction of odor packets (Table 3.1). The trajectories shown in Fig. 3.6C are
implemented based on the parameters close to that of the harbor seal.
79



Chapter 4
Navigation in unsteady flows: point-to-point navigation
4.1 Zermelo optimization problem
Classic Zermelo problem in uniform flow Zermelo problem considers a swimmer crossing a river
of width H [515]. Here, we consider a uniform background flow U, and constant swimmer speed V <
U. Let t = (cos θ,sin θ), be the swimmer’s heading direction, with θ the angle between the heading
of the swimmer and the flow direction. Given this setup, the course velocity of the swimmer is (−U +
V cos θ, V sin θ). Thus, the time required for crossing the river is H/(V sin θ), and the streamwise drift
distance is H(U − V cos θ)/(V sin θ). The orientation θ can be optimized to minimize either the time or
the streamwise drift by taking gradient regarding to θ, the results are
Time optimal: θopt =
π
2
, Drift optimal: θopt = cos−1 V
U
.
(4.1)
Policies are visualized in Fig. 4.1.
80



A B
C D
Figure 4.1: Hand-crafted policies for Zermelo problem. The field of optimal orientations for A. a naive
agent that does not see the flow and B. a Zermelo policy that navigates to the target in a uniform background flow. Thick black lines show where the boundary beyond which there is no solution for point-topoint navigation using this Zermelo formulation. Both the naive and the Zermelo policies perform poorly
when navigating across an unsteady wake, with an overall 9.8% and 31.85% success rate, respectively. C.
Zermelo’s time-optimal (green arrows) and D. drift-optimal (orange arrows) policies for entering the wake
subject to a mean background flow in the x-direction.
When considering point-to-point navigation in a uniform flow (−U, 0), a swimmer with velocity V
needs to navigate to a target located at ∆x = (∆x, ∆y) relative to the swimmer. A viable strategy is
choosing a constant heading
θopt = − arcsin
U∆y
V
p
∆x
2 + ∆y
2
+ arctan
∆y
∆x
(4.2)
This equation has no solution when




U∆y
V
√
∆x2+∆y
2




> 1.
4.2 Distinguishing Egocentric from Geocentric Learning
Three Stages of Navigation across Unsteady Flows Consider the problem of an artificial swimmer
tasked with navigating to a destination located across an unsteady wake (Fig. 4.2A, Sec. A.1). The wake
consists of a trail of alternating-sign vortices generated by a freestream flow of speed U past a fixed cylinder
81



of diameter D. The swimmer, modeled as a self-propelled agent, is constrained to move at a constant
swimming speed V = 0.8U weaker than the freestream speed U.
This problem is challenging because when positioned outside the wake, the swimmer cannot overcome
the flow; it drifts downstream. In Zermelo’s classic optimization problem, a swimmer in an overpowering
uniform stream with control only over its heading direction can, at best, optimize its motion either to
minimize the time it takes to travel a given distance across the stream (Fig. 4.2B, Sec. 4.1) or to minimize
its downstream drift distance (Fig. 4.2C). To navigate to a target across an unsteady wake, the swimmer
must follow three stages: enter the wake, slalom between vortices to exploit the weaker flows in the wake
to swim upstream, and exit upstream of the target to ensure reaching it despite the stronger downstream
current (Fig. 4.2A). These three stages – entering, zigzagging inside, and exiting the wake – universally
characterize navigation across an unsteady wake; they arise in trajectories based on time-optimal control
given full knowledge of the spatiotemporal evolution of the flow field, when the swimmer has direct control
over its heading direction [174] and when it has control only over the rotational rate at which it changes
its heading direction (Fig. 4.2A, Sec. 4.1). Slaloming inside the wake was reported in live fish negotiating
unsteady wakes [271] and is thought to endow fish with energetic benefits when swimming alone [271]
and in groups (e.g., [486, 193] and references therein).
Because full knowledge of the spatiotemporal evolution of the flow field is often unavailable for robotic
or biological underwater navigators, it was demonstrated in [174] that an optimal strategy for entering,
zigzagging in, and exiting the wake can be learned using Deep RL based only on local observations of
the flow velocity and relative position of the target. The problem of navigating across unsteady wakes in
strong currents is thus controllable and solvable with either full or partial observations of the flow field, as
long as observations are provided in an inertial frame of reference. But is learning feasible in the agent’s
Umwelt, from an agent-centric perspective? If feasible, what does the agent learn? and can it transfer its
82



learning to novel wakes and conditions unexplored during training? We investigated these three questions
in detail.
Egocentric versus Geocentric Sensing To illustrate the difference between egocentric and geocentric
sensing, consider the geocentric learning in [174] where the agent was tasked with reaching a target located
at x
⋆ ≡ (x
⋆
, y⋆
), with coordinates (x
⋆
, y⋆
) expressed in an inertial frame of reference (ex, ey); the policy
took as input the components of the fluid velocity u ≡ (u, v) locally at the agent’s location x ≡ (x, y),
and the components ∆x ≡ (∆x, ∆y) of the relative position ∆x = x
⋆ − x of the target, all in the inertial
frame of reference (ex, ey) (Fig. 4.2E). To obtain these observations, the swimmer must first measure these
quantities using on-board sensors in its own body-fixed frame, say, (t, n) chosen to coincide with the
swimmer’s heading t and transverse n directions (Fig. 4.2F). Then, to transform these measurements into
an inertial frame, the agent needs to know its own orientation θ, i.e., heading direction t ≡ (cos θ,sin θ),
relative to the inertial frame (ex, ey), which usually means the assistance of a satellite, compass, or inertial
measurement unit (Table 4.1). Additionally, to properly align the inertial frame relative to the freestream
direction as done in [174], the swimmer must know the freestream direction in advance, which is typically
unavailable in underwater environments [391, 85, 26, 350].
In an equivalent egocentric learning set-up, the agent collects sensory observations directly in its body
frame (t, n), with no prior knowledge of freestream direction and no dependence on Terrestrial Coordinates. Basically, the agent observes, at its location, the longitudinal and transverse components (ub, vb) ≡
(u· t, u·n) of the fluid velocity u and the relative position ∆x of the target (∆xb, ∆yb) ≡ (∆x· t, ∆x·n)
(Fig. 4.2F). It has no knowledge of its own orientation θ. Therefore, egocentric observations, if amenable to
learning in underwater environments, would at once be less demanding in terms of sensory requirements
and offer greater flexibility in underwater environments where obtaining and communicating external
sensory data to the agent is unfeasible.
83



Table 4.1: Minimal observations for successful learning. An autonomous swimmer navigating to a
target location across an unsteady wake measures both the local flow velocity F and target position T using
onboard sensors in its own body-frame and responds by controlling its rate of change of heading direction
Ω = ˙θ (Fig. 4.2). To transform the measurements F and T into geocentric observations, the agent must know
its own orientation relation to an inertial frame X. Navigation using geocentric observations is achievable
without knowledge of flow gradients. For successful egocentric navigation, additional knowledge of flow
gradients in either the tangential or normal direction is required. A comparison of the minimal sensory
requirements for successful learning indicates that egocentric sensing has the advantage of eliminating
the additional time delays and computations inherent to obtaining inertial measurements X [298, 89] at
the expense of requiring more flow measurements F to compute local flow gradients.
EnvironmentObservations Action
Learning
Observations SuccessfulStrategy
Target position T:Flow F: Orientation X:
Sensor
Reqiurement
RL [Ref 29] CFD yes ∆x, ∆y, u, v θ FTX
RL Geocentric ∆x, ∆y, u, v Ω noCFD, VS FTX
RL Geocentric ∆x, ∆y, θ, u, v Ω CFD, VS yes FTX
∆x CFD, VS RL Egocentric no b, ∆yb, ub, vb Ω FT
∆x CFD, VS RL Egocentric b, ∆yb, ub, vb, n · ∇ub, n · ∇vb Ω yes FFT
∆x CFD, VS RL Egocentric b, ∆yb, ub, vb, t · ∇ub, t · ∇vb Ω yes FFT
(ub,vb) (∆xb,∆yb) θ
Formulating the Learning Problem We formulated the learning problem such that the policy π(a|o)
outputs an action a, given a set of observations o, aimed to guide the artificial agent to a target location across the wake (Sec. B). To reflect practical limitations on motion steering in biological and robotic
systems [118, 32, 414], we considered the agent’s action a to control the rate of change ˙θ of its heading
direction; the agent has no direct control over its heading angle θ. The policy π(
˙θ|o) is learned by maximizing, through repeated interactions with the environment, a cumulative reward composed of a sparse
reward given once the swimmer reaches the target and a dense reward given at every timestep equal to
the change in distance to the target. Each training episode is initiated by randomly positioning the target
(x
∗
, y∗
) inside a circular region at one side of the wake and the agent (xo, yo) inside an equally-sized circular region at the opposite side of the wake and pointing in a random orientation θo (Fig. 4.2A). Training
is initialized at a random time, i.e., phase, to relative to the wake evolution.
84



ex
ey
E F
∆x
∆y
u
v
u
θ
Flow
V
D
−3 0 3×
U
D
Vorticity
Time-optimal strategy
in uniform flow
Drift-optimal strategy
in uniform flow
V U
U
V
Target
u+V t
t
n
Course
direction
Heading
direction
Geocentric frame
Flow u
V
u+V t
Course
direction
Heading
direction
Egocentric frame
∆xb
∆yb
vb
ub
Target Target
U
B
∆y
A
C
D
∆x
observations + rewards
actions
Agent
Actor-critic
Neural
Network
Environment
CFD simulations
of unsteady flows
˙θ = π(o)
Figure 4.2: Autonomous Underwater Navigation in Unsteady Flows. A. Unsteady flow generated by
a uniform freestream flow U past a cylinder of diameter D at Re = 400. A swimmer moving at constant
speed V = 0.8U must navigate the wake to reach the target (black star). Motion planning using timeoptimal control (black trajectory) requires prior knowledge of the entire flow field and its time evolution. In
a uniform background flow, the swimmer can, at best, B. optimize the time it takes to move cross-stream
by orienting itself perpendicular to the flow, thus moving in the direction tan−1
(V /U), or C. optimize
the distance it drifts downstream by orienting itself at an angle upstream, thus moving in the direction
tan−1

√
V /U
1−V 2/U2

(Sec. 4.1). D. To navigate across unsteady flows, we train, using Deep RL, a swimmer
that senses the ambient flow and target location locally in either E. a terrestrial geocentric frame (ex, ey)
or F. a body-fixed egocentric frame (t, n).
4.3 Egocentric Learning Requires Sensing Flow Gradients
To assess the advantages and limitations of egocentric sensing, we compared learning based on egocentric
and geocentric observations. We asked, in the same fluid environment, which set of observations facilitates
learning the task of navigating across the unsteady wake with no prior knowledge of the fluid environment.
Learning with Geocentric Observations Starting from the same set of geocentric observations o =
(∆x, ∆y, u, v) employed in [174], the swimmer failed to learn the navigation task. In [174], the swimmer
learned successfully because it had direct control over its heading angle θ. To remedy this, and because
85



0 21.510.5 0 0.5
Flow blind
100
0
50
150
200
Flow limited
0 21.510.5
rewards
100
0
50
150
200
CFD
×107
A
VS
rewards
B
100
50
75
0
25
success rate (%)
C
timesteps
100
50
75
0
25
success rate (%)
CFD
CFD
VS
D
Flow
blind
Flow
limited CFD
Figure 4.3: Learning Underwater Navigation Using Egocentric Observations Requires Sensing
Flow Gradients. We trained RL policies with geocentric and egocentric observations in two flow environments: CFD and VS simulations. All policies underwent training of equal length (2 × 107
timesteps
for CFD and 0.5×107
timesteps for VS). Learning curves represent the moving mean of cumulative rewards
per episode, calculated over a window of 500 episodes. A. Geocentric observations: sensing (∆x, ∆y, θ)
only, the flow-blind agent failed to learn in 6 instances of learning. Adding flow sensing abilities (u, v), the
agent learned to navigate in CFD wake in all 17 instances of learning. Training in VS wake with the same
observations succeeded in all 4 instances, with faster convergence. B. Egocentric observations: sensing
(∆xb, ∆yb, ub, vb) only, the agent failed in CFD wake in all 10 instances of learning. Adding local flow
gradients (n · ∇ub, n · ∇vb) resulted in successful learning in all 16 instances of learning in the CFD wake
and 4 instances of learning in the VS wake. Success rate of each of the trained policies is evaluated over
a distribution of 1000 randomly generated test conditions for C. geocentric and D. egocentric agent. The
larger variance of success rates in CFD compared to VS wakes reflects that CFD flows are more challenging
to navigate.
these geocentric observations require implicit knowledge of the swimmer’s heading angle θ in inertial
frame, we allowed the swimmer to explicitly observe θ, thus augmenting the geocentric observations to
o = (∆x, ∆y, θ, u, v) (Table 4.1). The policy converged in each of the 17 training sessions we conducted,
with some variation in reward (Fig. 4.3A, Suppl. Table 4.2). To highlight the importance of flow sensing,
we trained a flow-blind swimmer that observed only its own orientation and relative position to the target
(∆x, ∆y, θ). The flow-blind swimmer failed to reach the target (Fig. 4.3A, 4.1), performing worse than the
flow-blind swimmer in [174] because of the different action ( ˙θ versus θ) taken by the agents.
86



Table 4.2: Performance of trained RL policies. Testing the RL policies within the trained region using
a total of 1000 randomly chosen test cases . Action Ω = ˙θ. Success rate is reported as a percentage. In
CFD wake at Re = 400, we conducted 17 training instances for the geocentric policy, 16 for the egocentric
policy, 10 for the egocentric flow-limited policy, and 6 for the geocentric flow-blind policy. In reducedorder VS wake, we conducted 6 training instances for each of the geocentric, egocentric, and egocentric
flow-limited policies.
EnvironmentObservations ActionStrategy # Policies
Success rate (%)
best worse mean std.
RL Geocentric ∆x, ∆y, θ, u, v Ω CFD 17 98.5 83.5 94.4 4.4
RL Geocentric ∆x, ∆y, θ Ω CFD 6 14.3 3.4 10.1 3.7
RL Egocentric ∆xb, ∆yb, ub, vb Ω CFD 10 29.6 0.7 17.0 10
RL Egocentric ∆xb, ∆yb, ub, vb, n · ∇ub, n · ∇vb Ω CFD 16 99.3 70.9 88.8 9.9
RL Egocentric ∆xb, ∆yb, ub, vb, t · ∇ub, t · ∇vb Ω CFD 2 76.5 61.5 69.0 7.5
RL Geocentric ∆x, ∆y, θ, u, v Ω VS 6 99.3 97.2 98.4 0.7
∆x VS RL Egocentric b, ∆yb, ub, vb Ω 6 65.7 9.8 42.9 20.8
RL Egocentric ∆x VS b, ∆yb, ub, vb, n · ∇ub, n · ∇vb Ω 6 98.9 68.9 90.1 12.1
Learning with Egocentric Observations We next trained the swimmer using the same set of observations taken in body frame o = (∆xb, ∆yb, ub, vb). This flow-limited swimmer failed to learn (Fig. 4.3B).
When, in addition, we provided the swimmer with the ability to sense the transverse flow gradient (n ·
∇ub, n · ∇vb), that is, when considering an augmented set of six egocentric observations
o = (∆xb, ∆yb, ub, vb, n· ∇ub, n· ∇vb), the policy converged in each of the 16 training sessions, reaching
equally high reward as the geocentric policy (Fig. 4.3B, Table 4.2). Egocentric learning is also possible
when augmenting the local observations to sense the longitudinal flow gradients t · ∇ub and t · ∇vb in the
direction of motion of the agent (Suppl. Table 4.2). Sensing flow gradients is thus essential for autonomous
underwater navigation in unsteady environments.
Requirement for Sensing Flow Gradients is Independent of Wake Model To further substantiate
our conclusion that sensing flow gradients at the swimmer’s scale is necessary for egocentric point-to-point
navigation in coherent flows, we repeated our reinforcement learning methodology using a different model
87



of the fluid environment. Namely, we emulated the CFD wake using a well-known inviscid vortex street
(VS) model consisting of two infinite rows of equal-strength, opposite-sign point vortices [394]. Training
in this reduced order representation of the flow field, we arrived at the same result: egocentric learning is
not possible without the additional observations of either longitudinal or transverse flow gradients.
Training sessions in the VS environment converged faster than in the CFD environment (Fig. 4.3A,B),
with similar convergence trends across multiple training sessions: the geocentric policy learned faster
while the egocentric policy was capable of reaching equally high rewards but with slightly larger training
variance and longer convergence time.
Flow Observations Explain Why Flow Gradients are Necessary for Egocentric Navigation The
trained agent, whether using geocentric or egocentric observations and whether trained in CFD or VS
wake, followed the three stereotypical stages of navigation across an unsteady wake in strong currents:
entering the wake, slaloming between vortices to swim upstream, and exiting the wake upstream of the
target (Fig. 4.4A,B, Suppl. Movie 1). We plotted the corresponding trajectories in the flow observation subspaces consisting of (u, v) for the geocentric agent and (ub, vb) and (n · ∇ub, n · ∇vb) for the egocentric
agent (Fig. 4.4D-F). To distinguish the flow sensing cues in each of the three stages of navigation, we
highlighted the wake entry, zigzag within, and exit stages.
Upstream motions require the agent’s velocity in the upstream direction x˙ · ex = ex · (u + V t) to
be positive. This streamwise velocity can only be positive when the agent is within the wake; For the
geocentric agent with direct access to u = ex · u, it suffices that u + V ≥ 0 be non-negative for the agent
to unambiguously determine that it is inside the wake. Indeed, as the geocentric agent slalomed between
vortices in the physical space, its motion induced periodic oscillations in the observation space for which
u + V ≥ 0, reflecting that the geocentric agent learned the sensory cues u + V ≥ 0 to stay inside the
wake and move upstream (Fig. 4.4D).
88



The egocentric agent also learned to enter the wake and change direction to stay in the wake to satisfy x˙ · t = ub + V ≥ 0 (Fig. 4.4E), but this condition alone does not guarantee upstream motion nor
that the agent is located within the wake. For example, the agent’s initial location outside the wake and
pointing downstream also satisfies this condition. Therefore, the agent needs additional observations of
flow gradients (Fig. 4.4F) to determine when it is inside the wake. To further support this claim, we repeated the egocentric training with the agent tasked to swim in the upstream direction, with no specific
target position, starting from initial locations inside the wake. The agent failed to learn by observing only
fluid velocities (ub, vb) without additional observations of flow gradients such as (n · ∇ub, n · ∇vb) or
(t · ∇ub, t · ∇vb) or both.
Next, we considered the 1000 test cases employed in Fig. 4.3, counted the cases that reached each target, and interpolated the success rate over a regular grid spanning the target training domain. The policies
trained and tested in the same wake, whether in CFD and VS, achieved nearly 100% success (Fig. 4.5A,B),
consistent with Fig. 4.3C,D. In Fig. 4.5, we plotted, for all 1000 test cases, the likelihood of encountering
specific flow observations as a red colormap on the space of observations consisting of (u, v) for the geocentric agent and (ub, vb) and (n · ∇ub, n · ∇vb) for the egocentric agent. The biggest difference appeared
in the egocentric observations of velocity gradient – the CFD wake offered much richer signals of transverse flow gradients, while the gradients in the VS wake were more concentrated. These flow gradients
are essential for an egocentric navigator to differentiate its location within or outside the wake.
4.4 Egocentric Policies are More Robust to Transfer to New Flow Environments
Transfer from Low- to High-Fidelity Flow Environments In Fig. 4.4C, we tested the RL policies
trained in VS wake when placed in the CFD wake. Remarkably, both geocentric and egocentric policies
succeeded in entering the wake, zigzagging between vortices to swim upstream, and even exiting the wake
at an appropriate time and location. With the egocentric policy, after the swimmer missed the target by
89



Vortex Street
B
train and test in VS
A
Re=400
train and test in CFD
Re=400
C
train in VS, test in CFD
−1 −0.5
u· ex
u· ey
−0.5
0.5
0
−1 1 0
u· t
u·n
0
−1
1
D u· t+V >0
−1 −0.5
u· ex
−1 −0.5
u· ex
−1 1 0
u· t
−1 1 0
u· t
E
0
−2.5
0
2.5
−1 0 1
n· ∇(u·n)
n· ∇(u· t)
−1 1 −1 0 1
n· ∇(u· t) n· ∇(u· t)
u· ex+V >0 F
BA C BA C BA C
Figure 4.4: Trajectories of Trained Agents in Physical and Flow Observation Spaces. shown in
black (geocentric) and blue (egocentric) for the same initial conditions and target location. A. Both agents,
trained in high-fidelity CFD wake at Re = 400, learn to enter, slalom inside, and exit the wake. Agents
trained in a reduced flow representation consisting of a von Kármán vortex street (VS) succeed when
tested in B. the reduced VS wake and C. the high-fidelity CFD wake (Suppl. Movies 1 & 2). Corresponding
trajectories in the spaces of flow observations D. (u, v) for the geocentric agent, E. (ub, vb) and F. (n ·
∇ub, n· ∇vb) for the egocentric agent. The three stages of navigation are marked using dark red for wake
entry, dark blue for wake exit, black (geocentric) and light blue (egocentric) for slaloming inside the wake.
a small distance on its first attempt, it swam back into the wake, and continued to navigate upstream
(Suppl. Movie 2). This remarkable adaptive behavior shows that egocentric policies are resilient and
robust to perturbations and have two important implications for applying transfer learning techniques in
underwater environments [351]. First, it shows that the agent continues to perform reliably in unseen
environments and avoids actions that put it at risk [222, 448]. Importantly, it allows the agent to continue
to collect and update its observations, which is a key factor in the success of transfer learning [437]. These
findings will open new opportunities for bridging the gap between simulations and real environments [351,
173] using lifelong learning algorithms [353].
90



B
C
A
−1 1 0
0
1 0
u· t n· ∇(u· t)
u· ey u· ey u· ey
n· ∇(u·n) n· ∇(u·n) n· ∇(u·n)
u· ex+V >0
−0.5
0.5
0
−0.5
0.5
0
−0.5
0.5
0
−0.8
0.8
0
−0.8
0.8
0
−1 −0.5
u· ex
0
−1
1
u·n u·n u·n
0
−1
1
0
−1
1
u· t+V >0
10
0
5
0 flow
gradient
flow
velocity
50
0
100 success
rate(%)
−0.8
0.8
−0.7 0.7
0
Figure 4.5: Flow Observations and Transfer from Low- to High-Fidelity Flow representations.
Green colormap shows the success in reaching a target based on the 1000 random test cases ; sample
trajectories from Fig. 4.4 are superimposed; red colormaps show the likelihood of flow observations for
all 1000 test cases. Geocentric and egocentric agents trained and tested A. in CFD wake at Re=400. B.
in VS wake. C. trained in VS and tested in CFD wake. Likelihood plots of observations of relative target
locations are provided in Fig. 4.6.
We next tested the VS-trained policy in the CFD wake using all 1000 test cases (Fig. 4.5C). The geocentric policy outperformed the egocentric policy because the latter had difficulties reaching targets further
away from the wake of the first approach. This difficulty is due to inaccuracy in the exit conditions. But
even when the agent missed the target, it re-entered the wake and tried again (Fig. 4.4C, Suppl. Movie
2). The analysis in observation space (Fig. 4.5C) emphasizes that aspects of the task, such as entering and
zigzagging between vortices, are more robust to transfer from low to high-fidelity flows, while exiting the
wake is more sensitive to flow gradients. Therefore, a divide-and-conquer approach, say using curriculum
learning [38, 453], may optimize transfer learning in underwater navigation by breaking up the policy
into sub-tasks and focusing on improving the most challenging aspect (here accurate exit conditions) in
higher-fidelity flow environments [159].
91



A
C
∆y
∆x
−π π 0
θ
∆yb
-4 0 4-2 2 6 8
-4
0
4
-2
2
6
8
-8
-4
4
0
8
-1.4 -0.9 -0.4
-0.8
-0.4
0.4
0
0.8
0
2
1.5
1
0.5
2.5
0
2
1.5
1
0.5
2.5
-1.7
1.7
0
-1
1
-0.8
0.8
0
-0.4
0.4
∆y
-4
0
4
-2
2
6
8
-0.8
-0.4
0.4
0
0.8
E
∆y
-4
0
4
-2
2
6
8
-0.8
-0.4
0.4
0
0.8
∆yb
-8
-4
4
0
8
-1.7
1.7
0
-1
1
-0.8
0.8
0
-0.4
0.4
∆yb
-8
-4
4
0
8
-1.7
1.7
0
-1
1
-0.8
0.8
0
-0.4
0.4
∆xb
-8 -4 4 0 8 10 -1.7 1.70-1 1 -1 0 0.5 1-0.5
0
2
1.5
1
0.5
2.5
B
D
F
10050
Flow velocity Flow gradient
50
Flow velocity
0.20
Target position
0.20
Target position
u· t n· ∇(u· t)
u·n
n· ∇(u·n)
u·n
n· ∇(u·n)
u·n
n· ∇(u·n)
u· ex
u· ey u· ey u· ey
Agent
orientation
Figure 4.6: Distribution of observations when generalizing from VS wake to CFD wake. Observations collected by geocentric and egocentric agents, with histograms of the geocentric observations
(∆x, ∆y), (u, v) and θ shown in the three-left columns and of the egocentric observations (∆xb, ∆yb),
(ub, vb) and (∂ybub, ∂yb
vb) in the three-right columns. A., B. observations collected by a VS-trained agent
tested in the same wake. C., D. observations collected by a VS-trained agent tested in Re = 400 CFD wake.
Obvious similarities and differences exist between observations collected in these different environments,
with velocity gradients being remarkably different. E., F. observations collected by a CFD-trained agent
(at Re = 400) when tested in the same wake.
Egocentric Policies Transfer Better to Different Reynolds Numbers We next tested the policies
learned in CFD at Re = 400 in novel CFD wakes, ranging from Re = 200 to 1000 not seen during training.
In Fig. 4.8A, B, we show sample trajectories at Re = 200 and 1000. Both geocentric and egocentric policies
succeeded, albeit with some struggle at higher Re where the vortex wake became unstable. In Fig. 4.8C, we
report the performance of all policies trained in CFD (Re = 400) as Re number varied. Namely, we tested
each of the 17 geocentric and 16 egocentric policies on the 1000 random cases at each of the six values of
Re, a total of 198, 000 tests. We found that at lower Reynolds numbers (Re = 200, 300), egocentric policies
generalized better in a statistically-significant manner. At higher Reynolds numbers, the performance of
both geocentric and geocentric policies declined, and the difference in success rate between the two became
92



A
∆x
−π π 0
θ
-4 0 4-2 2 6 8 -1.4 -0.9 -0.4
∆y
-4
0
4
-2
2
6
8
-0.8
-0.4
0.4
0
0.8
C
∆y
-4
0
4
-2
2
6
8
-0.8
-0.4
0.4
0
0.8
E
∆y
-4
0
4
-2
2
6
8
-0.8
-0.4
0.4
0
0.8
∆yb
-8
-4
4
0
8
-1.7
1.7
0
-1
1
-0.8
0.8
0
-0.4
0.4
∆yb
-8
-4
4
0
8
-1.7
1.7
0
-1
1
-0.8
0.8
0
-0.4
0.4
∆yb
-8
-4
4
0
8
-1.7
1.7
0
-1
1
-0.8
0.8
-0.4
0.4
∆xb
-8 -4 0 4 8 10 -1.7 1.70-1 1 -1 0 0.5 1-0.5
0
0
2
1.5
1
0.5
2.5
0
2
1.5
1
0.5
2.5
0
2
1.5
1
0.5
2.5
B
D
F
0.20 0.20 50 10050
u· t n· ∇(u· t)
u·n
n· ∇(u·n)
u·n
n· ∇(u·n)
u·n
n· ∇(u·n)
u· ex
u· ey u· ey u· ey
Flow gradientFlow velocity Flow velocityTarget positionTarget position Agent
orientation
Figure 4.7: Distribution of observations in wakes of different Reynolds numbers. Observations
collected by geocentric and egocentric agents when tested in different wakes, with histograms of the geocentric observations (∆x, ∆y), (u, v) and θ shown in the left three columns and of the egocentric observations (∆xb, ∆yb), (ub, vb) and (∂ybub, ∂yb
vb) in the right three columns. Observations are collected by
CFD-trained agents (at Re = 400) when tested in A., B. Re = 200 CFD wake not seen during training; C.,
D. the same CFD wake during training; and E., F. Re = 1000 CFD wake not seen during training. Observations exhibit similarities in range in almost all quantities with the most notable differences in velocity
gradients. White-to-red colormap indicates the frequency of observations.
less significant statistically. The inability of both egocentric and geocentric policies to keep up with flows at
higher Re is caused by the changing nature of the wake: at lower Re, the downstream wake is stable, but at
higher Re, the wake loses stability [219], as evident by the cross-stream motion of the vortices in Fig. 4.8B,
introducing truly novel physical challenges to deal with. Notwithstanding the changing flow physics, the
gradual degradation of performance with Re (Fig. 4.8) suggests ample opportunity for updating the learned
policy as Re increases using lifelong learning [353].
The likelihood of flow observations based on all tests at Re = 200, Re = 400, and Re = 1000 is shown
in Fig. 4.7. The space of flow observations exhibited similar features to those reported in Fig. 4.5 with
periodic oscillations in the space of flow observations reflecting zigzagging motion of the agent inside
93



the wake. However, the magnitude of flow signals differed with Re number. Transfer to lower Re means
weaker flows and requires interpolation of flow observations acquired during training. Transfer to higher
Re means stronger flows and requires extrapolation of flow observations, which is notoriously difficult for
learning-based models [242]. Consideration of the physics of the flow environment is thus of paramount
importance in assessing the limitations of transfer learning to novel flows.
4.5 Egocentric Policies are Rotationally-Invariant and Better Adapt to
Novel Conditions Unexplored During Training
To further assess the advantages and limitations of the egocentric and geocentric policies, we subjected
them to two challenges. First, we tested both policies under superimposed rotations to the entire flow
field that introduced misalignment between the wake direction and the inertial frame. Second, with the
reference frame properly aligned with the wake and in the same flow used during training, we subjected
both policies to novel conditions unexplored during training.
Egocentric Policies Naturally Obey Rotational Symmetries To emphasize the distinction between
the egocentric and geocentric navigators, we gradually rotated the CFD wake relative to the inertial frame
(ex, ey), and at each degree of misalignment between the wake and the inertial frame, we tested the
performance of the trained agent considering initial conditions and target locations in the same domains
relative to the wake as those explored during training (Fig. 4.8D,E, Suppl. Movie 1). The performance of
the geocentric policy degraded rapidly with increasing misalignment between the wake and the reference
frame, while the egocentric policy, by construction, maintained its high performance at any degree of
misalignment. These results demonstrate that the egocentric policy is rotationally symmetric – that is,
invariant to the absolute orientation of the wake – while the geocentric policy requires a priori knowledge
of the alignment between the wake and the inertial frame.
94



A Re=200
B Re=1000
success rate (%)
Reynolds number
Ego
Geo
*** ****
100
80
60
40
20
0
200 300 400 500 600 1000
Re=200 C
Re=1000
D
Re=400
ex
ey
Re=400
ex
ey
-̟ 0 ̟ -̟ 0 ̟
rotation angle rotation angle
100
50
75
0
25
success rate (%)
E
Figure 4.8: Transfer to Novel Reynolds Numbers and Policy Invariance under Rotational Symmetry. Agents trained at Re = 400 succeed when tested at A. Re = 200 and B. Re = 1000. C. Success rates of
geocentric and egocentric agents trained in CFD wake at Re = 400 and tested across a range of Reynolds
numbers are summarized using box plots, where the median, lower and upper quartiles are indicated with
horizontal bars, and outliers are marked by ‘×’. All 17 geocentric and 16 egocentric policies are included,
each tested over 1000 test cases. To evaluate the difference in performance between the egocentric and
geocentric policies, a two-sample t-test is used [426, 145]. The null hypothesis states that there is no significant difference in success rates. A smaller p-value indicates stronger evidence against the null hypothesis,
suggesting a more significant difference in performance. *p < 0.05, **p < 0.01, ***p < 0.001. Likelihood
plots of visual and flow observations collected at Re = 200, 400 and 1000 are provided in Fig. 4.7. D.
Agents trained at Re = 400 and tested in the same wake under −30◦ misalignment between the wake and
inertial frame. E. With increasing misalignment, the success rate of the geocentric agent quickly drops to
nearly zero; whereas the performance of the egocentric agent is invariant to such rotations .
Invariance to rotations and successfully reaching the target irrespective of the direction of the unsteady
currents is a major advantage of egocentric learning; in geocentric learning, an incorrect estimate or a
change in flow direction would require a re-training of the policy.
RL Policies Tested Outside the Training Domain In Fig. 4.9A, B, we tested the behavior of both
policies starting at locations upstream of the target. The agent with geocentric observations failed immediately and headed outside the wake. The egocentric agent performed better; it initially turned toward
the target, and when it missed, it went back into the wake, zigzagged through the vortices, and exited
95



the wake to locate the target. This impressive robustness to new conditions is a hallmark of egocentric
policies; it emphasizes that the policy itself acts as a resilient feedback controller with built-in redundancy
that ensures functionality even in the event of failure. When placed downstream of the target (Fig. 4.4C),
both egocentric and geocentric policies performed well at moderate downstream locations, but further
downstream, the egocentric policy failed first. The failure occurred despite the agent’s attempt to enter
the wake and engage with the vortices .
Systematic Assessment of Policy Performance Outside the Training Domain We next systematically challenged the swimmer to reach a fixed target located at the center of the training domain starting
from any initial position in the flow field, including at locations unexplored during training (Fig. 4.9). To
standardize these tests, we initialized the agent’s position on a regular grid over the entire fluid domain
and, at each grid point, we initialized the agent orientation using 36 distinct initial orientations evenly distributed from 0 to 360◦
. We fixed the initial phase to of the flow. In total, we performed 34, 020 test cases
per policy. As expected, both geocentric and egocentric policies almost surely succeeded when starting
within the training domain (Fig. 4.9A, B). The egocentric policy generalized better upstream of the training
domain (90% success of egocentric policy versus 55% success of geocentric policy). The geocentric policy
performed better at downstream locations (63% success of geocentric policy versus 47% success of egocentric policy), but both policies reached a limit beyond which they failed (straight grey lines in Fig. 4.9A, B).
Failure downstream of the training domain occurred before physical limitations due to viscous decay of
the vortex structures were reached, reflecting the limitations of the policies themselves.
4.6 Interpretation of Underwater RL Policies
Policy Limitations versus Limitations due to Flow Physics Intuitively, when placed directly upstream of the target, we expect even a naive navigator that orients toward the target, ignoring entirely the
96



C D
E F
success rate(%)
50
0
100
preferred
direction
θ
˙θ
upstreamdownstream upstreamdownstream A B
Figure 4.9: Transfer to Locations outside Training Domain and Interpretation of RL Policies. A.
Geocentric and B. egocentric agents starting from initial conditions unseen during training: geocentric
agents fail upstream of the target location but outperform egocentric agents moderately downstream of
the target (Suppl. Movie 3). Success rates of C. geocentric and D. egocentric agents reaching a fixed
target (∗) starting anywhere in the wake (Green colormap) with 100% success of both policies within the
training domain (black circle), 58% and 66% in favor of egocentric policy outside the training domain, and
overall 60% and 68% success across the entire domain. Both policies fail downstream: solid lines marking
the failure of the geocentric policy align with the direction of the “time-optimal” strategy (Fig. 4.2C), and
those of the egocentric policy align with the direction of the “drift-optimal” strategy (Fig. 4.2B). The field
of “preferred orientations” defined by the stable fixed points of the average policy for E. geocentric and
F. egocentric agents explains the behavior of the trained agent inside and outside the training domain.
Preferred orientations that align with time-optimal and drift-optimal strategies are highlighted in green
and orange, respectively.
flow field, to reach the target simply by drifting downstream (SI, Fig. 4.1A). A slightly savvier navigator,
aware of only the background uniform flow U could exploit this flow to reach the target from a broader
range of upstream locations (SI, Fig. 4.1B). But the geocentric policy has no such “intuition” of the flow. Its
failure at upstream locations is due to policy limitations. When collecting observations in an inertial frame,
97



upstream situations are novel to the policy. The egocentric agent performs better because the self-centric
view of the flow and target provides a richer set of observations.
Far downstream, we expect flow physics to impose limits on what is achievable by even the savviest
agent. Vortices decay downstream. This viscous diffusion is best illustrated in flow physics using the Oseen
solution where an initially concentrated vortex decays spatially due to viscosity . Far downstream, the
wake’s streamwise velocity u approaches the freestream velocity U and the flow exhibits weaker gradients,
prohibiting the swimmer from exploiting the wake to move upstream. We thus expect the performance of
both policies to deteriorate as the vortex intensity decreases , with a faster drop in the performance of the
egocentric policy because it relies on flow gradients to navigate and is thus more disadvantaged in flows
with weaker gradients.
Dynamical Systems Interpretation of RL Policies To elucidate the reasons for the disparity in performance between geocentric and egocentric policies, we analyzed these policies using tools from dynamical
systems theory [13, 422, 218, 184]. From the dynamical systems perspective, the average policy defines
a deterministic function ˙θ = π(o), and this averaged rotational dynamics forms a “dynamical flow field”
over the phase space of action and observations. This is a high-dimensional space that prohibits direct
visualization of the policy and complicates the analysis of its stability and convergence to the target position [218, 184]. Luckily, at a given location (x, y) and phase t, the observations depend on the agent’s
heading direction, and the average policy can be viewed as a dynamical system ˙θ = π(o(θ)) over the phase
space (θ, ˙θ). We thus defined the field of preferred direction θp as follows: of all potential orientations θ at
a given location (x, y) and phase t, the preferred direction is a stable equilibrium of the dynamical system
˙θ = π(o(θ)) for which the average policy ˙θ = π(o(θ)) = 0 vanishes and its derivative with respect to θ is
negative ∂ ˙θ/∂θ < 0 (Fig. 4.9C, inset).
In Fig. 4.9C, D, we plotted the preferred directions over the entire domain for the geocentric and egocentric policies. Locations with multiple arrows imply multiple preferred directions. Generalizability of
98



the egocentric policy upstream of the target (Fig. 4.9B) correlates with its tendency to have multiple preferred directions at these locations, increasing the chance of taking a correct action. In contrast, at these
locations, the geocentric policy instructs the agent to confidently take action towards a single preferred
direction that does not lead to the target (Fig. 4.9A, C), exhibiting a worst-case scenario in decision-making.
This explains why the geocentric policy behaves much worse than the naive policy at upstream locations.
Downstream and outside the wake, the geocentric policy tries to minimize either the downstream drift
(Fig. 4.9C, orange arrows), the time to reach the wake (Fig. 4.9C, green arrows), or a trade-off between both.
The egocentric policy mostly instructs the agent to move into the wake while minimizing downstream drift
(Fig. 4.9D). Downstream and inside the wake, however, the preferred directions of the geocentric policy
clearly favor upstream motion (Fig. 4.9C), while the egocentric policy exhibits multiple preferred directions
that confuse the agent and lead to failure (Fig. 4.9D). This explains why the performance of the egocentric
policy deteriorates faster than that of the geocentric policy at downstream locations. The ability of the
geocentric policy to unambiguously favor upstream directions inside the wake can be attributed to the
fact that it has knowledge of the agent’s orientation relative to the orientation of the wake (through the
a priori knowledge of the wake alignment with the inertial frame (ex, ey) and the agent’s observation θ),
whereas the egocentric policy doesn’t. In a direct comparison of the trajectories in Fig. 4.9A,B to the vector
field of preferred directions in Fig. 4.9C,D, it is clear even when the egocentric agent is initially placed at
a location with an unambiguous preferred direction and tries to enter and engage with the wake, failure
occurs as the agent moves into locations with multiple preferred directions.
This analysis has several important implications. It shows that ambiguity in the preferred direction
is favorable when flow physics acts in concert with the desired task (upstream locations), but ambiguity
is detrimental when flow physics challenges the desired task (downstream locations). It also shows that
the application of tools rooted in dynamical systems theory unveils promising paths for evaluating and
interpreting the behavior of machine-learned policies.
99



4.7 Summary
We investigated a fundamental problem of underwater navigation within a flow regime of direct relevance
to medium-scale robotic underwater vehicles [523, 298]. At these scales, underwater navigation often
involves interactions with unsteady wakes consisting of persistent and coherent vortex structures at intermediate Reynolds numbers. Learning to enter, slalom within, or exit such flows is thus essential for
any underwater robotic mission involved in ocean exploration and surveillance [263, 174, 298, 173]. We
analyzed, using a combination of physics-based simulations and reinforcement learning methods, the feasibility of robot-centric learning in such flow environments. Unlike existing learning studies that require
inertial observations [174, 298], say with the help of a satellite, or continuous measurements of a global
direction of gravity [320] or wind [414], in robot-centric learning, observations are collected in the robot’s
own world, through on-board sensors, with no a priori or acquired knowledge of a global flow direction
or inertial frame of reference.
Our study demonstrated that (1) learning underwater navigation from a robot-centric perspective is
feasible provided that the robotic agent senses local flow velocities and local flow gradients; (2) robotcentric policies respect physical symmetries and are invariant to flow rotations; (3) robot-centric policies
exhibit adaptive behavior in unknown environments, allowing the robot to re-enter the wake and try
again when missing the target, and (4) robot-centric policies facilitate transfer learning from reduced to
high-fidelity flow environments and between different Reynolds number flows.
Our analysis of the sensory requirements for autonomous underwater navigation (Table 4.1) indicate
that self-centric sensing in the agent’s own world eliminates potential time delays and computations inherent to assessing inertial signals [298, 89] at the expense of requiring more sensors to observe spatial
variations in the flow field at the scale of the navigator. We envision that, to learn more complex and
diverse navigation tasks in future underwater [298, 173] and aerial [37, 388] robotic applications, flow
100



gradients measured at multiple locations and directions [85, 84], using a distributed array of flow sensors [82] along the swimmer, and supplemented by the ability to remember and update a history of flow
observations [465, 82, 86, 414] might be necessary. These directions will be investigated in future work.
In addition to its implications for robotic systems, our study opens avenues for understanding the link
between flow sensing and behavior in biological systems [309, 414]. Aquatic organisms that interact with
coherent vortex structures have bilateral arrays of flow sensors suited for computing flow gradients [391,
85], e.g., fish lateral line system [247, 305, 366] and harbor seal whiskers [106]. The methods we propose
here offer an exciting opportunity for future studies that unravel how flow sensing abilities in aquatic
organisms have been shaped, not only by the size and morphology of the organism, but also by the flow
environment it navigates.
Taken together, our work establishes promising directions for learning, from a robot-centric perspective, in dynamically changing physical environments, provides systematic analyses particularly suited for
bridging the gap between simulations and real-world environments, and opens avenues for future investigation of the mapping between environmental conditions and sensory requirements in biological and
robotic systems.
101



Chapter 5
Hydrodynamics of groups of multiple flapping swimmers
5.1 Mathematical models of flow-coupled flapping swimmers
Inspired by the experiments of [339, 266], we study self-organization in the context of flapping swimmers,
coupled passively via the fluid medium, with no mechanisms for visual [305, 176, 141, 116, 281], flow
sensing [129, 391, 85, 184], or feedback control [466, 218, 265] (Fig. 5.1). The swimmers are rigid, of finite
body length L and mass per unit depth m, and undergo pitching oscillations of identical amplitude A
and frequency f in the (x, y)-plane of motion, such that the pitching angle for swimmer j is given by
θj = A sin(2πf t+ϕj ), j = 1, 2, . . . , N, where N is the total number of swimmers. In pairwise interaction,
we set ϕ1 = 0 and ϕ2 = −ϕ, with ϕ being the phase lag between the oscillating pair. We fixed the lateral
distance ℓ between the swimmers to lie in the range ℓ ∈ [−L, L], and allowed the swimmers to move freely
in the x-direction in an unbounded two-dimensional fluid domain of density ρ and viscosity µ.
When unconstrained, the swimmers may drift laterally relative to each other, as illustrated in dipole
models [444, 228] and high-fidelity simulations of undulating swimmers [155, 466]. However, this drift
occurs at a slower time scale than the swimming motion, and can, in principle, be corrected by separate
feedback control mechanisms [526]. Here, we focus on the dynamics in the swimming direction.
102



Hereafter, all parameters are scaled using the body length L as the characteristic length scale, flapping period T = 1/f as the characteristic time scale, and ρL2
as the characteristic mass per unit depth.
Accordingly, velocities are scaled by Lf, forces by ρf 2L
3
, moments by ρf 2L
4
, and power by ρf 3L
4
.
The equations governing the free motion xj (t) of swimmer j are given by Newton’s second law (here,
the downstream direction is positive),
mx¨j = −Fj sin θj + Dj cos θj . (5.1)
The hydrodynamic forces on swimmer j are decomposed into a pressure force Fj acting in the direction
normal to the swimmer and a viscous drag force Dj acting tangentially to the swimmer. These forces
depend on the fluid motion, which, in turn, depends on the time history of the states of the swimmers.
To maintain their pitching motions, swimmers exert an active moment Ma about the leading edge,
whose value is obtained from the balance of angular momentum. The hydrodynamic power P expended
by a flapping swimmer is given by P = Ma
˙θ.
To compute the hydrodynamic forces and swimmers’ motion, we used two fluid models (Figs. 5.1,
5.2, 5.3 and 5.4). First, we employed a computational fluid dynamics (CFD) solver of the Navier-Stokes
equations tailored to resolving fluid-structure interactions (FSI) based on an adaptive mesh implementation
of the immersed boundary method [210, 167, 43]. Then, we solved the same FSI problem, in the limit of
thin swimmers, using the more computationally-efficient inviscid vortex sheet (VS) model [343, 204, 205,
194, 181]. To emulate the effect of viscosity in the VS model, we allowed shed vorticity to decay after a
dissipation time τdiss; larger τdiss correlates with larger Reynolds number Re in the Navier-Stokes model;
see SI for a brief overview of the numerical implementation and validation of both methods.
103



5.2 Flow coupling leads to stable emergent formations
We found, in both CFD and VS models, that pairs of swimmers self-organize into relative equilibria at a
streamwise separation distance d that is constant on average, and swim together as a single formation at
an average free-swimming speed U (Figs. 5.1 and 5.5). We distinguished four types of relative equilibria:
inline, diagonal, side-by-side inphase and side-by-side antiphase (Fig. 5.1).
Inline formations at ℓ = 0 arise when the follower positions itself, depending on its initial distance
from the leader, at one of many inline equilibria, each with its own basin of attraction (Fig. 5.5A). These
inline equilibria occur at average spacing d that is approximately an integer multiple of UT, consistent
with previous experimental [33, 384, 339] and numerical [524, 355, 362, 101, 194, 14] findings.
When offsetting the swimmers laterally at ℓ ̸= 0 (Fig. 5.5B), the leader-follower equilibria that arise
at ℓ = 0 shift slightly but persist, giving rise to diagonal leader-follower equilibria [341]. Importantly, at
a lateral offset ℓ, inphase swimmers (ϕ = 0) that are initially placed side-by-side reach a relative equilibrium where they travel together at a close, but non-zero, average spacing d ≤ L. That is, a perfect
side-by-side configuration of inphase flapping swimmers is unstable but the more commonly-observed
configuration [266] where the two swimmers are slightly shifted relative to each other is stable. This configuration is fundamentally distinct in terms of cost of transport from the mirror-symmetric side-by-side
configuration that arises when flapping antiphase at ϕ = π (Fig.5.6A). Both side-by-side equilibria were
observed experimentally in heaving hydrofoils [341], albeit with no assessment of the associated hydrodynamic power and cost of transport.
We next examined the effect of varying the phase ϕ on the emergent traveling formations. Starting
from initial conditions so as to settle on the first equilibrium d/UT ≈ 1 when ϕ = 0, and increasing ϕ, we
found, in both CFD and VS simulations, that the spacing d/UT at equilibrium increased with increasing ϕ
(Fig. 5.5C). This increase is linear, as evident when plotting d/UT as a function of ϕ (Fig. 5.6B). Indeed, in
Fig. 5.6B, we plotted the emergent average separation distance d/UT as a function of ϕ for various values
104



B Lateral y/L
0
1
-1
A
C Lateral y/L Lateral y/L
0
1
-1
0
1
-1
Lateral y/L
0
1
-1
Longitudinal x/L
0 1 2 3 4 5 6
Longitudinal x/L Longitudinal x/L
0 1 2 3 4 5 6
D
Inline Side-by-side
(inphase)
Side-by-side
(antiphase)
Diagonal
10×(U/L)
-10
0
0.01×(UL)
-0.01
0
Vorticity Circulation
Figure 5.1: Flow-coupled swimmers self-organize into stable pairwise formations. A. inline (ℓ = 0, ϕ = π/2), B.
diagonal (ℓ = L/2, ϕ = 0), C. inphase side-by-side (ℓ = L/2, ϕ = 0) and D. antiphase side-by-side (ℓ = L/2, ϕ = π) in
CFD (left) and VS (right) simulations. Power savings at steady state relative to respective solitary swimmers are reported in
Fig. 5.6. Parameter values are A = 15◦
, Re=2πρAfL/µ = 1645 in CFD, and fτdiss = 2.45 in VS simulations. Corresponding
hydrodynamic moments are given in Fig. 5.7. Simulations at different Reynolds numbers and dissipation times are given in
Figs. 5.2, 5.3 and 5.4.
of ℓ. Except for the antiphase side-by-side formation, the linear phase-distance relationship ϕ/2π ∝ d/UT
persisted for ℓ ̸= 0.
The key observation, that pairs of flapping swimmers passively self-organize into equilibrium formations, is independent of both scale and fluid model. In our CFD simulations (Fig. 5.2 and 5.4), we tested a
range of Reynolds number Re = ρUL/µ from 200 to 2000, which covers the entire range of existing CFD
105



C
A
B
D
10−10 0
Vorticity
×(U/L)
Flow
agreement
parameter
10−1
y/L
y/L
y/L
y/L 0
1
-1
0
1
-1
0
1
-1
0
1
-1
0 1 32 4 5 6 8 7
x /L x /L
0 1 32 4 5 6 7 8
Figure 5.2: Pairs of swimmers in CFD simulations. Vorticity field (left column) and flow agreement parameter V (right
column) in the wake of a pair of inline and inphase swimmers Reynolds number ReA = 206, 308, 411, 1645, respectively. The
pitching amplitude of leader and follower is set to A = 15◦
, except in A, where the follower is pitching at A = 13.5
◦
.
106



A
B
C
D
−.01 0 .01
Circulation
×(UL)
Flow
agreement
parameter
10−1
y/L 0
1
-1
y/L 0
1
-1
y/L 0
1
-1
y/L 0
1
-1
0 1 32 4 5 6 8 7
x /L x /L
0 1 32 4 5 6 7 8
Figure 5.3: Pairs of swimmers in VS simulations. Snapshots of the swimmers (left column) and flow agreement parameter
V (right column) in the wake of a pair of swimmers in the VS model for dissipation time τdiss = 2.45T, 3.45T, 4.45T, and 9.45T,
respectively. The pitching amplitude is set to A = 15◦
, and the pitching frequency to f = 1.
ReA
=1645
ReA
=822
ReA
=411
= 2.45T
= 3.45T
= 4.45T
= 9.45T
=
ReA
=308
ReA
=206
= 4.45T
= 9.45T
=
0.5
1
1.5
2
0
Separation distance, d /UT
2.5
3020100 5 2515
Time, t /T
A
3020100 5 2515
Time, t /T
B
Figure 5.4: Influence of fluid property. A. Separation distance versus time in a pair of swimmers in the CFD model for five
values of Reynolds numbers ReA = 206, 308, 411, 822, 1645 (Fig. 5.2). B. Separation distance versus time in a pair of swimmers
in the VS model for five values of dissipation time τdiss = 2.45T, 3.45T, 4.45T, 9.45T, ∞ (Fig. 5.3). Separation distance d is
normalized by swimming speed U for each cases, respectively. In A., B., the swimmers stabilize near d/UT = 1. The pitching
amplitude of leader and follower is set to A = 15◦
, except in ReA = 206 , where the follower is pitching at A = 13.5
◦
to avoid
collision.
107



simulations [33, 14], where Re ∼ O(102
), and experiments [33, 339, 341], where Re ∼ O(103
). In our
VS simulations, we varied τdiss from 2.45T to ∞ (Figs. 5.3 and 5.4). Note that the separation distance d is
scale-specific and increases with Re; at low Re, a compact inline formation is reached where the two swimmers “tailgate” each other, as observed in [362]. However, the scaled separation distance d/UT remains
nearly constant for all Re and τdiss (Fig. 5.4).
The fact that these equilibria emerge in time-forward simulations is indicative of stability [424]. A
more quantitative measure of linear stability can be obtained numerically by perturbing each equilibrium,
either by applying a small impulsive or step force after steady state is reached [341] or by directly applying a small perturbation to the relative equilibrium distance between the two swimmers and examining
the time evolution of d and F to quantify variations in hydrodynamic force δF as a function of signed
variations in distance δd from the equilibrium [194]. In either case, we found that the force-displacement
response to small perturbations at each equilibrium exhibited the basic features of a linear spring-mass
system, where δF/δd is negative, indicating that the hydrodynamic force acts locally as a restoring spring
force that causes the initial perturbation to decay and that stabilizes the two swimmers together at their
equilibrium relative position. Larger values of |δF/δd| imply faster linear convergence to the stable equilibrium and thus stronger cohesion of the pairwise formation. Results of this quantitative stability analysis
are discussed in subsequent sections.
5.3 Emergent formations save energy compared to solitary swimming
We evaluated the hydrodynamic advantages associated with these emergent formations by computing the
hydrodynamic power Psingle of a solitary swimmer and Pj of swimmer j in a formation of N swimmers.
We calculated the cost of transport COTj = Pj/mU, of swimmer j and the change in COT compared to
solitary swimming ∆COTj = (COTsingle − COTj)/COTsingle (Fig. 5.6A). We also calculated the average
change in cost of transport ∆COT =
PN
j ∆COTj/N for each formation (Fig. 5.6B). In all cases, except
108



B
0.5
1
1.5
2
Spacing, d /UT
0
CA
2.5
increasing φ
VS
CFD
3020100 5 2515
Time, t /T
3020100 5 2515
Time, t /T
3020100 5 2515
Time, t /T
Inline Inline
Side-by-side
(inphase)
Diagonal Inline
Diagonal
Figure 5.5: Emergent equilibria in pairwise formations. A. Time evolution of scaled streamwise separation distance d/UT
for a pair of inline swimmers at ϕ = 0. Depending on initial conditions, the swimmers converge to one of two equilibria at distinct
separation distance. B. At ℓ = L/2, d/UT changes slightly compared to inline swimming in panel A. Importantly, a new sideby-side inphase equilibrium is now possible where the swimmers flap together at a slight shift in the streamwise direction. C.
Starting from the first equilibrium in panel A, d/UT increases linearly as we increase the phase lag ϕ between the swimmers.
for the antiphase side-by-side formation, in both CFD and VS simulations, the swimmers traveling in
equilibrium formations save power and cost of transport compared to solitary swimming. The savings are
larger at tighter lateral spacing ℓ.
For inline and diagonal formations, these hydrodynamic benefits are granted entirely to the follower,
whose hydrodynamic savings can be as high as 60% compared to solitary swimming (Fig. 5.6A) [194].
Intuitively, because in 2D flows, vortex-induced forces decay with the inverse of the square of the distance
from the vortex location, flow coupling between the two inline or diagonal swimmers is non-reciprocal;
the follower positioned in or close to the leader’s wake interacts more strongly with that wake than the
leader interaction with the follower’s wake (Figs. 5.1A, B and 5.7A, B).
In side-by-side formations, by symmetry, flow coupling between the two swimmers is reciprocal, or
nearly reciprocal in inphase flapping (Figs. 5.1C, D and 5.7C, D). Thus, hydrodynamics benefits or costs
are expected to be distributed equally between the two swimmers. Indeed, for inphase flapping, the hydrodynamic benefits are shared equally between both swimmers. For antiphase flapping the cost is also
shared equally (Fig. 5.6A).
The biased distribution of benefits in favor of the follower in inline and diagonal formations could
be a contributing factor to the dynamic nature of fish schools [429, 314]. The egalitarian distribution of
109



0
1
2
1.5
0.5
Phase lag, φ/2̟
0 0.25 0.5 0.75 1
2.5
Li et al. 2020
Newbolt et al. 2019
CFD
VS
Kurt et al. 2021
Kim et al. 2010
Spacing, d /UT
Peng et al. 2018
Ramananarivo et al. 2016
Heydari & Kanso 2021
Arranz et al. 2022
Thandiackal & Lauder 2023
Zhu et al. 2014
side-by-side
antiphase
COT
saving
COT
expenditure
-40% 40%0%
Inline A
VS CFD
% saving in COT
B
VS CFD
Side-by-side
(antiphase)
VS CFD
% saving in COT
% saving in COT
Diagonal
VS CFD
% saving in COT
Side-by-side
(inphase)
1 21 2
50
0
-50
1 2 1 2 1 21 2
50
100
0
1 21 2
50
0
-50
100 100
50
0
-50
100
-50
Figure 5.6: Hydrodynamic benefits and linear phase-distance relationship in pairs of swimmers. A. change in cost
of transport compared to solitary swimmers for the inline, diagonal, side-by-side inphase and side-by-side antiphase formations
shown in Fig. 5.1. B. Emergent formations in pairs of swimmers in CFD and VS models satisfy a linear phase-distance relationship,
consistent with experimental [384, 339, 266, 248, 438] and numerical [238, 362, 194, 14] studies. With the exception of the antiphase
side-by-side formation, swimmers in these formations have a reduced average cost of transport compared to solitary swimming.
benefits in the inphase side-by-side formation could explain the abundance of this pairwise configuration
in natural fish populations [266] and why groups of fish favor this configuration when challenged to swim
at higher speeds [16, 281].
5.4 Linear phase-distance relationship in emergent formations is
universal
To probe the universality of the linear phase-distance relationship, we compiled, in addition to our CFD
and VS results, a set of experimental [384, 339, 266] and numerical [238, 362, 194, 248, 14] data from
the literature [193]. Data including CFD simulations of deformable flapping flags (9) [238], (□) [524] and
flexible airfoil with low aspect ratio (▷) [14], physical experiments with heaving (⃝) [384, 339] and pitching
(▽) [248] rigid hydrofoils, fish-foil interactions (∗) [438], and fish-fish interactions (△) measured in pairs of
both intact and visually- and/or lateral line-impaired live fish [266] are superimposed on Fig. 5.6B. All data
collapsed onto the linear phase-distance relationship ϕ/2π ∝ d/UT, with the largest variability exhibited
110



A
C
D
time (t-t
ss
)/T
0 1 2 3 4 5
0
3
-3
CFD VS
time (t-t
ss
)/T
0 1 2 3 4 5
Inline
Side-by-side
(inphase)
Side-by-side
(antiphase)
1 2
1
2
1
2
Ma
1 2
0
6
-6
Ma
B Diagonal
1 2
0
3
-3
Ma
0
3
-3
Ma
Figure 5.7: Hydrodynamic torque for pair of swimmers at different spatial configurations. A. inline (ℓ = 0, ϕ = 0),
B. diagonal (ℓ = L/4, ϕ = 0), C. inphase side-by-side (ℓ = L/2, ϕ = 0) and D. antiphase side-by-side (ℓ = L/2, ϕ = π) in
CFD (left) and VS (right) simulations. For each simulation, we show the active torque Ma exerted by the swimmers after a time
tss, ensuring that steady state has been reached. The non-reciprocity in the effects of leader on follower in inline and diagonal
configuration is apparent. In the side-by-side confirgurations, a simple shift of the data in the inphase flapping case and a simple
mirror symmetry in the antiphase flapping case would show that flow coupling is reciprocal.
111



by live fish with close streamwise distance, where the interaction between fish bodies may play a role.
The side-by-side inphase formations trivially satisfy this linearity because d/UT ≈ ϕ/2π = 0, but the
side-by-side antiphase formations don’t satisfy; in the latter, d/UT = 0 while ϕ/2π = 1.
These findings strongly indicate that flow-coupled flapping swimmers passively organize into stable
traveling equilibrium formations with linear phase-distance relationship. This relationship is independent of the geometric layout (inline versus laterally-offset swimmers), flapping kinematics (heaving versus pitching), material properties (rigid versus flexible), tank geometry (rotational versus translational),
fidelity of the fluid model (CFD versus VS versus particle model), and system (biological versus robotic,
2D versus 3D). Observations that are robust across such a broad range of systems are expected to have
common physical and mechanistic roots that transcend the particular set-up or system realization.
Importantly, this universal relationship indicates that flow physics passively positions a swimmer at
locations d where the swimmer’s flapping phase ϕ matches the local phase of the wake ϕwake = 2πd/UT,
such that the effective phase ϕeff = ϕ−ϕwake is zero. Importantly, because the quantity UT is nearly equal
to the wavelength of the wake of a solitary swimmer, the phase ϕwake = 2πd/UT is practically equal to
the phase of a solitary leader. These observations have two major implications. First, they are consistent
with the vortex phase matching introduced in [266] as a strategy by which fish maximize hydrodynamic
benefits. However, they proffer that vortex phase matching is an outcome of passive flow interactions
among flapping swimmers, and not necessarily an active strategy implemented by fish via sensing and
feedback mechanisms. Second, they led us to hypothesize that emergent side-by-side formations can be
predicted from symmetry arguments, while emergent inline and diagonal formations can be predicted
entirely from kinematic considerations of the leader’s wake without considering two-way flow coupling
between the two swimmers.
112



5.5 Leader’s wake unveils opportunities for stable emergent formations
To challenge our hypothesis that the leader’s wake contains information about the emergent pairwise
equilibria, we examined the wake of a solitary swimmer in CFD and VS simulations (Fig. 5.8A, B). By
analyzing the wake of a solitary swimmer, without consideration of two-way coupling with a trailing
swimmer, we aimed to assess the opportunities available in that wake for a potential swimmer, undergoing
flapping motions, to position itself passively in the oncoming wake and extract hydrodynamic benefit.
Therefore, in the following analysis, we treated the potential swimmer as a “virtual" particle located at
a point (x, y) in the oncoming wake and undergoing prescribed transverse oscillations A sin(2πf t−ϕ) in
the y-direction, at velocity v(t; ϕ) = 2πAf cos(2πf t − ϕ)ey, where ey is a unit vector in the y-direction.
The oncoming wake is blind to the existence of the virtual particle. Guided by our previous findings that
stable equilibrium formations in pairwise interactions occur at zero effective phase ϕeff = ϕ − ϕwake = 0,
where the net hydrodynamic force on the trailing swimmer is zero and where small perturbations lead to
negative force gradients, we introduced two assessment tools: a flow agreement parameter field V(x, y; ϕ)
that measures the degree of alignment, or matching, between the flapping motion of the virtual particle
and the transverse flow of the oncoming wake, and a thrust parameter field T(x, y; ϕ) that estimates the
potential thrust force required to undergo such flapping motions.
Specifically, inspired by [14] and following [194], we defined the flow agreement parameter V(x, y; ϕ)
using 1
T
R t+T
t
v ·u dt
′
, where t is chosen after the oncoming wake has reached steady state, normalized by
1
T
R t+T
t
v·v dt
′
. The normalized V(x, y; ϕ) describes how well the oscillatory motion v(t; ϕ) of the virtual
particle matches the local transverse velocity u(x, y, t) of the oncoming wake [194]. Positive (negative)
values of V indicate that the flow at (x, y) is favorable (unfavorable) to the flapping motion of the virtual
follower.
113



In Fig. 5.8A and B, we show V(x, y; ϕ = 0) as a field over the physical space (x, y) for ϕ = 0. Blue
regions indicate where the local flow favors the follower’s flapping motion. In both CFD and VS simulations, the locations with the maximum flow agreement parameter closely coincide with the stable equilibria
(black circles) obtained from solving pairwise interactions. These findings imply that hydrodynamic coupling in pairs of flapping swimmers is primarily non-reciprocal – captured solely by consideration of the
effects of the leader’s wake on the follower. This non-reciprocity allows one, in principle, to efficiently
and quickly identify opportunities for hydrodynamic benefits in the leader’s wake, without the need to
perform costly two-way coupled simulations and experiments.
Importantly, our findings suggest a simple rule for identifying the locations of stable equilibria in any
oncoming wake from considerations of the flow field of the wake itself: a potential swimmer undergoing a
flapping motion at phase ϕ tends to position itself at locations(x
∗
, y∗
) of maximum flow agreement V(x, y; ϕ)
between its flapping motion and the oncoming wake.
To verify this proposition, we show in Fig. 5.9A, as a function of phase ϕ, the streamwise locations of
the local maxima of V(x, y; ϕ) computed based on the CFD and VS models, and scaled by UT, where U
is the speed of the solitary swimmer. We superimpose onto these results the equilibrium configurations
obtained from pairwise interactions in the context of the CFD (♦), VS (■), and time-delay particle (⃝)
models, where we modified the latter to account for non-zero lateral offset ℓ (Sec. A.3.1 ). Predictions of
the equilibrium configurations based on maximal flow agreement parameter agree remarkably well with
actual equilibria based on pairwise interactions, and they all follow the universal linear phase-distance
relationship shown in Fig. 5.5B.
The wake of a solitary swimmer contains additional information that allows us to evaluate the relative
power savings of a potential follower and relative stability of the pairwise formation directly from the
leader’s wake, without accounting for pairwise interactions. Assessment of the relative power savings
follows directly from the maximal value of the flow agreement parameter: larger values imply more power
114



savings and reduced cost of transport. To verify this, we calculated the maximal V(x
∗
, y∗
; ϕ) in the wake
of the solitary swimmer, where we expected the follower to position itself in pairwise interactions. In
Fig. 5.9B, we plotted these V values as a function of lateral distance ℓ for ϕ = 0. We superimposed the
power savings ∆P based on pairwise interactions of inphase swimmers using the CFD and VS simulations
and normalized all quantities by the maximal value of the corresponding model to highlight variations in
these quantities as opposed to absolute values. Power savings are almost constant for ℓ < 0.25L, but
decrease sharply as ℓ increases. This trend is consistent across all models, with the most pronounced
drop in the CFD-based simulations because the corresponding velocity field u decays more sharply when
moving laterally away from the swimmer.
Next, to assess the stability of the virtual particle based only on information in the oncoming wake
of a solitary swimmer, we estimated the thrust force based on the fact that the thrust magnitude scales
with the square of the swimmer’s lateral velocity relative to the surrounding fluid’s velocity [442, 148,
339]. We defined the thrust parameter field T(x, y; ϕ) = −
1
T
R t+T
t
|(v − u).ey|
2
dt
′
, normalized using
1
T
R t+T
t
|v.ey|
2
dt
′
. At the locations of the maxima of V(x
∗
, y∗
; ϕ), a negative slope ∂T/∂d of the thrust
parameter is an indicator of linear stability or cohesion of the potential equilibria; that is, emergent pairwise formations are expected to be stable if a small perturbation in distance about the locations (x
∗
, y∗
)
of maximal V is accompanied by an opposite, restorative change in T. Indeed, in both CFD and VS wakes,
∂T/∂d at (x
∗
, y∗
) is negative (Fig. 5.8).
In Fig. 5.9C, we plotted |∂T/∂d| as a function of lateral distance ℓ for ϕ = 0. We superimposed the
magnitude of the eigenvalues |δF/δd| obtained from the linear stability analysis of pairwise interactions
in inphase swimmers using the VS and time-delay particle models. As in Fig. 5.9B, all quantities are normalized by the maximal value of the corresponding model to highlight variations in these quantities as
opposed to absolute values. Also, as in Fig. 5.9B, all models produce consistent results: pairwise cohesion
115



BA
1
0
−1
0
1
-1
0
1
-1
Flow agreement
parameter
Lateral y/LLateral y/L
0
Thrust parameter
−1
−.5
∂
∂d ∂
∂d ∂
∂d
∂
∂d
UT 2UT0UT 2UT0
Longitudinal x/L Longitudinal x/L
∂
∂d ∂
∂d
0
Thrust parameter
−1
−.5
∂
∂d
∂
∂d
∂
∂d ∂
∂d
Figure 5.8: Predictions of equilibrium formations from the wake of a solitary swimmer. A, B. Snapshots of vorticity
and fluid velocity fields created by a solitary swimmer in CFD and VS simulations and corresponding flow agreement parameter V
fields for a virtual follower at ϕ = 0. Locations of maximum V-values (i.e., peaks in the flow agreement parameter field) coincide
with the emergent equilibria in inphase pairwise formations (indicated by black circles). Contour lines represent flow agreement
parameter at ±0.25, ±0.5. Thrust parameter T is shown at ℓ = 0 and ℓ = 0.5L. A negative slope ∂T/∂d indicates stability of
the predicted equilibria. See also Figs. 5.2 and 5.3.
is strongest for ℓ < 0.25L, but weakens sharply as ℓ increases, with the most pronounced drop in the
CFD-based simulations.
A few comments on our virtual particle model and diagnostic tools in terms of the flow agreement and
thrust parameters are in order. Our model differs from the minimal particle model used in [33, 339], which
treated both swimmers as particles with minimal ‘wakes’ and considered two-way coupling between them
(see Sec. A.3.1). In our analysis, the oncoming wake can be described to any desired degree of fidelity of the
fluid model, including using experimentally constructed flows when available. Indeed, our flow agreement
and thrust parameters are agnostic to how the flow field of the oncoming wake is constructed. Additionally,
116



VS solitary leader
10.80.60.40.20
1
0
−1
−.5
.5
10.750.50.250
Phase lag, φ/ 2̟ Normalized power saving
1
0
2
0.5
1.5
2.5
10.80.60.40.20
Normalized pairwise cohesion
1
0
−1
−.5
.5
Separation distance, d /UT
A B C
CFD solitary leader
Lateral distance ,
Lateral distance ,
CFD pairwise interactions VS pairwise interactions time-delay particle model
time-delay particle model
Asin(2̟ft) Asin(2̟ft-φ)
Figure 5.9: Predictions of equilibrium locations, power savings, and cohesion, from the wake of a solitary leader.
A. Location of maximum V as a function of phase lag ϕ in the wake of solitary leaders in CFD and VS simulations. For comparison, equilibrium distances of pairwise simulations in CFD, VS and time-delay particle models (Sec. A.3.1 and Fig. 5.10) are
superimposed. Agreement between V-based predictions and actual pairwise equilibria is remarkable. B. V values also indicate
the potential benefits of these equilibria, here shown as a function of lateral distance ℓ for a virtual inphase follower in the wake
of a solitary leader in CFD and VS simulations. The power savings of an actual follower in pairwise formations in CFD and VS
simulations are superimposed. C. A negative slope ∂T/∂d of the thrust parameter T indicates stability and |∂T/∂d| expresses the
degree of cohesion of the predicted formations, here, shown as a function of ℓ for an inphase virtual follower. |∂F/∂d| obtained
from pairwise formations in VS and time-delay particle models are superimposed (Fig. 5.10). Results in D. and E. are normalized
by the corresponding maximum values to facilitate comparison.
these diagnostic tools are equally applicable to any oncoming wake, not necessarily produced by a single
swimmer, but say by multiple swimmers (as discussed later) or even non-swimming flow sources. Thus,
the approach we developed here could be applied broadly to analyze, predict, and test opportunities for
schooling and hydrodynamic benefits for live and robotic fish whenever measurements of an oncoming
flow field are available.
5.6 Parametric analysis over the entire space of phase lags and lateral
offsets
Having demonstrated consistency in the emergence of flow-mediated equilibria in both CFD and VS simulations, we next exploited the computational efficiency of the VS model to systematically investigate
117



21.61.20.80.40
Period-averaged flow speed
1
0
−1
−0.5
0.5
B
VS
Lateral distance,
model
1.2
1.1
1
1.05
1.15
C
Separation distance, d /UT
50403020100
Time, t/T
time-delay particle model
increasing
y1
=Asin(2̟ft) y2
=Asin(2̟ft-φ)
time-delay particle model with lateral offset
y1
=Asin(2̟ft)
y2
=Asin(2̟ft-φ)
A
uy
=y1
(t-∆t)e-∆t/т
uy
=y1
(t-∆t)e-∆t/тe
-| /h|
fitted curve e-| /h|
Figure 5.10: Time-delay particle model and stability of swimmer. A. Schematics of time-delayed particle model [384, 339]
and its extension to laterally-offset swimmers. Each swimmer generates hydrodynamic thrust via oscillating vertically, which
also leaves a wake behind it. The follower swimmer interacts with the wake of the leader. B. For an inphase pair, starting at initial
distance d/UT = 1.15, we incrementally increase the lateral offset from ℓ = 0 to ℓ = L. (inset) Lateral decay of flow speed
in the wake of a solitary swimmer in the vortex sheet model (blue line) and the fitted exponential curve (red line). The lateral
exponential decay in the time-delay particle model takes the form exp
−|ℓ/1.6|
2.73
. C. The hydrodynamic force ⟨F2⟩ acting
on the follower as a function of the corresponding distance from the equilibrium. Due to the decay of the leader’s wake in the
lateral direction (see inset in A), the hydrodynamic force ⟨F2⟩ experienced by the follower decreases in magnitude at increasing
lateral offset ℓ. Orange lines show the linear change in force at the corresponding equilibria. The slope δF/δd is a measure of
linear stability. Negative slopes imply stable formations for all ℓ ≤ L, but these slopes become more shallow as we increase ℓ.
11



emergent pairwise formations over the entire space of phase lag ϕ ∈ [0, 2π) and lateral offset ℓ ∈ [−L, L],
excluding side-by-side antiphase formations.
Equilibrium configurations are dense over the entire range of parameters: for any combination of phase
lag ϕ and lateral offset ℓ, there exists an emergent equilibrium configuration where the pair of swimmers
travel together at a separation distance d/UT (Fig. 5.11A). Perturbing one or both parameters, beyond the
limits of linear stability, causes the swimmers to stably and smoothly transition to another equilibrium
at different spacing d/UT. Importantly, increasing the phase lag ϕ shifts the equilibrium positions in the
streamwise direction such that d/UT depends linearly on ϕ, but the effect of lateral distance for ℓ ≤ L is
nonlinear and nearly negligible for small ℓ: increasing the lateral offset ℓ by an entire bodylength L changes
the pairwise distance d/UT by about 15%. Our results explored emergent equilibria up to d/UT ≤ 2.5 and
are consistent with the experimental findings in [341], which explored up to nine downstream equilibria.
To assess the hydrodynamic advantages of these emergent formations, we calculated the average
change in hydrodynamic power per swimmer. The pair saves power compared to solitary swimming
(Fig. 5.11B). Power savings vary depending on phase lag ϕ and lateral distance ℓ: for the entire range of
ϕ from 0 to 2π, the school consistently achieves over 20% power reduction, as long as the lateral offset
is ℓ ≤ 0.25L. However, increasing ℓ from 0.25L to L reduces significantly the hydrodynamic benefit.
That is, swimmers can take great liberty in changing their phase without compromising much the average
energy savings of the school, as long as they maintain close lateral distance to their neighbor.
A calculation of the linear stability of each equilibrium in Fig. 5.11A shows that these emergent formations are linearly stable (Fig. 5.11C), and the degree of stability is largely insensitive to phase lag, with
strongest cohesion achieved at lateral offset ℓ ≤ 0.25L. The results in Fig. 5.11A-C are constructed using
pairwise interactions in VS simulations, but can be inferred directly from the wake of a solitary leader, as
discussed in the previous section and shown in Fig. 5.11D-F.
119



2.521.510.50
Separation distance, d /UT
A
phase lag φ
Lateral distance, /L
1
0
−1
−0.5
0.5
stability / cohesion
B
2.521.510.50
Separation distance, d /UT
2.521.510.50
Separation distance, d /UT
0 ̟
C
COT
saving
COT
expenditure
2̟ -50% 50% 0% 0 0.15 0.3
Inline
Side-by-side
% COT saving
per swimmer
×(ρL3
/T2
)
0.25
0.15
0.10
|∂F/∂d|
30%
10%
20%
2.521.510.50
Separation distance, d /UT
2.521.510.50
Separation distance, d /UT
2.521.510.50
Separation distance, d /UT
100%0% 50% 100%50%0%
Lateral distance, /L
1
0
−1
−0.5
0.5
D E F
Flow agreement
parameter
Figure 5.11: Equilibria are dense over the parameter space. For any given phase lag ϕ and at any lateral offset ℓ inside the
wake, the pair reaches equilibrium formations that are stable and power saving relative to a solitary swimmer. A. Equilibrium
separation distances, B. average power saving, and C.stability as a function of phase lag and lateral distance in a pair of swimmers.
Predictions of D. equilibrium locations, E. hydrodynamic benefits, and F. cohesion based on the wake of a solitary swimmer
following the approach in Figs. 5.8 and 5.9. For comparison, the contour lines from panels B and C based on pairwise interactions
are superimposed onto panels E and F (white lines). Simulations in panels A-C are based pairwise interactions and Simulations
in panels D-F are based on the wake of a single swimmer, all in the context of the vortex sheet model with A = 15◦
, f = 1 and
τdiss = 2.45T.
5.7 Analysis of larger groups of inline and side-by-side swimmers
How do these insights scale to larger groups? To address this question, we systematically increased the
number of swimmers and computed the emergent behavior in larger groups based on flow-coupled VS
simulations.
In a group of six swimmers, all free to move in the streamwise x-direction, we found that the last three
swimmers split and form a separate subgroup (Fig. 5.12A). In each subgroup, swimmer 3 experiences the
120



largest hydrodynamic advantage (up to 120% power saving!), swimmer 2 receives benefits comparable to
those it received in pairwise formation (65% power saving), and swimmer 1 no benefit at all (Fig. 5.12C).
We asked if loss of cohesion is dependent on the number of inline swimmers. To address this question,
we gradually increased the number of swimmers from two to six (Fig. 5.13). We found that in a school of
three inline swimmers, flow interactions led to a stable emergent formation with hydrodynamic benefits
similar to those experienced by the three swimmers in each subgroup of Fig. 5.12A and C. When computing
the motion of four inline swimmers (Fig. 5.15A), we found that the leading three swimmers maintained
cohesion, at hydrodynamic benefits similar to a formation of three, but swimmer 4 separated and lagged
behind, receiving no advantage in terms of power savings because it split from the formation (Figs. 5.15D
and 5.13). In a group of five, the last two swimmers split and formed their own subgroup. That is, in all
examples, swimmer 4 consistently lost hydrodynamic advantage and served as local leader of the trailing
subgroup. These observations are consistent with [362] and demonstrate that flow interactions alone are
insufficient to maintain inline formations as the group size increases.
We next explored the robustness of the side-by-side pattern to larger number of swimmers starting
from side-by-side initial conditions (Fig. 5.12B). The swimmers reached stable side-by-side formations
reminiscent of the configurations observed experimentally when fish were challenged to swim at higher
swimming speeds [16]. The swimmers in this configuration saved power compared to solitary swimming
(Fig. 5.12C): swimmers gained equally in terms of hydrodynamic advantage (up to 55% power saving for
the middle swimmers in a school of six), except the two edge swimmers which benefited less. We tested
these results by gradually increasing the number of swimmers from two to six (Fig. 5.14). The robustness
and overall trend of power saving among group members is robust to the total number of swimmers in
these side-by-side formations.
121



A
B
Inline
Side-by-side
0
50
100
% Saving in
Cost of Tranport
Swimmer index, n
C
Inline Side-by-side
1 2 3 4 5 6 average
Swimmer index, n
1 2 3 4 5 6 average
0
50
100
Figure 5.12: Larger inline and side-by-side formations. A. Inline formations lose cohesion and split into two subgroups
as depicted here for a group of six swimmers. B. Side-by-side formations remain cohesive. C. Power saving of each swimmer in
inline and side-by-side formations. Dissipation time τdiss = 2.45T. Simulations of inline formations and side-by-side formations
ranging from 2 to 6 swimmers are shown in Fig. 5.13 and Fig. 5.14.
5.8 Mechanisms leading to loss of cohesion in larger inline formations
To understand why three swimmers form a stable inline formation but four don’t, we extended the analysis
in Fig. 5.8 to analyze the wake created behind two-swimmer (Fig. 5.16A) and three-swimmer (Fig. 5.16B)
groups. Specifically, we computed pairwise interactions in a two-swimmer school and considered the
combined wake of both swimmers after they had settled onto an equilibrium state. Similarly, we computed
the behavior of a three-swimmer school and analyzed the combined wake at steady state. Compared to
the single leader wake in Fig. 5.8B, in the wake of a two-swimmer school, positive flow agreement in the
(blue) region is enlarged and enhanced, corresponding to swimmer 3 receiving the largest power savings.
On the other hand, behind three inline swimmers, the region of positive flow agreement is weakened and
shrunk, indicating weaker potential for energy saving by a fourth swimmer.
Importantly, in the wake of the pairwise formation, the downstream jet is modest at the location of
maximum V, where swimmer 3 is expected to position itself for hydrodynamic benefit, thus allowing
swimmer 3 to reach this position and stay in formation (Fig. 5.16E). Also, at this location, the wake has a
substantial transverse velocity u · ey (Fig. 5.16G), which aids thrust production at a diminished cost. In
contrast, three inline swimmers generate a much stronger downstream jet at the location of maximum V
122



B
E
D
C
1 32 4 65
1 32 4 5
1 32 4
1 32
Spacing, d /UT
0
1
2
5
1-2
2-3
Spacing, d /UT
0 10 20 30 40 50
Time, t/T
3-4
4-5
5-6
4
3
1-2
2-3
Spacing, d /UT Spacing, d /UT
1-2
2-3
3-4
3-4
4-5
1 2
A
1-2
Spacing, d /UT
1-2
2-3
0
1
2
5
4
3
0
1
2
5
4
3
0
1
2
5
4
3
0
1
2
5
4
3
F
1 2 31 2 average 3 41 2 average 3 4 51 2 average 3 4 65average 1 2 average
% Saving in
Cost of Tranport
0
50
100
Figure 5.13: Inline formations. Snapshots of inline formations composed of 2, 3, 4, 5 and 6 swimmers in VS simulations at
steady state; time-evolution of pairwise distances is shown on the right. A and B. Formations composed of 2 or three swimmers
are stable with at consecutive spacing d/UT = 1. C. For a trail of 4 swimmers, the group splits into a leading subgroup of 3
swimmers while the fourth swimmer separates from the rest. D and E. For formation of 5 or 6 swimmers, the group splits into a
leading subgroup of 3 swimmers and another subgroup containing the remaining 2 or 3 swimmers. F. reports recent savings in
COT for each swimmer and the average of the whole group.
123



B
C
D
E
1
2
3
1
2
3
4
1
2
3
4
5
1
2
3
4
5
6
A
1
2
1-2
2-3
Spacing , d /UT
1-2
2-3
Spacing , d /UT
1-2
2-3
Spacing , d /UT
0
Spacing , d /UT
0 10 15 20 25 30
Time, t/T
5
2-3
3-4
3-4
4-5
1-2
2-3
3-4
4-5
5-6
.5
-.5
0
.5
-.5
0
.5
-.5
0
.5
-.5
1-2
Spacing , d /UT
0
.5
-.5
% Saving in
Cost of Tranport
0
50
100
F
1 32 4 65average 1 2 average3 4 51 2 average3 41 2 average31 2 average
Figure 5.14: Side-by-side formations. Vortex sheet simulation of side-by-side formations with 2, 3, 4, 5 and 6 swimmers, from
top to bottom, respectively. On right hand side, we report pairwise spacing between them. In all of the groups the formations are
stable and the distances between every pair are close to zero. F. reports recent savings in COT for each swimmer and the average
of the whole group.
124



1 2 3 4 average 1 2 3 4 5 average 1 2 3 4 5 6 average
6 swimmers
A
B
C
swimmer 4
separates
swimmer 5
separates
swimmer 6
separates
0
50
100
Swimmer index, n
0
50
100
Swimmer index, n
5 swimmers
0
50
100
Swimmer index, n
D 4 swimmers % Saving in Cost of Tranport
Figure 5.15: Loss of cohesion in larger groups of inline swimmers. Number of swimmers that stay in cohesive formation
depends on parameter values. A-C. For dissipation time τdiss = 2.45T, 3.45T and 4.45T, the 4th, 5th and 6th swimmers separate
from the group, respectively. D. Power savings per swimmer in panels A-C, respectively. On average, all schools save equally in
cost of transport, but the distribution of these savings vary significantly between swimmers. In all case, swimmer 3 receives the
most hydrodynamic benefits.
where swimmer 4 is expected to position itself (Fig. 5.16F). This jet prevents swimmer 4 from stably staying
in formation, and the transverse flow velocity u·ey is nearly zero for the entire flapping period (Fig. 5.16H)
indicating little opportunity for exploiting the flow generated by the three upstream swimmers for thrust
generation. This limitation is fundamental; it results from the flow physics that govern the wake generated
by the upstream swimmers. There is not much that a trailing swimmer can do to extract hydrodynamic
benefits from an oncoming flow field that does not offer any.
5.9 Critical size of inline formations beyond which cohesion is lost
We sought to understand what determines the critical group size, here three, beyond which inline formations lose cohesion and split into subgroups. Because we have established that the flow agreement
125



1
0
−1
.75
0
-.75
x /L
A B
Flow agreement
parameter
T-averaged
streamwise
velocity
x /L
0
0
3
1
-1
-2
2
-3
Transverse
fluid velocity, uy
follower
tailbeat
velocity
0 .2
Time, t/T
1.4 .6 .8 0 .2
Time, t/T
1.4 .6 .8
0
3
1
-1
-2
2
-3
Transverse
fluid velocity, uy
C D
E F
G H
0
1
-1
Lateral y/L
0
1
-1
Lateral y/L
0
1
-1
Lateral y/L
UT 2UT 3UT 0 UT 2UT 3UT
Figure 5.16: Prediction of equilibrium formations, cohesion, and power savings from the wake of upstream swimmers. A., B. Snapshots of vorticity fields created by two inline inphase swimmers, and three inline inphase swimmers. C., D.
shows the corresponding flow agreement parameter V fields. Contour lines represent flow agreement parameter at ±0.25, ±0.5.
E., F. plots the corresponding period-averaged streamwise velocity. Separation distances d/UT predicted by the locations of
maximal V are marked by circles in the flow agreement field. In the left column, separation distances d/UT based on freely
swimming triplets are marked by black circles and coincide with the locations of maximal V. In the right column, the orange
marker shows the prediction of the location of a fourth swimmer based on the maximum flow agreement parameter. In two-way
coupled simulation, swimmer 4 actually separates from the leading 3 swimmers as illustrated in Fig 5.15A. CFD simulation shows
swimmer 4 will collide with swimmer 3 as in Fig. 5.17. G. , H. shows the transverse flow velocity in a period at the location
predicted by the maximum flow agreement parameter and with a lateral offset ℓ = 0, 0.5L, L, in comparison to the follower’s
tailbeat velocity.
parameter V plays an important role in predicting emergent formations, we first examined V in the wake
of a pair of flapping swimmers in CFD (Fig. 5.2) and VS (Fig. 5.3) simulations. These results show that at
lower Re and smaller dissipation time τdiss, the flow agreement parameter V decays rapidly downstream of
the flapping swimmers, thus diminishing the opportunities for downstream swimmers to passively stay in
126



1-2
Spacing, d /UT
2-3
0.5
0
1.5
1
0 10 20 30
Time, t/T
Lateral y/L
0
1
-1
1-2
2-3
3-4
Spacing, d /UT
0.5
0
1.5
1
collision
Longitudinal, x/L
1 2 3 4 5 60 7 8 9 10
Lateral y/L
0
1
-1
B
A
3 swimmer
4 swimmer
−10 0 10
Vorticity
×(U/L)
Figure 5.17: CFD simulation of larger inline schools. CFD simulation of inline formations with 3 and 4swimmers at Re
= 1645. Vorticity field is shown on the left hind side. On right hand side, we report pairwise spacing between them. The pitching
amplitude of leader and follower is set to A = 15◦
.
cohesive formation and achieve hydrodynamic benefits. We thus hypothesized that the number of swimmers that passively maintain a cohesive inline formation is not a universal property of the flow physics,
but depends on the flow regime.
We tested this hypothesis in VS simulations with increasing number of swimmers and increasing
τdiss. As we increased τdiss, the number of swimmers that stayed in cohesive inline formation increased
(Fig. 5.15B,C). These findings confirm that this aspect of schooling – the maximal number of swimmers
that passively maintain a cohesive inline formation – is indeed scale-dependent. Interestingly, an analysis
of the power savings in these formations shows that, although swimmers 4 and 5 stay in formation at
increased τdiss, swimmer 3 always receives the most hydrodynamic benefit (Fig. 5.15D).
We additionally tested the stability of inline formations in CFD simulations at Re = 1645 (Fig. 5.17)
and observed the same trend: an inline school of 3 swimmers remains cohesive, but a fourth swimmer
collides with the upstream swimmer. These observations imply that the loss of cohesion does not depend
on the specific fluid model. This is consistent qualitatively with existing results [362]. In [362], the authors
127



employed flexible heaving foils at Re = 200 and observed stable inline formations with larger number of
swimmers. The flexible foil model and smaller Re make the swimmer more adaptive to changes in the flow
field, by passively modulating the amplitude and phase along its body, thus diverting some of the hydrodynamic energy into elastic energy and stabilizing the larger inline formation. This, again, emphasizes
that the number of swimmers in a stable inline group is not a universal property of the formation, rather
it is model and scale-dependent.
5.10 Phase control to stabilize unstable inline school
Consider a swimmer flapping at a phase ϕ can ‘sense’ or measure the agreement of its flapping motion
with the local fluid velocity u (generated by sources other than itself) at its location (say at its midpoint),
over a time span of m flapping periods. The goal of the swimmer would be to adjust its current flapping
phase ϕ to a desired phase Φ that maximizes its agreement with the local velocity
Φ(t) = argmax
ϕ
1
mT
R t
min (t−mT,0) v(t
′
, ϕ) · u(t
′
) dt
′
1
mT
R t
min (t−mT,0) v(t
′
, ϕ) · v(t
′
, ϕ) dt
′
, (5.2)
using a proportional phase controller inspired from [265, 266],
ϕ¨(t) = −γ
2
[ϕ(t) − Φ(t)] − 2γϕ˙(t); (5.3)
Here, Φ(t) is the desired phase and γ is a constant that determines the speed of convergence. We chose the
parameters as follows: we set the number of periods m = 2 that describes the memory of the swimmer
of the ambient fluid u, such that the time history is two times the pitching period 2T. We set the control
gain γ = 3 to ensure that the actual phase ϕ(t) can reach the desired phase Φ(t) at 1% of relative error
within 1.5T.
128



By implementing this phase controller in swimmer 4 in a group of four inline swimmers, the swimmer
is able to stabilize itself in the formation as shown in Figure 5.18. However, this stabilization is expensive:
by actively controlling its phase to stay in formation, swimmer 4 spends 100% more hydrodynamic power
than swimming alone.
5.11 Mapping emergent spatial patterns to energetic benefits
We next returned to the school of four swimmers, which, when positioned inline and flapped inphase,
lost cohesion as the trailing swimmer separated from the school. We aimed to investigate strategies for
stabilizing the emergent school formation and mapping the location of each member in the school to the
potential benefit or cost it experiences compared to solitary swimming.
Inspired by vortex phase matching as an active strategy for schooling [266, 265], we tested whether
phase control is a viable approach to maintain cohesion and gain hydrodynamic benefits. We devised an
active feedback control strategy, where the swimmer senses the oncoming transverse flow velocity at its
location and adjusts its flapping phase to maximize the agreement V between its flapping motion and
the local flow (see 5.10 for more details). When applied to swimmer 4 (Fig. 5.18A), this phase controller
led to a stable formation, albeit at no benefit to swimmer 4; in fact, swimmer 4 spent 100% more power
compared to solitary swimming, whereas the power savings of swimmers 2 and 3 remained robustly at the
same values as in the formation without swimmer 4. The inability of swimmer 4 to extract hydrodynamic
benefits from the oncoming flow is due to a fundamental physical limitations, as explained in Fig. 5.16; by
the non-reciprocal nature of flow interactions, changing the phase of the trailing swimmer has little effect
on the oncoming flow field generated by the upstream swimmers. If the oncoming wake itself presents no
opportunity for hydrodynamic benefit, phase control cannot generate such benefit.
We next investigated whether collaborative phase modulation could aid in maintaining school cohesion
by imposing that each swimmer flaps at a phase lag ∆ϕ relative to the swimmer ahead (Fig. 5.18B,C). We
129



found a range of values of ∆ϕ at which the school became passively stable, but without providing much
hydrodynamic benefit to the trailing swimmer; in fact, at certain ∆ϕ, cohesion came at a hydrodynamic
cost to swimmer 4, much like the active phase control strategy.
Lastly, we investigated whether a lateral offset of some of the swimmers could passively stabilize the
emergent formation. The choice of which swimmers to displace laterally and by how much is not unique.
Thus, we probed different scenarios and obtained multiple stable formations (Fig. 5.18D and Fig. 5.19).
For example, pairing any two of the four swimmers side-by-side, say at the leading, middle, or trailing
end of the school, led to cohesive formations. The distribution of hydrodynamic cost or benefit varied
depending on the spatial pattern of the school and the individual position within the school. Staggering
the swimmers in a zigzag pattern also stabilized the school, but did not always allow the trailing swimmer
to improve its cost of transport. Staggering the swimmers in a "Diamond" formation stabilized the school
and, of all the stable formations we tested, led to the highest savings in cost of transport for the entire
school (Fig. 5.18D). These results are consistent with existing evidence that diamond formations are both
stable [444] and energetically optimal [486, 101]. But unlike individuals in an infinite diamond lattice [486],
individuals in a finite diamond formation do not receive equal energetic benefits.
Our findings highlight the versatility and fluidity of the emergent spatial patterns in groups of flapping
swimmers and emphasize that energetic benefits vary depending on the position of the individual within
the school. Importantly, these findings imply that, although many emergent formations do not globally
optimize the savings of the entire school, hydrodynamic interactions within these formations offer individuals numerous opportunities to achieve varying levels of energetic savings [296], potentially creating
competition among school members over advantageous positions in the school.
130



-4% 57% 17% -102%
Sense
local flow
velocity
Phase control
to maximize flow
agreement
-1%
85%
85%
142%
-2%
65%
113%
-11%
-2% 65% 123% -107%
Staggering Diamond
Spacing, d4 /UT
Δφ
0
0
1
2
3
-100
-50
0
% Saving in
Cost of Tranport
-180o 180o 30o
-120o
-90o
-60o
-30o 60o 90o
-150o 150o
120o
Loss
of
School
Cohesion
Expenditure
Savings
A D
B
C
side-by-side pairing
Phase
Tuning
Phase
Control
24%
35% 69% -42%
0% 85%
89%
23%
-1% 62%
16%
34%
Figure 5.18: Passive and active methods for stabilizing an emergent formation of four swimmers. A. In an inline
school of four-swimmers, the leading three swimmers flap inphase, but swimmer 4 actively controls its phase in response to
the flow it perceives locally to match its phase to that of the local flow as proposed in [266]. The phase controller stabilizes
swimmer 4 in formation but at no hydrodynamic benefit. B. Sequentially increasing the phase lag by a fixed amount ∆ϕ = −30o
in an inline school of four-swimmers stabilizes the trailing swimmer but at no hydrodynamic benefit. C. Gradually tuning the
phase lag ∆ϕ in a school of four swimmers as done in panel B. At moderate phase lags, the school stays cohesive (top plot) but
swimmer 4 barely gets any power savings (bottom plot). D. By laterally offsetting the swimmers, four swimmers, all flapping
inphase, form cohesive schools with different patterns, e.g. with side-by-side pairing of two swimmers, staggered, and diamond
patterns. The time evolution of separation distances is shown in Fig. 5.19. Individual in each pattern receive a different amount of
hydrodynamic benefit. Diamond formation provides the most power saving for the school as anticipated in [486] for a school in
a regular infinite lattice. In panels A, B and D, % values indicate the additional saving or expenditure in cost of transport relative
to solitary swimming.
5.12 Feedback control for maintaining school cohesion in uncoordinated
flapping swimmers
We employed the minimal time-delayed particle model developed in [339] and described in Sec. A.3.1 to
describe a pair of flow-coupled oscillating swimmers (Fig. A.2). With the aim to stabilize the formation of
a pair of swimmers flapping at different frequencies, we devise a feedback controller based on local flow
sensing inspired by [264]. We consider frequency control instead of amplitude control for two reasons.
First, as suggested in [415], when fish try to accelerate or decelerate, they typically adjust their undulation
131



A
D
C
B
2
1
3 4
collision
2
1
3 4
21
3
4
1 2 3
4
1-2
2-3
3-4
0 10 20 30 40 50
Time, t/T
Spacing, d /UT
0
1
2
4
3
1-3
3-4
1-2
Spacing, d /UT
0
1
2
4
3
1-3
3-4
Spacing, d
1-2
/UT
0
1
2
4
3
1-2
2-4
Spacing, d
2-3
/UT
0
1
2
4
3
1-2
2-3
3-4
Spacing, d /UT
0
1
2
4
3
E
21 3 4
collision
time
Figure 5.19: Alternative formations of four swimmers. Vortex sheet simulation of four swimmers with alternative formations. On right hand side, we report pairwise spacing between them. The lateral distance is ℓ = 0.25L. A. Two leading
swimmers swim side by side. The third and forth swimmer collide to each other. B. The same configuration as in A. with larger
initial distance between the third and forth swimmer. They form a stable school. The forth swimmer stays at the second equilibrium behind the third swimmer. C. An additional swimmer is placed side by side to the second swimmer in a school of three
inline swimmers. The second and third swimmer stays at the second equilibrium from the first swimmer, and the forth swimmer
stays at the second equilibrium from the second and third swimmer. D. An additional swimmer is placed side by side to the third
swimmer in a school of three inline swimmers. The last two swimmers stay at the second equilibrium from the second swimmer.
E. Four inline inphase swimmers when initially placed close to the second equilibrium. Power saving per swimmer is reported in
Fig. 5.18.
132



frequency while maintaining the same body deformation amplitude for optimal hydrodynamic performance. Thus, developing a frequency controller is of biological relevance. Secondly, frequency is more
detrimental to the stability of emergent formation as demonstrated in [339] and Fig. A.3.
5.12.1 Flow sensing model
The parametric study in Fig. A.3 shows that the pair of swimmers lose cohesion when the follower is
flapping at the same amplitude (A2/A1 = 1) but different frequency (f2/f1 ̸= 1) from the leader. To
form a coherent school, an intuitive idea is for the follower to try to match its own frequency with the
frequency of the leader. However, the frequency of the leader f1 is unknown to the follower. Thus, we
need to estimate it based on the follower’s local information. The follower has access to its own swimming
velocity x˙ 2(t), the flow velocity at its own location e
−∆t/τ y˙1(t − ∆t), and the hydrodynamic force F2(t)
acting on it. Here we discuss scenarios by which the follower, from this local information, can estimate
the flapping frequency f1 of the leader.
5.12.2 Simplified sensing scenarios
We first considered a one-way coupled problem in which the follower probes the flow velocity in the wake
of the leader, but its swimming motion is not influenced by the leader’s wake. Thus, the swimming speed
of the follower is given by U2 = πA2f2
p
2CT /CD and its motion is given by x2(t) = −U2t. The flow
velocity left by the leader is given by v(x, t) = 2πA1f1e
−(x+U1t)/U1τ
cos(2πf1x/U1). The signal sensed
by the follower as a function of time is v(x2(t), t) = 2πA1f1e
−(−U2t+U1t)/U1τ
cos(2πf1U2/U1t). Thus, the
dominant frequency is f1U2/U1 = f2A2/A1. Under this scenario, the follower cannot decode information
about the leader’s oscillatory frequency.
We next considered that both swimmers are tethered, in which the gap distance d(t) between them
is kept constant d, and the incoming flow velocity is the self-propelled swimming speed of the leader U1.
133



Here, ∆t = d/U1 is a constant. The flow velocity sensed by the follower is 2πA1f1e
−d//U1τ
cos(2πf1(t −
d/U1)), in which the frequency of leader f1 can be decoded by a frequency analysis. Alternatively, if the
follower senses the fluid force instead of flow velocity, the force can be decomposed can be represented in
a Fourier series expansion, with four Fourier modes |f1 − f2|, f1 + f2, 2 max(f1, f2), 2 min(f1, f2), which
encode the leader’s frequency f1. In fact, the follower only needs the first two Fourier modes are sufficient
to decode the frequency of the leader.
5.12.3 Flow sensing model during free swimming
A
B
f
2
/f1
=0.7
f
2
/f1
=0.6
f
2
/f1
=1.4
f
2
/f1
=1.2
0 2 4 6 8 10
frequency
magnitude magnitude
0
10
8
6
4
2
(x103
)
(x103
)
0
10
8
6
4
2
Figure 5.20: Fourier analysis of flow velocity sensed by the follower. A. The follower’s frequency is
smaller than that of the leader (f2/f1 = 0.6, 0.7). B. The follower’s frequency is larger than that of the
leader (f2/f1 = 1.2, 1.4).
leader frequency f1
0.25 0.5 0.75 1.0 2.0 1.25 1.5 1.75
0.25
0.5
0.75
1.0
2.0
1.25
1.5
1.75
follower frequency f
2
f
s
1
2
0
f
1 = f2
decision
boundary
Figure 5.21: Dominant frequency of sensed flow velocity fs as a function of leader frequency f1 and follower frequency f2.
134



The simplified scenarios discussed above are insufficient to decode the frequency of the leader while
swimming freely. To this end, we considered the original two-way coupled problem in Eq. (A.27). We
used the flow velocity v(x2(t), t) at the location of the follower as sensory cue and took the Fourier series
expansion of this flow velocity. We considered the limit of high Reynolds number (τ → ∞) to ensure long
enough signal. Fig. 5.20A,B shows four typical examples. When the follower’s frequency f2 is smaller than
that of the leader and the leader frequency is normalized to f1 = 1, the frequency of the dominant mode
is nearly independent of the follower’s frequency (Fig. 5.20A): for f2 = 0.6 and f2 = 0.7, respectively,
the dominant mode has frequency fs = 1.15 and fs = 1.18 (the follower frequency f2 is reflected in the
second mode of flow velocity). When the follower’s frequency f2 is larger than the leader, the first mode
only reflects the frequency of the follower fs ≈ f2 (Fig. 5.20B): for f2 = 1.2 and f2 = 1.4, respectively,
the dominant mode has frequency fs = 1.2 and fs = 1.4.
We set out to probe whether f2/f1 = 1 indeed determines the boundary between when the dominant
Fourier mode fs of v(x2(t), t) reflects either f1 or f2. We discretized the frequency space (f1, f2) between
([0.25, 2] × [0.25, 2]) using a 175 × 175 grid and at each grid point, we solved the coupled time-delay
system in Eq. (A.27) for a time interval from [0, 100] using a timestep dt = 10−3
. We evaluated v(x2(t), t)
over the entire time interval and calculated the dominant Fourier mode. Results are shown as a colormap
over the (f1, f2) space in Fig. 5.21. We found that the boundary between whether f1 or f2 are reflected
in the dominant mode fs is not f1 = f2 (dashed line in Fig. 5.21). Instead, it is a line with a smaller slope
(solid line). Below this line, fs is close but not exactly equal to f1. Above this line, fs is equal to f2 (up
to a small numerical error < 10−8
). To understand this transition in fs at the solid line, we went back to
analyze the equation of motion Eq. (A.27) in an effort to determine analytically the dominant frequency.
135



5.12.4 Frequency analysis
To simplify the equations of motion, we assumed that the leader is moving at a constant speed U1 equal
to its time-average speed x˙ 1(t) = −U1. Substituting into Eq. (A.27), we obtained the following decoupled
system of equation
mx¨2 = − F2 + D2, D2 = CD( ˙x2)
2
F2 =CT (2πA2f2 cos(2πf2t − ϕ) − 2πA1f1e
−∆t/τ cos(2πf1(t − ∆t)))2
∆t =x2/U1 + t
(5.4)
We expanded the expression for the force F2, ignored the high frequency (f1 + f2) term (in simulation,
we checked that ignoring this term does not influence the dynamics of the problem), and took the limit of
τ → ∞:
F2 =C1 + C2 + C1 cos(4πf2t − 2ϕ) + C2 cos(4πf1t − 2t − 2x2/U1)
− C3 cos(2π(f1 − f2)t − t − x2/U1 + ϕ),
(5.5)
where the coefficients C1 = 2π
2A2
2
f
2
2 CT , C2 = 2π
2A2
1
f
2
1 CT , C3 = 4π
2A1A2f1f2CT are introduced to
simplify the notation.
We assumed that x2 = B0 + B1t + B2 cos(B3t) at steady state, where B0, B1, B2, and B3 are unknown constants representing the follower’s initial location, average speed, and amplitude, and frequency
of oscillation speed. Substituting into Eq. (5.5), we get
F2 = C1 + C2 + C1 cos(4πf2t − 2ϕ) + C2 cos[(4πf1 − 2 − 2B1/U1)t
− 2B0/U1 − 2B2 cos(B3t)/U1] − C3 cos[(2πf1 − 2πf2 − 1 − B1/U1)t
− B0/U1 − B2 cos(B3t)/U1 + ϕ]
(5.6)
136



The complexity of this equation comes from the composite trigonometric functions: the unknown function x2 is inside the cos function, e.g. the presence of term cos(B2 cos(B3t)/U1). Fourier expansion of
cos[B2 cos(B3t)] gives JB2
(0) − 2JB2
(2) cos(2B3t) + h.o.t, where Jα denotes Bessel functions of the first
kind. In this expansion, using the first two terms is a good estimation of the nested trigonometric functions.
As such, Eq. (5.6) can be further simplified to
F2 = C1 + C2 + C1 cos(4πf2t − 2ϕ) + C2[J2B2
(0) − 2J2B2
(2) cos(2B3t)]
cos[(4πf1 − 2 − 2B1/U1)t − 2B0/U1] − C2[2J2B2
(1) cos(B3t) − 2J2B2
(3) cos(3B3t)]
sin[(4πf1 − 2 − 2B1/U1)t − 2B0/U1] cos(2B3t) − C3[JB2
(0) − 2JB2
(2) cos(2B3t)]
cos[(2π(f1 − f2) − 1 − B1/U1)t − B0/U1 + ϕ] + C3[2JB2
(1) cos(B3t) − 2JB2
(3) cos(3B3t)]
sin[(2π(f1 − f2) − 1 − B1/U1)t − B0/U1 + ϕ]
(5.7)
From the above equation, when f2 >
p
J2B2
(0)f1, the term with frequency 2f1 has the highest amplitude.
When f2 <
p
J2B2
(0)f1, the term with frequency f1−f2−(1+B1/U1)/2π has the highest amplitude. The
decision boundary between these two modes is f2 =
p
J2B2
(0)f1. Since B2 has a small value, p
J2B2
(0)
is slightly smaller than 1, the analysis agrees with the numerical results in Fig. 5.21.
We lastly went back to the equation of motion mx¨2 = −F2 + CD( ˙x2)
2
, which is a forced damped
system. The dominant frequency of forcing term F2 determines the frequency of solution x2. When the
motion of the follower is dominated by its own transverse oscillations, the frequency of x2 is 2f2. When
sensing flow velocity, similar to the first simplified case described in Sec. 5.12.2, the dominant frequency
of the signal is f2 given that the amplitude of both swimmers is the same. On the other hand, when the
interaction term dominates the motion of the follower, the frequency of the follower’s motion (f1 − f2)
and the spatial pattern of the wake add up. Thus, the dominant frequency is f1 + constant.
137



5.12.5 Sliding mode controller design
Our goal is to design a controller that stabilizes two uncoordinated swimmers in a cohesive school formation. Via the analysis in Sec.5.12.1, we found that the follower can extract information about the frequency
of the leader f1 based on only local flow sensing, but our sensing algorithm does not provide an accuracy
measurement of f1. Through analyzing the passive pairs in Sec. 5.12.1, we know that a subtle mismatch in
frequency leads to unstable formations. Thus, instead of sensing once and matching frequency, we applied
a controller that involves periodic sensing and adjusting frequency.
From Sec. 5.12.1, both flow velocity at the position of the follower and the follower’s swimming velocity
contain information about the leader’s frequency. However, when sensing its own swimming velocity, 2f2
and f1 − f2 + constant creates ambiguity. Thus, we used flow velocity as the sensory cue, and designed a
sliding mode controller as follows,
f2 =



f2 − c1, |f2 − fs| ≤ c3
fs − c2, |f2 − fs| > c3
, (5.8)
where c1, c2, c3 are constants. This controller can be understood intuitively as follows. When the sensed
frequency fs is close to the follower’s own frequency (the difference is less than a threshold c3), it means
that f2 > f1, thus we need to decrease f2. However, since we don’t know the value of f1, we apply the
controller to decrease f2 by a small step until fs is not close to f2. When this happens, it means that fs is
f1 plus a constant. Thus, the controller switches f2 to fs minus a constant c2. In this study, these constants
are chosen as c1 = 0.03, c2 = 0.2, c3 = 0.1. Since we need to apply a Fourier series expansion to the
signal, we apply the controller intermittently. We chose the time interval to be 5/f1.
This controller guides the follower to form a stable limit cycle with the leader. Fig. 5.22A shows the time
evolution of the scaled distance with different initial conditions. Simulations show that limit cycles exist in
multiple spatial locations, which is different from the passive limit cycle, which is a global attractor. This
138



implies that larger inline schools can be constructed based on this frequency controller, where different
uncoordinated swimmers stay at different distances. We plotted the scaled distance over frequency ratio
f2/f1 in Fig. 5.22B for one of these trajectories, in which black dots show when the controller is applied.
frequency ratio f2
/f1
scaled distance d/λ
separation
collision
equilibrium
0 100 400 500300200
0
1
4
5
3
2
6
A 7
B
.85 .9 1.051.95
3
3.5
2.5
4
4.5
5
scaled distance d/λ
time t
Figure 5.22: Controller behavior. A. Scaled distance as a function of time with different initial conditions.
B. Scaled distance over frequency ratio for the case plotted by black line in A.. Black dots show where the
controller changes frequency f2. Behavior of the swimmer is plotted in SI movie.
5.13 Discussion
We analyzed how passive flow interactions mediate self-organization in groups of flapping swimmers, all
heading in the same direction. Our approach relied on a hierarchy of fluid-structure interaction models
and aimed to distill which aspects of self-organization are universal and those which are scale-dependent.
We found that a pair of flapping swimmers self-organize into inline, diagonal, or side-by-side formations (Fig. 5.1). The emergent formation depends on the swimmers’ flapping phase and initial conditions.
In fact, the distinction between these types of formation is somewhat arbitrary because, as phase varies,
the emergent equilibria are dense over the space of lateral offset and separation distance (Fig. 5.11). These
139



findings are consistent with experimental observations [266, 438, 339, 341], but go beyond these observations to quantify the hydrodynamic benefits to each member in these formations. Two side-by-side
swimmers flapping inphase save energy, compared to solitary swimming, and share the hydrodynamic
benefits nearly equally. When flapping antiphase, the side-by-side swimmers exert extra effort compared
to solitary swimming, contrary to a common misconception that this configuration saves hydrodynamic
energy [519]. In leader-follower formations, whether inline or diagonal, hydrodynamic benefits are bestowed entirely on the follower (Fig. 5.6A).
Importantly, we showed that the wake of a solitary leader contains information that unveils opportunities for the emergence of stable and energetically-favorable formations in pairs of swimmers. Equilibrium
locations and trends in power savings and school cohesion can all be predicted entirely from kinematic
considerations of the leader’s wake with no consideration of the two-way coupling between the two swimmers (Fig. 5.9). These results are important because they highlight the non-reciprocal or asymmetric nature
of flow coupling in leader-follower configuratoins, inline or diagonal, at finite Re and open new avenues for
future studies of non-reciprocal flow-coupled oscillators. These oscillators have distinct properties from
classic mechanical and biological oscillators, such as Huygens pendula or viscosity-dominant oscillators,
where the coupling between the oscillators is reciprocal; see, e.g., [425, 286, 345, 478, 175].
Our analysis has practical importance in that it provides efficient diagnostics and predictive tools that
are equally applicable to computational models and experimental data and could, therefore, be applied
broadly to analyze, predict, and test opportunities for schooling and hydrodynamic benefits in live and
robotic fish when flow measurements are available.
Case in point, we used these diagnostic tools to explain the mechanisms leading to scattering in larger
groups of inline swimmers and to predict when the wake of a leading group of swimmers offers no opportunities for a follower to benefit from passive hydrodynamics (Fig. 5.16). At an increasing number of
flow-coupled swimmers, side-by-side formations remain robust, but inline formations become unstable
140



beyond a critical number of swimmers (Figs. 5.12 and 5.15). The critical number depends on the fluid
properties and can be predicted by analyzing the wake of the leading group of swimmers. We also developed control laws which stabilize unstable schools when passive hydrodynamics cannot stabilize a school
(Sec. 5.10, Sec. 5.12.5) Future work will focus on testing these findings experimentally and in CFD simulations with increasing number of swimmers, together with accounting for body deformations [181], lateral
dynamics [248, 102], and variable flapping amplitudes and frequencies [339, 182].
Our findings could have far-reaching consequences on our understanding of biological fish schools.
Field and laboratory experiments [356, 295, 16, 281] have shown that actual fish schools do not generally
conform to highly regularized patterns, and schooling fish dynamically change their position in the school.
Neighboring fish vary from side-by-side to inline and diagonal configurations. Importantly, in laboratory
experiments that challenged groups of fish to sustain high swimming speeds, the fish rearranged themselves in a side-by-side pattern as the speed increased, much like the pattern in Fig. 5.12B, presumably to
save energy [16]. These empirical observations, together with our findings that side-by-side formations
provide the fairest distribution of efforts among school members (Fig. 5.12B and C), offer intriguing interpretations of the results in [16, 281]: when the fish are not challenged by a strong background current to
sustain high swimming speeds, they position themselves as they please spatially, without much consideration to equal sharing of hydrodynamic benefits. But when challenged to swim at much higher speeds
than their average swimming speed, fish are forced to cooperate.
To expand on this, our results suggest a connection between flow physics and what is traditionally
thought of as social traits: greed versus cooperation. We posit that there is a connection between the
resources that arise from flow physics – in the form of energetic content of the wake of other swimmers
– and greedy versus cooperative group behavior. In cohesive inline formations, the leader is always disadvantaged and hydrodynamic benefits are accorded entirely to trailing swimmers (Figs. 5.6A and 5.12C).
141



Importantly, flows generated by these inline formations present serious impediments for additional swimmers to join the line downstream (Figs. 5.12 and 5.15). Thus, we could call these formations greedy, leaving
no resources in the environment for trailing swimmers. This thought, together with our interpretation of
the observations in [16] that cooperation to achieve an egalitarian distribution of hydrodynamic benefits
is forced, not innate, raise an interesting hypothesis. The dynamic repositioning of members within the
school (e.g., Fig. 5.18) could be driven by greed and competition to occupy hydrodynamically advantageous
positions, much like in peloton of racing cyclists [51]. On a behavioral time scale, these ideas, besides their
relevance to schooling fish, open up opportunities for analyzing and comparing the collective flow physics
in cooperative versus greedy behavior in animal groups from formations of swimming ducklings [513] and
flying birds [45, 374] to peloton of racing cyclists [51]. From an evolutionary perspective, it is particularly
exciting to explore the prospect that flow physics could have acted as a selective pressure in the evolution
of social traits such as cooperation and greed in aquatic animal groups.
142



Chapter 6
Collective phenomena at extreme scale
6.1 Mathematical model
We modeled each fish as a self-propelled swimmer moving at a constant velocity U (units m· s
−1
) relative
to the flow velocity. Each swimmer follows behavioral laws that are derived from shallow-water experiments in a circular tank [154]. These behavioral laws instruct swimmers to be attracted to and align with
their Voronoi neighbors. Additionally, we used a dipolar flow field to model the far-field hydrodynamic
interaction of fish [328, 444, 142, 203]. The far-field flow of an individual is approximated by a potential
dipole. Following [142, 203], we used the self-propelled velocity U and the intensity of attraction term to
define the characteristic length and time scales and arrived at non-dimensional control parameters: alignment intensity Ia, noise intensity In, dipole strength If . In non-dimensional form, the motion of swimmer
j, where j = 1, 2, · · ·, N, where N is the number of swimmers in the school, follows the set of stochastic
differential equations (SI Sec. C):
x˙ j = pj + Uj , dθj = [Ωj + (pj · ∇)Uj · p
⊥
j
]dt + IndW (6.1)
Here, xj ≡ (xj , yj ) and pj ≡ (cos θi
,sin θi) represent the position and heading direction of swimmer j.
The vector Uj represents the flow velocity generated by all other swimmers at the location of swimmer j,
143



Ωj denotes the angular velocity proposed by the vision-based response; see more details in SI Sec. C. W(t)
is the standard Wiener process [492], modeling the "free will" of fish.
The system of equations (6.1), is solved numerically using explicit forward Euler method, with timestep
∆t = 10−2
. The computational complexity mainly comes from two parts, hydrodynamic interaction, and
Voronoi tessellation, which have a time complexity of O(N2
) and O(N log N), respectively. For hydrodynamic interaction, the fast multipole method (FMM), although it has a smaller asymptotic time complexity
of O(N), has not shown its advantage over direct sum at the order of 104
agents [510]. Thus, we optimized
and paralleled our direct sum code using a just-in-time compiler, numba [252]. It compiles, optimizes, and
parallelizes the Python code to approach the performance of C or Fortran. Voronoi tessellation is handled by Scipy Spatial package [472]. Simulations are performed on an Exxact Valence Workstation with
a 56-core Intel Xeon W9-3495X CPU. With this software and hardware setup, a timestep takes about 1
second for 10, 000 agents, with hydrodynamic interaction and Voronoi tessellation taking about half of
the computational time each. Detailed descriptions can be found in Sec. C.
6.2 Statistical and data-driven analysis
6.2.1 Order parameters
Order parameters of the school can be defined either globally over the entire school or locally. Evaluating
the order parameter globally has been done in previous papers from our group [142, 203]. The polarization
order parameter P and milling order parameter M are calculated as follows
P =
1
N
||X
N
i=1
(cos θi
,sin θi)||
M =
1
N
X
N
i=1
|x˙ i
· (xi − xi)|
|x˙ i
||xi − xi)||
(6.2)
144



6.2.2 Identifying splitting and merging events
Fish remained cohesive in relatively small groups, but in large schools, we observed dynamic splitting
and merging where the large school got divided into subgroups, each moving in a different direction that
seemed to randomly rejoin and divide again for the entire simulation time. To identify these splitting and
merging events, we examined the time evolution of the polar order parameter: P rapidly decreased or
increased when a splitting or emerging event occurred. To determine the time scale at which these events
took place, we calculated the dominant frequency of dP/dt using Fast Fourier transformation (FFT). In the
absence of splitting and merging events, such as at small number of fish, the FFT is characterized by high
frequencies due to individual-level noise. We discarded these frequencies (equivalent to a low-pass filter)
to identify the frequencies at which the splitting and merging events occurred in large schools. We discard
all frequencies larger than 0.5. The inverse of this dominant frequency defines the time scale of splitting
and merging.
expectation–maximization algorithms [107], such as K-means [289] or Mixture Models [119], suffer
in identifying intertwined clusters with time-varying shapes. Here, we used density-based methods that
are designed to separate low- and high-density regimes in the domain and identify complex-shaped clusters [130, 62, 304, 399, 226]; particularly, we used the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm [130, 62, 304], implemented in the scikit-learn package [361],
which has been successfully applied to identify clusters in simulations of the Vicsek model [316].
6.2.3 Spatial correlation in velocity fluctuations.
The degree of polarization P provides little insights into the collective response in a school [68, 72]. To
understand collective response, we examined how fluctuations in each swimmer’s velocity correlate with
those of others. For swimmer i, we defined the fluctuation δvi around the group’s mean velocity as δvi =
vi − ⟨v⟩N , where ⟨v⟩N =
PN
j=1 vj/N. By construction, PN
i=1 δvi = 0, which simply indicates no net
145



motion in the center of mass reference frame of the school. We defined the spatial correlation function
C(r) of fluctuations, which measures the average inner product of velocity fluctuations of swimmers at a
distance r from each other,
C(r) = 1
Co
P
i
P
j
(δvi
· δvj )δ(r − rij )
P
i
P
j
δ(r − rij )
. (6.3)
Here, δ(r − rij ) is a smooth Dirac-delta function selecting pairs of swimmers at mutual distance r and Co
is a normalization factor such that C(r = 0) = 1.
6.2.4 Time delays during turning and information propagation within the group.
When a cohesive polarized group of swimmers performed a collective turn, to define the turn, we examined
the time evolution of the curvature κi =
∥vi × v˙ i∥
∥vi∥
3
of the trajectory traced by swimmer i, where v˙ i
is the
swimmer’s acceleration. In 2D, the curvature can be calculated directly in terms of the time derivatives
of the coordinates (xi
, yi), namely, ki(t) = x˙iy¨i − y˙ix¨i
( ˙x
2
i + ˙y
2
i
)
3/2
. The time-evolution of the curvature κi(t) of a
swimmer i undergoing a turn exhibits a maximum at the time of the turn. Inspired by [17, 18], and given
two swimmers i and j, we defined the mutual turning delay τij as the time required to shift the full curve
of κj (t) to maximally overlap it with κi(t)
τij = argmax
τ
ki(t)kj (t − τ ).
(6.4)
Here, τij < 0 means fish i turns ahead of fish j and vice versa. In the absence of noise, time ordering
requires that τij = τik + τkj , for each triplet i, j, k. For example, if i turns 10 time units before k, and k
turns 5 time units before j, then i turns 15 time units before j. Because we are dealing with a noisy system,
this equality may not be strictly satisfied, but τij is equal to τik + τkj on average.
We next ranked the group of fish undergoing a turn based on their time of maximal curvature. For
each fish i, we calculated how many other fish it has turned ahead of [17, 70]. The order of this number
146



– the number of other fish, a fish precedes in turning – defines a rank for the fish; the first-ranked fish
is ahead of the largest number of fish and its turning time is used to set the time t1 of the onset of the
turning event. In a perfect system, with no noise, the turning time ti of a lagging fish i can be calculated
directly relative to the turning time of the first-ranked fish 1, ti = t1 +τi1. However, because the system is
noisy, this method of calculating ti
introduces small statistical errors. To minimize these errors, we define
ti using the mutual delay τij with respect to all swimmers j ranked higher than i,
ti =
1
ranki − 1
X
rankj 1 (6.5)
6.2.5 Coarse Graining of the active matter
The school can be viewed as an active fluid with spatiotemporally changing density and velocity fields.
Given particle-based simulation, hydrodynamic fields are obtained by coarse-graining. A popular coarsegraining approach is based on convolution kernels [418, 477, 427], weight functions that translate discrete
fine-grained particle densities into continuous fields. Namely, given the particle positions xi(t) and velocity vi(t), the associated density field ρ(x, t) and momentum field p(x, t) = ρv are calculated as
ρ(x, t) = X
i
K[x − xi(i)], (6.6)
The kernel K(x) is chosen to be a Gaussian kernel, K(x) = 1
2πσ2 e
−
||x||2
2σ2
, so that the total number of
particles is recovered from R
d
2xρ(x, t) = N.
The average density of a school is calculated as
⟨ρ(t)⟩ =
R
Ω
ρ(x, t)dx
R
Ω
dx
, (6.7)
where Ω is defined as the location where ρ(x, t) > 1 × 10−2
.
147



6.3 Dynamic reorganization, splitting and merging in large fish schools
We numerically simulated the motion of a school of 50,000 fish coupled via visual feedback rules and
flow interactions in an unbounded planar domain (Fig. 6.1, Suppl. movie 1). Each swimmer followed
behavioral rules derived empirically from shallow-water experiments, where it turned towards its Voronoi
neighbors, aligned with the same neighbors, and experienced rotational white noise [154, 61]. Additionally,
each swimmer generated a dipolar flow field and responded to the combined flow generated by all other
swimmers [142, 203]. We normalized the swimming speed U and intensity of rotational attraction to
one by a proper choice of characteristic time and length scales [142]. Accordingly, three dimensionless
parameters (In, Ia, If ) distinguished the behavior of individual swimmers, representing, respectively, the
rotational noise, alignment, and hydrodynamic intensities (Methods). Here, we used parameter values
that, in smaller groups of 100 fish, led to stable polarized schooling [142, 203] (Fig. 6.2A). We optimized
our computational algorithms in order to scale our simulations to groups of the order of 104
swimmers
(Methods). In the group of 50,000 fish, starting from random initial conditions, the fish self-organized
into coherent polarized structures that dynamically split and merged, exhibiting large density fluctuations
(Fig. 6.1), comparable to empirical observations of large bird flocks and fish schools [486, 68, 180, 357, 45,
390, 389].
6.4 Global polarization is lost with increasing number of swimmers
To understand the emergence of this dynamic reorganization in large schools, we systematically varied the
number of swimmers N. In Fig. 6.2A-C, snapshots show cohesive and highly polarized schools of 100 and
1000 swimmers and loss of global cohesion for in a school of 10,000 swimmers, where distinct polarized
clusters moved in different directions. Sample simulations at N = 100, 1000, 10,000, and 50,000 are reported
in Fig. 6.2D-G. The polarization order parameter P = |
PN
j=1 e
iθj
|/N, where θj is the orientation of
148



swimmer j, is consistently close to 1 for N = 100 and 1000, indicating high polarization at all time. For N=
10,000 and 50,000, P fluctuates violently, reflecting the reorganization and constant splitting and merging
in larger schools: a sharp decrease in P indicates a splitting event, while a sharp increase indicates a
merging event. Defining the school mean velocity ⟨v⟩ =
PN
j=1 vj/N, we found that, on average, schools
swam faster than the individual swimming speed U for N =100 and 1000 but slower for N =10,000
and 50,000 because of the breaking of these larger schools into subgroups that themselves swam faster
but in random directions. For example, in Fig. 6.2A-C, ∥⟨v⟩∥ = 1.20, 1.08, 0.54 for N =100, 1000 and
10,000, respectively, with the four subgroups in Fig. 6.2C swimming at average speeds 1.14, 0.83, 0.86,
10a
Figure 6.1: Emergent behavior in a school of 50,000 fish. School organizes into coherent polarized clusters that dynamically
split and merge, exhibiting large density fluctuations, as shown here in a massive merging event in a portion of the school.
Parameter values: Ia = 9, In = 0.5, If = 0.01, T = 1000, and N = 50,000. Suppl. Movie 1. Only 20% of fish are plotted within
a subgroup of the school.
149



1.08, albeit in different directions (see also Suppl. Movie 1). The time evolution of cos(∠⟨v⟩), where
∠⟨v⟩ represents the school’s overall orientation shows more frequent changes in orientation at smaller
N: in the larger schools, frequent splitting and merging events created subgroups that moved in random
directions, hindering the entire school from turning together cohesively. Fig. 6.2F and G show the number
of subgroups per school identified by our automatic clustering algorithm (Methods) and number of fish
per cluster. The larger schools at N= 10,000 and 50,000 exhibited wider distributions while the average
number of clusters and average number of fish per cluster varied little. Because of the behavioral and
statistical similarities between N= 10,000 and N = 50,000, and to save computational effort, hereafter we
analyze the behavior of groups of up to 10,000 fish.
6.5 More is different
In Fig. 6.3A-C, we systematically varied the school size from 100 to 10,000 swimmers. Up to N ≈ 1000,
the swimmers exhibited stable schooling, behaving mostly as an indivisible entity, with consistently high
polarization values P, where P here is averaged over the last 80% of the simulation time, discounting the
initial 20% to eliminate transient effects due to random initialization. Beyond N = 1000, the school began
to split, forming locally polarized subgroups that dynamically rejoined and separated again. This dynamic
reorganization caused a decrease in the global polarization order parameter and an increase in its variance
(Fig. 6.3A). In the highly polarized and cohesive regime, the school turned frequently and rarely split, but
as N increased, the frequency of global turning events decreased while the frequency of splitting and
merging increased (Fig. 6.3B). The average density of the school increased monotonously up to N ≈ 1000,
while, locally, the average nearest neighbor distance (NND) remained nearly unchanged and the average
distance to Voronoi neighbors (VND) decreased (Fig. 6.3C). That is, in the cohesive regime, the school
became denser with increasing N not by getting uniformly closer to all neighbors, but by getting closer to
distant neighbors while maintaining the same distance to nearest neighbors. As N increased beyond the
150



1
0
.5
3
0
1.5
1
-1
0 .5
0
probability density function, p.d.f.
1
1
0
.5
School polarization, P
1
-1
0
10 150 5
.25
0
.5
Number of clusters
3
0
1.5
Fish per cluster
D E F G
10a
A N = 10,000
1
0
.5
1
-1
0
1
0
.5
10000 200 400 600 800
Time
1
-1
0
Time
10000 200 400 600 800
3
0
.25 1.5
0
.5
3
0
1.5 .25
0
.5
a 5a
N = 100 N = 1000
B C
102 103 104 101
School orientation,
turning
event
splitting
event
merging
event
probability density function, p.d.f.
Figure 6.2: More is different: self-organized behavior depends on group size. Snapshots of three schools of A. 100, B.
1000, and C. 10,000 fish. For N=100 and 1000, the school is globally polarized and remains coherent in time, while for N =
10,000, the school continuously self-reorganizes, dynamically splitting and merging. Blue arrows indicate the school’s average
velocity, and green arrows indicate the average speed of each cluster. Time evolution of D. school polarization P and E. average
orientation cos ∠⟨v⟩. Distributions of F. number of clusters and G. number of fish per cluster shown in log scale; dashed line
indicates statistical average and shaded grey area indicates standard deviation. Parameter values: Ia = 9, In = 0.5, If = 0.01,
T = 1000. In D-G, N = 100 1000, 10,000, and 50,000 (from top to bottom). See Suppl. Movies 1 & 2.
cohesive regime, the average density and distance to nearest and Voronoi neighbors (NND and VND) all
exhibited large fluctuations, reflecting dynamic reorganization within the school.
151



1
0
.5
School Polarization, P
.25
.75
1.25
C
Number of fish, N
A B
Distance
Frequency
10-3
10-2
turning splitting / merging
Number of fish, NNumber of fish, N
102 103 104
1
.5 1
3
2
NND VND Density
Density
1
.5
0
102 103 104 102 103 104
0
E p/pmax
0 1 HEAD
LEFT
RIGHT
0
.5
-.5
1
0
.5
10-4 10-3 10-2
D
N = 100 N = 1000
Hydrodynamic intensity, If
10-1 100 10-4 10-3 10-2 10-1 100
N = 10,000
10-4 10-3 10-2 10-1 100
School Polarization, P
TAIL
-.5 0 .5
0
.5
-.5
-.5 0 .5
N = 100 N = 10,000
Figure 6.3: School cohesiveness depends on hydrodynamic intensity of individual swimmers. Time-averaged values
as functions of number of fish N with hydrodynamic interaction (If = 0.01): A. school polarization P, with shaded indicating
standard deviation, B. dominant frequency of dP /dt and cos ∠⟨v⟩, C. average nearest neighbor distance (NND), average distance
to Voronoi neighbors (VND), and average density. Corresponding plots with the absence of hydrodynamic interaction (If = 0)
are shown in Fig. 6.4. D. Time-averaged polar order parameter P as a function of hydrodynamic intensity If ranging from 10−4
to 5 for schools of size N =100, N =1000, and N =10,000. E. Comparison of heatmap of nearest neighbors for N = 100, and
N = 10, 000. Top row: with hydrodynamic interaction If = 0.01; bottom row: without hydrodynamic interactions If = 0.
Parameter values: Ia = 9, In = 0.5, T = 1000.
6.6 Flow interactions trigger spontaneous self-reorganization within
the school
We next asked which mechanisms lead to school splitting and merging with larger N. The vision-based
rules of getting attracted and aligning with Voronoi neighbors lead to no fragmentation of the group, independent of group size. We thus tested whether fragmentation with increasing group size could arise
from either noise or hydrodynamic interactions. To test the effect of each, we first suppressed all hydrodynamic interactions, and considered a school of 10,000 swimmers interacting only via vision-based
rules. We observed no splitting and merging, independent of noise levels (Fig. 6.4D). We thus concluded
that noise alone is not sufficient for self-reorganization. Additionally, Without hydrodynamic interaction, the average density of the school increased monotonically with the number of swimmers, leading to
152



unrealistically dense patterns and distribution of nearest neighbor distance that does not fit experimental
observations (Fig. 6.3E, [367]). Hydrodynamic interactions are needed. We next maintained the same noise
level and varied the intensity of the dipolar field If across several orders of magnitudes from 10−4
to 5:
since If ∼ a
2U is proportional to the swimmer’s square bodylength and speed, a weaker dipolar intensity
represents smaller and slower fish and a larger dipolar intensity represents larger and faster fish [142]. In
Fig. 6.3D, we report results across this wide range of If for N =100, 1000, and 10,000 swimmers. Smaller
schools maintain school cohesion at larger values of If . In larger schools, cohesion is lost at smaller values
of If , indicating that the capacity for cohesive schools depends on the hydrodynamic intensity of individual swimmers, which in turn depends on their size and speed. That is, smaller fish form larger cohesive
schools.
On the other hand, with hydrodynamic interaction and with the absence of noise, the phenomena of
dynamic separation and merging are unchanged (Fig. 6.5). This shows that this self-organization does not
depend on noise, instead, hydrodynamic interaction is both necessary and sufficient.
5a
1
0
.5
School polarization, P
D
E
NND
0
.1
.3
.2
.4
.5 B
Number of fish, N
102 103 104
0
15
5
10
A C
Number of fish, N
102 103 104
N
10000
5000
0
School size, L
0 10 20 30
0
5
10
Correlation length, ξ
0
.5
Number of fish, N
102 103 104
VND
.1
.3
.2
.4
2nd shell VND
0
1.5
.5
1
Density
D
F G
Number of fish, N
102 103 104
Number of fish, N
102 103 104
Figure 6.4: Polarized school does not split without hydrodynamic interaction. Average polar order parameter (A.),
density (B.), average nearest neighbor distance (C.) are plotted as a function of number of fish without hydrodynamic interaction
(If = 0). D. A snapshot of the school composed of 10, 000 swimmers without hydrodynamic interaction. E., F. average distance
to first and second shell Voronoi neighbors are plotted as a function of N. G. Correlation length ξ is a linear function of school
size L. The slope of the fitting line is 0.30.
153



25a
1
0
.5
400 600 800 10000 200
Time
P
1
0
.5
School polarization, P
A B
C
Noise I
n
0 .5 1
N=100 N=1000 N=10,000
Figure 6.5: Noise is not required for self-organization. A. A snapshot and B. time evolution of polar order parameter P for
a case with N = 10, 000 swimmer without noise and with hydro. Parameters are N = 10, 000, Ia = 9, In = 0, If = 0.01. C.
Time averaged polar order parameter P is plotted as a function of noise intensity In for N = 100, N = 1000, and N = 10, 000.
6.7 Scale-free correlation in cohesive groups breaks down during school
self-reorganization
Collective responsiveness in animal groups is reflected through scale-free behavioral correlations [68]. For
example, the range of spatial correlation in polarized flocks of birds was shown to scale with the linear
size of the flock [68]. The linear scaling of correlation range with group size implies that the effective
perception range of each individual encompasses the entire group and enables seamless transfer of information between members regardless of distance. We asked whether these conclusions are generic to
polarized groups of self-propelled individuals, including our simulations of schooling fish, and how selfreorganization within the school, in the form of constant splitting and merging, affects the range of spatial
correlation and the ability to transfer information among school members.
To address these questions, we first considered cohesive and highly polarized groups of swimmers
ranging in size from N = 100 to 1000, where consistent with [68], we analyzed snapshots with high degree
of polarization (P > 0.9). For swimmer i, we defined the fluctuation δvi around the group’s mean velocity
as δvi = vi − ⟨v⟩ (Fig. 6.6A,B). By construction, PN
i=1 δvi = 0, indicating no net fluctuations in the net
154



motion of the center of mass of the school. We calculated the spatial correlation function C(r) of velocity
fluctuations as in Eq. (6.3).
Here, the Dirac-delta function δ(r−rij ), where rij = ∥rij∥ and rij = xi−xj , selects pairs of swimmers
at mutual distance r, and Co is a normalization factor such that C(r = 0) = 1. The span of r does not
exceed the school length L of the group defined as L = max(rij ). A positive value of C(r) close to 1
implies that the fluctuations are nearly parallel and strongly correlated. Conversely, a negative value of
C(r) close to −1 implies that the fluctuations are antiparallel and anticorrelated. A value of C(r) ≈ 0
implies a random distribution of velocity fluctuations with no correlation.
In Fig. 6.6C, we report C(r) versus r for the snapshot presented in Fig. 6.6A. This form is generic:
at short distances, the correlation is close to 1 and decays with increasing r, becoming negative at large
interindividual distances. Such behavior indicates that within a group, there is a strong correlation at short
distances and strong anticorrelation at large distances, and in no range of r, the velocity fluctuations are
uncorrelated.
To explain the behavioral implications of this form of C(r), we defined the correlation length ξ as the
relative distance r at which C(r) is zero, C(r = ξ) = 0. By definition, the value of ξ is the maximal
size of the positively correlated domain. In Fig. 6.6D, the resulting correlation length ξ is plotted versus
school size L using simulations at various sets of parameters (If , Ia, In) and school size N ≤ 1000,
provided P > 0.9 . We found that ξ increases linearly with L, much like in the case of starling flocks [68].
Correlated domains in cohesive and polarized fish schools are larger in larger groups, implying scale-free
correlations. Interestingly, in our simulations, the slope is nearly one-third, similar to the slope reported
in [68]. This means that the critical behavior is dominated by fluctuations that are self–similar up to the
scale of the correlation length, which is one-third the size of the school. This robustness of scale-free
correlation in disparate systems from empirical bird flocks [68] to the present simulations of fish schools
155



and the similarity in slope obtained in these different systems indicate the existence of a universal law
behind these phenomena.
But does this scale-free correlation generalize to larger groups that continuously reorganize? To answer this question, we considered the school of 10,000 swimmers described in Fig. 6.2. At each snapshot,
we identified all clusters of cohesive swimmers, selected highly-polarized clusters for which P > 0.9 ,
calculated the corresponding ξ and L, and plotted the joint probability density function of cluster size L
and correlation length ξ as a heatmap over the (L, ξ) space (Fig. 6.6E). The (L, ξ) values are concentrated
at and below the scale-free correlation line (dashed grey) obtained in stable schools in Fig. 6.6D. Highlighted on this plot are (L, ξ) values corresponding to two examples: in one example, the entire school
S of N = 10,000 split into three cohesive subgroups S1, S2, S3 of respective sizes n1 = 826 n2 = 6616
and n3 = 2460; in another example, two subgroups M1,M2 of respective size n1 = 5168 and n2 = 2334
merged into a larger subgroup M of size n = 7502. These results indicate a loss of scale-free correlation
during school reorganization.
6.8 Information propagation during turning, splitting, and rejoining
Scale-free correlation reflects the potential for indirect transfer of information between individuals in the
group but it does not describe the efficiency of a collective response to environmental factors [68, 17]. An
efficient collective response depends on how localized perturbations succeed in modifying the behavior of
the entire group. Take, for example, a group changing its overall heading direction (Fig. 6.7A and Suppl.
Movie 3). The actual execution of such turns cannot be instantaneous, because a certain amount of time is
needed to propagate the turn throughout the group. During this time, cohesion is strained by the mismatch
between individuals who have already turned and those who have not yet done so, as reflected by a drop
in polarization P. Therefore, the speed at which information is transferred from individual to individual
156



A C Correlation function C(r)
Distance, r
0 164 8 12
-1
1
0
ξ
5 a
Correlation length, ξ
1
0
3
2
5
4
5 10
N
1000
500
0
fluctuation
Correlation length, ξ
0
60
30
0 80
Cluster size, L
p.d.f
9
6
3
0
40 120
S
S1
S2
S3
E
M2
School size, L
D
B
M1
M
S→(S1
, S2
, S3
)
(M1
, M2
)→M
Figure 6.6: Correlation length in cohesive and dynamically changing polarized schools. A. A snapshot of a stable
school with N = 1000 and B. its corresponding velocity fluctuation. C. Correlation function C(r) is the average inner product
of the velocity fluctuations of pairs of birds at mutual distance r. The function changes sign at r = ξ, which gives a good estimate
of the average size of the correlated domains. D. The orientation correlation length ξ is plotted as a function of the school size L
of the polarized schools. The parameters of local rules are ◦: Ia = 9, In = 0.5; ×: Ia = 7, In = 0.5; ∇: Ia = 5, In = 0.5; □:
Ia = 9, In = 0.3; D: Ia = 9, In = 0.7. The darkness of the colors indicates the number of fish in the simulations (all the data
are averaged based on the snapshots where the school remains cohesive. ). The equation of the fitting line is ξ ≈ 0.37L − 0.84
with coefficient of determination R
2 = 0.83. E. Histogram of correlation length and cluster size for all clusters emerged in the
simulation of 10, 000 fish. The markers indicate the schools and subgroups in Fig. 6.7E, I before and after the splitting or merging.
plays a crucial role in maintaining group cohesion, which in return is key for scale-free correlation and
collective responsiveness.
We aimed to quantify information transfer in cohesive groups and to assess the effect of school reorganization on information transfer in larger schools that constantly split and merged. To fix ideas, we
first analyzed, following [17], a collective turn in a cohesive group of N = 1000 swimmers. Given the
full trajectory of each swimmer i in the group (Fig. 6.7A), we calculated the curvature κi as a function of
time and identified the time ti of maximum curvature. For each pair of swimmers, i and j, we calculated
their mutual turning delay, τij = ti − tj , defined as the amount of time by which swimmer j turns before
(τij > 0) or after(τij < 0)swimmer i (Fig. 6.7B). From the delays τij , we ranked all swimmers in the group
according to their turning order, identifying the first to turn, the second, and so on. We then labeled each
157



E
25a
-10 20
time
Curvature, к
-1.5
1.5
0
F
J
Number of fish, N
0
Information
transfer speed
0
32
Time from first ranking swimmer
0
10a
I
0
5a
Curvature, к
-2
1
0
-1
time
100 5
A B first-rank middle-rank last-rank
10
Time delay
C
i
j
7500
turn/
merge
10
i
j
G
K
20-20
time
-10 0
Curvature, к
2
0
1
-1
i
j
1
splitting
merging
turning
3
0
2
Time from first ranking swimmer
Rank by turning order, o
1000
0
500
1
1510
Time from first ranking swimmer
0 5
3
0
2
Time from first ranking swimmer
Information travel distance
40
0
20
1
3
0
2
Time from first ranking swimmer
Information travel distance
40
0
20
turn &
merge
1
3
0
2
Time from first ranking swimmer
Information travel distance
40
0
20
1
split
& turn
0
Rank by turning order, o 10000
5000
0
Rank by turning order, o 10000
5000
D
turn split/
turn
25
50
H
L
150
60
0
30
7.5
20
Time delay
20
150
Time delay
Figure 6.7: Information transfer during turning, splitting and merging. A. Polarized school of N = 1000 fish turns
by "free will". Trajectories of each fish are plotted in different colors. B. sample curvature versus time for first-rank, middlerank, and last-rank swimmers. C., D. rank of fish and information traveled distance are plotted versus absolute turning delay.
The information transfer speed is 23.2. The change of polar order parameter P and correlation length ξ/L for this case is
shown in Fig. 6.8A. The inset in D. compares information transfer speed in all different scenarios, including additional data in
Fig. 6.9, 6.10, 6.11 marked using symbols with dashed lines. E. An example snapshot for a school with N = 10,000 when all the
swimmers are swimming together (t1 = 209), and at a later time when the school splits into 3 clusters (t2 = t1+65). Trajectories
of fish individual fish during the splitting process, red, blue, and green, represent three clusters at t2. F. sample curvature versus
time from each subgroup. G., H. rank of fish and information traveled distance are plotted versus absolute turning delay. The
change of polar order parameter P and correlation length ξ/L for this case is shown in Fig. 6.8B. The information transfer speed
is 4.9, 8.7, and 7.4 for the red, blue, and green subgroups, respectively. I. Two clusters in a school of N = 10,000 fish merge
together. During the merging, one of the clusters turns, and the other cluster splits into two parts, and one part of it also turns.
Trajectories of all swimmers are plotted. J. sample curvatures from each subgroup. K., L. rank of fish and information traveled
distance are plotted versus absolute turning delay. The information transfer speed is 30.0, and 38.9 for the red, and blue subgroups,
respectively. The movies of these cases are shown in SI Movie 3.
swimmer i by its order oi
in terms of its absolute turning time ti with respect to the top-ranking swimmer
(Methods). We found that the top-ranking swimmers – the first swimmers to turn – are physically close
158



ξ/L
0
.5
ξ/L .25
0
.5
.25
-20
time
1
0
P .5
0 20
1
0 15
time
P .75
.25
105-5
A turning B splitting
Figure 6.8: Correlation length during turning and splitting. A. Polar order parameter P, and correlation length over
school size ξ/L during the turning as the case in Fig. 6.7A. B. Polar order parameter P, and correlation length over school size
ξ/L during the splitting as the case in Fig. 6.7D.
to each other (Fig. 6.7C). That is, the collective turn has a spatially localized origin that propagates across
the group through swimmer-to-swimmer transfer of information.
From this ranking, we sought a dispersion law that describes how much distance d the information
travels in a time t. Given that the motion of the group is two-dimensional and that the turn has a localized
origin, the information propagates a distance di =
p
oi/ρ, where ρ is the school density which remains
nearly constant during the turn [17]. The curve d(t) (Fig. 6.7C) has a clear linear regime for early and
intermediate times, implying that, following the first-rank fish, the distance traveled by the information
grows linearly with time d(t) = ct, where c is the speed of propagation of information; its value is about
20 times that of the self-propelled speed U of individual swimmers in our model. We repeated this analysis
for 5 instances of turning in schools of various sizes, ranging from 100 to 2000 swimmers (Fig. 6.9). The
information transfer speed fluctuated with the number of swimmers but remained consistently an order
of magnitude larger than that of an individual swimmer. Similar results were reported in empirical observations of bird flocks undergoing collective turns [17, 18] and in agitation waves that travel much faster
than the individual animal’s speed in fish schools [380, 157, 180, 368] and bird flocks [375, 191].
We next examined splitting events during school self-reorganization. In Fig. 6.7D, we show trajectories
of the splitting event pointed out earlier (Figs. 6.2D, 6.6E), where the school of N = 10, 000 swimmers,
starting from a polarized state, splits into three subgroups (labeled as red, green, and blue), each subgroup turning in a different direction. We analyzed each subgroup, which is defined at the snapshot after
159



splitting, computing the turning sequence of each swimmer within their subgroup (Figs. 6.7E ). We then
calculated the information travel speed within each subgroup (Fig. 6.7F). Different subgroups have nearly
the same information transfer speed (∼ 3× the self-propelled velocity) and are much slower compared to
free turning (Figs. 6.7C,D).
Lastly, we examined information transfer during merging events. In Fig. 6.7I, we show trajectories of
the merging event mentioned earlier (Fig. 6.6E), where two subgroups, starting from polarized states in
different directions (labeled as red and blue), turn and merge into a single subgroup. During merging, the
fish in different subgroups do not mix together (Fig. 6.7G). Instead, different subgroups turn individually to
both move closer and reach a consensus in moving direction. Therefore, we calculate the curvature for each
fish, and find pairwise time-delay within subgroups based on the snapshot before merging (Figs. 6.7E,F).
Interestingly, the information transfer speed increases to up to 40 fold that of the self-propelled speed,
much faster than information propagation in free turning and splitting. We confirmed these results by
analyzing other rejoining events with total numbers of fish ranging from 1000 to 3000 (Fig. 6.10). This
indicates that continuous information input from neighboring clusters increases the information transfer
speed (Fig. 6.7D).
6.9 Analytical derivation of continuum model of small perturbation in
polarized schools
In [17], Attanasi et al. derived that the kinematics collective model with only alignment interaction, like
Vicsek model [469], would have a quadratic diffusive propagation of information. They explained their
finding of linear and fast propagation in bird flocks by introducing moment of inertia against social force
on each individual to construct a high-order dynamics model. On the other hand, in our study, the local
kinematics model is prescribed. Thus, there is a question of why we also get a linear propagation of
160



5a
D FE
3
0
2
Time delay from first-ranking swimmer
Information travel distance
40
0
20
1
5a
G H
3
0
2
Time delay from first-ranking swimmer
Information travel distance
40
0
20
1
0
rank
500
250
2 3
Time delay from first-ranking swimmer
0 1
0
rank
2000
1000
32
Time delay from first-ranking swimmer
0 1
I
5a
A CB
3
0
2
Time delay from first-ranking swimmer
Information travel distance
40
0
20
1
0
rank
100
50
32
Time delay from first-ranking swimmer
0 1
Figure 6.9: Additional data on turning. A. - C. turning trajectories of a school containing 100 fish and corresponding absolute
turning time delay and rank. The information transfer speed is 11.4. D. - F. turning trajectories of a school containing 500 fish
and corresponding absolute turning time delay and rank. The information transfer speed is 12.7. G. - I. turning trajectories of a
school containing 2000 fish and corresponding absolute turning time delay and rank. The information transfer speed is 18.2.
information. To address this, we found the most important feature of our model is a frontal bias derived
from experimental measurement [154].
To analyze the dispersion rate in our model and emphasize the importance of frontal bias, we defined
a simplified model in 1D and 2D and kept only alignment terms in our model. We thought to derive a
continuum model from microscopic interaction rules.
161



5a
D FE
3
0
2
Time delay from first-ranking swimmer
Information travel distance
40
0
20
1
0
rank
3000
1500
2 3
Time delay from first-ranking swimmer
0 1
5a
A CB
3
0
2
Time delay from first-ranking swimmer
Information travel distance
40
0
20
1
0
rank
1000
500
32
Time delay from first-ranking swimmer
0 1
60
Figure 6.10: Additional data on merging. A. - C. turning and merging trajectories of a school containing 1000 fish and corresponding absolute turning time delay and rank. The information transfer speed is 28.7. D. - F. turning and merging trajectories
of a school containing 3000 fish and corresponding absolute turning time delay and rank. The information transfer speed is 26.0,
and 37.8 for red, and blue subgroups, respectively.
Starting from the microscopic equation describing the time evolution of swimmer’s heading
˙θi = Ia
P
j∈Ni
sin ϕij (1 + γ cos θij )
P
j∈Ni
(1 + γ cos θij )
, (6.8)
where γ ∈ [0, 1] is a parameter we introduced here to vary the strength of frontal bias. γ = 1 is equivalent
to the numerical model we used in main text; while γ = 0 means no frontal bias.
We derive a continuum equation under the following conditions. Firstly, we consider a highly polarized
school, which means that the orientation of each swimmer within the school can be decomposed into the
average heading direction of the school ⟨θ⟩ and a small fluctuations φi of individual swimmers i about
the average heading θ, namely θi = ⟨θ⟩ + φi
. Without loss of generality, we assume the ⟨θ⟩ = 0, which
aligns the positive x-direction with the moving direction of the group. Based on this assumption, sin ϕij =
162



5a
D FE
15
0
10
Time delay from first-ranking swimmer
Information travel distance
40
0
20
5
0
rank
2000
1000
10 15
Time delay from first-ranking swimmer
0 5
5a
A CB
15
0
10
Time delay from first-ranking swimmer
Information travel distance
40
0
20
5
0
rank
1000
500
1510
Time delay from first-ranking swimmer
0 5
5a
A 60
60
Figure 6.11: Additional data on splitting. A. - C. turning and splitting trajectories of a school containing 1000 fish and
corresponding absolute turning time delay and rank. The information transfer speed is 3.0. D. - F. turning and splitting trajectories
of a school containing 2000 fish and corresponding absolute turning time delay and rank. The information transfer speed is 3.4.
B
A
i i -1 i +1
group
direction
group
direction
i3 i i1
i2
i4
C
group
direction
i
i1
i3
i2
i4
ψ i1
perturbation
direction
Figure 6.12: Schematics of 1d and 2d lattice model. A. A one-dimensional model of particles on equally-spaced lattice, the
spacing between them is α. B. A two-dimensional equally-spaced lattice with mesh size α. C. A two-dimensional lattice with
mesh size α. The particles in front of the focal particle already moved in the perturbation direction for a small amount ψα. In all
models, group direction is in positive x direction.
163



sin(θj − θi) = sin(φj − φi) ≈ φj − φi
, and cos θij = cos(arctan yj−yi
xj−xi
− θi) = cos(arctan yj−yi
xj−xi
− φi).
Substitute these relationships into (C.1), we get
∂φi
∂t =
Ia
N
X
j∈Ni
(φj − φi)

1 + γ cos(arctan yj − yi
xj − xi
− φi)

, (6.9)
Secondly, we assume that the swimmers are located on a 2d lattice with mesh size α (Fig. 6.12B), and mesh
orientation aligned with the swimming direction. We aim to coarse-grain the discrete equations (C.1) over
a coarse-graining box containing a focal swimmer and four immediate neighbors, such that a swimmer i
responds to its direct front, left, back, and right neighbors, indexed by i1, i2, i3, i4. The locations of them
respect to particle i are written as
xi1 − xi = (α, 0), xi2 − xi = (0, α), xi3 − xi = (−α, 0), xi4 − xi = (0, −α)
(6.10)
Plug it into (6.9), we get
∂φi
∂t =
Ia
4
[φi1(γ cos φi + 1) + φi3(−γ cos φi + 1)
+φi2(γ sin φi + 1) + φi4(−γ sin φi + 1) − 4φi
]
(6.11)
We reorganize the above equation to construct a finite difference scheme for each term
∂φi
∂t =
α
2
Ia
4

φi1 + φi3 − 2φi
α2
+
φi2 + φi4 − 2φi
α2

+
γαIa
2

cos φi
φi1 − φi3
2α
+ sin φi
φi2 − φi4
2α

(6.12)
The finite difference can be approximated by first-order and second-order derivatives
∂φ
∂t =
α
2
Ia
4
∆φ +
γαIa
2
t · ∇φ, (6.13)
164



where t = (cos φ,sin φ). After linearization (cos φ ∼ 1, sin φ ∼ φ), we get
∂φ
∂t =
α
2
Ia
4
∆φ +
γαIa
2

∂φ
∂x + φ
∂φ
∂y
(6.14)
This shows that the advection term only exists with frontal bias.
Analyze information transfer speed on non-reciprocal 2D model We performed a linearization on
non-reciprocal 2D PDE (6.14) about φ ≈ 0 at second-order accuracy,
∂φ
∂t −
γαIa
2
∇φ · (1, φ) −
α
2
Ia
4
∆φ = 0 (6.15)
If we ignore the diffusion term ∆φ ≈ 0 and take γ = 1, the equation is simplified to
∂φ
∂t −
αIa
2
∂φ
∂x −
αIa
2
φ
∂φ
∂y = 0 (6.16)
Consider an initial condition of φ(t = 0, x, y) = φ0(x, y), according to characteristic equations, we
arrive at a solution in implicit form
φ(t, x, y) = φ0

x +
αIa
2
t, y +
αIa
2
φ(t, x, y)t

(6.17)
For the initial condition, we first consider an initial condition with perturbation only in x-direction of
form φ0(x, y) = A sin(kxx). Then the corresponding solution is derived as
φ(t, x, y) = A sin(kx(x +
αIa
2
t)),
(6.18)
165



which has a temporal frequency of ω = kxαIa/2. The linear dispersion rate is ∂ω/∂kx = αIa/2 in
negative x direction.
We then consider an initial condition with perturbation only in y-direction of form
φ0(x, y) = A sin(kyy). Then the corresponding solution is derived as
φ(t, x, y) = A sin(ky(y +
αIa
2
φ(t, x, y)t)),
(6.19)
Under this scenario, the solution is only valid before a breaking time tb. After this time, the solution
becomes multi-valued and thus invalid, which is known as shock formation [29]. The time to form a shock
wave, namely the breaking time tb, depends on the maximum initial gradient
tb =
2
αIa max |∂yφ0|
(6.20)
Viscous diffusion smooths gradients, which prevents infinite slopes. This phenomenon is known as balance
steepening. For weak viscosity (α
2
Ia/4 ≪ 1), shock has thickness ∼ α
2
Ia/4.
Information transfer of attraction interaction
Here, we thought to analyze the influence of attraction term on the continuum description. If we continue
considering a regular lattice, the attraction term from neighbors would cancel out, and thus has no influence on the PDE. However, in real fish schools, the particles are constantly moving, and changing their
relative locations dynamically due to other terms. Thus, here we consider a small offset in the position of
neighbors. Because in the derivation of the model with alignment terms, we found that information travels
from front of the school to back, we assumed only the neighbors in front of a particle, and other particles
166



are still fixed in the lattice (Fig. 6.12C). We thought to derive the influence of attraction term under this
model. The equation of motion for a single particle is written as
∂φi
∂t =
α
N
X
j∈Ni
sin(arctan yj − yi
xj − xi
− φi)

1 + γ cos(arctan yj − yi
xj − xi
− φi)

, (6.21)
The particle i is also responding to its 4 neighbors as in the derivation of alignment model. However,
neighbor i1 in (6.10) is offseted via a small amount ψ ∼ φ, namely xi1 − xi = α(cos ψ,sin ψ). Thus, the
equation of motion is expanded as
∂φi
∂t =
α
4
h
sin φ + sin φ cos φ + sin(ψ − φ) + γ
2
sin(2ψ − 2φ)
i
(6.22)
Taking first-order approximation of all trigonometric functions, we get
∂φi
∂t =
α
4
[(1 − γ)φ + (1 + γ)ψ]
(6.23)
Consider a full visual bias, we have
∂φi
∂t =
αψ
2
(6.24)
Since the perturbation in displacement is in the order of perturbation in orientation ψ ∼ φ when combined
with the alignment model, the attraction term is a self-augmenting forcing term, which would enlarge the
magnitude of the perturbation. In lateral direction of (6.14), the magnitude of perturbation determines
the information transfer speed via φ∂φ/∂y. Thus, the attraction term would increase the information
transfer speed in a nonlinear manner. During splitting and emerging, the relationship between ψ and φ
would change due to the presence of a neighboring group at different location.
167



Influence of neighboring group and moving frame of reference
Here, we want to discuss the influence of a neighboring group moving in the other direction, which is
similar to the case of merging and splitting. During splitting and merging, the neighboring group can
be modeled by a mirror effect, which is a Neumann boundary condition. Keeping only linear terms, the
equation is written as
∂φ
∂t −
αIa
2
∂φ
∂x = 0, x < 0
∂φ
∂t +
αIa
2
∂φ
∂x = 0, x > 0
∂φ
∂x = 0, x = 0
(6.25)
Consider an initial condition of φ(x, 0) = e
ikx, the solution can be written as
φ(x, t) = e
i(kx−ωt) + e
−i(kx+ωt) = 2 cos kxe−iωt (6.26)
This does not change the information transfer speed.
In addition, we consider the fact that the school is actually moving, and the previous analysis only
considers the Lagrangian coordinate of reference fixed on the school. To distinguish them, we define x
as the coordinate in the lab frame and X as the coordinate in the moving frame of reference, namely
x = X + V t, where V is the average velocity of the school. For polarized schools, V ≈ 1 is close to the
self-propelled velocity. The continuum equation in lab frame will be transformed to
∂φ
∂t + (V −
αIa
2
)
∂φ
∂x = 0 (6.27)
168



Hydrodynamic interaction model. Considering the group is heading in the same direction and ignoring noise and all vision-based interactions in (C.1), a small perturbation in φi about the heading direction
propagates via hydrodynamic interactions only following the simpler equation
∂φi
∂t = pi
·
dUi
dx
|xi
· p
⊥
i
.
(6.28)
Here, to simply the analysis, we consider the swimmers to form an infinite one-dimensional lattice with
equally-spaced potential dipoles of mesh size α, such that the flow field at location i is given by [446],
Ui =
X∞
j=−∞,j̸=i
If
π
p
⊥
j
sin 2θji + pj cos 2θji
r
2
ij
(6.29)
Considering the perturbation has a periodic window containing K particles, we employ the analytical
expression derived in [446], which transforms the infinite summation to a finite summation
∂φi
∂t =
−2π
2
If
K3α3
X
K
j=i,j̸=i
cos
π
K
(i − j)

sin3

π
K
(i − j)
sin(φj − 2φi) (6.30)
Using linear approximation, we get
∂φi
∂t =
4π
2
If
K2α3
X
K/2
j=i
j cos h
jπ
K
i
sin3
h
jπ
K
i
φi+j − φi−j
2jα
(6.31)
The finite difference can be approximated by first-order derivatives, we arrive at
∂φ
∂t =
4π
2
If
K2α3
∂φ
∂x
X
K/2
j=i
j cos h
jπ
K
i
sin3
h
jπ
K
i , (6.32)
where the summation is a constant. This shows that information transfer in hydrodynamic interaction is
linear.
169



a
1
0
.5
Rotational order M
Number of fish, N
i ii iii
i
ii
iii
102 103 104
20a
5a
Figure 6.13: Emergent collective behaviors at extreme scale. With the same set of parameters of local behavior laws
(Ia = 1.5, In = 0.3), at small number of swimmers (N = 100), the school forms a stable "vortex-like" milling pattern (i). For
intermediate number of swimmers (N = 1000), the milling pattern seldom breaks and quickly reforms. or large number of
swimmers (N = 10, 000), a global milling pattern never forms. Swimmers are represented as airfoils of unit length to illustrate
both their position and heading direction.
6.10 Global rotational order is lost with an increasing number of
swimmers independent of hydrodynamic interaction
For milling pattern, the school also split into local substructures. However, the averaged milling order
parameter drops more rapidly as illustrated in Fig. 6.13B, after reaching a certain number of agents. This
is because of the fact that a global milling pattern cannot sustain an increasing number of agents, which
will be explained later.
The hydrodynamic interaction is the only term in our equations of motion (6.1). We want to consider
whether the splitting of both patterns can happen without hydrodynamic interaction. For schooling patterns without hydrodynamic interaction, the global cohesive pattern never split with an increasing number
of agents (Fig. 6.14A). For rotationally-ordered milling patterns, however, split similarly with an increasing
number of agents (Fig. 6.14B). This means that the splitting of rotationally-ordered state is independent of
repulsive hydrodynamic interaction, and thus is intrinsic to the non-metric alignment and attraction rules.
6.11 Correlation length in stable milling patterns
In stable milling pattern, the net motion of the school, indicated by the average velocity over all swimmers
(v¯), is negligible and random. Thus, the velocity fluctuation cannot be simply defined as the velocity
170



1
0
.5
Rotational order M
Number of fish, N
102 103 104
Figure 6.14: Emergent collective behaviors at extreme scale without hydrodynamic interaction. Dynamical phases
emerge depending on the number of swimmers. Rotationally-ordered milling patterns split similarly without hydrodynamic
interaction.
difference between each swimmer and the mean velocity. Instead, the main feature of the milling pattern
is the rotational order; namely, all the swimmers are rotating about the rotation center, which is close to
their center of mass (COM). Thus, the radial velocity is chosen as the perturbation of interest (Fig. 6.15A),
which is calculated as
δvr,i = δvi
·
xi − x¯
||xi − x¯||
(6.33)
The correlation function is thus defined as
C(r) =
P
i,j δvr,iδvr,jδ(r − rij )
P
i,j δ(r − rij )
(6.34)
After this, the correlation lengths ξ is defined similarly as the location where the correlation function
crosses 0 (Fig. 6.15B). Fig. 6.15C shows that the correlation length is a linear function of average radius,
which indicates that scale-free correlation also applies to milling pattern.
6.12 Scaling law in stable milling patterns predicts the loss of stability
To explain why the milling pattern has a number limit, which constrains the maximum stable size, independent of the repulsive force, we established scaling laws in stable milling patterns. The stable milling pattern
171



has two most dominant features: 1. the agents are mostly concentrated around the school’s boundary, instead of uniformly distributed among the school; 2. There is a density wave continuously propagating
outward from the interior of the structure (Fig. 6.15D).
To quantify the first point, we calculated the average radius ⟨||xi − x||⟩ and the standard deviation
of the radius σ(||xi − x||). The standard deviation of the radius can be viewed as the half-width of the
boundary of the school. In Fig. 6.15F, we plotted both of them as a function of number of agents N. The
average radius scales with √
N, and the width scales with N. The number N where the extrapolation line
of width 2σ(||xi − x||) exceeds the average radius ⟨||xi − x||⟩ is close to the point where we cannot find
a globally-stable milling pattern.
To further explain this, we consider that the trajectory of inner agents in milling patterns satisfies an
Archimedean spiral
r = bθ, (6.35)
where b is an unknown constant, representing the ratio between velocity in the radial direction and angular velocity. Thus, we have σ(r) = b · 2πk, where σ(r) is the standard deviation of the distance
from a swimmer to the center of the school, k represents how many circles it takes to the inner swimmer to join the main stream at the boundary. The length of trajectory it takes is b
2
[2πkp
1 + (2πk)
2 +
ln(2πk +
p
1 + (2πk)
2)] ≈
b
2
[4π
2k
2 + ln(4πk)]. Because the self-propelled velocity of the swimmers
is homogeneous, the length of this trajectory should be equal to the outer swimmer’s trajectory. Thus,
b
2
[4π
2k
2 + ln(4πk)] = 2π⟨r⟩C, where C represents during the time when an inner swimmer travels
outward, how many periods the outer swimmer travels. Thus,
k/2 +
ln(4πk)
8π
2k
=
C⟨r⟩
σ(r)
,
(6.36)
The solution of the equation is approximately k =
2C⟨r⟩
σ(r)
.
172



A
5a
B
G
N
1000
500
0
D
radial fluctuations
Time
1000200 400 600 800
0
1
.5
0
5
r/rmax
E
Fraction of other label
0 50003000 400020001000
0
20
10
30
Number of fish, N
Radius
mean
2 x std
F
Correlation function C(r)
-1
1
0
Distance, r
0 105 20 25 3015
Average radius, r
4 65 7 1083 9
Correlation length, ξ
4
5
3
7
8
6
ξ
2
5a 5a
5a
t = 915 t = 930 t = 945
0
3
0
.5
.4
.3
.2
.1
Speed of density wave
0 50003000 400020001000
Number of fish, N
minimum reqiurement
of wave speed
C
Figure 6.15: Correlation length and scaling law in milling pattern. A. A snapshot of a stable milling with N = 1000
and its corresponding radial velocity fluctuation δvr,i. B. Correlation function C(r) is the radial fluctuations correlation of pair
of agents. The correlation functions change sign at ξ. C. The correlation length ξ is plotted as a function of average radius ⟨r⟩.
D Coarse-grained density field is plotted as a function of space for several snapshots in a milling pattern composed of N = 900
swimmers. E. Density averaged as a function of distance to center of the school ri = ||xi − ⟨x⟩||. Sploe of the strips is defined
as the speed of the internal density wave. F. Average value of distance to center ⟨r⟩ and its standard deviation σ(r) are plotted
as a function of number of fish N. G. speed of density wave is plotted as a function of number of fish N.
Another relationship is the definition of parameter b, b = √ vr
1−v
2
r /⟨r⟩
. Thus, we arrive at
p
vr
1 − v
2
r
=
σ
2
(r)
4πC2⟨r⟩
2
∝
σ
2
(r)
⟨r⟩
2
∝ N (6.37)
This analysis provides a lower bound of the requirement for speed of the density wave. The number of
agents where the minimum requirement reaches the actual speed of density wave is also close to the point
where we cannot see a global milling structure.
173



Chapter 7
Conclusions and Future Work
7.1 Conclusion
In this thesis, we built a hierarchy of models describing fish’s swimming and schooling behaviors. As illustrated in Fig. 1.2, in the models with a lower number of swimmers, we employed high-fidelity physical
models, including high-resolution unsteady fluid models and detailed descriptions of fish kinematics. On
the other hand, with increasing numbers of individuals, the details of the body kinematics and hydrodynamics model become unimportant. Instead, the sensing and control part becomes more dominant. Thus,
a low-resolution physical model is coupled with behavior models to study the collective behavior of fish.
The main results are summarized as follows:
(1) Body flexion of fish plays a dominant role in fish’s swimming efficiency and speed. Having an active
tail flapping antiphase to the body kinematics, which matches the flow field generated by a passive tail,
can improve swimming speed and efficiency simultaneously.
(2) We investigate the role of local flow sensing in the context of navigating in unsteady flow fields with
the aid of deep reinforcement learning. For tracking hydrodynamic wakes without any global information,
sensing local gradient of the flow field at tail is sufficient for tracking different wakes. While not apriori obvious, the RL policies led to parsimonious strategies, analogous to Braitenberg’s simplest vehicles, where
an agent senses local flow gradients and turns away from or towards the direction of stronger signal. We
174



analyzed the stability of these strategies and demonstrated, for non-intuitive sensor placement, that they
robustly track unfamiliar flows using diverse types of sensors. For point-to-point navigation under adversarial flow, we extended the work in [174] by considering egocentric sensing. Comparing the egocentric
navigator to one using geocentric observations, we found that while sensing local flow velocities is sufficient for geocentric navigation, successful egocentric navigation requires additional information about
local flow gradients. Importantly, when evaluating both policies under conditions not seen during training, egocentric navigation strategies exhibited superior performance in unfamiliar conditions and novel
flows.
(3) Flow-coupled swimmers self-organise into stable schools, which provide hydrodynamic benefits to
swimming together. Different spatial patterns of swimmers generate different partitions of hydrodynamic
benefits among swimmers. An inline formation would give all the hydrodynamic benefits to the following
swimmers, while swimmers in side-by-side formation share hydrodynamic benefits equally. Moreover,
when scaling to more numbers of swimmers, the "greedy" inline formation cannot maintain passive stability while the "cooperative" side-by-side formation maintains its stability for an increasing number of
swimmers. We built diagnostic tools that predict the equilibrium positions and energetic benefits given
the wake of leading swimmers. Based on the characteristics of the flow field, we also developed controllers
that stabilize the unstable schools under different scenarios.
(4) Ordered states of fish schools, which emerge at ∼ 100 individuals, do not sustain their stability with
increasing number of individuals based on our simulation of 104 fish. For polarized schools, splitting of
structure is attributed to the repulsive far-field hydrodynamic interaction. However, in rotational ordered
"milling state", the structure has its own limiting capacity in number of swimmers. The speed of the
internal density wave restricts the maximum capacity of a cohesive milling state. Moreover, we studied
the information transfer among different local structures in the dynamically changing schools. We found
that (1) prior to the splitting of a school or a cluster, the internal structure already changes, represented by
175



the decrease in correlation length (2) the information transfer speed in the rejoining process is enhanced
compared to when fish school is turning due to individual noise (3) we extended the definition of correlation
of fluctuation from polarized schooling to rotationally-ordered milling, and found that milling state also
has a scale-free correlation (4) we built a scaling law to explain the splitting of milling states in two ways:
when the fluctuation of radius exceeds the average radius and the wave speed of internal density wave
cannot sustain.
7.2 Visions on Future Work
7.2.1 Full-stack of Biomimetics
As reviewed in Sec. 1.1.2, nowadays, we are getting close to mimicking biological behavior in a certain aspect. However, different aspects are developed in different communities, and we indeed need a joint effort
to integrate these knowledges and reach or even surpass biological behavior [474]. Take the example of
perhaps the most successful robot fishes – Tunabot [523] and the robotic fish developed in [266]. Years
of effort make the kinematics and morphology of tailbeat achieve a hydrodynamic efficiency very close
to actual fishes. However, if we take the efficiency of generating the kinematics into account, the overall
efficiency is still much lower. This is because the tailbeat motions of these robots are generated by electric
motors coupled with mechanical transmissions. This problem becomes even more challenging when considering high-frequency oscillations [523]. On the other hand, artificial muscles [290] have been developed
and recently applied to underwater robotics [433]. However, the design and optimization of such actuators
present significant challenges, as they require solving inverse problems corresponding to highly-coupled
nonlinear fluid-structure interaction problems. In nature, these kinds of problems are solved by millions
of years of natural selection.
176



Nowadays, performing optimization on this kind of coupled system is becoming increasingly possible
with the aid of the development of artificial intelligence(AI) [346, 256, 1, 358]. In the development of AI,
there are algorithms mimicking natural selection, like Genetic Algorithm (GA) [196], Ant Colony Optimization [115], etc. In addition, there are lots of algorithms not bio-inspired, namely the Bayesian-based
models [416], and recent success of frequency-based models (e.g. neural networks) [200].
7.2.2 Physics-aware machine learning
Data-driven models, especially deep learning, have changed our world dramatically in the past decade [256].
When it comes to using scientific discoveries and engineering applications, there are some unique challenges and opportunities [58, 229, 57]. The challenge comes from the lack of data: generating high-quality
data either computationally or experimentally is much more expensive than gathering datasets for images
or natural language. On the other hand, we have lots of prior knowledge coming from first principle,
including known governing equations and symmetries. Thus, how to utilize these priors to reduce the
required amount of data is an open question for the community. Researchers have tried to tackle this
question from different aspects. Some people proposed embed the known physics laws into loss, without
changing the architecture of the neural network itself, like Physics informed neural network (PINN) [381,
382]. Others tried to enforce that the physics always evolves in low dimensional latent variable, such
as Sparse identification of nonlinear dynamics (SINDy) [59], Deep Operator Network (DeepONet) [283],
Fourier Neural Operator (FNO) [268], Attention mechanism [459, 462], or perform some weight sharing
based on symmetry of the problems, such as Neural ODE [74], Group-invariance deep learning [80]. My
contribution in this area involves utilizing a known equation with unknown parameters [202] and applying
graph neural network in agent-based simulation [275].
In order to perform control in nonlinear and unsteady environments, reinforcement learning has been
widely used [104, 178, 99, 317, 428, 15, 241, 190, 331]. In terms of interacting with fluid environment, a
177



model-free RL approach purely based on trial and error, has been successfully applied in various problems,
including energy-efficient navigation [385, 466, 34], motion planning [218, 174, 217], source seeking [184].
Importantly, some of the policies trained in simulated environments have been successfully transferred in
real world, including energy-efficient soaring [388], flow control [133], whale tracking [215], and source
seeking [173]. Recently, model-based RL ([177]) has begun to show its superiority in interacting with physical environments [306, 278]. Future direction involves utilizing the above-mentioned physics-inspired ML
models in the design of model-based RL.
178



Bibliography
[1] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro,
Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. “Tensorflow: Large-scale
machine learning on heterogeneous distributed systems”. In: arXiv preprint arXiv:1603.04467
(2016).
[2] Julius Adler. “Chemoreceptors in Bacteria: Studies of chemotaxis reveal systems that detect
attractants independently of their metabolism.” In: Science 166.3913 (1969), pp. 1588–1597.
[3] Julius Adler. “Chemotaxis in bacteria”. In: Annual review of biochemistry 44.1 (1975), pp. 341–356.
[4] Jake K Aggarwal and Quin Cai. “Human motion analysis: A review”. In: Computer vision and
image understanding 73.3 (1999), pp. 428–440.
[5] Otar Akanyeti, Lily D Chambers, Jaas Ježov, Jennifer Brown, Roberto Venturelli,
Maarja Kruusmaa, William M Megill, and Paolo Fiorini. “Self-motion effects on hydrodynamic
pressure sensing: part I. Forward–backward motion”. In: Bioinspiration & biomimetics 8.2 (2013),
p. 026001.
[6] Silas Alben. “Optimal flexibility of a flapping appendage in an inviscid fluid”. In: Journal of Fluid
Mechanics 614 (2008), pp. 355–380.
[7] Mohamad Alsalman, Brendan Colvert, and Eva Kanso. “Training bioinspired sensors to classify
flows”. In: Bioinspiration & biomimetics 14.1 (2018), p. 016009.
[8] Philip W Anderson. “More Is Different: Broken symmetry and the nature of the hierarchical
structure of science.” In: Science 177.4047 (1972), pp. 393–396.
[9] Ichiro Aoki. “A simulation study on the schooling mechanism in fish.” In: Nippon Suisan
Gakkaishi 48 (1982), pp. 1081–1088. url: https://api.semanticscholar.org/CorpusID:84753385.
[10] Igor S Aranson and Lev S Tsimring. “Patterns and collective behavior in granular media:
Theoretical concepts”. In: Reviews of modern physics 78.2 (2006), pp. 641–692.
[11] G. P. ARNOLD. “RHEOTROPISM IN FISHES”. In: Biological Reviews 49.4 (), pp. 515–576. doi:
https://doi.org/10.1111/j.1469-185X.1974.tb01173.x. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1469-185X.1974.tb01173.x.
179



[12] GP Arnold. “The reactions of the plaice (Pleuronectes platessa L.) to water currents”. In: Journal
of Experimental Biology 51.3 (1969), pp. 681–697.
[13] Vladimir I Arnold. Ordinary differential equations. Springer Science & Business Media, 1992.
[14] G Arranz, O Flores, and M Garcia-Villalba. “Flow interaction of three-dimensional self-propelled
flexible plates in tandem”. In: Journal of Fluid Mechanics 931 (2022).
[15] Donde P Ashmos, Dennis Duchon, and Reuben R McDaniel Jr. “Participation in strategic decision
making: The role of organizational predisposition and issue interpretation”. In: Decision Sciences
29.1 (1998), pp. 25–51.
[16] Intesaaf Ashraf, Hanaé Bradshaw, Thanh Tung Ha, José Halloy, Ramiro Godoy-Diana, and
Benjamin Thiria. “Simple phalanx pattern leads to energy saving in cohesive fish schooling”. In:
Proc. Nat. Acad. Sci. 114 (2017), pp. 9599–9604. issn: 10916490. doi: 10.1073/pnas.1706503114.
[17] Alessandro Attanasi, Andrea Cavagna, Lorenzo Del Castello, Irene Giardina, Tomas S Grigera,
Asja Jelić, Stefania Melillo, Leonardo Parisi, Oliver Pohl, Edward Shen, et al. “Information transfer
and behavioural inertia in starling flocks”. In: Nature physics 10.9 (2014), pp. 691–696.
[18] Alessandro Attanasi, Andrea Cavagna, Lorenzo Del Castello, Irene Giardina, Asja Jelic,
Stefania Melillo, Leonardo Parisi, Oliver Pohl, Edward Shen, and Massimiliano Viale. “Emergence
of collective changes in travel direction of starling flocks from individual birds’ fluctuations”. In:
Journal of The Royal Society Interface 12.108 (2015), p. 20150319.
[19] Yael Avni, Michel Fruchart, David Martin, Daniel Seara, and Vincenzo Vitelli. “The non-reciprocal
Ising model”. In: arXiv preprint arXiv:2311.05471 (2023).
[20] Fatma Ayancik, Qiang Zhong, Daniel B Quinn, Aaron Brandes, Hilary Bart-Smith, and
Keith W Moored. “Scaling laws for the propulsive performance of three-dimensional pitching
propulsors”. In: Journal of Fluid Mechanics 871 (2019), pp. 1117–1138.
[21] Shun-ichi Azuma, Mahmut Selman Sakar, and George J Pappas. “Stochastic source seeking by
mobile robots”. In: IEEE Transactions on Automatic Control 57.9 (2012), pp. 2308–2321.
[22] Ralf Bachmayer and Naomi Ehrich Leonard. “Vehicle networks for gradient descent in a sampled
environment”. In: Proceedings of the 41st IEEE Conference on Decision and Control, 2002. Vol. 1.
IEEE. 2002, pp. 112–117.
[23] Tim Bailey and Hugh Durrant-Whyte. “Simultaneous localization and mapping (SLAM): Part II”.
In: IEEE robotics & automation magazine 13.3 (2006), pp. 108–117.
[24] Iztok Lebar Bajec and Frank H Heppner. “Organized flight in birds”. In: Animal Behaviour 78.4
(2009), pp. 777–789.
[25] CF Baker and JC Montgomery. “The sensory basis of rheotaxis in the blind Mexican cave fish,
Astyanax fasciatus”. In: Journal of Comparative Physiology A 184.5 (1999), pp. 519–527.
180



[26] Keeley L Baker, Michael Dickinson, Teresa M Findley, David H Gire, Matthieu Louis,
Marie P Suver, Justus V Verhagen, Katherine I Nagel, and Matthew C Smear. “Algorithms for
olfactory search across species”. In: Journal of Neuroscience 38.44 (2018), pp. 9383–9389.
[27] C Bradford Barber, David P Dobkin, and Hannu Huhdanpaa. “The quickhull algorithm for convex
hulls”. In: ACM Transactions on Mathematical Software (TOMS) 22.4 (1996), pp. 469–483.
[28] Jennifer Basil and Jelle Atema. “Lobster orientation in turbulent odor plumes: simultaneous
measurement of tracking behavior and temporal odor patterns”. In: The Biological Bulletin 187.2
(1994), pp. 272–273.
[29] Harry Bateman. “Some recent researches on the motion of fluids”. In: Monthly Weather Review
43.4 (1915), pp. 163–170.
[30] Ray H Baughman. “Playing nature’s game with artificial muscles”. In: Science 308.5718 (2005),
pp. 63–65.
[31] David N Beal, Franz S Hover, Michael S Triantafyllou, James C Liao, and George V Lauder.
“Passive propulsion in vortex wakes”. In: Journal of fluid mechanics 549 (2006), pp. 385–402.
[32] Charalampos P. Bechlioulis, George C. Karras, Shahab Heshmati-Alamdari, and
Kostas J. Kyriakopoulos. “Trajectory Tracking With Prescribed Performance for Underactuated
Underwater Vehicles Under Model Uncertainties and External Disturbances”. In: IEEE
Transactions on Control Systems Technology 25.2 (2017), pp. 429–440. doi:
10.1109/TCST.2016.2555247.
[33] Alexander D Becker, Hassan Masoud, Joel W Newbolt, Michael Shelley, and Leif Ristroph.
“Hydrodynamic schooling of flapping swimmers”. In: Nature communications 6.1 (2015), pp. 1–8.
[34] Diederik Beckers and Jeff D Eldredge. “Deep reinforcement learning of airfoil pitch control in a
highly disturbed environment using partial observations”. In: Physical Review Fluids 9.9 (2024),
p. 093902.
[35] Heather Beem, Matthew Hildner, and Michael Triantafyllou. “Calibration and validation of a
harbor seal whisker-inspired flow sensor”. In: Smart Materials and Structures 22.1 (2012),
p. 014012.
[36] Heather R Beem and Michael S Triantafyllou. “Wake-induced ‘slaloming’response explains
exquisite sensitivity of seal whisker-like sensors”. In: Journal of Fluid Mechanics 783 (2015),
pp. 306–322.
[37] Marc G Bellemare, Salvatore Candido, Pablo Samuel Castro, Jun Gong, Marlos C Machado,
Subhodeep Moitra, Sameera S Ponda, and Ziyu Wang. “Autonomous navigation of stratospheric
balloons using reinforcement learning”. In: Nature 588.7836 (2020), pp. 77–82.
[38] Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. “Curriculum learning”.
In: Proceedings of the 26th annual international conference on machine learning. 2009, pp. 41–48.
181



[39] Kazusa Beppu, Ziane Izri, Tasuku Sato, Yoko Yamanishi, Yutaka Sumino, and Yusuke T Maeda.
“Edge current and pairing order transition in chiral bacterial vortices”. In: Proceedings of the
National Academy of Sciences 118.39 (2021), e2107461118.
[40] Howard C Berg. Random walks in biology. Princeton University Press, 1993.
[41] Howard C Berg and Douglas A Brown. “Chemotaxis in Escherichia coli analysed by
three-dimensional tracking”. In: Nature 239.5374 (1972), pp. 500–504.
[42] Florian Berlinger, Melvin Gauci, and Radhika Nagpal. “Implicit coordination for 3D underwater
collective behaviors in a fish-inspired robot swarm”. In: Science Robotics 6.50 (2021), eabd8668.
[43] Amneet Pal Singh Bhalla, Rahul Bale, Boyce E Griffith, and Neelesh A Patankar. “A unified
mathematical framework and an adaptive numerical method for fluid–structure interaction with
rigid, deforming, and elastic bodies”. In: Journal of Computational Physics 250 (2013), pp. 446–476.
[44] Amneet Pal Singh Bhalla, Nishant Nangia, Panagiotis Dafnakis, Giovanni Bracco, and
Giuliana Mattiazzo. “Simulating water-entry/exit problems using Eulerian–Lagrangian and
fully-Eulerian fictitious domain methods within the open-source IBAMR library”. In: Applied
Ocean Research 94 (2020), p. 101932.
[45] William Bialek, Andrea Cavagna, Irene Giardina, Thierry Mora, Edmondo Silvestri,
Massimiliano Viale, and Aleksandra M Walczak. “Statistical mechanics for natural flocks of
birds”. In: Proceedings of the National Academy of Sciences 109.13 (2012), pp. 4786–4791.
[46] Luca Biferale, Fabio Bonaccorso, Michele Buzzicotti, Patricio Clark Di Leoni, and
Kristian Gustavsson. “Zermelo’s problem: Optimal point-to-point navigation in 2D turbulent
flows using reinforcement learning”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science
29.10 (2019), p. 103138.
[47] Emrah Biyik and Murat Arcak. “Gradient climbing in formation via extremum seeking and
passivity-based coordination rules”. In: 2007 46th IEEE Conference on Decision and Control. IEEE.
2007, pp. 3133–3138.
[48] JHS Blaxter. “Structure and development of the lateral line”. In: Biological Reviews 62.4 (1987),
pp. 471–514.
[49] Horst Bleckmann, Joachim Mogdans, and Sheryl L Coombs. “Flow sensing in air and water”. In:
Berlin, Germany 976 (2014).
[50] BA Block, JE Keen, B Castillo, H Dewar, EV Freund, DJ Marcinek, RW Brill, and C Farwell.
“Environmental preferences of yellowfin tuna (Thunnus albacares) at the northern extent of its
range”. In: Marine biology 130 (1997), pp. 119–132.
[51] Bert Blocken, Thijs van Druenen, Yasin Toparlar, Fabio Malizia, Paul Mannion,
Thomas Andrianne, Thierry Marchal, Geert-Jan Maas, and Jan Diepens. “Aerodynamic drag in
cycling pelotons: New insights by CFD simulation and wind tunnel testing”. In: Journal of Wind
Engineering and Industrial Aerodynamics 179 (2018), pp. 319–337.
182



[52] Paolo Blondeaux, Francesco Fornarelli, Laura Guglielmini, Michael S. Triantafyllou, and
Roberto Verzicco. “Numerical experiments on flapping foils mimicking fish-like locomotion”. In:
Physics of Fluids 17.11 (2005), p. 113601. doi: 10.1063/1.2131923. eprint:
https://doi.org/10.1063/1.2131923.
[53] Meliha Bozkurttas, Haibo Dong, Rajat Mittal, James Tangorra, Ian Hunter, George Lauder, and
Peter Madden. “CFD-Based Analysis and Design of Biomimetic Flexible Propulsor for
Autonomous Underwater Vehicles”. In: 37th AIAA Fluid Dynamics Conference and Exhibit. 2007,
p. 4213.
[54] Valentino Braitenberg. Vehicles: Experiments in synthetic psychology. MIT press, 1986.
[55] Vasil Bratanov, Frank Jenko, and Erwin Frey. “New class of turbulence in active fluids”. In:
Proceedings of the National Academy of Sciences 112.49 (2015), pp. 15048–15053.
[56] JR Brett. “The relation of size to rate of oxygen consumption and sustained swimming speed of
sockeye salmon (Oncorhynchus nerka)”. In: Journal of the Fisheries Board of Canada 22.6 (1965),
pp. 1491–1501.
[57] Steven L Brunton and J Nathan Kutz. “Promising directions of machine learning for partial
differential equations”. In: Nature Computational Science 4.7 (2024), pp. 483–494.
[58] Steven L Brunton, Bernd R Noack, and Petros Koumoutsakos. “Machine learning for fluid
mechanics”. In: Annual Review of Fluid Mechanics 52 (2020), pp. 477–508.
[59] Steven L Brunton, Joshua L Proctor, and J Nathan Kutz. “Discovering governing equations from
data by sparse identification of nonlinear dynamical systems”. In: Proceedings of the national
academy of sciences 113.15 (2016), pp. 3932–3937.
[60] Daniel S Calovi, Alexandra Litchinko, Valentin Lecheval, Ugo Lopez, Alfonso Pérez Escudero,
Hugues Chaté, Clément Sire, and Guy Theraulaz. “Disentangling and modeling interactions in
fish with burst-and-coast swimming reveal distinct alignment and attraction behaviors”. In: PLoS
computational biology 14.1 (2018), e1005933.
[61] Daniel S Calovi, Ugo Lopez, Sandrine Ngo, Clément Sire, Hugues Chaté, and Guy Theraulaz.
“Swarming, schooling, milling: phase diagram of a data-driven fish school model”. In: New journal
of Physics 16.1 (2014), p. 015026.
[62] Ricardo JGB Campello, Davoud Moulavi, and Jörg Sander. “Density-based clustering based on
hierarchical density estimates”. In: Pacific-Asia conference on knowledge discovery and data
mining. Springer. 2013, pp. 160–172.
[63] Ring T Cardé, Ralph E Charlton, William E Wallner, and Yuri N Baranchikov.
“Pheromone-mediated diel activity rhythms of male Asian gypsy moths (Lepidoptera:
Lymantriidae) in relation to female eclosion and temperature”. In: Annals of the Entomological
Society of America 89.5 (1996), pp. 745–753.
[64] Ring T Cardé and Agenor Mafra-Neto. “Mechanisms of flight of male moths to pheromone”. In:
Insect pheromone research. Springer, 1997, pp. 275–290.
183



[65] Taylor N Carlson. “Modeling political information transmission as a game of telephone”. In: The
Journal of Politics 80.1 (2018), pp. 348–352.
[66] J Carrier, Leslie Greengard, and Vladimir Rokhlin. “A fast adaptive multipole algorithm for
particle simulations”. In: SIAM journal on scientific and statistical computing 9.4 (1988),
pp. 669–686.
[67] Jérôme Casas and Olivier Dangles. “Physical ecology of fluid flow sensing in arthropods”. In:
Annual review of entomology 55 (2010), pp. 505–520.
[68] Andrea Cavagna, Alessio Cimarelli, Irene Giardina, Giorgio Parisi, Raffaele Santagati,
Fabio Stefanini, and Massimiliano Viale. “Scale-free correlations in starling flocks”. In: Proceedings
of the National Academy of Sciences 107.26 (2010), pp. 11865–11870.
[69] Andrea Cavagna, Daniele Conti, Chiara Creato, Lorenzo Del Castello, Irene Giardina,
Tomas S Grigera, Stefania Melillo, Leonardo Parisi, and Massimiliano Viale. “Dynamic scaling in
natural swarms”. In: Nature Physics 13.9 (2017), pp. 914–918.
[70] Andrea Cavagna and Irene Giardina. “Bird flocks as condensed matter”. In: Annu. Rev. Condens.
Matter Phys. 5.1 (2014), pp. 183–207.
[71] Andrea Cavagna, Irene Giardina, and Francesco Ginelli. “Boundary information inflow enhances
correlation in flocking”. In: Physical review letters 110.16 (2013), p. 168107.
[72] Andrea Cavagna, Irene Giardina, and Tomás S Grigera. “The physics of flocking: Correlation as a
compass from experiments to theory”. In: Physics Reports 728 (2018), pp. 1–62.
[73] Michail Chatzimanolakis, Pascal Weber, Michael Triantafyllou, and Petros Koumoutsakos.
“Schooling Hydrodynamics of 300 Fish”. In: APS Division of Fluid Dynamics Meeting Abstracts.
2021, H13–010.
[74] Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. “Neural ordinary
differential equations”. In: Advances in neural information processing systems 31 (2018).
[75] Alan Cheng Hou Tsang and Eva Kanso. “Flagella-induced transitions in the collective behavior of
confined microswimmers”. In: Phys. Rev. E 90 (2 Aug. 2014), p. 021001. doi:
10.1103/PhysRevE.90.021001.
[76] Po-Wei Chou, Daniel Maturana, and Sebastian Scherer. “Improving stochastic policy gradients in
continuous control with deep reinforcement learning using the beta distribution”. In:
International conference on machine learning. PMLR. 2017, pp. 834–843.
[77] Jennie Cochran, Eva Kanso, Scott D Kelly, Hailong Xiong, and Miroslav Krstic. “Source seeking
for two nonholonomic models of fish locomotion”. In: IEEE Transactions on Robotics 25.5 (2009),
pp. 1166–1176.
[78] Jennie Cochran, Eva Kanso, and Miroslav Krstic. “Source seeking for a three-link model of fish
locomotion”. In: 2009 American Control Conference. IEEE. 2009, pp. 1808–1813.
184



[79] Jennie Cochran and Miroslav Krstic. “Nonholonomic source seeking with tuning of angular
velocity”. In: IEEE Transactions on Automatic Control 54.4 (2009), pp. 717–731.
[80] Taco Cohen and Max Welling. “Group equivariant convolutional networks”. In: International
conference on machine learning. PMLR. 2016, pp. 2990–2999.
[81] Jonathan Colen, Ming Han, Rui Zhang, Steven A Redford, Linnea M Lemma, Link Morgan,
Paul V Ruijgrok, Raymond Adkins, Zev Bryant, Zvonimir Dogic, et al. “Machine learning
active-nematic hydrodynamics”. In: Proceedings of the National Academy of Sciences 118.10 (2021),
e2016708118.
[82] Brendan Colvert, Mohamad Alsalman, and Eva Kanso. “Classifying vortex wakes using neural
networks”. In: Bioinspiration & biomimetics 13.2 (2018), p. 025003.
[83] Brendan Colvert, Kevin Chen, and Eva Kanso. “Local flow characterization using bioinspired
sensory information”. In: Journal of Fluid Mechanics 818 (2017), p. 366.
[84] Brendan Colvert, Kevin K Chen, and Eva Kanso. “Bioinspired sensory systems for shear flow
detection”. In: Journal of Nonlinear Science 27.4 (2017), pp. 1183–1192.
[85] Brendan Colvert and Eva Kanso. “Fishlike rheotaxis”. In: Journal of Fluid Mechanics 793 (2016),
pp. 656–666. doi: 10.1017/jfm.2016.141.
[86] Brendan Colvert, Geng Liu, Haibo Dong, and Eva Kanso. “Flowtaxis in the wakes of oscillating
airfoils”. In: Theoretical and Computational Fluid Dynamics 34.4 (2020), pp. 545–556.
[87] S. A. Combes and T. L. Daniel. “Flexural stiffness in insect wings II. Spatial distribution and
dynamic wing bending”. In: Journal of Experimental Biology 206.17 (2003), pp. 2989–2997. issn:
0022-0949. doi: 10.1242/jeb.00524. eprint:
https://jeb.biologists.org/content/206/17/2989.full.pdf.
[88] SA Combes and TL Daniel. “Shape, flapping and flexion: wing and fin design for forward flight”.
In: Journal of Experimental Biology 204.12 (2001), pp. 2073–2085.
[89] Christos Constantinidis, Matthew N Franowicz, and Patricia S Goldman-Rakic. “The sensory
nature of mnemonic representation in the primate prefrontal cortex”. In: Nature neuroscience 4.3
(2001), pp. 311–316.
[90] Sheryl Coombs, John Janssen, and Jacqueline F Webb. “Diversity of lateral line systems:
evolutionary and functional considerations”. In: Sensory biology of aquatic animals. Springer,
1988, pp. 553–593.
[91] Sheryl Coombs and Sietse Van Netten. “The hydrodynamics and structural mechanics of the
lateral line system”. In: Fish physiology 23 (2005), pp. 103–139.
[92] Mario Coppola, Kimberly N McGuire, Christophe De Wagter, and Guido CHE De Croon. “A
survey on swarming with micro air vehicles: Fundamental challenges and constraints”. In:
Frontiers in Robotics and AI 7 (2020), p. 18.
185



[93] Alessandro Corbetta and Federico Toschi. “Physics of human crowds”. In: Annual Review of
Condensed Matter Physics 14.1 (2023), pp. 311–333.
[94] Iain D Couzin, Jens Krause, et al. “Self-organization and collective behavior in vertebrates”. In:
Advances in the Study of Behavior 32.1 (2003), pp. 10–1016.
[95] Iain D Couzin, Jens Krause, Nigel R Franks, and Simon A Levin. “Effective leadership and
decision-making in animal groups on the move”. In: Nature 433.7025 (2005), pp. 513–516.
[96] Iain D Couzin, Jens Krause, Richard James, Graeme D Ruxton, and Nigel R Franks. “Collective
memory and spatial sorting in animal groups”. In: Journal of theoretical biology 218.1 (2002),
pp. 1–11.
[97] Darren P Croft, Richard James, and Jens Krause. Exploring animal social networks. Princeton
University Press, 2008.
[98] Gabriel T Csanady. Turbulent diffusion in the environment. Vol. 3. Springer Science & Business
Media, 2012.
[99] Antoine Cully, Jeff Clune, Danesh Tarapore, and Jean-Baptiste Mouret. “Robots that can adapt
like animals”. In: Nature 521.7553 (2015), pp. 503–507.
[100] Hu Dai, Haoxiang Luo, Paulo J. S. A. Ferreira de Sousa, and James F. Doyle. “Thrust performance
of a flexible low-aspect-ratio pitching plate”. In: Physics of Fluids 24.10 (2012), p. 101903. doi:
10.1063/1.4764047. eprint: https://doi.org/10.1063/1.4764047.
[101] Longzhen Dai, Guowei He, Xiang Zhang, and Xing Zhang. “Stable formations of self-propelled
fish-like swimmers induced by hydrodynamic interactions”. In: Journal of The Royal Society
Interface 15.147 (2018), p. 20180490.
[102] Rishita Das, Sean D Peterson, and Maurizio Porfiri. “Stability of schooling patterns of a fish pair
swimming against a flow”. In: Flow 3 (2023), E31.
[103] CT David, JS Kennedy, and AR Ludlow. “Finding of a sex pheromone source by gypsy moths
released in the field”. In: Nature 303.5920 (1983), pp. 804–806.
[104] Thomas Degris, Patrick M Pilarski, and Richard S Sutton. “Model-free reinforcement learning
with continuous action in practice”. In: 2012 American Control Conference (ACC). IEEE. 2012,
pp. 2177–2182.
[105] Guido Dehnhardt, Björn Mauck, and Horst Bleckmann. “Seal whiskers detect water movements”.
In: Nature 394.6690 (1998), pp. 235–236.
[106] Guido Dehnhardt, Björn Mauck, Wolf Hanke, and Horst Bleckmann. “Hydrodynamic
trail-following in harbor seals (Phoca vitulina)”. In: Science 293.5527 (2001), pp. 102–104.
[107] Arthur P Dempster, Nan M Laird, and Donald B Rubin. “Maximum likelihood from incomplete
data via the EM algorithm”. In: Journal of the royal statistical society: series B (methodological) 39.1
(1977), pp. 1–22.
186



[108] Dana V Devine and Jelle Atema. “Function of chemoreceptor organs in spatial orientation of the
lobster, Homarus americanus: differences and overlap”. In: The Biological Bulletin 163.1 (1982),
pp. 144–153.
[109] Michael H. Dickinson, Fritz-Olaf Lehmann, and Sanjay P. Sane. “Wing Rotation and the
Aerodynamic Basis of Insect Flight”. In: Science 284.5422 (1999), pp. 1954–1960. issn: 0036-8075.
doi: 10.1126/science.284.5422.1954. eprint:
https://science.sciencemag.org/content/284/5422/1954.full.pdf.
[110] Sven Dijkgraaf. “The functioning and significance of the lateral-line organs”. In: Biological reviews
38.1 (1963), pp. 51–105.
[111] Paolo Domenici and Robert W Blake. “The kinematics and performance of fish fast-start
swimming”. In: Journal of Experimental Biology 200.8 (1997), pp. 1165–1178.
[112] Paolo Domenici and Robert W Blake. “The kinematics and performance of the escape response in
the angelfish (Pterophyllum eimekei)”. In: Journal of Experimental Biology 156.1 (1991),
pp. 187–205.
[113] Paolo Domenici and Melina E Hale. “Escape responses of fish: a review of the diversity in motor
control, kinematics and behaviour”. In: Journal of Experimental Biology 222.18 (2019), jeb166009.
[114] Amin Doostmohammadi, Jordi Ignés-Mullol, Julia M Yeomans, and Francesc Sagués. “Active
nematics”. In: Nature communications 9.1 (2018), p. 3246.
[115] Marco Dorigo, Mauro Birattari, and Thomas Stutzle. “Ant colony optimization”. In: IEEE
computational intelligence magazine 1.4 (2006), pp. 28–39.
[116] Ron Douglas and Mustafa Djamgoz. The visual system of fish. Springer Science & Business Media,
2012.
[117] E.G. Drucker and G.V. Lauder. “A hydrodynamic analysis of fish swimming speed: wake structure
and locomotor force in slow and fast labriform swimmers”. In: Journal of Experimental Biology
203.16 (2000), pp. 2379–2393. issn: 0022-0949. eprint:
https://jeb.biologists.org/content/203/16/2379.full.pdf. url:
https://jeb.biologists.org/content/203/16/2379.
[118] L. E. Dubins. “On Curves of Minimal Length with a Constraint on Average Curvature, and with
Prescribed Initial and Terminal Positions and Tangents”. In: American Journal of Mathematics 79.3
(1957), pp. 497–516. issn: 00029327, 10806377. (Visited on 10/01/2023).
[119] Richard O Duda, Peter E Hart, et al. Pattern classification and scene analysis. Vol. 3. Wiley New
York, 1973.
[120] Hugh Durrant-Whyte and Tim Bailey. “Simultaneous localization and mapping: part I”. In: IEEE
robotics & automation magazine 13.2 (2006), pp. 99–110.
[121] Jeff D Eldredge. “Numerical simulations of undulatory swimming at moderate Reynolds number”.
In: Bioinspiration & biomimetics 1.4 (2006), S19.
187



[122] Jeff D. Eldredge. “Dynamically coupled fluid–body interactions in vorticity-based numerical
simulations”. In: Journal of Computational Physics 227.21 (2008). Special Issue Celebrating Tony
Leonard’s 70th Birthday, pp. 9170–9194. issn: 0021-9991. doi:
https://doi.org/10.1016/j.jcp.2008.03.033.
[123] Jeff D. Eldredge, Jonathan Toomey, and Albert Medina. “On the roles of chord-wise flexibility in a
flapping wing with hovering kinematics”. In: Journal of Fluid Mechanics 659 (2010), pp. 94–115.
doi: 10.1017/S0022112010002363.
[124] C. P. Ellington. “The Aerodynamics of Hovering Insect Flight. VI. Lift and Power Requirements”.
In: Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 305.1122
(1984), pp. 145–181. issn: 00804622. url: http://www.jstor.org/stable/2396077.
[125] Charles P Ellington. “Power and efficiency of insect flight muscle”. In: Journal of Experimental
Biology 115.1 (1985), pp. 293–304.
[126] Charles Porter Ellington. “The aerodynamics of hovering insect flight. IV. Aerodynamic
mechanisms”. In: Philosophical Transactions of the Royal Society of London. B, Biological Sciences
305.1122 (1984), pp. 79–113.
[127] Christophe Eloy. “On the best design for undulatory swimming”. In: J. Fluid Mech 717.25 (2013),
p. 002.
[128] Thierry Emonet and Massimo Vergassola. “Olfactory cues and memories in animal navigation”.
In: Nature Reviews Physics (2024), pp. 1–2.
[129] Jacob Engelmann, Wolf Hanke, Joachim Mogdans, and Horst Bleckmann. “Hydrodynamic stimuli
and the fish lateral line”. In: Nature 408.6808 (2000), pp. 51–52.
[130] Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. “A density-based algorithm for
discovering clusters in large spatial databases with noise”. In: kdd. Vol. 96. 34. 1996, pp. 226–231.
[131] Nikolaos Evangelou, Tianqi Cui, Juan M Bello-Rivas, Alexei Makeev, and Ioannis G Kevrekidis.
“Tipping points of evolving epidemiological networks: Machine learning-assisted, data-driven
effective modeling”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 34.6 (2024).
[132] Gianluca Fabiani, Nikolaos Evangelou, Tianqi Cui, Juan M Bello-Rivas, Cristina P Martin-Linares,
Constantinos Siettos, and Ioannis G Kevrekidis. “Task-oriented machine learning surrogates for
tipping points of agent-based models”. In: Nature communications 15.1 (2024), p. 4117.
[133] Dixia Fan, Liu Yang, Zhicheng Wang, Michael S Triantafyllou, and George Em Karniadakis.
“Reinforcement learning for bluff body active flow control in experiments and simulations”. In:
Proceedings of the National Academy of Sciences 117.42 (2020), pp. 26091–26098.
[134] Zhifang Fan, Jack Chen, Jun Zou, David Bullen, Chang Liu, and Fred Delcomyn. “Design and
fabrication of artificial lateral line flow sensors”. English (US). In: Journal of Micromechanics and
Microengineering 12.5 (Sept. 2002), pp. 655–661. issn: 0960-1317. doi: 10.1088/0960-1317/12/5/322.
188



[135] F. Fang. “Hydrodynamic interactions between self-propelled flapping wings”. PhD thesis. New
York University, 2016.
[136] Damien R Farine and Hal Whitehead. “Constructing, conducting and interpreting animal social
network analysis”. In: Journal of animal ecology 84.5 (2015), pp. 1144–1163.
[137] Jay A Farrell, John Murlis, Xuezhu Long, WEI Li, and Ring T Cardé. “Filament-based atmospheric
dispersion model to achieve short time-scale structure of odor plumes”. In: Environmental fluid
mechanics 2 (2002), pp. 143–169.
[138] Jay A Farrell, Shou Pang, Wei Li, and Richard Arrieta. “Chemical plume tracing experimental
results with a REMUS AUV”. In: Oceans 2003. Celebrating the Past... Teaming Toward the Future
(IEEE Cat. No. 03CH37492). Vol. 2. IEEE. 2003, pp. 962–968.
[139] Kara L Feilich and George V Lauder. “Passive mechanical models of fish caudal fins: effects of
shape and stiffness on self-propulsion”. In: Bioinspiration & Biomimetics 10.3 (Apr. 2015),
p. 036002. doi: 10.1088/1748-3190/10/3/036002.
[140] Karla E Feitl, Victoria Ngo, and Matthew J McHenry. “Are fish less responsive to a flow stimulus
when swimming?” In: Journal of Experimental Biology 213.18 (2010), pp. 3131–3137.
[141] Russell D Fernald. “Fish vision”. In: Development of the vertebrate retina. Springer, 1989,
pp. 247–265.
[142] A. Filella, F. Nadal, C. Sire, E. Kanso, and C. Eloy. “Model of collective fish behavior with
hydrodynamic interactions”. In: Physical review letters 120.19 (2018), p. 198101.
[143] Frank E Fish and Clifford A Hui. “Dolphin swimming–a review”. In: Mammal Review 21.4 (1991),
pp. 181–195.
[144] Frank E Fish, Jenifer Hurley, and Daniel P Costa. “Maneuverability by the sea lion Zalophus
californianus: turning performance of an unstable body design”. In: Journal of Experimental
Biology 206.4 (2003), pp. 667–674.
[145] Ronald A Fisher. “On the interpretation of χ 2 from contingency tables, and the calculation of P”.
In: Journal of the royal statistical society 85.1 (1922), pp. 87–94.
[146] Richard FitzHugh. “Impulses and physiological states in theoretical models of nerve membrane”.
In: Biophysical journal 1.6 (1961), pp. 445–466.
[147] Dario Floreano and Robert J Wood. “Science, technology and the future of small autonomous
drones”. In: nature 521.7553 (2015), pp. 460–466.
[148] Daniel Floryan, Tyler Van Buren, Clarence W Rowley, and Alexander J Smits. “Scaling the
propulsive performance of heaving and pitching foils”. In: Journal of Fluid Mechanics 822 (2017),
pp. 386–397.
[149] Michel Fruchart, Ryo Hanai, Peter B Littlewood, and Vincenzo Vitelli. “Non-reciprocal phase
transitions”. In: Nature 592.7854 (2021), pp. 363–369.
189



[150] Matt K Fu and John O Dabiri. “Magnetic signature of vertically migrating aggregations in the
ocean”. In: arXiv preprint arXiv:2207.03486 (2022).
[151] Shinpei Fujiwara and Satoru Yamaguchi. “Development of Fishlike Robot that Imitates
Carangiform and Subcarangiform Swimming Motions”. In: Journal of Aero Aqua Bio-mechanisms
6.1 (2017), pp. 1–8. doi: 10.5226/jabmech.6.1.
[152] Steve Furber. “Large-scale neuromorphic computing systems”. In: Journal of neural engineering
13.5 (2016), p. 051001.
[153] Simon Garnier, Jacques Gautrais, and Guy Theraulaz. “The biological principles of swarm
intelligence”. In: Swarm intelligence 1 (2007), pp. 3–31.
[154] Jacques Gautrais, Francesco Ginelli, Richard Fournier, Stéphane Blanco, Marc Soria,
Hugues Chaté, and Guy Theraulaz. “Deciphering Interactions in Moving Animal Groups”. In:
PLOS Computational Biology 8.9 (Sept. 2012), pp. 1–11. doi: 10.1371/journal.pcbi.1002678.
[155] Mattia Gazzola, Babak Hejazialhosseini, and Petros Koumoutsakos. “Reinforcement learning and
wavelet adapted vortex methods for simulations of self-propelled swimmers”. In: SIAM Journal on
Scientific Computing 36.3 (2014), B622–B639.
[156] Brad J. Gemmell, Stephanie M. Fogerson, John H. Costello, Jennifer R. Morgan, John O. Dabiri,
and Sean P. Colin. “How the bending kinematics of swimming lampreys build negative pressure
fields for suction thrust”. In: Journal of Experimental Biology 219.24 (2016), pp. 3884–3895. issn:
0022-0949. doi: 10.1242/jeb.144642. eprint:
https://jeb.biologists.org/content/219/24/3884.full.pdf.
[157] François Gerlotto, Sophie Bertrand, Nicolas Bez, and Mariano Gutierrez. “Waves of agitation
inside anchovy schools observed with multibeam sonar: a way to transmit information in
response to predation”. In: ICES Journal of Marine Science 63.8 (2006), pp. 1405–1417.
[158] Delphine Geyer, Alexandre Morin, and Denis Bartolo. “Sounds and hydrodynamics of polar active
fluids”. In: Nature materials 17.9 (2018), pp. 789–793.
[159] Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, and Sergey Levine.
“Divide-and-conquer reinforcement learning”. In: arXiv preprint arXiv:1711.09874 (2017).
[160] Florence Gibouin, Christophe Raufaste, Yann Bouret, and Médéric Argentina. “Study of the
thrust–drag balance with a swimming robotic fish”. In: Physics of Fluids 30.9 (2018), p. 091901.
doi: 10.1063/1.5043137. eprint: https://doi.org/10.1063/1.5043137.
[161] Francesco Ginelli. “The physics of the Vicsek model”. In: The European Physical Journal Special
Topics 225 (2016), pp. 2099–2117.
[162] Nele Gläser, Sven Wieskotten, Christian Otter, Guido Dehnhardt, and Wolf Hanke.
“Hydrodynamic trail following in a California sea lion (Zalophus californianus)”. In: Journal of
Comparative Physiology A 197.2 (2011), pp. 141–151.
[163] Raymond E Goldstein. “Traveling-wave chemotaxis”. In: Physical review letters 77.4 (1996), p. 775.
190



[164] Klaus Gramann, Julie Onton, Davide Riccobon, Hermann J Mueller, Stanislav Bardins, and
Scott Makeig. “Human brain dynamics accompanying use of egocentric and allocentric reference
frames during navigation”. In: Journal of cognitive neuroscience 22.12 (2010), pp. 2836–2849.
[165] Melissa A Green, Clarence W Rowley, and Alexander J Smits. “The unsteady three-dimensional
wake produced by a trapezoidal pitching panel”. In: Journal of Fluid Mechanics 685 (2011),
pp. 117–145.
[166] Leslie Greengard and Vladimir Rokhlin. “A fast algorithm for particle simulations”. In: Journal of
computational physics 73.2 (1987), pp. 325–348.
[167] Boyce E Griffith, Richard D Hornung, David M McQueen, and Charles S Peskin. “An adaptive,
formally second order accurate version of the immersed boundary method”. In: Journal of
computational physics 223.1 (2007), pp. 10–49.
[168] Boyce E Griffith and Neelesh A Patankar. “Immersed methods for fluid–structure interaction”. In:
Annual review of fluid mechanics 52 (2020), p. 421.
[169] Boyce E Griffith and Charles S Peskin. “On the order of accuracy of the immersed boundary
method: Higher order convergence rates for sufficiently smooth problems”. In: Journal of
Computational Physics 208.1 (2005), pp. 75–105.
[170] Robert Großmann, Pawel Romanczuk, Markus Bär, and Lutz Schimansky-Geier. “Vortex arrays
and mesoscale turbulence of self-propelled particles”. In: Physical review letters 113.25 (2014),
p. 258104.
[171] François Gu, Benjamin Guiselin, Nicolas Bain, Iker Zuriguel, and Denis Bartolo. “Emergence of
collective oscillations in massive human crowds”. In: Nature 638.8049 (2025), pp. 112–119.
[172] Charlène Guillot and Thomas Lecuit. “Mechanics of epithelial tissue homeostasis and
morphogenesis”. In: Science 340.6137 (2013), pp. 1185–1189.
[173] Peter Gunnarson and John O Dabiri. “Fish-inspired tracking of underwater turbulent plumes”. In:
arXiv preprint arXiv:2403.06091 (2024).
[174] Peter Gunnarson, Ioannis Mandralis, Guido Novati, Petros Koumoutsakos, and John O Dabiri.
“Learning efficient navigation in vortical flow fields”. In: Nature communications 12.1 (2021),
pp. 1–7.
[175] Hanliang Guo, Lisa Fauci, Michael Shelley, and Eva Kanso. “Bistability in the synchronization of
actuated microfilaments”. In: Journal of Fluid Mechanics 836 (2018), pp. 304–323.
[176] DM Guthrie. “Role of vision in fish behaviour”. In: The behaviour of Teleost fishes. Springer, 1986,
pp. 75–113.
[177] Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, and Timothy Lillicrap. “Mastering diverse domains
through world models”. In: arXiv preprint arXiv:2301.04104 (2023).
191



[178] Adrian M Haith and John W Krakauer. “Model-based and model-free mechanisms of human
motor learning”. In: Progress in motor control: Neural, computational and dynamic approaches.
Springer. 2013, pp. 1–21.
[179] Adel Hamdi. “The recovery of a time-dependent point source in a linear transport equation:
application to surface water pollution”. In: Inverse Problems 25.7 (2009), p. 075006.
[180] Nils Olav Handegard, Kevin M Boswell, Christos C Ioannou, Simon P Leblanc, Dag B Tjøstheim,
and Iain D Couzin. “The dynamics of coordinated group hunting and collective information
transfer among schooling prey”. In: Current biology 22.13 (2012), pp. 1213–1217.
[181] Haotian Hang, Sina Heydari, John H Costello, and Eva Kanso. “Active tail flexion in concert with
passive hydrodynamic forces improves swimming speed and efficiency”. In: Journal of Fluid
Mechanics 932 (2022), A35.
[182] Haotian Hang, Sina Heydari, and Eva Kanso. “Feedback control of uncoordinated flapping
swimmers to maintain school cohesion”. In: (ACC conference) (2024).
[183] Haotian Hang, Chenchen Huang, Alex Barnett, and Eva Kanso. “Fish schooling at extreme
scales”. In: (in preparation) (2025).
[184] Haotian Hang, Yusheng Jiao, Sina Heydari, Feng Ling, Josh Merel, and Eva Kanso. “Interpretable
and Generalizable Strategies for Stably Following Hydrodynamic Trails”. In: bioRxiv (2023),
pp. 2023–12.
[185] Wolf Hanke, Sven Wieskotten, Christopher Marshall, and Guido Dehnhardt. “Hydrodynamic
perception in true seals (Phocidae) and eared seals (Otariidae)”. In: Journal of Comparative
Physiology A 199.6 (2013), pp. 421–440.
[186] David G Harper and Robert W Blake. “Fast-start performance of rainbow trout Salmo gairdneri
and northern pike Esox lucius”. In: Journal of Experimental Biology 150.1 (1990), pp. 321–342.
[187] Arthur Davis Hasler and Allan T Scholz. Olfactory imprinting and homing in salmon: Investigations
into the mechanism of the imprinting process. Vol. 14. Springer Science & Business Media, 2012.
[188] S. Heathcote and I. Gursul. “Flexible Flapping Airfoil Propulsion at Low Reynolds Numbers”. In:
AIAA Journal 45.5 (2007), pp. 1066–1079. doi: 10.2514/1.25431. eprint:
https://doi.org/10.2514/1.25431.
[189] Sam Heathcote, Z Wang, and Ismet Gursul. “Effect of spanwise flexibility on flapping wing
propulsion”. In: Journal of Fluids and Structures 24.2 (2008), pp. 183–199.
[190] Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa,
Tom Erez, Ziyu Wang, SM Eslami, et al. “Emergence of locomotion behaviours in rich
environments”. In: arXiv preprint arXiv:1707.02286 (2017).
[191] Charlotte K Hemelrijk, Lars van Zuidam, and Hanno Hildenbrandt. “What underlies waves of
agitation in starling flocks”. In: Behavioral ecology and sociobiology 69 (2015), pp. 755–764.
192



[192] J Herskin and JF Steffensen. “Energy savings in sea bass swimming in a school: measurements of
tail beat frequency and oxygen consumption at different swimming speeds”. In: Journal of Fish
Biology 53.2 (1998), pp. 366–376.
[193] Sina Heydari, Haotian Hang, and Eva Kanso. “Mapping Spatial Patterns to Energetic Benefits in
Groups of Flow-coupled Swimmers”. In: Elife (2024), pp. 2024–02.
[194] Sina Heydari and Eva Kanso. “School cohesion, speed and efficiency are modulated by the
swimmers flapping motion”. In: Journal of Fluid Mechanics 922 (2021).
[195] Koichi Hirata, Tadanori Takimoto, and Kenkichi Tamura. “Study on turning performance of a fish
robot”. In: First International Symposium on Aqua Bio-Mechanisms. Mitaka. 2000, pp. 287–292.
[196] John Henry Holland et al. Adaptation in natural and artificial systems: an introductory analysis
with applications to biology, control, and artificial intelligence. MIT press, 1992.
[197] Alexander P Hoover, Ricardo Cortez, Eric D Tytell, and Lisa J Fauci. “Swimming performance,
resonance and shape evolution in heaving flexible panels”. In: Journal of Fluid Mechanics 847
(2018), pp. 386–416.
[198] Alexander P Hoover, Joost Daniels, Janna C Nawroth, and Kakani Katija. “A Computational
Model for Tail Undulation and Fluid Transport in the Giant Larvacean”. In: Fluids 6.2 (2021), p. 88.
[199] Alexander P Hoover and Eric Tytell. “Decoding the relationships between body shape, tail beat
frequency, and stability for swimming fish”. In: Fluids 5.4 (2020), p. 215.
[200] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. “Multilayer feedforward networks are
universal approximators”. In: Neural networks 2.5 (1989), pp. 359–366.
[201] Ru-Nan Hua, Luoding Zhu, and Xi-Yun Lu. “Locomotion of a flapping flexible plate”. In: Physics of
Fluids 25.12 (2013), p. 121901. doi: 10.1063/1.4832857. eprint: https://doi.org/10.1063/1.4832857.
[202] Chenchen Huang and Haotian Hang. Inferring unknown parameters of partially-observable system
using Physics-informed-DeepONet. https://github.com/chenchenhuang/DeepONet_physics_inferring.
[203] Chenchen Huang, Feng Ling, and Eva Kanso. “Collective phase transitions in confined fish
schools”. In: Proceedings of the National Academy of Sciences 121.44 (2024), e2406293121.
[204] Y. Huang, M. Nitsche, and E. Kanso. “Hovering in oscillatory flows”. In: Journal of Fluid Mechanics
804 (2016), pp. 531–549.
[205] Y. Huang, L. Ristroph, M. Luhar, and E. Kanso. “Bistability in the rotational motion of rigid and
flexible flyers”. In: Journal of Fluid Mechanics 849 (2018), pp. 1043–1067.
[206] Yangyang Huang, Monika Nitsche, and Eva Kanso. “Stability versus maneuverability in hovering
flight”. In: Physics of Fluids 27.6 (2015). issn: 10897666. doi: 10.1063/1.4923314. arXiv: 1411.6764.
193



[207] Yangyang Huang, Jeannette Yen, and Eva Kanso. “Detection and tracking of chemical trails in
bio-inspired sensory systems”. In: European Journal of Computational Mechanics 26.1-2 (2017),
pp. 98–114.
[208] Marcus Hultmark, Megan Leftwich, and Alexander J Smits. “Flowfield measurements in the wake
of a robotic lamprey”. In: Experiments in fluids 43 (2007), pp. 683–690.
[209] Andreas Huth and Christian Wissel. “The simulation of the movement of fish schools”. In: Journal
of theoretical biology 156.3 (1992), pp. 365–385.
[210] IBAMR. IBAMR:An adaptive and distributed-memory parallel implementation of the immersed
boundary (IB) method. https://ibamr.github.io.
[211] Ingenuity. https://en.wikipedia.org/wiki/Ingenuity_(helicopter).
[212] E Inglis-Arkell. The Very First Robot “Brains” Were Made of Old Alarm Clocks. Gizmodo. 2015.
[213] Christos C Ioannou, Iain D Couzin, Richard James, Darren P Croft, and Jens Krause. “Social
organisation and information transfer in schooling fish”. In: Fish cognition and behavior 2 (2011),
pp. 217–239.
[214] Tomoyuki Itoh, Sachiko Tsuji, and Akira Nitta. “Migration patterns of young Pacific bluefin tuna
(Thunnus orientalis) determined with archival tags”. In: Fishery Bulletin 101.3 (2003), pp. 514–534.
[215] Ninad Jadhav, Sushmita Bhattacharya, Daniel Vogt, Yaniv Aluma, Pernille Tonessen,
Akarsh Prabhakara, Swarun Kumar, Shane Gero, Robert J Wood, and Stephanie Gil.
“Reinforcement learning–based framework for whale rendezvous via autonomous sensing
robots”. In: Science Robotics 9.95 (2024), eadn7299.
[216] Yusheng Jiao, Brendan Colvert, Yi Man, Matthew J McHenry, and Eva Kanso. “Evaluating evasion
strategies in zebrafish larvae”. In: Proceedings of the National Academy of Sciences 120.7 (2023),
e2218909120.
[217] Yusheng Jiao, Haotian Hang, Josh Merel, and Eva Kanso. “Sensing flow gradients is necessary for
learning autonomous underwater navigation”. In: (accepted by Nature Communication) (2025).
[218] Yusheng Jiao, Feng Ling, Sina Heydari, Nicolas Heess, Josh Merel, and Eva Kanso. “Learning to
swim in potential flow”. In: Phys. Rev. Fluids 6 (5 May 2021), p. 050505. doi:
10.1103/PhysRevFluids.6.050505.
[219] Javier Jimenez. “On the linear stability of the inviscid Kármán vortex street”. In: Journal of Fluid
Mechanics 178 (1987), pp. 177–194.
[220] M. A. Jones. “The separated flow of an inviscid fluid around a moving flat plate”. In: Journal of
Fluid Mechanics 496 (2003), p. 405.
[221] M. J. Jones M. A .and Shelley. “Falling cards”. In: Journal of Fluid Mechanics 540 (2005),
pp. 393–425.
194



[222] Hao Ju, Rongshun Juan, Randy Gomez, Keisuke Nakamura, and Guangliang Li. “Transferring
policy of deep reinforcement learning from simulation to reality for robotics”. In: Nature Machine
Intelligence 4.12 (2022), pp. 1077–1087.
[223] Nirag Kadakia, Mahmut Demir, Brenden T Michaelis, Brian D DeAngelis, Matthew A Reidenbach,
Damon A Clark, and Thierry Emonet. “Odour motion sensing enhances navigation of complex
plumes”. In: Nature 611.7937 (2022), pp. 754–761.
[224] Rudolf E Kalman. “On the general theory of control systems”. In: Proceedings first international
conference on automatic control, Moscow, USSR. 1960, pp. 481–492.
[225] Chang-kwon Kang, Farbod Fahimi, Rob Griffin, D Brian Landrum, Bryan Mesmer,
Guangsheng Zhang, Taeyoung Lee, Hikaru Aono, Jeremy Pohly, Jesse McCain, et al.
Marsbee-swarm of flapping wing flyers for enhanced mars exploration. Tech. rep. 2019.
[226] Chia-Pin Kang, Hung-Chi Tu, Tzu-Fun Fu, Jhe-Ming Wu, Po-Hsun Chu, and
Darby Tien-Hao Chang. “An automatic method to calculate heart rate from zebrafish larval
cardiac videos”. In: BMC bioinformatics 19 (2018), pp. 1–10.
[227] E Kanso and A C H Tsang. “Pursuit and Synchronization in Hydrodynamic Dipoles”. In: Journal
of Nonlinear Science 25(5) (2015), p. 1141.
[228] E. Kanso and A. C. H. Tsang. “Dipole models of self-propelled bodies”. In: Fluid Dynamics
Research 46.6 (2014), p. 061407.
[229] George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang.
“Physics-informed machine learning”. In: Nature Reviews Physics 3.6 (2021), pp. 422–440.
[230] Ehud D Karpas, Adi Shklarsh, and Elad Schneidman. “Information socialtaxis and efficient
collective behavior emerging in groups of information-seeking agents”. In: Proceedings of the
National Academy of Sciences 114.22 (2017), pp. 5589–5594.
[231] MA Kasapi, P Domenici, RW Blake, and D Harper. “The kinematics and performance of escape
responses of the knifefish Xenomystus nigri”. In: Canadian journal of zoology 71.1 (1993),
pp. 189–195.
[232] Ronald A Kastelein, Paul J Wensveen, John M Terhune, and Christ AF de Jong. “Near-threshold
equal-loudness contours for harbor seals (Phoca vitulina) derived from reaction times during
underwater audiometry: a preliminary study”. In: The Journal of the Acoustical Society of America
129.1 (2011), pp. 488–495.
[233] Robert K Katzschmann, Joseph DelPreto, Robert MacCurdy, and Daniela Rus. “Exploration of
underwater life with an acoustically controlled soft robotic fish”. In: Science Robotics 3.16 (2018).
[234] Evelyn F Keller and Lee A Segel. “Model for chemotaxis”. In: Journal of theoretical biology 30.2
(1971), pp. 225–234.
[235] Stefan Kern and Petros Koumoutsakos. “Simulations of optimized anguilliform swimming”. In:
Journal of Experimental Biology 209.24 (2006), pp. 4841–4857.
195



[236] James D Kieffer, DEREK Alsop, and Chris M Wood. “A respirometric analysis of fuel use during
aerobic swimming at different temperatures in rainbow trout (Oncorhynchus mykiss)”. In:
Journal of Experimental Biology 201.22 (1998), pp. 3123–3133.
[237] Shaun S Killen, Stefano Marras, John F Steffensen, and David J McKenzie. “Aerobic capacity
influences the spatial position of individuals within fish schools”. In: Proceedings of the Royal
Society B: Biological Sciences 279.1727 (2012), pp. 357–364.
[238] Sohae Kim, Wei-Xi Huang, and Hyung Jin Sung. “Constructive and destructive interaction modes
between two tandem flexible flags in viscous flow”. In: Journal of fluid mechanics 661 (2010),
pp. 511–521.
[239] Diederik P Kingma and Jimmy Ba. “Adam: A method for stochastic optimization”. In: arXiv
preprint arXiv:1412.6980 (2014).
[240] Peter E Kloeden, Eckhard Platen, Peter E Kloeden, and Eckhard Platen. Stochastic differential
equations. Springer, 1992.
[241] Jens Kober, J Andrew Bagnell, and Jan Peters. “Reinforcement learning in robotics: A survey”. In:
The International Journal of Robotics Research 32.11 (2013), pp. 1238–1274.
[242] Andrey Kolmogoroff. “Interpolation und extrapolation von stationaren zufalligen folgen”. In:
Izvestiya Rossiiskoi Akademii Nauk. Seriya Matematicheskaya 5.1 (1941), pp. 3–14.
[243] Andrei Nikolaevich Kolmogorov. “The local structure of turbulence in incompressible viscous
fluid for very large Reynolds numbers”. In: Proceedings of the Royal Society of London. Series A:
Mathematical and Physical Sciences 434.1890 (1941), pp. 9–13.
[244] Andrey Nikolaevich Kolmogorov. “A refinement of previous hypotheses concerning the local
structure of turbulence in a viscous incompressible fluid at high Reynolds number”. In: Journal of
Fluid Mechanics 13.1 (1962), pp. 82–85.
[245] J Michael Kosterlitz. “The critical properties of the two-dimensional xy model”. In: Journal of
Physics C: Solid State Physics 7.6 (1974), p. 1046.
[246] Ajay Giri Prakash Kottapalli, Mohsen Asadnia, Jianmin Miao, and Michael Triantafyllou. “Touch
at a distance sensing: lateral-line inspired MEMS flow sensors”. In: Bioinspiration & biomimetics
9.4 (2014), p. 046011.
[247] AB Kroese and NA Schellart. “Velocity-and acceleration-sensitive units in the trunk lateral line of
the trout”. In: Journal of Neurophysiology 68.6 (1992), pp. 2212–2221.
[248] Melike Kurt, Amin Mivehchi, and Keith W Moored. “Two-dimensionally stable self-organization
arises in simple schooling swimmers through hydrodynamic interactions”. In: arXiv preprint
arXiv:2102.03571 (2021).
[249] Melike Kurt and Keith W Moored. “Flow interactions of two-and three-dimensional networked
bio-inspired control elements in an in-line arrangement”. In: Bioinspiration & biomimetics 13.4
(2018), p. 045002.
196



[250] Jérémie Labasse, Uwe Ehrenstein, and Philippe Meliga. “Numerical exploration of the pitching
plate parameter space with application to thrust scaling”. In: Applied Ocean Research 101 (2020),
p. 102278.
[251] Natalia Ladyka-Wojcik and Morgan D Barense. “Reframing spatial frames of reference: What can
aging tell us about egocentric and allocentric navigation?” In: Wiley Interdisciplinary Reviews:
Cognitive Science 12.3 (2021), e1549.
[252] Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. “Numba: A llvm-based python jit compiler”.
In: Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC. 2015, pp. 1–6.
[253] Matz Larsson. “Why do fish school?” In: Current Zoology 58.1 (2012), pp. 116–128.
[254] George V Lauder, Jeanette Lim, Ryan Shelton, Chuck Witt, Erik Anderson, and James L Tangorra.
“Robotic models for studying undulatory locomotion in fishes”. In: Marine Technology Society
Journal 45.4 (2011), pp. 41–55.
[255] George V. Lauder. “Fish Locomotion: Recent Advances and New Directions”. In: Annual Review of
Marine Science 7.1 (2015). PMID: 25251278, pp. 521–545. doi:
10.1146/annurev-marine-010814-015614. eprint:
https://doi.org/10.1146/annurev-marine-010814-015614.
[256] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. “Deep learning”. In: nature 521.7553 (2015),
pp. 436–444.
[257] Jae H Lee, Alex D Rygg, Ebrahim M Kolahdouz, Simone Rossi, Stephen M Retta,
Nandini Duraiswamy, Lawrence N Scotten, Brent A Craven, and Boyce E Griffith.
“Fluid–structure interaction models of bioprosthetic heart valve dynamics in an experimental
pulse duplicator”. In: Annals of biomedical engineering 48.5 (2020), pp. 1475–1490.
[258] Seungjoon Lee, Yorgos M Psarellis, Constantinos I Siettos, and Ioannis G Kevrekidis. “Learning
black-and gray-box chemotactic PDEs/closures from agent based Monte Carlo simulation data”.
In: Journal of Mathematical Biology 87.1 (2023), p. 15.
[259] Katherine J Leitch, Francesca V Ponce, William B Dickson, Floris van Breugel, and
Michael H Dickinson. “The long-distance flight behavior of Drosophila supports an agent-based
model for wind-assisted dispersal in insects”. In: Proceedings of the National Academy of Sciences
118.17 (2021), e2013342118.
[260] PH Lenz and DK Hartline. “Reaction times and force production during escape behavior of a
calanoid copepod, Undinula vulgaris”. In: Marine Biology 133 (1999), pp. 249–258.
[261] Naomi Ehrich Leonard, Anastasia Bizyaeva, and Alessio Franci. “Fast and flexible multiagent
decision-making”. In: Annual Review of Control, Robotics, and Autonomous Systems 7 (2024).
[262] Naomi Ehrich Leonard and Edward Fiorelli. “Virtual leaders, artificial potentials and coordinated
control of groups”. In: Proceedings of the 40th IEEE conference on decision and control (Cat. No.
01CH37228). Vol. 3. IEEE. 2001, pp. 2968–2973.
197



[263] Naomi Ehrich Leonard, Derek A Paley, Francois Lekien, Rodolphe Sepulchre, David M Fratantoni,
and Russ E Davis. “Collective motion, sensor networks, and ocean sampling”. In: Proceedings of
the IEEE 95.1 (2007), pp. 48–74.
[264] Gen Li, Dmitry Kolomenskiy, Hao Liu, Benjamin Thiria, and Ramiro Godoy-Diana.
“Hydrodynamical Fingerprint of a Neighbour in a Fish Lateral Line”. In: Frontiers in Robotics and
AI 9 (2022).
[265] Liang Li, Danshi Liu, Jian Deng, Matthew J Lutz, and Guangming Xie. “Fish can save energy via
proprioceptive sensing”. In: Bioinspiration & biomimetics 16.5 (2021), p. 056013.
[266] Liang Li, Máté Nagy, Jacob M. Graving, Joseph Bak-Coleman, Guangming Xie, and
Iain D. Couzin. “Vortex phase matching as a strategy for schooling in robots and in fish”. In:
Nature Communications 11.1 (2020), p. 5408. doi: 10.1038/s41467-020-19086-0.
[267] Shuguang Li, Richa Batra, David Brown, Hyun-Dong Chang, Nikhil Ranganathan,
Chuck Hoberman, Daniela Rus, and Hod Lipson. “Particle robotics based on statistical mechanics
of loosely coupled components”. In: Nature 567.7748 (2019), pp. 361–365.
[268] Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya,
Andrew Stuart, and Anima Anandkumar. “Fourier neural operator for parametric partial
differential equations”. In: arXiv preprint arXiv:2010.08895 (2020).
[269] Jianhong Liang, Tianmiao Wang, and Li Wen. “Development of a two-joint robotic fish for
real-world exploration”. In: Journal of Field Robotics 28.1 (2011), pp. 70–79.
[270] James C Liao, David N Beal, George V Lauder, and Michael S Triantafyllou. “Fish exploiting
vortices decrease muscle activity”. In: Science 302.5650 (2003), pp. 1566–1569.
[271] James C. Liao, David N. Beal, George V. Lauder, and Michael S. Triantafyllou. “The Kármán gait:
novel body kinematics of rainbow trout swimming in a vortex street”. In: Journal of Experimental
Biology 206.6 (Mar. 2003), pp. 1059–1073. issn: 0022-0949.
[272] MJ Lighthill. “Mathematical biofluiddynamics: SIAM”. In: Regional Conference Series in Applied
Mathematics. Society for Industrial and Applied Mathematics, Philadelphia. 1975.
[273] Achim Lilienthal and Tom Duckett. “Experimental analysis of gas-sensitive Braitenberg vehicles”.
In: Advanced Robotics 18.8 (2004), pp. 817–834.
[274] Zhonglu Lin, Amneet Pal Singh Bhalla, Boyce Griffith, Zi Sheng, Hongquan Li, Dongfang Liang,
and Yu Zhang. “How swimming style affects schooling of two fish-like wavy hydrofoils”. In:
arXiv preprint arXiv:2209.01590 (2022).
[275] Alec J Linot, Haotian Hang, Eva Kanso, and Kunihiko Taira. “Hierarchical equivariant graph
neural networks for forecasting collective motion in vortex clusters and microswimmers”. In:
arXiv preprint arXiv:2501.00626 (2024).
198



[276] Guijie Liu, Anyi Wang, Xinbao Wang, and Peng Liu. “A review of artificial lateral line in sensor
fabrication and bionic applications for robot fish”. In: Applied bionics and biomechanics 2016
(2016).
[277] Pengfei Liu and Neil Bose. “Propulsive performance from oscillating propulsors with spanwise
flexibility”. In: Proceedings of the Royal Society of London. Series A: Mathematical, Physical and
Engineering Sciences 453.1963 (1997), pp. 1763–1770.
[278] Zhecheng Liu, Diederik Beckers, and Jeff D Eldredge. “Model-Based Reinforcement Learning for
Control of Strongly-Disturbed Unsteady Aerodynamic Flows”. In: arXiv preprint arXiv:2408.14685
(2024).
[279] Thomas Lochmatter. “Bio-inspired and probabilistic algorithms for distributed odor source
localization using mobile robots”. PhD thesis. 2010.
[280] Aurore Loisy and Christophe Eloy. “Searching for a source without gradients: how good is
infotaxis and how to beat it”. In: Proceedings of the Royal Society A 478.2262 (2022), p. 20220118.
[281] Daniel A Burbano Lombana and Maurizio Porfiri. “Collective response of fish to combined
manipulations of illumination and flow”. In: Behavioural Processes 203 (2022), p. 104767.
[282] KH Low and CW Chong. “Parametric study of the swimming performance of a fish robot
propelled by a flexible caudal fin”. In: Bioinspiration & Biomimetics 5.4 (2010), p. 046002.
[283] Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. “Learning
nonlinear operators via DeepONet based on the universal approximation theorem of operators”.
In: Nature machine intelligence 3.3 (2021), pp. 218–229.
[284] K. N. Lucas, N. Johnson, W. T. Beaulieu, E. Cathcart, G. Tirrell, S. P. Colin, B. J. Gemmell,
J. O. Dabiri, and J. H. Costello. “Bending rules for animal propulsion”. In: Nature communications
5.1 (2014), pp. 1–7.
[285] Kelsey N Lucas, Patrick J M Thornycroft, Brad J Gemmell, Sean P Colin, John H Costello, and
George V Lauder. “Effects of non-uniform stiffness on the swimming performance of a
passively-flexing, fish-like foil model”. In: Bioinspiration & Biomimetics 10.5 (Oct. 2015), p. 056019.
doi: 10.1088/1748-3190/10/5/056019.
[286] Enkeleida Lushi, Hugo Wioland, and Raymond E Goldstein. “Fluid flows created by swimming
bacteria drive self-organization in confined suspensions”. In: Proceedings of the National Academy
of Sciences 111.27 (2014), pp. 9733–9738.
[287] EP Lyon. “On rheotropism. I.—Rheotropism in fishes”. In: American Journal of Physiology-Legacy
Content 12.2 (1904), pp. 149–161.
[288] Kevin Y Ma, Pakpong Chirarattananon, Sawyer B Fuller, and Robert J Wood. “Controlled flight of
a biologically inspired, insect-scale robot”. In: Science 340.6132 (2013), pp. 603–607.
199



[289] J MacQueen. “Some methods for classification and analysis of multivariate observations”. In:
Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability/University of
California Press. 1967.
[290] John DW Madden, Nathan A Vandesteeg, Patrick A Anquetil, Peter GA Madden, Arash Takshi,
Rachel Z Pytel, Serge R Lafontaine, Paul A Wieringa, and Ian W Hunter. “Artificial muscle
technology: physical principles and naval prospects”. In: IEEE Journal of oceanic engineering 29.3
(2004), pp. 706–728.
[291] Manu S Madhav and Noah J Cowan. “The synergy between neuroscience and control theory: the
nervous system as inspiration for hard control challenges”. In: Annual Review of Control, Robotics,
and Autonomous Systems 3 (2020), pp. 243–267.
[292] Eleanor A Maguire, Neil Burgess, James G Donnett, Richard SJ Frackowiak, Christopher D Frith,
and John O’Keefe. “Knowing where and getting there: a human navigation network”. In: Science
280.5365 (1998), pp. 921–924.
[293] M Cristina Marchetti, Jean-François Joanny, Sriram Ramaswamy, Tanniemola B Liverpool,
Jacques Prost, Madan Rao, and R Aditi Simha. “Hydrodynamics of soft active matter”. In: Reviews
of modern physics 85.3 (2013), pp. 1143–1189.
[294] Danijela Marković, Alice Mizrahi, Damien Querlioz, and Julie Grollier. “Physics for neuromorphic
computing”. In: Nature Reviews Physics 2.9 (2020), pp. 499–510.
[295] S. Marras, S. S. Killen, J. Lindström, D. J. McKenzie, J. F. Steffensen, and P. Domenici. “Fish
swimming in schools save energy regardless of their spatial position”. In: Behavioral ecology and
sociobiology 69.2 (2015), pp. 219–226.
[296] Stefano Marras and Maurizio Porfiri. “Fish and robots swimming together: attraction towards the
robot demands biomimetic locomotion”. In: Journal of The Royal Society Interface 9.73 (2012),
pp. 1856–1868.
[297] Eduardo Martin Moraud and Dominique Martinez. “Effectiveness and robustness of robot
infotaxis for searching in dilute conditions”. In: Frontiers in neurorobotics 4 (2010), p. 1213.
[298] Ivan Masmitja, Mario Martin, Tom O’Reilly, Brian Kieft, Narcís Palomeras, Joan Navarro, and
Kakani Katija. “Dynamic robotic tracking of underwater targets using reinforcement learning”.
In: Science robotics 8.80 (2023), eade7811.
[299] Jean-Baptiste Masson. “Olfactory searches with limited space perception”. In: Proceedings of the
National Academy of Sciences 110.28 (2013), pp. 11261–11266.
[300] Michael E McConney, Nannan Chen, David Lu, Huan A Hu, Sheryl Coombs, Chang Liu, and
Vladimir V Tsukruk. “Biologically inspired design of hydrogel-capped hair sensors for enhanced
underwater flow detection”. In: Soft Matter 5.2 (2009), pp. 292–295.
[301] Barry M McCoy and Tai Tsun Wu. The two-dimensional Ising model. Harvard University Press,
1973.
200



[302] Matthew J McHenry, James A Strother, and Sietse M Van Netten. “Mechanical filtering by the
boundary layer and fluid–structure interaction in the superficial neuromast of the fish lateral line
system”. In: Journal of Comparative Physiology A 194 (2008), pp. 795–810.
[303] MJ McHenry, KE Feitl, JA Strother, and WJ Van Trump. “Larval zebrafish rapidly sense the water
flow of a predator’s strike”. In: Biology Letters 5.4 (2009), pp. 477–479.
[304] Leland McInnes and John Healy. “Accelerated hierarchical density based clustering”. In: 2017 IEEE
international conference on data mining workshops (ICDMW). IEEE. 2017, pp. 33–42.
[305] Amberle McKee, Alberto P Soto, Phoebe Chen, and Matthew J McHenry. “The sensory basis of
schooling by intermittent swimming in the rummy-nose tetra (Hemigrammus rhodostomus)”. In:
Proceedings of the Royal Society B 287.1937 (2020), p. 20200568.
[306] Mahmoud Medany, Lorenzo Piglia, Liam Achenbach, S Karthik Mukkavilli, and Daniel Ahmed.
“Model-Based Reinforcement Learning for Ultrasound-Driven Autonomous Microrobots”. In:
bioRxiv (2024), pp. 2024–09.
[307] Idir Mellal, David Crompton, Milos Popovic, and Milad Lankarany. “A Flexible FPGA
Implementation of Morris-Lecar Neuron for Reproducing Different Neuronal Behaviors”. In:
International Conference on Neuromorphic Systems 2021. 2021, pp. 1–5.
[308] Karthik Menon and Rajat Mittal. “Flow physics and dynamics of flow-induced pitch oscillations
of an airfoil”. In: J. Fluid Mech 877 (2019), pp. 582–613.
[309] Josh Merel, Diego Aldarondo, Jesse Marshall, Yuval Tassa, Greg Wayne, and Bence Ölveczky.
“Deep neuroethology of a virtual rodent”. In: arXiv preprint arXiv:1911.09451 (2019).
[310] Alex Mesoudi and Andrew Whiten. “The multiple roles of cultural transmission experiments in
understanding human cultural evolution”. In: Philosophical Transactions of the Royal Society B:
Biological Sciences 363.1509 (2008), pp. 3489–3501.
[311] J-M Miao and M-H Ho. “Effect of flexure on aerodynamic propulsive efficiency of flapping flexible
airfoil”. In: Journal of Fluids and Structures 22.3 (2006), pp. 401–419.
[312] Sébastien Michelin and Stefan G Llewellyn Smith. “Resonance and propulsion performance of a
heaving flexible wing”. In: Physics of Fluids 21.7 (2009), p. 071902.
[313] Seyed M Mirvakili and Ian W Hunter. “Artificial muscles: Mechanisms, applications, and
challenges”. In: Advanced Materials 30.6 (2018), p. 1704407.
[314] Amir Mirzaeinia, F Heppner, and Mostafa Hassanalian. “An analytical study on leader and
follower switching in V-shaped Canada Goose flocks for energy management purposes”. In:
Swarm Intelligence 14.2 (2020), pp. 117–141.
[315] Rajat Mittal, Haibo Dong, Meliha Bozkurttas, FM Najjar, Abel Vargas, and Alfred Von Loebbecke.
“A versatile sharp interface immersed boundary method for incompressible flows with complex
boundaries”. In: Journal of computational physics 227.10 (2008), pp. 4825–4852.
201



[316] Hideyuki Miyahara, Hyu Yoneki, and Vwani Roychowdhury. “Vicsek Model Meets DBSCAN:
Cluster Phases in the Vicsek Model”. In: arXiv preprint arXiv:2307.12538 (2023).
[317] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness,
Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al.
“Human-level control through deep reinforcement learning”. In: nature 518.7540 (2015),
pp. 529–533.
[318] John C Montgomery, Cindy F Baker, and Alexander G Carton. “The lateral line can mediate
rheotaxis in fish”. In: Nature 389.6654 (1997), pp. 960–963.
[319] John C Montgomery, Sheryl Coombs, and Cindy F Baker. “The mechanosensory lateral line
system of the hypogean form of Astyanax fasciatus”. In: The biology of hypogean fishes. Springer,
2001, pp. 87–96.
[320] Rémi Monthiller, Aurore Loisy, Mimi AR Koehl, Benjamin Favier, and Christophe Eloy. “Surfing
on turbulence: a strategy for planktonic navigation”. In: Physical Review Letters 129.6 (2022),
p. 064502.
[321] Brandon J Moore and Carlos Canudas-de-Wit. “Source seeking via collaborative measurements by
a circular formation of agents”. In: Proceedings of the 2010 American control conference. IEEE. 2010,
pp. 6417–6422.
[322] K. W. Moored and D. B. Quinn. “Inviscid scaling laws of a self-propelled pitching airfoil”. In:
AIAA Journal 57.9 (2019), pp. 3686–3700.
[323] Keith W Moored, Peter A Dewey, Birgitt M Boschitsch, AJ Smits, and H Haj-Hariri. “Linear
instability mechanisms leading to optimally efficient locomotion with flexible propulsors”. In:
Physics of Fluids 26.4 (2014).
[324] Jorge J Moré and Danny C Sorensen. “Computing a trust region step”. In: SIAM Journal on
Scientific and Statistical Computing 4.3 (1983), pp. 553–572.
[325] Xinyi Mou, Xuanwen Ding, Qi He, Liang Wang, Jingcong Liang, Xinnong Zhang, Libo Sun,
Jiayu Lin, Jie Zhou, Xuanjing Huang, et al. “From Individual to Society: A Survey on Social
Simulation Driven by Large Language Model-based Agents”. In: arXiv preprint arXiv:2412.03563
(2024).
[326] Naveed Muhammad, Juan Francisco Fuentes-Perez, Jeffrey A Tuhtan, Gert Toming, Mark Musall,
and Maarja Kruusmaa. “Map-based localization and loop-closure detection from a moving
underwater platform using flow features”. In: Autonomous Robots 43 (2019), pp. 1419–1434.
[327] Naveed Muhammad, Gert Toming, Jeffrey A Tuhtan, Mark Musall, and Maarja Kruusmaa.
“Underwater map-based localization using flow features”. In: Autonomous Robots 41 (2017),
pp. 417–436.
[328] UK Müller, BLE Van Den Heuvel, EJ Stamhuis, and JJ Videler. “Fish foot prints: morphology and
energetics of the wake behind a continuously swimming mullet (Chelon labrosus Risso)”. In:
Journal of Experimental Biology 200.22 (1997), pp. 2893–2906.
202



[329] Ulrike K Müller, Eize J Stamuis, and John J Videler. “Riding the Waves: the Role of the Body Wave
in Undulatory Fish Swimming1”. In: INTEGR. COMP. BIOL 42 (2002), pp. 981–987.
[330] John Murlis, Joseph S Elkinton, Ring T Carde, et al. “Odor plumes and how insects use them”. In:
Annual review of entomology 37.1 (1992), pp. 505–532.
[331] Kevin Murphy. “Reinforcement Learning: An Overview”. In: arXiv preprint arXiv:2412.05265
(2024).
[332] Jinichi Nagumo, Suguru Arimoto, and Shuji Yoshizawa. “An active pulse transmission line
simulating nerve axon”. In: Proceedings of the IRE 50.10 (1962), pp. 2061–2070.
[333] Máté Nagy, Zsuzsa Ákos, Dora Biro, and Tamás Vicsek. “Hierarchical group dynamics in pigeon
flocks”. In: Nature 464.7290 (2010), pp. 890–893.
[334] Engineering National Academies of Sciences and Medicine. Physics of Life. Washington, DC: The
National Academies Press, 2022. isbn: 978-0-309-27400-5. doi: 10.17226/26403.
[335] Aryan Naveen, Jalil Morris, Christian Chan, Daniel Mhrous, E Farrell Helbling,
Nak-Seung Patrick Hyun, Gage Hills, and Robert J Wood. “Hardware-in-the-Loop for
Characterization of Embedded State Estimation for Flying Microrobots”. In: arXiv preprint
arXiv:2411.06382 (2024).
[336] Sietse M van Netten. “Hydrodynamic detection by cupulae in a lateral line canal: functional
relations between physics and physiology”. In: Biological cybernetics 94.1 (2006), pp. 67–85.
[337] Sietse M van Netten and Matthew J McHenry. “The biophysics of the fish lateral line”. In: The
Lateral Line System. Springer, 2013, pp. 99–119.
[338] Gabrielle A Nevitt. “Olfactory foraging by Antarctic procellariiform seabirds: life at high
Reynolds numbers”. In: The Biological Bulletin 198.2 (2000), pp. 245–253.
[339] J. W. Newbolt, J. Zhang, and L. Ristroph. “Flow interactions between uncoordinated flapping
swimmers give rise to group cohesion”. In: Proceedings of the National Academy of Sciences 116
(2019), p. 201816098. issn: 0027-8424. doi: 10.1073/pnas.1816098116.
[340] Joel W Newbolt, Nickolas Lewis, Mathilde Bleu, Jiajie Wu, Christiana Mavroyiakoumou,
Sophie Ramananarivo, and Leif Ristroph. “Flow interactions lead to self-organized flight
formations disrupted by self-amplifying waves”. In: Nature communications 15.1 (2024), p. 3462.
[341] Joel W Newbolt, Jun Zhang, and Leif Ristroph. “Lateral flow interactions enhance speed and
stabilize formations of flapping swimmers”. In: Physical Review Fluids 7.6 (2022), p. L061101.
[342] Kacie TM Niimoto, Kyleigh J Kuball, Lauren N Block, Petra H Lenz, and Daisuke Takagi.
“Rotational maneuvers of copepod nauplii at low Reynolds number”. In: Fluids 5.2 (2020), p. 78.
[343] M. Nitsche and R. Krasny. “A numerical study of vortex ring formation at the edge of a circular
tube”. In: Journal of Fluid Mechanics 276 (1994), pp. 139–161.
203



[344] Petter Ogren, Edward Fiorelli, and Naomi Ehrich Leonard. “Cooperative control of mobile sensor
networks: Adaptive gradient climbing in a distributed environment”. In: IEEE Transactions on
Automatic control 49.8 (2004), pp. 1292–1302.
[345] Henrique M Oliveira and Luís V Melo. “Huygens synchronization of two clocks”. In: Scientific
reports 5.1 (2015), pp. 1–12.
[346] Gregory Ongie, Ajil Jalal, Christopher A Metzler, Richard G Baraniuk, Alexandros G Dimakis,
and Rebecca Willett. “Deep learning techniques for inverse problems in imaging”. In: IEEE
Journal on Selected Areas in Information Theory 1.1 (2020), pp. 39–56.
[347] OpenAI. OpenAI Github baseline - Proximal Policy Optimization.
https://github.com/openai/baselines/tree/master/baselines/ppo2.
[348] OpenAI. OpenAI Spinning up documentation - Proximal Policy Optimization.
https://spinningup.openai.com/en/latest/algorithms/ppo.html.
[349] Steven A Orszag. “Analytical theories of turbulence”. In: Journal of Fluid Mechanics 41.2 (1970),
pp. 363–386.
[350] Pablo Oteiza, Iris Odstrcil, George Lauder, Ruben Portugues, and Florian Engert. “A novel
mechanism for mechanosensory-based rheotaxis in larval zebrafish”. In: Nature 547.7664 (2017),
pp. 445–448.
[351] Sinno Jialin Pan and Qiang Yang. “A survey on transfer learning”. In: IEEE Transactions on
knowledge and data engineering 22.10 (2009), pp. 1345–1359.
[352] Rich Pang, Floris van Breugel, Michael Dickinson, Jeffrey A. Riffell, and Adrienne Fairhall.
“History dependence in insect flight decisions during odor tracking”. In: PLOS Computational
Biology 14.2 (Feb. 2018), pp. 1–26. doi: 10.1371/journal.pcbi.1005969.
[353] German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. “Continual
lifelong learning with neural networks: A review”. In: Neural networks 113 (2019), pp. 54–71.
[354] Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and
Michael S Bernstein. “Generative agents: Interactive simulacra of human behavior”. In:
Proceedings of the 36th annual acm symposium on user interface software and technology. 2023,
pp. 1–22.
[355] Sung Goon Park and Hyung Jin Sung. “Hydrodynamics of flexible fins propelled in tandem,
diagonal, triangular and diamond configurations”. In: Journal of Fluid Mechanics 840 (2018), p. 154.
[356] B. L. Partridge and T. J. Pitcher. “Evidence against a hydrodynamic function for fish schools”. In:
Nature 279.5712 (1979), pp. 418–419.
[357] Brian L Partridge. “Internal dynamics and the interrelations of fish in schools”. In: Journal of
comparative physiology 144 (1981), pp. 313–325.
204



[358] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan,
Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. “Pytorch: An imperative style,
high-performance deep learning library”. In: Advances in neural information processing systems 32
(2019).
[359] DS Pavlov, AO Kasumyan, et al. “Patterns and mechanisms of schooling behavior in fish: a
review”. In: Journal of Ichthyology 40.2 (2000), S163.
[360] Stephan J Peake and Anthony P Farrell. “Locomotory behaviour and post-exercise physiology in
relation to swimming speed, gait transition and metabolism in free-swimming smallmouth bass
(Micropterus dolomieu)”. In: Journal of Experimental Biology 207.9 (2004), pp. 1563–1575.
[361] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion,
Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al.
“Scikit-learn: Machine learning in Python”. In: the Journal of machine Learning research 12 (2011),
pp. 2825–2830.
[362] Ze-Rui Peng, Haibo Huang, and Xi-Yun Lu. “Hydrodynamic schooling of multiple self-propelled
flapping plates”. In: Journal of Fluid Mechanics 853 (2018), pp. 587–600.
[363] Anton Peshkov, Eric Bertin, Francesco Ginelli, and Hugues Chaté. “Boltzmann-Ginzburg-Landau
approach for continuous descriptions of generic Vicsek-like models”. In: The European Physical
Journal Special Topics 223.7 (2014), pp. 1315–1344.
[364] Anton Peshkov, Sandrine Ngo, Eric Bertin, Hugues Chaté, and Francesco Ginelli. “Continuous
theory of active matter systems with metric-free interactions”. In: Physical review letters 109.9
(2012), p. 098101.
[365] Charles S Peskin. “Numerical analysis of blood flow in the heart”. In: Journal of computational
physics 25.3 (1977), pp. 220–252.
[366] Ashley N Peterson, Alberto P Soto, and Matthew J McHenry. “Pursuit and evasion strategies in
the predator–prey interactions of fishes”. In: Integrative and comparative biology 61.2 (2021),
pp. 668–680.
[367] Ashley N Peterson, Nathan Swanson, and Matthew J McHenry. “Fish communicate with water
flow to enhance a school’s social network”. In: Journal of Experimental Biology 227.17 (2024).
[368] Winnie Poel, Bryan C Daniels, Matthew MG Sosna, Colin R Twomey, Simon P Leblanc,
Iain D Couzin, and Pawel Romanczuk. “Subcritical escape waves in schooling fish”. In: Science
Advances 8.25 (2022), eabm6385.
[369] Kirsten Pohlmann, Jelle Atema, and Thomas Breithaupt. “The importance of the lateral line in
nocturnal predation of piscivorous catfish”. In: Journal of Experimental Biology 207.17 (2004),
pp. 2971–2978.
[370] Kirsten Pohlmann, Frank W Grasso, and Thomas Breithaupt. “Tracking wakes: the nocturnal
predatory strategy of piscivorous catfish”. In: Proceedings of the National Academy of Sciences
98.13 (2001), pp. 7371–7374.
205



[371] B. Pollard and P. Tallapragada. “Passive Appendages Improve the Maneuverability of Fishlike
Robots”. In: IEEE/ASME Transactions on Mechatronics 24.4 (2019), pp. 1586–1596. doi:
10.1109/TMECH.2019.2916779.
[372] Beau Pollard and Phanindra Tallapragada. “Learning hydrodynamic signatures through
proprioceptive sensing by bioinspired swimmers”. In: Bioinspiration & Biomimetics 16.2 (2021),
p. 026014.
[373] Maurizio Porfiri, Peng Zhang, and Sean D Peterson. “Hydrodynamic model of fish orientation in a
channel flow”. In: bioRxiv (2021).
[374] Steven J Portugal, Tatjana Y Hubel, Johannes Fritz, Stefanie Heese, Daniela Trobe,
Bernhard Voelkl, Stephen Hailes, Alan M Wilson, and James R Usherwood. “Upwash exploitation
and downwash avoidance by flap phasing in ibis formation flight”. In: Nature 505.7483 (2014),
pp. 399–402.
[375] Andrea Procaccini, Alberto Orlandi, Andrea Cavagna, Irene Giardina, Francesca Zoratto,
Daniela Santucci, Flavia Chiarotti, Charlotte K Hemelrijk, Enrico Alleva, Giorgio Parisi, et al.
“Propagating waves in starling, Sturnus vulgaris, flocks under predation”. In: Animal behaviour
82.4 (2011), pp. 759–765.
[376] Kai Qi, Elmar Westphal, Gerhard Gompper, and Roland G Winkler. “Emergence of active
turbulence in microswimmer suspensions due to active hydrodynamic stress and volume
exclusion”. In: Communications Physics 5.1 (2022), p. 49.
[377] Suyang Qin, Haotian Hang, Yang Xiang, and Hong Liu. “Reynolds-number scaling analysis on lift
generation of a flapping and passive rotating wing with an inhomogeneous mass distribution”. In:
Chinese Journal of Aeronautics (2023).
[378] D B Quinn, G V Lauder, and A J Smits. “Scaling the propulsive performance of heaving flexible
panels”. In: Journal of fluid mechanics 738 (2014), p. 250.
[379] Daniel B. Quinn, George V. Lauder, and Alexander J. Smits. “Maximizing the efficiency of a
flexible propulsor using experimental optimization”. In: Journal of Fluid Mechanics 767 (2015),
pp. 430–448. doi: 10.1017/jfm.2015.35.
[380] D. V. Radakov. Schooling in the ecology of fish. eng. New York: J. Wiley, 1973. isbn: 0706513517.
[381] Maziar Raissi, Paris Perdikaris, and George E Karniadakis. “Physics-informed neural networks: A
deep learning framework for solving forward and inverse problems involving nonlinear partial
differential equations”. In: Journal of Computational physics 378 (2019), pp. 686–707.
[382] Maziar Raissi, Alireza Yazdani, and George Em Karniadakis. “Hidden fluid mechanics: Learning
velocity and pressure fields from flow visualizations”. In: Science 367.6481 (2020), pp. 1026–1030.
[383] Srinivas Ramakrishnan, Meliha Bozkurttas, Rajat Mittal, and George V Lauder. “Thrust
production in highly flexible pectoral fins: a computational dissection”. In: Marine Technology
Society Journal 45.4 (2011), pp. 56–64.
206



[384] S. Ramananarivo, F. Fang, A. Oza, J. Zhang, and L. Ristroph. “Flow interactions lead to orderly
formations of flapping wings in forward flight”. In: Phys. Rev. Fluids 1 (7 Nov. 2016), p. 071201.
doi: 10.1103/PhysRevFluids.1.071201.
[385] Gautam Reddy, Antonio Celani, Terrence J Sejnowski, and Massimo Vergassola. “Learning to soar
in turbulent environments”. In: Proceedings of the National Academy of Sciences 113.33 (2016),
E4877–E4884.
[386] Gautam Reddy, Venkatesh N Murthy, and Massimo Vergassola. “Olfactory sensing and navigation
in turbulent environments”. In: Annual Review of Condensed Matter Physics 13 (2022), pp. 191–213.
[387] Gautam Reddy, Boris I Shraiman, and Massimo Vergassola. “Sector search strategies for odor trail
tracking”. In: Proceedings of the National Academy of Sciences 119.1 (2022), e2107431118.
[388] Gautam Reddy, Jerome Wong-Ng, Antonio Celani, Terrence J Sejnowski, and Massimo Vergassola.
“Glider soaring via reinforcement learning in the field”. In: Nature 562.7726 (2018), pp. 236–239.
[389] G Rieucau, A De Robertis, KM Boswell, and NO Handegard. “School density affects the strength
of collective avoidance responses in wild-caught Atlantic herring Clupea harengus: a simulated
predator encounter experiment”. In: Journal of Fish Biology 85.5 (2014), pp. 1650–1664.
[390] Guillaume Rieucau, Anders Fernö, Christos C Ioannou, and Nils Olav Handegard. “Towards of a
firmer explanation of large shoal formation, maintenance and collective reactions in marine fish”.
In: Reviews in Fish Biology and Fisheries 25 (2015), pp. 21–37.
[391] Leif Ristroph, James C Liao, and Jun Zhang. “Lateral line layout correlates with the differential
hydrodynamic pressure on swimming fish”. In: Physical Review Letters 114.1 (2015), p. 018102.
[392] Alan Roberts, Roman Borisyuk, Edgar Buhl, Andrea Ferrario, Stella Koutsikou, Wen-Chang Li,
and Stephen R Soffe. “The decision to move: response times, neuronal circuits and sensory
memory in a simple vertebrate”. In: Proceedings of the Royal Society B 286.1899 (2019), p. 20190297.
[393] Jaeha Ryu, Sung Goon Park, Wei-Xi Huang, and Hyung Jin Sung. “Hydrodynamics of a
three-dimensional self-propelled flexible plate”. In: Physics of Fluids 31.2 (2019), p. 021902. doi:
10.1063/1.5064482. eprint: https://doi.org/10.1063/1.5064482.
[394] Philip G Saffman. Vortex dynamics. Cambridge university press, 1995.
[395] Taavi Salumäe, Inaki Ranó, Otar Akanyeti, and Maarja Kruusmaa. “Against the flow: A
Braitenberg controller for a fish robot”. In: 2012 IEEE International Conference on Robotics and
Automation. IEEE. 2012, pp. 4210–4215.
[396] Katsufumi Sato, Yutaka Watanuki, Akinori Takahashi, Patrick JO Miller, Hideji Tanaka,
Ryo Kawabe, Paul J Ponganis, Yves Handrich, Tomonari Akamatsu, Yuuki Watanabe, et al. “Stroke
frequency, but not swimming speed, is related to body size in free-ranging seabirds, pinnipeds and
cetaceans”. In: Proceedings of the Royal Society B: Biological Sciences 274.1609 (2007), pp. 471–477.
207



[397] Sercan Sayin, Einat Couzin-Fuchs, Inga Petelski, Yannick Günzel, Mohammad Salahshour,
Chi-Yu Lee, Jacob M. Graving, Liang Li, Oliver Deussen, Gregory A. Sword, and Iain D. Couzin.
“The behavioral mechanisms governing collective motion in swarming locusts”. In: Science
387.6737 (2025), pp. 995–1000. doi: 10.1126/science.adq7832. eprint:
https://www.science.org/doi/pdf/10.1126/science.adq7832.
[398] Teis Schnipper, Anders Andersen, and Tomas Bohr. “Vortex wakes of a flapping foil”. In: Journal
of Fluid Mechanics 633 (2009), p. 411.
[399] Erich Schubert, Jörg Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu. “DBSCAN
revisited, revisited: why and how you should (still) use DBSCAN”. In: ACM Transactions on
Database Systems (TODS) 42.3 (2017), pp. 1–21.
[400] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. “Proximal policy
optimization algorithms”. In: arXiv preprint arXiv:1707.06347 (2017).
[401] N Schulte-Pelkum, S Wieskotten, W Hanke, G Dehnhardt, and B Mauck. “Tracking of biogenic
hydrodynamic trails in harbour seals (Phoca vitulina)”. In: Journal of Experimental Biology 210.5
(2007), pp. 781–787.
[402] Jung-Hee Seo and Rajat Mittal. “Improved swimming performance in schooling fish via
leading-edge vortex enhancement”. In: Bioinspiration & Biomimetics 17.6 (2022), p. 066020.
[403] Tim Seyde, Igor Gilitschenski, Wilko Schwarting, Bartolomeo Stellato, Martin Riedmiller,
Markus Wulfmeier, and Daniela Rus. “Is Bang-Bang Control All You Need? Solving Continuous
Control with Bernoulli Policies”. In: Advances in Neural Information Processing Systems 34 (2021).
[404] M. Sfakiotakis, D. M. Lane, and J. B. C. Davies. “Review of fish swimming modes for aquatic
locomotion”. In: IEEE Journal of Oceanic Engineering 24.2 (1999), pp. 237–252. doi:
10.1109/48.757275.
[405] Robert E Shadwick and Sven Gemballa. “Structure, kinematics, and muscle dynamics in
undulatory swimming”. In: Fish physiology 23 (2005), pp. 241–280.
[406] Danish Shaikh and Ignacio Rañó. “Braitenberg vehicles as computational tools for research in
neuroscience”. In: Frontiers in bioengineering and biotechnology 8 (2020), p. 565963.
[407] Claude Elwood Shannon. “A mathematical theory of communication”. In: The Bell system
technical journal 27.3 (1948), pp. 379–423.
[408] Michael J Shelley and Jun Zhang. “Flapping and bending bodies interacting with fluid flows”. In:
Annual Review of Fluid Mechanics 43 (2011), pp. 449–465.
[409] J. X. Sheng, A. Ysasi, D. Kolomenskiy, E. Kanso, M. Nitsche, and K. Schneider. “Simulating Vortex
Wakes of Flapping Plates”. In: Natural Locomotion in Fluids and on Surfaces. Ed. by
Stephen Childress, Anette Hosoi, William W. Schultz, and Jane Wang. New York, NY: Springer
New York, 2012, pp. 255–262. isbn: 978-1-4614-3997-4.
208



[410] Tan Shizhe. “Underwater artificial lateral line flow sensors”. In: Microsystem technologies 20.12
(2014), pp. 2123–2136.
[411] Kourosh Shoele and Rajat Mittal. “Flutter instability of a thin flexible plate in a channel”. In:
Journal of Fluid Mechanics 786 (2016), pp. 29–46. doi: 10.1017/jfm.2015.632.
[412] C Simpfendorfer. Galeocerdo cuvier. In IUCN 2013. IUCN Red List of Threatened Species, Version
2013.1. Website. http://www.iucnredlist.org/. 2009.
[413] Stephen D Simpson, Hugo B Harrison, Michel R Claereboudt, and Serge Planes. “Long-distance
dispersal via ocean currents connects Omani clownfish populations throughout entire species
range”. In: PLoS One 9.9 (2014), e107610.
[414] Satpreet H Singh, Floris van Breugel, Rajesh PN Rao, and Bingni W Brunton. “Emergent
behaviour and neural dynamics in artificial agents tracking odour plumes”. In: Nature Machine
Intelligence 5.1 (2023), pp. 58–70.
[415] Alexander J Smits. “Undulatory and oscillatory swimming”. In: Journal of Fluid Mechanics 874
(2019).
[416] Jasper Snoek, Hugo Larochelle, and Ryan P Adams. “Practical bayesian optimization of machine
learning algorithms”. In: Advances in neural information processing systems 25 (2012).
[417] Houshang H Sohrab. Basic real analysis. Vol. 231. Springer, 2003.
[418] Alexandre P Solon, Joakim Stenhammar, Michael E Cates, Yariv Kafri, and Julien Tailleur.
“Generalized thermodynamics of phase equilibria in scalar active matter”. In: Physical Review E
97.2 (2018), p. 020602.
[419] Geoffrey R Spedding. “Wake signature detection”. In: Annual review of fluid mechanics 46 (2014),
pp. 273–302.
[420] William J Stewart and Matthew J McHenry. “Sensing the strike of a predator fish depends on the
specific gravity of a prey fish”. In: Journal of Experimental Biology 213.22 (2010), pp. 3769–3777.
[421] John D Stieglitz, Edward M Mager, Ronald H Hoenig, Daniel D Benetti, and Martin Grosell.
“Impacts of Deepwater Horizon crude oil exposure on adult mahi-mahi (Coryphaena hippurus)
swim performance”. In: Environmental Toxicology and Chemistry 35.10 (2016), pp. 2613–2622.
[422] S.H. Strogatz. Nonlinear Dynamics and Chaos with Student Solutions Manual: With Applications to
Physics, Biology, Chemistry, and Engineering, Second Edition. CRC Press, 2018. isbn:
9780429680151.
[423] Steven Strogatz, Sara Walker, Julia M Yeomans, Corina Tarnita, Elsa Arcaute,
Manlio De Domenico, Oriol Artime, and Kwang-Il Goh. “Fifty years of ‘More is different’”. In:
Nature Reviews Physics 4.8 (2022), pp. 508–510.
[424] Steven H Strogatz. Nonlinear dynamics and chaos with student solutions manual: With applications
to physics, biology, chemistry, and engineering. CRC press, 1994.
209



[425] Steven H Strogatz and Ian Stewart. “Coupled oscillators and biological synchronization”. In:
Scientific american 269.6 (1993), pp. 102–109.
[426] Student. “The probable error of a mean”. In: Biometrika (1908), pp. 1–25.
[427] Rohit Supekar, Boya Song, Alasdair Hastewell, Gary PT Choi, Alexander Mietke, and
Jörn Dunkel. “Learning hydrodynamic equations for active matter from particle simulations and
experiments”. In: Proceedings of the National Academy of Sciences 120.7 (2023), e2206994120.
[428] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018.
[429] Jon C Svendsen, Jakob Skov, Mogens Bildsoe, and John Fleng Steffensen. “Intra-school positional
preference and reduced tail beat frequency in trailing positions in schooling roach under
experimental conditions”. In: Journal of fish biology 62.4 (2003), pp. 834–846.
[430] Milad Taghavi, Wei Wang, Kyubum Shim, Jinsong Zhang, Itai Cohen, and Alyssa Apsel.
“Coordinated behavior of autonomous microscopic machines through local electronic pulse
coupling”. In: Science Robotics (2024).
[431] Kunihiko Taira and TIM Colonius. “Three-dimensional flows around low-aspect-ratio flat-plate
wings at low Reynolds numbers”. In: Journal of Fluid Mechanics 623 (2009), pp. 187–207.
[432] Kunihiko Taira, Maziar S Hemati, Steven L Brunton, Yiyang Sun, Karthik Duraisamy,
Shervin Bagheri, Scott TM Dawson, and Chi-An Yeh. “Modal analysis of fluid flows: Applications
and outlook”. In: AIAA journal 58.3 (2020), pp. 998–1022.
[433] Yu Jun Tan, Gianmarco Mengaldo, and Cecilia Laschi. “Artificial Muscles for Underwater Soft
Robots: Materials and Their Interactions”. In: Annual Review of Condensed Matter Physics 15
(2023).
[434] Sadatoshi Taneda. “Experimental investigation of vortex streets”. In: Journal of the Physical
Society of Japan 20.9 (1965), pp. 1714–1721.
[435] James L Tangorra, George V Lauder, Ian W Hunter, Rajat Mittal, Peter GA Madden, and
Meliha Bozkurttas. “The effect of fin ray flexural rigidity on the propulsive forces generated by a
biorobotic fish pectoral fin”. In: Journal of Experimental Biology 213.23 (2010), pp. 4043–4054.
[436] Graham K Taylor, Robert L Nudds, and Adrian LR Thomas. “Flying and swimming animals cruise
at a Strouhal number tuned for high power efficiency”. In: Nature 425.6959 (2003), pp. 707–711.
[437] Matthew E Taylor and Peter Stone. “Transfer learning for reinforcement learning domains: A
survey.” In: Journal of Machine Learning Research 10.7 (2009).
[438] Robin Thandiackal and George Lauder. “In-line swimming dynamics revealed by fish interacting
with a robotic mechanism”. In: Elife 12 (2023), e81392.
[439] John Toner and Yuhai Tu. “Flocks, herds, and schools: A quantitative theory of flocking”. In:
Physical review E 58.4 (1998), p. 4828.
210



[440] John Toner and Yuhai Tu. “Long-range order in a two-dimensional dynamical XY model: how
birds fly together”. In: Physical review letters 75.23 (1995), p. 4326.
[441] John Toner, Yuhai Tu, and Sriram Ramaswamy. “Hydrodynamics and phases of flocks”. In: Annals
of Physics 318.1 (2005), pp. 170–244.
[442] George S Triantafyllou, Michael S Triantafyllou, and Mark A Grosenbaugh. “Optimal thrust
development in oscillating foils with application to fish propulsion”. In: Journal of Fluids and
Structures 7.2 (1993), pp. 205–224.
[443] M. S. Triantafyllou, G. S. Triantafyllou, and D. K. Yue. “Hydrodynamics of fishlike swimming”. In:
Annual review of fluid mechanics 32.1 (2000), pp. 33–53.
[444] A. C. H. Tsang and E. Kanso. “Dipole interactions in doubly periodic domains”. In: Journal of
Nonlinear Science 23.6 (2013), pp. 971–991.
[445] Alan Cheng Hou Tsang and Eva Kanso. “Density shock waves in confined microswimmers”. In:
Physical review letters 116.4 (2016), p. 048101.
[446] Alan Cheng Hou Tsang, Michael J Shelley, and Eva Kanso. “Activity-induced instability of
phonons in 1D microfluidic crystals”. In: Soft Matter 14.6 (2018), pp. 945–950.
[447] Hsue-Shen Tsien. “Symmetrical Joukowsky airfoils in shear flow”. In: Quarterly of Applied
Mathematics 1.2 (1943), pp. 130–148.
[448] Hiroyasu Tsukamoto, Soon-Jo Chung, and Jean-Jaques E Slotine. “Contraction theory for
nonlinear stability analysis and learning-based control: A tutorial overview”. In: Annual Reviews
in Control 52 (2021), pp. 135–169.
[449] Eric D Tytell, Chia-Yu Hsu, and Lisa J Fauci. “The role of mechanical resonance in the neural
control of swimming in fishes”. In: Zoology 117.1 (2014), pp. 48–56.
[450] Eric D Tytell, Chia-Yu Hsu, Thelma L Williams, Avis H Cohen, and Lisa J Fauci. “Interactions
between internal forces, body stiffness, and fluid environment in a neuromechanical model of
lamprey swimming”. In: Proceedings of the National Academy of Sciences 107.46 (2010),
pp. 19832–19837.
[451] Eric D Tytell and George V Lauder. “The hydrodynamics of eel swimming: I. Wake structure”. In:
Journal of Experimental Biology 207.11 (2004), pp. 1825–1841.
[452] Eric D Tytell, Megan C Leftwich, Chia-Yu Hsu, Boyce E Griffith, Avis H Cohen, Alexander J Smits,
Christina Hamlet, and Lisa J Fauci. “Role of body stiffness in undulatory swimming: insights from
robotic and computational models”. In: Physical Review Fluids 1.7 (2016), p. 073202.
[453] Selen Uguroglu and Jaime Carbonell. “Feature selection for transfer learning”. In: Joint European
Conference on Machine Learning and Knowledge Discovery in Databases. Springer. 2011,
pp. 430–442.
211



[454] James R Usherwood, Marinos Stavrou, John C Lowe, Kyle Roskilly, and Alan M Wilson. “Flying in
a flock comes at a cost in pigeons”. In: Nature 474.7352 (2011), pp. 494–497.
[455] Gerald Van Belle. Statistical rules of thumb. Vol. 699. John Wiley & Sons, 2011.
[456] Floris Van Breugel and Michael H Dickinson. “Plume-tracking behavior of flying Drosophila
emerges from a set of distinct sensory-motor reflexes”. In: Current Biology 24.3 (2014),
pp. 274–286.
[457] T. Van Buren, D. Floryan, and A. J. Smits. “Scaling and performance of simultaneously heaving
and pitching foils”. In: AIAA Journal 57.9 (2019), pp. 3666–3677.
[458] William J Van Trump and Matthew J McHenry. The lateral line system is not necessary for
rheotaxis in the Mexican blind cavefish (Astyanax fasciatus). 2013.
[459] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
Łukasz Kaiser, and Illia Polosukhin. “Attention is all you need”. In: Advances in neural information
processing systems 30 (2017).
[460] Nathan Vaughn, Leighton Wilson, and Robert Krasny. “A GPU-accelerated barycentric Lagrange
treecode”. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops
(IPDPSW). IEEE. 2020, pp. 701–710.
[461] Wouter G van Veen, Johan L van Leeuwen, and Florian T Muijres. “Malaria mosquitoes use leg
push-off forces to control body pitch during take-off”. In: Journal of Experimental Zoology Part A:
Ecological and Integrative Physiology 333.1 (2020), pp. 38–49.
[462] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and
Yoshua Bengio. “Graph attention networks”. In: arXiv preprint arXiv:1710.10903 (2017).
[463] Roberto Venturelli, Otar Akanyeti, Francesco Visentin, Jaas Ježov, Lily D Chambers, Gert Toming,
Jennifer Brown, Maarja Kruusmaa, William M Megill, and Paolo Fiorini. “Hydrodynamic pressure
sensing with an artificial lateral line in steady and unsteady flows”. In: Bioinspiration &
biomimetics 7.3 (2012), p. 036004.
[464] Kyrell Vann B Verano, Emanuele Panizon, and Antonio Celani. “Olfactory search with finite-state
controllers”. In: Proceedings of the National Academy of Sciences 120.34 (2023), e2304230120.
[465] Massimo Vergassola, Emmanuel Villermaux, and Boris I Shraiman. “‘Infotaxis’ as a strategy for
searching without gradients”. In: Nature 445.7126 (2007), pp. 406–409.
[466] Siddhartha Verma, Guido Novati, and Petros Koumoutsakos. “Efficient collective swimming by
harnessing vortices through deep reinforcement learning”. In: Proceedings of the National
Academy of Sciences 115.23 (2018), pp. 5849–5854.
[467] Siddhartha Verma, Guido Novati, Flavio Noca, and Petros Koumoutsakos. “Fast motion of heaving
airfoils”. In: Procedia Computer Science 108 (2017), pp. 235–244.
212



[468] Siddhartha Verma, Costas Papadimitriou, Nora Lüthen, Georgios Arampatzis, and
Petros Koumoutsakos. “Optimal sensor placement for artificial swimmers”. In: Journal of Fluid
Mechanics 884 (2020).
[469] Tamás Vicsek, András Czirók, Eshel Ben-Jacob, Inon Cohen, and Ofer Shochet. “Novel type of
phase transition in a system of self-driven particles”. In: Physical review letters 75.6 (1995), p. 1226.
[470] Lionel Vincent, Yucen Liu, and Eva Kanso. “Shape optimization of tumbling wings”. In: Journal of
Fluid Mechanics 889 (2020), A9.
[471] Lionel Vincent, Min Zheng, John H Costello, and Eva Kanso. “Enhanced flight performance in
non-uniformly flexible wings”. In: Journal of the Royal Society Interface 17.168 (2020), p. 20200352.
[472] Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy,
David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright,
Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov,
Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng,
Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen,
E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa,
Paul van Mulbregt, and SciPy 1.0 Contributors. “SciPy 1.0: Fundamental Algorithms for Scientific
Computing in Python”. In: Nature Methods 17 (2020), pp. 261–272. doi: 10.1038/s41592-019-0686-2.
[473] Cees J Voesenek, Gen Li, Florian T Muijres, and Johan L Van Leeuwen. “Experimental–numerical
method for calculating bending moments in swimming fish shows that fish larvae control
undulatory swimming with simple actuation”. In: PLoS biology 18.7 (2020), e3000462.
[474] Giorgio Volpe, Nuno AM Araújo, Maria Guix, Mark Miodownik, Ayusman Sen, Samuel Sanchez,
Nicolas Martin, Laura Alvarez, Juliane Simmchen, Roberto Di Leonardo, et al. “Roadmap for
Animate Matter”. In: arXiv preprint arXiv:2407.10623 (2024).
[475] Karl D Von Ellenrieder, Kamal Parker, and Julio Soria. “Flow structures behind a heaving and
pitching finite-span wing”. In: Journal of Fluid Mechanics 490 (2003), pp. 129–138.
[476] von Neumann architecture. https://en.wikipedia.org/wiki/Von_Neumann_architecture.
[477] Erik Wallin and Martin Servin. “Data-driven model order reduction for granular media”. In:
Computational Particle Mechanics (2021), pp. 1–14.
[478] Kirsty Y Wan and Raymond E Goldstein. “Coordinated beating of algal flagella is mediated by
basal coupling”. In: Proceedings of the National Academy of Sciences 113.20 (2016), E2784–E2793.
[479] Li Wang. “Locomotion of a self-propulsive pitching plate in a quiescent viscous fluid”. In:
Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering
Science 0.0 (2020), p. 0954406220903338. doi: 10.1177/0954406220903338. eprint:
https://doi.org/10.1177/0954406220903338.
[480] Zhian Wang and Thomas Hillen. “Classical solutions and pattern formation for a volume filling
chemotaxis model”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 17.3 (2007).
213



[481] Paul W Webb and Raymond S Keyes. “Division of labor between median fins in swimming
dolphin (Pisces: Coryphaenidae)”. In: Copeia 1981.4 (1981), pp. 901–904.
[482] Paul W Webb, Gina D LaLiberte, and Amy J Schrank. “Does body and fin form affect the
maneuverability of fish traversing vertical and horizontal slits?” In: Environmental Biology of
Fishes 46.1 (1996), pp. 7–14.
[483] Pascal Weber, Georgios Arampatzis, Guido Novati, Siddhartha Verma, Costas Papadimitriou, and
Petros Koumoutsakos. “Optimal flow sensing for schooling swimmers”. In: Biomimetics 5.1 (2020),
p. 10.
[484] Nicholas C Wegner, Mark A Drawbridge, and John R Hyde. “Reduced swimming and metabolic
fitness of aquaculture-reared California Yellowtail (Seriola dorsalis) in comparison to wild-caught
conspecifics”. In: Aquaculture 486 (2018), pp. 51–56.
[485] Chang Wei, Qiao Hu, Tangjia Zhang, and Yangbin Zeng. “Passive hydrodynamic interactions in
minimal fish schools”. In: Ocean Engineering 247 (2022), p. 110574.
[486] D Weihs. “The mechanism of rapid starting of slender fish”. In: Biorheology 10.3 (1973),
pp. 343–350.
[487] Marc J Weissburg and Richard K Zimmer-Faust. “Odor plumes and how blue crabs use them in
finding prey.” In: Journal of Experimental Biology 197.1 (1994), pp. 349–375.
[488] Li Wen and George Lauder. “Understanding undulatory locomotion in fishes using an
inertia-compensated flapping foil robotic device”. In: Bioinspiration & biomimetics 8.4 (2013),
p. 046013.
[489] Carl H White, George V Lauder, and Hilary Bart-Smith. “Tunabot Flex: a tuna-inspired robot with
body flexibility improves high-performance swimming”. In: Bioinspiration & Biomimetics 16.2
(2021), p. 026019.
[490] HAL Whitehead. “Analysing animal social structure”. In: Animal behaviour 53.5 (1997),
pp. 1053–1067.
[491] Norbert Wiener. Cybernetics: Or Control and Communication in the Animal and the Machine. MIT
press, 1948.
[492] Norbert Wiener. Norbert Wiener: Collected Works. Vol. 1. MIT Press Cambridge, Mass., 1976.
[493] S Wieskotten, G Dehnhardt, B Mauck, L Miersch, and W Hanke. “Hydrodynamic determination
of the moving direction of an artificial fin by a harbour seal (Phoca vitulina)”. In: Journal of
Experimental Biology 213.13 (2010), pp. 2194–2200.
[494] Charles HK Williamson and Anatol Roshko. “Vortex formation in the wake of an oscillating
cylinder”. In: Journal of fluids and structures 2.4 (1988), pp. 355–381.
214



[495] Leighton Wilson, Nathan Vaughn, and Robert Krasny. “A GPU-accelerated fast multipole method
based on barycentric Lagrange interpolation and dual tree traversal”. In: Computer Physics
Communications 265 (2021), p. 108017.
[496] Leighton Wilson, Nathan Vaughn, and Robert Krasny. BaryTree.
https://github.com/Treecodes/BaryTree.
[497] Shane P Windsor and Matthew J McHenry. “The influence of viscous hydrodynamics on the fish
lateral-line system”. In: Integrative and comparative biology 49.6 (2009), pp. 691–701.
[498] Shane P Windsor, Stuart E Norris, Stuart M Cameron, Gordon D Mallinson, and
John C Montgomery. “The flow fields involved in hydrodynamic imaging by blind Mexican cave
fish (Astyanax fasciatus). Part I: open water and heading towards a wall”. In: Journal of
Experimental Biology 213.22 (2010), pp. 3819–3831.
[499] Robert J Wood, E Steltz, and RS Fearing. “Optimal energy density piezoelectric bending
actuators”. In: Sensors and Actuators A: Physical 119.2 (2005), pp. 476–488.
[500] Wright brothers. https://en.wikipedia.org/wiki/Wright_brothers.
[501] T. Wu. “Hydromechanics of swimming propulsion. Part 1. Swimming of a two-dimensional
flexible plate at variable forward speeds in an inviscid fluid”. In: Journal of Fluid Mechanics 46.2
(1971), pp. 337–355.
[502] Yang Xiang, Haotian Hang, Suyang Qin, and Hong Liu. “Scaling analysis of the circulation growth
of leading-edge vortex in flapping flight”. In: Acta Mechanica Sinica 37.10 (2021), pp. 1530–1543.
[503] Wen-Hua Xu, Guo-Dong Xu, and Lei Shan. “Real-time parametric estimation of periodic
wake-foil interactions using bioinspired pressure sensing and machine learning”. In:
Bioinspiration & Biomimetics (2022).
[504] Shinji Yabuta. “Spawning migrations in the monogamous butterflyfish, Chaetodon trifasciatus”.
In: Ichthyological Research 44.2-3 (1997), pp. 177–182.
[505] Xingbo Yang and M Cristina Marchetti. “Hydrodynamics of turning flocks”. In: Physical review
letters 115.25 (2015), p. 258101.
[506] Yingchen Yang, Jack Chen, Jonathan Engel, Saunvit Pandya, Nannan Chen, Craig Tucker,
Sheryl Coombs, Douglas L Jones, and Chang Liu. “Distant touch hydrodynamic imaging with an
artificial lateral line”. In: Proceedings of the National Academy of Sciences 103.50 (2006),
pp. 18891–18895.
[507] Zhen Yang, Zheng Gong, Yonggang Jiang, Yueri Cai, Zhiqiang Ma, Xin Na, Zihao Dong, and
Deyuan Zhang. “Maximized Hydrodynamic Stimulation Strategy for Placement of Differential
Pressure and Velocity Sensors in Artificial Lateral Line Systems”. In: IEEE Robotics and
Automation Letters 7.2 (2022), pp. 2170–2177.
215



[508] Jeannette Yen, David W Murphy, Lin Fan, and Donald R Webster. “Sensory-motor systems of
copepods involved in their escape from suction feeding”. In: Integrative and comparative biology
55.1 (2015), pp. 121–133.
[509] Jeannette Yen, Marc J Weissburg, and Michael H Doall. “The fluid physics of signal perception by
mate-tracking copepods”. In: Philosophical Transactions of the Royal Society of London. Series B:
Biological Sciences 353.1369 (1998), pp. 787–804.
[510] Lexing Ying, George Biros, and Denis Zorin. “A kernel-independent adaptive fast multipole
algorithm in two and three dimensions”. In: Journal of Computational Physics 196.2 (2004),
pp. 591–626.
[511] Rio Yokota and Lorena Barba. “Hierarchical n-body simulations with autotuning for
heterogeneous systems”. In: Computing in Science & Engineering 14.3 (2012), pp. 30–39.
[512] Junzhi Yu, Lizhong Liu, Long Wang, Min Tan, and De Xu. “Turning control of a multilink
biomimetic robotic fish”. In: IEEE Transactions on Robotics 24.1 (2008), pp. 201–206.
[513] Zhi-Ming Yuan, Minglu Chen, Laibing Jia, Chunyan Ji, and Atilla Incecik. “Wave-riding and
wave-passing by ducklings in formation swimming”. In: Journal of Fluid Mechanics 928 (2021), R2.
[514] Vladimir E Zakharov, Victor S L’vov, and Gregory Falkovich. Kolmogorov spectra of turbulence I:
Wave turbulence. Springer Science & Business Media, 2012.
[515] Ernst Zermelo. “Über das Navigationsproblem bei ruhender oder veränderlicher Windverteilung”.
In: ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik
und Mechanik 11.2 (1931), pp. 114–124.
[516] Yufan Zhai, Xingwen Zheng, and Guangming Xie. “Fish lateral line inspired flow sensors and
flow-aided control: A review”. In: Journal of Bionic Engineering 18.2 (2021), pp. 264–291.
[517] Peng Zhang, Elizabeth Krasner, Sean D Peterson, and Maurizio Porfiri. “An information-theoretic
study of fish swimming in the wake of a pitching airfoil”. In: Physica D: Nonlinear Phenomena 396
(2019), pp. 35–46.
[518] Yangfan Zhang and George V Lauder. “Energetics of collective movement in vertebrates”. In:
Journal of Experimental Biology 226.20 (2023), jeb245617.
[519] Yangfan Zhang and George V. Lauder. “Energy conservation by group dynamics in schooling
fish”. In: (Oct. 2023). doi: 10.7554/elife.90352.1.
[520] Zhicheng Zheng, Yuan Tao, Yalun Xiang, Xiaokang Lei, and Xingguang Peng. “Body orientation
change of neighbors leads to scale-free correlation in collective motion”. In: Nature
Communications 15.1 (2024), p. 8968.
[521] Ji Zhou, Jung-Hee Seo, and Rajat Mittal. “Complex Emergent Dynamics in Fish Schools-Insights
from a Flow-Physics-Informed Model of Collective Swimming in Fish”. In: Bulletin of the
American Physical Society 67 (2022).
216



[522] Ji Zhou, Jung-Hee Seo, and Rajat Mittal. “Effect of schooling on flow generated sounds from
carangiform swimmers”. In: Bioinspiration & Biomimetics 19.3 (2024), p. 036015.
[523] Joseph Zhu, Carl White, Dylan K Wainwright, Valentina Di Santo, George V Lauder, and
Hilary Bart-Smith. “Tuna robotics: A high-frequency experimental platform exploring the
performance space of swimming fishes”. In: Science Robotics 4.34 (2019), eaax4615.
[524] Xiaojue Zhu, Guowei He, and Xing Zhang. “Flow-Mediated Interactions between Two
Self-Propelled Flapping Filaments in Tandem Configuration”. In: Physical Review Letters 113.23
(2014), p. 238105. issn: 0031-9007. doi: 10.1103/physrevlett.113.238105.
[525] Yan Zhu, Aljoscha Nern, S Lawrence Zipursky, and Mark A Frye. “Peripheral visual circuits
functionally segregate motion and phototaxis behaviors in the fly”. In: Current Biology 19.7
(2009), pp. 613–619.
[526] Yi Zhu, Jian-Hua Pang, and Fang-Bao Tian. “Stable schooling formations emerge from the
combined effect of the active control and passive self-organization”. In: Fluids 7.1 (2022), p. 41.
217



Appendix A
Fluid simulation models
A.1 CFD simulation
In our CFD simulations, a swimmer is modeled as a symmetric 2D Joukowsky airfoil [447]. The chord
length of the airfoil is the characteristic length L, and the maximum thickness is 0.12L. The airfoil undergoes pitching motion around its leading edge. Fluid-structure interactions are governed by the incompressible Navier-Stokes equations,
∂u
∂t + u · ∇u = −∇p +
1
Re
∆u, ∇ · u = 0, (A.1)
where u(x, t) and p(x, t) are the velocity and pressure field, respectively. We solved for these fields numerically using immersed boundary method (IBM) that handles the two way coupled fluid structure interaction [365, 167, 169, 43, 168, 315].
The immersed boundary formulation involves an Eulerian description of the flow field and a Lagrangian description of the immersed swimmers, modeled as Joukowsky airfoils. The boundary condition
is mapped to a body force exerted on the fluid. The Lagrangian and Eulerian variables are correlated by the
Dirac delta function, which is smoothed during discretization. Here, we used the implementation developed by the group of Professor Boyce Griffith, IBAMR [210], which has long been used to solve problems
218



such as blood flow in heart [365, 257], water entry/exit problems [44], fish’s swimming [199, 449, 473], insect’s flight [457, 461], flexible propulsors [198, 452, 197], self propulsion of pitching/heaving airfoil [197,
507], and fish schooling [507, 274]. This implementation is based on an adaptive mesh, which enables
us to accurately simulate self propulsion and reach steady state in a large computational domain with a
reasonable computational cost.
For simulating wake behind fixed or motion-prescribed objects in an incoming flow(Chapter. 3 and
Chapter. 4), we chose a [−24, 8] × [−8, 8] rectangular computational domain, with the object centered at
the origin (0, 0). The coarsest Eulerian is a uniform 128 × 64 Cartesian grid, with three layers of adaptive
Eulerian mesh refining it; the refinement ratio between two layers is 4; the refinement region is based on
both the solid boundary and vorticity. The simulation time step is ∆t = 2 × 10−4
.
For self-propelled swimmers (Chapter. 5), the computational domain is a rectangle of dimensions 80L×
20L, with periodic boundary conditions on the computational domain and no-slip boundary condition on
the surface of airfoils. The initial location of the first swimmer is 12L away from the right boundary in
streamwise direction. The initial distance between the two swimmers d(t = 0) ranges from 1.5L to 4L to
ensure we access different equilibria that emerge in these pairwise formations. The coarsest Eulerian mesh
is a uniform 500 × 125 Cartesian grid. The computational domain close to the airfoils and their wake are
refined. There are 3 layers of refinement mesh, and the refinement ratio for each layer is 4. The simulation
timestep is adaptive, with the maximum timestep ∆tmax = 2.5 × 10−3
.
The hydrodynamic forces Fx, Fy and moment M acting on each swimmer are calculated by integrating
over the surface of that swimmer, the traction force σ · n and moment x × (σ · n), where σ = −pI +
µ(∇u+ ∇u
T
) is the fluid stress tensor, x denotes positions on the surface of that swimmer and n the unit
normal to that airfoil into the fluid.
219



A.2 Vortex Sheet simulation
The coupled fluid-structure interaction between the two-link swimmer and the surrounding fluid is simulated using an inviscid vortex sheet model. In viscous fluids, boundary layer vorticity is formed along
the sides of the swimmer, and it is swept away at the swimmer’s tail to form a shear layer that rolls up
into vortices. In the vortex sheet model, the swimmer is approximated by a bound vortex sheet, denoted
by lb, whose strength ensures that no fluid flows through the rigid plate, and the separated shear layer is
approximated by a free regularized vortex sheet lw at the trailing edge of the swimmer. The total shed circulation Γ in the vortex sheet is determined so as to satisfy the Kutta condition at the trailing edge, which
is given in terms of the tangential velocity components above and below the bound sheet and ensures that
the pressure jump across the sheet vanishes at the trailing edge.
To express these concepts mathematically, it is convenient to use the complex notation z = x + iy,
where i = √
−1 and (x, y) denote the components of an arbitrary point in the plane. The bound vortex
sheet lb is described by its position zb(s, t) and strength γ(s, t), where s ∈ [0, L] denotes the arc length
along the sheet lb. The separated sheet lw is described by its position zw(Γ, t), Γ ∈ [0, Γw] where Γ is the
Lagrangian circulation around the portion of the separated sheet between its free end in the spiral center
and the point zw(Γ, t). The parameter Γ defines the vortex sheet strength γ = dΓ/ds.
By linearity of the problem, the complex velocity w(z, t) = u(z, t) − iv(z, t) is a superposition of the
contributions due to the bound and free vortex sheets
w(z, t) = wb(z, t) + ww(z, t). (A.2)
In practice, the free sheet lw is regularized using the vortex blob method to prevent the growth of the
Kelvin-Helmholtz instability. The bound sheet lb is not regularized in order to preserve the invertibility
220



Free vortex sheet
Bound vortex sheet
e
y
ex
n
n
Fa
Da
Fp
Dp
O
A
B
u+
γ(s, t )
Γ w
zw (s, t )
zb(s, t )
u−
s = L
s = 0
BA
θa
α
Figure A.1: (A) Schematic of the vortex sheet model for a two-dimensional bending swimmer. (B) Depiction of the different
hydrodynamic forces acting on the swimmer.
of the map between the sheet strength and the normal velocity along the sheet. The velocity components
wb(z, t) and ww(z, t) induced by the bound and free vortex sheets, respectively, are given by
wb(z, t) = Z L
0
Ko(z − zb(s, t))γ(s, t) ds, ww(z, t) = Z Γw
0
Kδ(z − zw(Γ, t)) dΓ,
(A.3)
where Kδ is the vortex blob kernel, with regularization parameter δ,
Kδ(z) = 1
2πi
z
|z|
2 + δ
2
, z = x − iy. (A.4)
If z is a point on the bound sheet for which δ = 0, wb is to be computed in the principal value sense.
The position of the bound vortex sheet zb is determined from the plate’s flapping (θa(t), θp(t)) and
swimming x(t) motions. The corresponding sheet strength γ(s, t) is determined by imposing the no penetration boundary condition on the plate, together with conservation of total circulation. The no penetration boundary condition is given by
Re [wn]
zb
= Re [wswimmern] , (A.5)
where
n =



− sin θa + i cos θa, s ∈ [0, l],
− sin θp + i cos θp, s ∈ [l, L],
(A.6)
221



and
wswimmer =



x˙ − i ˙y − i
˙θa [¯zb − (x − iy)] , s ∈ [0, l],
x˙ − i ˙y − i
˙θa [¯zb − (x − iy)] − i ˙α [¯zb − (xA − iyA)] , s ∈ [l, L].
(A.7)
Conservation of the fluid circulation implies that R
lb
γ(s, t)ds + Γw(t) = 0.
The circulation parameter Γ along the free vortex sheet zw(Γ, t) is determined by the circulation shedding rates Γ˙ w, according to the Kutta condition, which states that the fluid velocity at the trailing edge is
finite and tangent to the flyer. The Kutta condition can be obtained from the Euler equations by enforcing
that, at the trailing edge, the difference in pressure across the swimmer is zero. To this end, we integrate
the balance of momentum equation for inviscid planar flow along a closed contour containing the vortex
sheet and trailing edge,
[p]∓(s) = p−(s) − p+(s) = −
dΓ(s, t)
dt −
1
2
(u
2
− − u
2
+), (A.8)
where Γ(s, t) = Γw +
R s
0
γ(s
′
, t)ds′
, 0 ≤ s ≤ L, is the circulation within the contour and p∓(s, t) and
u∓(s, t) denote the limiting pressure and tangential slip velocities on both sides of the swimmer. Since the
pressure difference across the free sheet is zero, it also vanishes at the trailing edge by continuity, which
implies that
Γ˙ w = −
1
2
(u
2
− − u
2
+)|s=L. (A.9)
The values of u− and u+ are obtained from the average tangential velocity component and from the
velocity jump at the trailing edge, given by the sheet strength, evaluated at s = L
u =
u+ + u−
2
= Im[(w − wswimmer)n] , u− − u+ = γ. (A.10)
222



Once shed, the vorticity in the free sheet moves with the flow. Thus the parameter Γ assigned to each
particle zw(Γ, t) is the value of Γw at the instant it is shed from the trailing edge. The evolution of the free
vortex sheet zw is obtained by advecting it in time with the fluid velocity,
z¯˙w = ww(zw, t) + wb(zw, t). (A.11)
A.2.1 Forces and moments
The hydrodynamic forces Fa and Fp acting on the anterior and posterior parts of the swimmer, respectively,
are given by
Fa = Fax + iFay =
Z l
0
n[p]∓ds, Fp = Fpx + iFpy =
Z L
l
n[p]∓ds, (A.12)
The hydrodynamic moment Ma acting on anterior part of the swimmer about its leading edge and the
hydrodynamic moment Mp acting on the posterior part of the swimmer about the flexion point are given
by
Ma =
Z l
0
[p]∓sds, Mp =
Z L
l
[p]∓(s − l)ds. (A.13)
Note that the components Fax, Fay and Fpx, Fpy can be written explicitly as
Fax =
R l
0
[p]∓(− sin θa)ds, Fay =
R l
0
[p]∓ cos θads,
Fpx =
R L
l
[p]∓(− sin θp)ds, Fpy =
R L
l
[p]∓ cos θpds.
(A.14)
where θp = θa + α and α is the flexion angle.
223



The total hydrodynamic force acting on the swimmer due to the pressure difference across the swimmer
is given by
F = Fx + iFy
(A.15)
where the components Fx and Fy are
Fx = Fax + Fpx =
R l
0
[p]∓(− sin θa)ds +
R L
l
[p]∓(− sin θp)ds,
Fy = Fay + Fpy =
R l
0
[p]∓ cos θads +
R L
l
[p]∓ cos θpds,
(A.16)
The total hydrodynamic moment acting on the swimmer about its leading edge is given by
M =
R l
0
[p]∓sds +
R L
l
[p]∓(s − l + l cos α)ds. (A.17)
We introduce a drag force D that emulates the effect of skin friction due to fluid viscosity. This force is
based on the Blasius laminar boundary layer theory as implemented by [135] in the context of the vortex
sheet model. Blasius theory provides an empirical formula for skin friction on one side of a horizontal plate
of length L placed in fluid of density ρf and uniform velocity U. In dimensional form, Blasius formula
is D = −
1
2
ρfL(cf)U
2
, where the skin friction coefficient Cf = 0.664/
√
Re is given in terms of the
Reynolds number Re = ρfUL/µ. Substituting back in the empirical formula leads to D = −CdU
3/2
,
where Cd = 0.664p
ρfµ(L). Following [135], we write a modified expression of the drag force for a
swimming plate
D = Cd(U
3/2
+ + U
3/2
− ), (A.18)
224



where U ± are the spatially-averaged tangential fluid velocities on the upper and lower side of the plate,
respectively, relative to the swimming velocity U,
U ±(t) = 1
L
Z L
0
u±(s, t)ds − U. (A.19)
We estimate Cd to be approximately 0.04 in the experiments of [384].
The equation of motion governing the free swimming x(t) is given by Newton’s second law
mx¨ = Fx − Dx, (A.20)
where Dx is the x-component of the drag force D. When the swimmer bends passively, the relative rotation
α(t) = θp−θa of the posterior end is not prescribed a priori and follows from the physics of fluid-structure
interactions. Considering that the rotational joint at the flexion point is equipped with a torsional spring of
stiffness κ and damping coefficient c, we write the equation governing the relative rotation of the posterior
link
Ip(
¨θa + ¨α) + cα˙ + κα = Mp + Minertia, (A.21)
where Ip is the moment of inertia of the posterior link about the flexion point, and Minertia is an inertial
moment acting on the posterior link due to the free motion of the flexion point. Namely,
Mintertia = mpIm [−zAa¯A] , (A.22)
225



where mp = ρe(L − l) is the mass of the posterior link, zA =
L − l
2
(cos θp + i sin θp) is the position of
the flexion point relative to the mass center of the posterior link, and a¯A is the complex conjugate of the
acceleration aA at the flexion point. The latter is given by
aA = (¨x − l
¨θa sin θa) − il
˙θ
2
a
sin θa. (A.23)
For a swimmer undergoing active flexion, the flapping motion and body bending are produced by two
active moments Ma and Mp acting by the swimmer on the fluid about the leading edge O and the hinge
A, respectively. The power input by the swimmer to overcome the moment of all the hydrodynamic forces
about the leading edge is given by
P(t) = ˙θaMa + ˙θp(Mp + l|Fp| cos α).
(A.24)
For a swimmer with passive flexion, the input power is given by
P(t) = ˙θa(Ma + l|Fp| cos α − κα − cα˙)
(A.25)
Note that the skin drag does not contribute to input power.
A.2.2 Numerical implementation
The bound vortex sheet is discretized by 2n + 1 point vortices at zb(t) with strength ∆Γ = γ∆s. These
vortices are located at Chebyshev points that cluster at the two ends of the swimmer. Their strength is
determined by enforcing no penetration at the midpoints between the vortices, together with conservation
of circulation. The free vortex sheet is discretized by regularized point vortices at zw(t), that is released
from the trailing edge at each timestep with circulation given by (A.9). The free point vortices move with
226



the discretized fluid velocity while the bound vortices move with the swimmer’s velocity. For the activelybending swimmer, the discretization of equations (A.20) and (A.9, A.11) yields a coupled system of ordinary
differential evolution equations for the swimmer’s position, the shed circulation, and the free vorticity,
that is integrated in time using the 4th order Runge-Kutta scheme. For the passively-bending swimmer,
the discretization of equation (A.21) is added to the coupled system of equations to simultaneously solve
for the rotational motion of the posterior link relative to the anterior link. The details of the shedding
algorithm are given in [343]. The numerical values of the timestep ∆t, the number of bound vortices n,
and the regularization parameter δ are chosen so that the solution changes little under further refinement.
Finally, to emulate the effect of viscosity, we allow the shed vortex sheets to decay gradually by dissipating each incremental point vortex after a finite time Tdiss (Tdiss = 1.65T for the swimmer with active
flexion and Tdiss =
√
2.09T for passive flexion) from the time it is shed into the fluid. Larger Tdiss implies
that the vortices stay in the fluid for longer times, mimicking the effect of lower fluid viscosity. We refer
the reader to [205] for a detailed analysis of the effect of dissipation time on the hydrodynamic forces on a
stationary and moving plate in the vortex sheet model. Details of the numerical validation in comparison
to [220] and [221] are provided in [204].
A.2.3 Fast multipole methods
In the calculation of vortex sheet methods, the most computationally costly part is the interaction among
free vortex sheets. Directly summing the kernel function Equation (A.4) leads to time complexity of O(N2
),
where N is the number of free vortex sheets. This makes the computation impossible when we have a
large number of swimmers (such as more than 10 swimmers).
To deal with this issue, Fast Multipole Method (FMM) is employed [66, 510, 460, 495]. FMM can reduce the computational cost to O(N) by clustering the free vortices into large groups and approximating
their influence using multipole expansion [66, 511]. However, the original FMM is difficult to extend to
227



non-Laplacian kernels [510], and the regularized Biot-Savart kernel used in VSM is not Laplacian. Thus
kernel-independent FMM is required for our application [510, 135]. Specifically, we used the Barycentric
Lagrange Dual Tree Traversal (BLDTT) algorithm [495] implemented in BaryTree [496]. It is a GPUaccelerated algorithm based on MPI and OpenACC. We refer readers to the original reference and their
GitHub repository cited above for a thorough explanation of the theory and details behind this method.
A.3 Minimal Hydrodynamic Models
A.3.1 Time-delayed particle model
We employed a time-delayed particle model [339] to describe a pair of flow-coupled oscillating swimmers
(Fig. 1.1). Each swimmer is modeled as a point mass that oscillates in the y-direction to propel itself in the
x-direction. Let y1 = A1 sin(2πf1t) and y2 = A2 sin(2πf2t − ϕ) denote the transverse oscillations of the
leader and follower, respectively, where Ai
, fi
, i = 1, 2, are the amplitude and frequency of oscillations,
and ϕ is the phase difference between two swimmers, and let x1(t) and x2(t) denote their swimming
motion.
Each foil experiences a thrust force proportional to the square of its vertical velocity relative to the
ambient fluid and a drag force proportional to the square of its horizontal velocity relative to the ambient
fluid [442, 501, 148, 339], namely,
Fi = CT ( ˙yi − v(xi))2
, Di = CD( ˙xi − u(xi))2
, (A.26)
where u(xi) and v(xi) are the components of the fluid velocity in the x and y-directions at the location of
the i
th swimmer, and CT and CD are the thrust parameter and drag parameter, respectively.
The leader swims into still water, which means u(x1) = v(x1) = 0, and creates a transverse wake
(u(x2) = 0) in the fluid environment for the follower to interact with. The transverse velocity of the
228



time-delay
particle
model
Asin(2̟ft) Asin(2̟ft-φ)
Ae-∆t/тsin(2̟ft)
Figure A.2: Schematics of time delayed particle model. Each swimmer is modeled as particles oscillating in vertical direction A sin(2πf t) and left a wake which decays exponentially with time [339, 193].
frequency ratio f2
/f1
0 0.5 1 1.5 2
0
0.5
1
1.5
2
amplitude ratio A2/A1
collision
unstable
position
limit cycle
stable
equilibrium
separation
Figure A.3: Parametric study of the model behavior over frequency ratio and amplitude ratio.
Other parameters are fixed at τ = 1, A1 = sin 15◦
, f1 = 1, ϕ = 0. Reproduced independently from [339].
leader’s wake at its present location is the same as its oscillatory speed y˙1(t). This wake decays exponentially with time e
−∆t/τ , where ∆t denotes the time passed since the leader occupied that location.
The parameter τ depends on the scale of the problem: larger τ models weaker viscous effect or larger
Reynolds number. Say at t − ∆t, the leader occupied the position where the follower is now located
x1(t− ∆t) = x2(t). The follower thus interacts with a transverse wake of velocity e
−∆t/τ y˙1(t− ∆t). The
equations of motion for both swimmers (i = 1, 2) are given by
mx¨i = −Fi + CDx˙
2
i
,
(A.27)
229



A B
G
F
H
time t
0 20 40 60 80 100
0
1
2
3
4
5
0
1
2
3
4
5
phase
φ/2̟
0
.5
1
τ = 1
τ = 2
d/λ d/λ
E
0
1
2
3
4
5
d/λ
time t
0 20 40 60 80 100
DC
A2
/A1
.5
1
1.5
f
2
/f1
.5
1
1.5
A
B
FE
0
1
2
3
4
5
d/λ
separation separation
collision collision
f
2
/f1
=1 A2
/A1
=1
Figure A.4: Scaled distance as a function of time for different parameter values. A f2/f1 = 1,
A2/A1 = 1, τ = 1, ϕ = 0 with different initial distance; B f2/f1 = 1, A2/A1 = 1, τ = 1 with phase ϕ
ranges from 0 to 2π; C f2/f1 ∈ [0.5, 1.5], A2/A1 = 1, τ = 1; D f2/f1 = 1, A2/A1 ∈ [0.5, 1.5], τ = 1; E
f2/f1 = 0.9, A2/A1 = 1.2, τ = 1; F f2/f1 = 1.1, A2/A1 = 0.9, τ = 1; G f2/f1 = 0.9, A2/A1 = 1.2,
τ = 1, 2; H τ = ∞ with cases from A, B, E, and F. Other parameters are kept the same as CD = 0.25,
CT = 0.96, m = 1.325g/cm2
, A1 = sin 15◦
, f1 = 1 [339, 193].
where
F1 = CT y˙
2
1
,
F2 = CT ( ˙y2(t) − e
−∆t/τ y˙1(t − ∆t))2
,
x1(t − ∆t) = x2(t).
(A.28)
Here, m is the mass of each swimmer. The average swimming speed of the leader is solved analytically as
U1 = πA1f1
p
2CT /CD, but the swimming speed of the follower is solved numerically via Runge–Kutta
methods. Solutions are shown in Fig. A.4, where the separation distance d between the two swimmers is
scaled by the wavelength of the wake left by leader λ = U1/f1.
Following [339], we explored the behavior of the two swimmers over the entire space of frequency and
amplitude ratios f2/f1 and A2/A1 (Fig. A.3). For each set of parameter values, we calculated the separation
230



distance between the leader and follower for 16 different initial conditions ranging from d0/λ = 1 to
d0/λ = 16. We found, consistent with [339], that the swimmers reach one of five characteristic behaviors:
(1) reach a stable formation and swim together in a relative equilibrium; (2) always separate; (3) always
collide; (4) separate or collide based on initial conditions; or (5) reach stable periodic limit cycles.
It is instructive to show representative trajectories for each of the five behaviors identified in Fig. A.3.
In Fig. A.4A, we considered the case when the frequency and amplitude of two swimmers are the same and
the two swimmers oscillate inphase (ϕ = 0). Depending on the initial separation distance, the follower
positions itself at one of multiple relative equilibria, consistent with [339, 194]. Each equilibrium has its
own basin of attractions, as illustrated by dashed grey lines. When the swimmers flap at a phase lag
relative to each other (ϕ ̸= 0), the equilibria shift following a linear phase-distance relationship ϕ/2π =
d/U1f1 + const (Fig. A.4B) [339, 266, 193].
We next considered cases when the follower flaps at a mismatch in either amplitude or frequency relative to the leader (Fig. A.4C and D). When the follower flaps at larger frequency (f2/f1 > 1) or amplitude
(A2/A1 > 1.2), it always collides with the leader regardless of initial conditions. When the follower flaps
at smaller frequency (f2/f1 < 1) or amplitude (A2/A1 < 0.8), the two swimmers always separate. The
stability of the formation is more sensitive to differences in frequency; indeed, there is a big margin of amplitude mismatch (A2/A1 ∈ [0.8, 1.2]) for which uncoordinated swimmers can swim together cohesively,
but a tiny amount of frequency mismatch makes the school unstable (Fig. A.3).
The separation or collision can be explained intuitively as follows. Higher amplitude (higher frequency)
generates higher thrust and vice versa (Eq. A.20). However, frequency mismatch brings additional and
unique effects. It introduces a time-dependent phase shift ϕ(t) = (f1 − f2)t. When the frequency of
the follower is smaller than that of the leader (f2 < f1), the phase ϕ constantly increases with time, and
given the aforementioned linear phase-distance relationship, the equilibrium distance d also increases.
That is, f2 < f1 increases phase difference and decreases thrust on the follower, both lead to an increased
231



separation between leader and follower. For f2 > f1, phase decreases and thrust increases, and both effects
lead to collision.
We lastly considered two cases when both amplitude and frequency are mismatched (Fig. A.4E,F). In
Fig. A.4F, the frequency of the follower is larger (f2 > f1) and the vertical velocity of the follower is
smaller (A2f2 < A1f1). Here, depending on initial conditions, the follower either separates from the
leader or collides with the leader. Combinations of amplitude and frequency ratios that lead to either
collision or separation based on initial conditions are categorized as unstable [339] (Fig. A.3).
When the frequency of the follower is smaller than that of the leader (f2 < f1) but the transverse
velocity of the follower is larger (A2f2 > A1f1), they form a stable limit cycle (Fig. A.4E). When the
follower is close to the leader, the leader’s wake strongly affects the follower’s motion and the phase
effect discussed earlier pushes the follower away from the leader. When the separation distance is larger,
exponential decay of the wake makes the motion of the follower less affected by the wake. Because the
transverse velocity of the follower is larger, its self-propelled velocity is larger than the leader. Thus, the
follower swims forward and forms the limit cycle. The limit cycle is a global attractor in the system because
it only appears at the position where the magnitude of the phase effect and thrust effect are close. Note
that the relative equilibria of the coordinated swimmers ( A2/A1 = 1 and f2/f1 = 1) are independent
of τ (Fig. A.4H), but the limit cycles that emerge in the uncoordinated swimmers are scale-dependent;
they depend on the value of τ (Fig. A.4G). When considering the limit τ → ∞, the limit cycle disappears
(Fig. A.4H).
232



A.3.2 Potential dipole model
Far-field hydrodynamic interaction of fish assumes that each swimmer generates an inviscid dipole flow
field [444, 75, 228, 227, 445]. The velocity and angular velocity created by all fish on fish i are written as
Ui =
X
N
j=1,j̸=i
If
π
p
⊥
j
sin 2θji + pj cos 2θji
r
2
ij
, Ωi = pi
· Ui
· p
⊥
i
, (A.29)
where If = π(a/2)2U is the strength of fish-induced dipolar flow field, with a indicating the fish
bodylength and p
⊥ is a unit vector orthogonal to p [228]. Eqs. (C.1)– (A.29) form a closed set of 3N
differential equations governing the 3N unknowns (xi
, yi
, θi), where i = 1, . . . , N. These equations
depend solely on three non-dimensional parameters, In, Ia, and If representing the noise, alignment, and
hydrodynamic intensities.
A.4 Odor simulation
A.4.1 Laminar Plume
An odor-emitting source with continuous emission rate R is located at point x
∗
. Time evolution of the
odor concentration field is composed of advection of background flow u(x, t) and diffusion at constant
diffusive constant D
∂c
∂t + u · ∇c = D∆c + Rδ(x − x
∗
).
(A.30)
In laminar plume, the flow is a uniform background flow u(x, t) = U = 1. Thus, Péclect number is defined
as the ratio between the advection rate and diffusion rate P e = LU/D. The solution of (A.30) is given by
c(x, y) = R
2πD
exp
−
U(x − x
∗
)
2D

K0

U||x − x
∗
||
2D

(A.31)
233



where K0 is the modified Bessel function of the second kind of order zero [465]. The parameters are chosen
as: odor emission rate R = 0.25, uniform background flow velocity U = 1, and diffusion rate D = 0.1.
These result in a Péclect number Pe = 10.
To consider the concentration field advected by the flow field (e.g. flow field calculated via CFD simution, Sec.A.1), we placed an odor-emitting source at the trailing edge of the airfoil and solved the advectiondiffusion equation for the concentration field advected by the flow field (A.30). The Péclect number in
Fig. 3.1C is 10, 000.
A.4.2 Turbulent Plume
We employed a particle-based two-dimensional plume model [137, 414]. The plume model simulates the
random release of odor packets from a source with a fixed initial radius r0 and initial strength c0. A
turbulent wind transported the odor packets, and spread out in all directions due to radial diffusion. The
concentration decreases with the increases in packet size
ct = c0(
r0
rt
)
3 (A.32)
The turbulent wind is simplified to the combination of a uniform background flow of velocity U = 1 and
random perturbations with strength σ. These parameters are chosen as emission rate R = 1, initial radius
and concentration of the plume r0 = 0.01 and c0 = 1, random perturbations cross-wind σ = 0.01.
234



Appendix B
Reinforcement learning
We implement the clipped advantage proximal policy optimization (PPO) method proposed by [400] for
our RL training. PPO maximizes a surrogate objective that clips off unwanted changes when the policy
deviates too much from the policy of the previous cycle to ensure faster and more robust convergence.
We refer readers to the original reference cited above as well as the OpenAI’s documentation of the PPO
algorithm [348] and their baseline implementations [347] for a thorough explanation of the theory and
details behind this method.
Our implementation can be separated into two parts. The main loop simulates the environment using
action sequence at generated by the agent, and stores the observed rollouts for future updates (see Algorithm 1). Note that no and na are used to indicate the number of observable states and actions. Equations
describing the motion of the agent were integrated numerically using an adaptive time step, explicit RK45
method between each decision step of 0.06 unit of time. This choice of decision time step size limits the
sensory signal the agent can get in one single period of airfoil oscillation.
Parameters of the actor-critic networks of the RL agent are updated every N time steps for K epochs.
Here the value of K is chosen to be 80 and the value of N is set to 4000, and we force all the updates
of the networks to happen after an episode has finished. For simplicity, we assume our continuous action variables follow a normally distributed policy πθ with mean value represented by a neural network
235



parametrized by θ and standard deviation σ, and the critic or value function Vϕ(ot) is also represented
by a neural network with parameters ϕ. Specifically, both the mean policy and value function are implemented as feed-forward neural networks with two hidden layers and tanh activation functions. The sizes
of the two hidden layers were fixed to 64 and 32, respectively. In order to better combine exploration and
exploitation, we let the standard deviation of the action σ to change according to the moving average of
reward. We start with a large standard deviation σ0 = 0.4, and a threshold for average reward ϵr = 20.
For every 100 episodes, we will check the average reward, if the average reward is larger than the threshold ϵr, we will decrease the standard deviation by 0.01, and meanwhile add the reward threshold by 0.5.
We also have a constraint that the standard deviation should be no smaller than 0.1. Finally, using the
collected trajectories during the previous N time steps, the parameters θ, ϕ are updated according a total
loss function L(θ, ϕ) via a back-propagating gradient-based optimizer (see Algorithm 1).
An important side note is that since it is in general impossible to obtain an unrealized infinite horizon
return Rt =
P∞
t
′ γ
t
′−t
rt
′, we need to choose an appropriate estimator of this value based on finite length
simulations. We can either simply truncate rewards after some step k by using
Rˆ
t
|truncation = rt + γrt+1 + γ
2
rt+2 + · · · + γ
k
rt+k,
(B.1)
or we can use the trained value function (critic) to approximate the residual contribution to the return via
k-step bootstrapping,
Rˆ
t
|bootstrapping = rt + γrt+1 + γ
2
rt+2 + · · · + γ
kVϕ(ot+k+1).
(B.2)
The two approaches are compared and published in previous paper by our group [218]. As a result, bootstrapping is used for all tasks depicted in the main text. In this paper, k = 900 is chosen to be large enough
236



for the swimmer to have enough time to travel to the source if it is able to fulfill this (a typical successful
trajectory only needs 200 ∼ 300 time steps).
Note that since we did not perform systematic hyperparameter tuning, readers might want to explore
different values for better performance. A possible way to improve our training is to replace the Gaussian
distribution we used in the policy to beta distribution, since it has been found that beta distribution is
more suitable for continuous control problem with bounded action domain, especially for policy close to
bang-bang controller [76].
Algorithm 1 Environment simulation
1: for time step t = 0, 1, ... do
2: if t = 0 or episode terminate then
3: store time step of episode termination
4: reset state st ∼ P(s0)
5: evaluate observation: ot ∼ o(st)
6: end if
7: sample action from policy at ∼ πθ(at
|ot)
8: evolve next state according to physics
9: evaluate next observation ot+1 ∼ o(st+1) and reward rt ∼ r(at
, st+1)
10: if t = 0 or mod (t, N) ̸= 0 then
11: append current action, observation, reward, and probability of sampling the action to assemble
vectors aN×na
, oN×na
, rN×1 and πθold (a|o)N×1
12: else
13: update agent networks according to Algorithm 2
14: end if
15: end for
237



Algorithm 2 Updating the agent
1: for update epoch number κ = 0, 1, ...K do
2: compute the truncated return using rewards rN×1 and assemble into vector RN×1
3: estimate infinite-horizon return using RN×1 and VT = Vϕ(oT ) if bootstrapping is desired[]
4: using oN×na
and value function Vϕ, evaluate expected returns at each time step and store into
VN×1
5: compute the advantage A = RN×1 − VN×1, evaluate expected returns at each time step and store
into VN×1
6: evaluate the probability of realizing aN×na based on oN×na
for the policy πθ, and store to
πθ(a|o)N×1
7: compute the action-likelihood ratio: ϱθ =
πθ(a|o)N×1
πθold (a|o)N×1
8: compute clipped surrogate loss function: Lclip (θ) = mean[min[ϱθ · A, clip(ϱθ, 1 − ϵ, 1 + ϵ) · A]]
9: compute the value function loss: Lvalue(ϕ) = 0.5 × mean[(RN×1 − VN×1)
2
]
10: compute the total loss: L(θ, ϕ) = −Lclip (θ) + Lvalue(ϕ) − α × entropy[πθ]
11: update parameters (θ, ϕ) to minimize the total loss using a gradient-based optimizer (e.g. Adam
[239])
12: end for
238



Appendix C
Agent-based model
C.1 Mathematical model
We consider a system of N fish, where each fish is represented as a self-propelled particle moving at a
constant velocity U (m·s
−1
) relative to the flow velocity. A fish creates a flow disturbance represented by its
far-field potential dipole [444, 228] and follows behavioral laws derived from shallow water experiments of
flagtails (Kuhlia mugil) [154, 61, 142, 203]. Accordingly, each fish interacts with the local flow generated by
all other fish and reorients its heading direction to both get closer and align with its Voronoi neighbors [142,
203]. Consider that fish i is located at xi ≡ (xi
, yi) in an inertial (x, y)-frame, with velocity vi = x˙ i
, where
˙
() represents derivative with respect to time t, and has a heading direction pi ≡ (cos θi
,sin θi) expressed
in terms of a heading angle θi measured from the x-axis. We write the equations of motion of fish i directly
in non-dimensional form, using the length scale p
U/kp and timescale 1/
p
U kp, where kp (m−1
·s
−1
) is
the intensity with which a fish reorients to get closer to its neighbors [142],
x˙ i = Upi + Ui
, dθi = ⟨rij sin θij + Ia sin ϕij ⟩dt + Ωidt + IndWt
.
(C.1)
Here, speed is normalized to U = 1. The non-dimensional noise intensity In scales a standard Wiener
process W(t) modeling the fish “free will" [492]. The term ⟨◦⟩ represents the fish reorientation in response
239



to visual feedback: it means that fish i only “sees" its Voronoi neighbors Vi
, with attraction intensity
normalized to one and non-dimensional alignment intensity Ia, both averaged with weight 1 + cos θij
modeling continuously a rear blind angle [61],
⟨◦⟩ =
X
j∈Vi
◦ (1 + cos θij ) /
X
j∈Vi
(1 + cos θij ). (C.2)
The intermediate variables rij = ∥xi − xj∥, θij = (∠(xj − xi) − θi), and ϕij = θj − θi represent,
respectively, the relative distance, viewing angle, and difference in heading angle between fish i and j.
The vector Ui represents the flow velocity generated by all other swimmers at the location of swimmer i
and Ωi denotes the angular velocity which is given in Sec. A.3.2.
C.2 Computational method
To numerically solve the system of equations (C.1) for a large number of fish N, one needs a computationally efficient approach to handle the all-to-all hydrodynamic interactions and Voronoi tessellation at each
time step. The computational complexity due to the hydrodynamic interactions in Eq. A.29 scales with
O(N2
). To handle these interactions, we optimized and paralleled the code responsible for computing the
direct sum in Eq. A.29 using a just-in-time compiler called Numba [252]. Numba compiles, optimizes, and
parallelizes the Python code to approach the computational performance of C or Fortran. Note that fast
multipole methods (FMM) reduce the computational complexity of the hydrodynamic interactions from
O(N2
) to nearly O(N) [166, 510], but FMM algorithms do not have a significant advantage over direct
sum in systems of the order of 104
agents [510], hence our choice to directly optimize the O(N2
) sum
in (A.29). For the Voronoi tessellation in two dimensions (2D), efficient algorithms exist for reducing this
task to O(N log N) [27]. We utilized the function Delaunay in Scipy [472]. We implemented these approaches in evaluating the right-hand sides of Eq. (C.1) at each time step dt, discretized the noise term
240



using dWt = N (0, 1)√
dt, and used an explicit Euler–Maruyama method to integrate (C.1) forward in
time at a timestep size dt = 10−2
[240]. We run our algorithm on an Exxact Valence Workstation with
a 56-core Intel Xeon W9-3495X CPU. With this software and hardware setup, a timestep takes about 1
second for 10, 000 agents, with hydrodynamic interactions and Voronoi tessellation taking about half of
the computational time each. Integrating the motion of 10, 000 agents over a time interval T = 1000
took about a day; integrating the motion of 50, 000 swimmers for the same time interval took about three
weeks. To verify the implementation, we reproduced the phase diagram of [142] in Fig. C.1. Summary of
our Table C.1.
Alignment intensity I
p
Noise intensity I
n
0 .2 .4 .6 .8 1 0
2
4
6
8
10
0 .2 .4 .6 .8 1
0
.2
.4
.6
.8
1
Polar order parameter P
Milling order parameter M
milling turning
swarming schooling
Figure C.1: Reproduction of phase diagram. We systematically varied alignment intensity Ia and noise intensity In to
reproduce the phase diagram of fish school in [142]. Other Parameters are N = 100, T = 1000, If = 0.01, 20 Monte Carlo
simulations are performed at each parameter set.
241



Table C.1: Summary of the dataset for polarized schools. In total, we performed and analyzed 631
simulations at various parameter values and school size.
Ia In If N ∆N #MC # P
9 0.5 0.01 100 - 5 5 0.96
9 0.5 0.01 1000 - 5 5 0.79
9 0.5 0.01 10,000 - 5 5 0.69
9 0.5 0.01 50,000 - 1 1 0.81
9 0.5 0.01 110-540 10 1 44 0.87-0.96
9 0.5 0.01 550-900 50 1 7 0.78-0.89
9 0.5 0.01 1,500 - 1 1 0.78
9 0.5 0.01 1,600 - 1 1 0.68
9 0.5 0.01 2,000 - 5 5 0.83
9 0.5 0.01 2,500 - 1 1 0.76
9 0.5 0.01 3,000 - 7 7 0.73
9 0.5 0.01 3,600 - 1 1 0.74
9 0.5 0.01 4,900 - 1 1 0.80
9 0.5 0.01 5,000 - 7 7 0.67
9 0.5 0.01 6,400 - 1 1 0.59
9 0.5 0.01 7,500 - 6 6 0.77
9 0.5 0.01 8,100 - 1 1 0.73
9 0.5 10−4 – 5 100, 200, 500, 1000 - 5 375 0.67-0.98
9 0.5 10−4 – 5 1500, 2000, 2500, 3000 - 1 60 0.66-0.98
9 0.5 10−4 – 5 10,000 - 1 15 0.26-0.95
9 0.5 0 100, 1000, 10,000 - 10 30 0.96-0.99
5 0.5 0.01 100-1000 100 1 10 0.83-0.92
5 0.5 0.01 5000 - 1 1
7 0.5 0.01 100-1000 100 1 10 0.87-0.95
9 0.7 0.01 100-1000 100 1 10 0.80-0.94
9 0.3 0.01 100-1000 100 1 10 0.9-0.97
9 0.3 0.01 5000 - 1 1
9 0.0 0.01 100, 1000, 10,000 - 1 3 0.92-0.98
9 0.75 0.01 100, 1000, 10,000 - 1 3 0.73-0.87
9 1.0 0.01 100, 1000, 10,000 - 1 3 0.63-0.70
242 
Asset Metadata
Creator Hang, Haotian (author) 
Core Title Underwater navigation strategies and emergent collective behavior in bioinspired swimmers 
Contributor Electronically uploaded by the author (provenance) 
School Andrew and Erna Viterbi School of Engineering 
Degree Doctor of Philosophy 
Degree Program Mechanical Engineering 
Degree Conferral Date 2025-05 
Publication Date 04/28/2025 
Defense Date 03/14/2025 
Publisher University of Southern California. Libraries (digital), University of Southern California (Los Angeles, California, USA) (original) 
Tag collective behavior,fish schooling,OAI-PMH Harvest,reinforcement learning,underwater navigation,vortex dynamics 
Format theses (aat) 
Language English
Advisor Kanso, Eva (committee chair), Nakano, Aiichiro (committee member), Bermejo-Moreno, Ivan (committee member), Oberai, Assad (committee member) 
Creator Email haotianh@usc.edu 
Permanent Link (DOI) https://doi.org/10.25549/usctheses-oUC11399KF03 
Unique identifier UC11399KF03 
Legacy Identifier etd-Hang-36813-47749 
Document Type Dissertation 
Format theses (aat) 
Rights Hang, Haotian 
Internet Media Type application/pdf 
Type texts
Source 202504w05-usctheses (batch), University of Southern California Dissertations and Theses (collection), University of Southern California (contributing entity) 
Access Conditions The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. 
Repository Name University of Southern California Digital Library
Repository Location USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email uscdl@usc.edu
Abstract (if available)
Abstract Autonomous vehicles navigating unsteady fluid environments have gained attention from aviation to ocean monitoring. Efficient navigation in dynamically changing flows requires harvesting energy and extracting information from fluid structures. Aquatic and aerial animals excel in these tasks, often relying on collective locomotion for enhanced efficiency and mobility.

We start with a single agent, discussing the optimal gait to achieve high efficiency and speed using fluid-structure interaction (FSI). We also employed reinforcement learning (RL) to navigate bio-inspired vortical wakes. We uncovered the physics mechanisms behind these navigation strategies and emphasized the crucial role of flow sensing, especially flow gradient.

Extending to multiple agents, we explore high-fidelity hydrodynamic interactions. While prior studies show two synchronized agents stabilize at energy-saving equilibria, we analyze how these formations scale to larger schools. "Cooperative" side-by-side patterns scale well, sharing energetic benefits, whereas "selfish" inline formations favor trailing agents but lose cohesion beyond a critical size.

To study larger schools, we employ a data-driven behavioral model with far-field flow interactions. At intermediate sizes ($\sim  100$), collective patterns like polarized schooling and rotationally-ordered milling emerge. Simulating up to $10^4$ agents, we observe global order breaking down, with locally ordered subgroups undergoing continuous splitting and merging. Information transfer within clusters intensifies during merging but diminishes during splitting.

These findings guide the design and control of autonomous robots in fluid environments while contributing to nonlinear dynamics, flow control, and biological physics. 
Tags
fish schooling
reinforcement learning
vortex dynamics
underwater navigation
collective behavior
Linked assets
University of Southern California Dissertations and Theses
doctype icon
University of Southern California Dissertations and Theses 
Action button