Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Human and organizational factors of PTC integration in railroad system and developing HRO-centric methodology for aligning technological and organizational change
(USC Thesis Other)
Human and organizational factors of PTC integration in railroad system and developing HRO-centric methodology for aligning technological and organizational change
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
University of Southern California Human and Organizational Factors of PTC Integration in Railroad System and Developing HRO-centric Methodology for Aligning Technological and Organizational Change A Dissertation Submitted to the Faculty of the USC Graduate School in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Industrial and Systems Engineering By Yalda Khashe Academic Adviser and Committee Chair Professor Najmedin Meshkati Dissertation Committee Professor Ann Majchrzak Professor Mansour Rahimi May 2018 I Table of Contents Acknowledgement ....................................................................................................................... VI Executive Summary ...................................................................................................................... 1 CHAPTER 1. Introduction .......................................................................................................... 4 1.1. Railroad Accidents and PTC ............................................................................................ 4 1.2. Problem Statement and Motivation ................................................................................. 6 1.3. Scope.................................................................................................................................. 10 CHAPTER 2. Literature Review ............................................................................................... 11 2.1. Positive Train Control (PTC) ......................................................................................... 13 2.2. High Reliability Organization ......................................................................................... 18 2.2.1. Hallmarks of High Reliability Organizations ............................................................. 18 2.2.2. Implementation of HRO Principles ............................................................................ 21 2.2.2.1. System Views of HRO ........................................................................................ 21 2.2.2.2. Decision Making at HROs ................................................................................. 25 2.2.2.3. Safety Culture at HROs ..................................................................................... 26 2.2.3. High Reliability Organizing and Resiliency ............................................................... 30 2.2.4. Application of HRO in Railroads ............................................................................... 32 2.2.5. New Technology Implementation in High-risk Organizations ................................... 34 2.3. Risk and Reliability Analysis in Railroads .................................................................... 37 2.3.1. Preliminary Hazard Analysis (PHA) .......................................................................... 39 2.3.2. Hazard Operability Analysis (HAZOP) ...................................................................... 39 2.3.3. Failure Modes and Effects Analysis (FMEA) ............................................................ 40 2.3.4. Failure Modes and Effects Criticality Analysis (FMECA) ......................................... 41 2.3.5. Operating Hazard Analysis (OHA) ............................................................................. 42 2.3.6. Fault Tree Analysis (FTA) .......................................................................................... 42 2.3.7. Event Tree Analysis (ETA) ........................................................................................ 43 2.3.8. Human Reliability Analysis (HRA) ............................................................................ 43 2.3.9. Comparison of the Risk Analysis Tools ..................................................................... 46 CHAPTER 3. Methodology........................................................................................................ 49 3.1. Overview ........................................................................................................................... 49 II 3.2. Hypothesis Testing and Data Analysis ........................................................................... 49 3.2.1. Research Question and Data Collection ..................................................................... 49 3.2.2. Interviews and Site Visits ........................................................................................... 52 3.2.3. Study 1 – Factors Influencing a Successful Performance ........................................... 54 3.2.3.1. Multiple Discriminant Analysis ........................................................................ 58 3.2.3.2. Z-test .................................................................................................................... 64 3.2.4. Study 2 – PTC Delay Factors ...................................................................................... 70 3.2.4.1. PTC Failure Modes ............................................................................................ 71 3.2.4.2. PTC Delay Factors ............................................................................................. 79 3.3. Integration of HRO Principles into the PTC Operations ............................................. 83 3.3.1. Adaptation of HRO Principles for Railroad Operations under PTC ........................... 84 3.3.2. Integration of HRO Principles into the Identified PTC Performance Measures ........ 93 CHAPTER 4. Challenges and Potential Future Work ............................................................ 96 CHAPTER 5. Summary and Conclusion ............................................................................... 100 References .................................................................................................................................. 104 III List of Tables Table 1 - PTC Preventable Accidents (1996-2015) ........................................................................ 5 Table 2 - Comparison of risk analysis tools ................................................................................. 47 Table 3 - PTC Visits and Interviews ............................................................................................. 53 Table 4 - Daily PTC Run Reports Summary ................................................................................ 57 Table 5 - Multiple Discriminant Analysis .................................................................................... 60 Table 6 - Summary of the Canonical Discriminant Function ....................................................... 62 Table 7 - Standardized Canonical Discriminant Function Coefficients ....................................... 63 Table 8 - Structure Coefficients Matrix ........................................................................................ 64 Table 9 - Initialization Most Influential Factor ............................................................................. 65 Table 10 - Disengage Most Influential Factor .............................................................................. 66 Table 11 - Cut-out Most Influential Factor ................................................................................... 66 Table 12 - Differences between Initialization and Disengage ...................................................... 68 Table 13 - Differences between Initialization and Cut-out ........................................................... 68 Table 14 - Differences between Disengage and Cut-out .............................................................. 69 Table 15 - Most Influential Factors Contributing to Failed PTC Runs ........................................ 69 Table 16 - Most Influential Factors Contributing to Semi-successful PTC Runs ........................ 69 Table 17 - PTC Failure mode ....................................................................................................... 72 Table 18 - PTC Delay Factors ...................................................................................................... 80 Table 19 - HRO adaptation to PTC ............................................................................................... 85 Table 20 - Integration of HRO Principles into the PTC Operations ............................................. 92 IV List of Figures Figure 1 - September 2008, Metrolink Accident .......................................................................... 14 Figure 2 - How PTC works ........................................................................................................... 15 Figure 3 - Integrated Positive Train Control ................................................................................. 16 Figure 4 - HRO Meta Models ....................................................................................................... 22 Figure 5 - "Work-as-imagine" vs. Work-as-done" in non-HROs ................................................. 23 Figure 6 - HRO Characteristics vs. INPO Traits of a healthy Safety Culture .............................. 29 Figure 7- A general model of limited reliability seeking ............................................................. 33 Figure 8 - PSA Procedure ............................................................................................................ 38 Figure 9 - First level fault tree analysis for train derailment ........................................................ 43 Figure 10 - Event Tree .................................................................................................................. 44 Figure 11 - Relationship of safety, human error and their influence ............................................ 45 Figure 12 - Metrolink System Map ............................................................................................... 51 Figure 13 - Metrolink PTC Service Map ...................................................................................... 51 Figure 14 - System initialization under PTC ................................................................................ 56 Figure 15 - FTA for over speeding ............................................................................................... 74 Figure 16 – FTA for Movement of a train through a switch left in the wrong position .............. 75 Figure 17 - FTA for Incursion into established work zone ........................................................... 76 Figure 18 - TFA for train-to-train accidents ................................................................................. 77 Figure 19 - Factors influencing PTC based on FTA ..................................................................... 78 Figure 20 - Factors Contributing to Delayed PTC Runs ............................................................... 82 Figure 21 - Unidentified factors affecting successful PTC Runs (April - December 2016) ......... 83 Figure 22 - Mapping between PTC Run Reports vs. Delay Reports Factors ............................... 91 Figure 23 - Metrolink Rotem Cab Simulator ................................................................................ 97 Figure 24 - Example of a potential PTC Dashboard ..................................................................... 99 V To my Parents Shahriar Khashe and Sara Khaleghi Soroush and my Sister Saba for their Unconditional Love and Support VI Acknowledgement Throughout this journey many people offered their support and insight, and although I cannot mention them all, I would like to take this opportunity to acknowledge several who deserve special recognition. I would like to express my sincere gratitude to my advisor and mentor, Professor Najmedin Meshkati, for his guidance, patience, and encouragement through this journey. His passion for his work and selfless dedication to his students’ success have inspired me to be a better scholar. I will forever treasure his invaluable support. I am grateful to my mentor Professor Mansour Rahimi, who stood by me and guided me every step of the way. I would like to thank Professor Ann Majchrzak for her generous guidance and continued support. I want to express my appreciation towards Professor Jamal Abedi of University of California, Davis, whose instrumental support made this work possible. My sincere thanks go to Ms. Gail Davis and her colleagues at Metrolink for their help with data collection and invaluable input regarding PTC operations. It is my pleasure to acknowledge the significant help and insightful remarks provided by Professor Greg Placencia of California State Polytechnic University, Pomona; Professor Daved Van Stralen of Loma Linda University; RAdm Tom Mercer; and Professor James Moore of University of Southern California, during this research. I would like to thank the faculty and the staff of the Daniel J. Epstein Department of Industrial and Systems Engineering for their support throughout my PhD program at USC. VII Lastly, I would like to thank my family for their love and support. Maman and Baba, you believed in me, even when I did not believe in myself, and you pushed me when I was about to give up. All that I am and all I will ever be is because of you, and for that I will forever be grateful. Saba, my baby sister, you are my best friend and confidant, I am so glad to have you in my life. My uncle Alireza, thank you for being my rock. Mehdi Dorri, thank you for your friendship, and for being there when I needed a shoulder to lean on. And to the memory of my aunt Mitra, to whom I never got the chance to say good bye. 1 Executive Summary An apparent rash of grave railroad accidents in the United States has not only damaged the railroad infrastructure and interrupted its operations but also endangered the safety and lives of train crewmembers and passengers. Railroads operate in high-risk, hazardous, and rapidly changing environments over long periods of time while facing the inevitable task of avoiding catastrophic events. Although train accidents are rare, they are highly visible, making the consequence of such failures disastrous. The fact that trains are major means of transportation around the world is another factor that makes these failures highly significant. One of the accidents that had a major impact on the US railroad industry was the 2008 collision between a Metrolink commuter train and a Union Pacific freight near Chatsworth, California. The catastrophic result of the accident was the loss of 25 lives, 135 injuries, and millions of dollars in damages. Shortly after the accident, US Congress passed the Rail Safety Improvement Act that required Class I railroads to install Positive Train Control (PTC) systems on their tracks, which carry passengers or toxic-by-inhalation materials by the end of 2015. The deadline was later extended to December 31, 2018 in light of the challenges that railroad organizations had in implementing PTC systems. This may seem surprising, as one would think well-established procedures should exist given the long history of railroads, but the reality is that railroads, as in all industries, must continually reestablish themselves with new technologies. One of the challenges of PTC implementation is altering an existing safety system with this new technology. The installation of new technology always involves some changes to the organization and its components. Therefore, organizational and technological changes must be considered simultaneously. 2 High Reliability Organizations (HRO) methodology provides a systematic approach to organizational safety. This methodology creates a culture of mindfulness and can provide an extra layer of defense in railroad safety systems. To the best of our knowledge, and based on extensive research, HRO has not been implemented in railroads so much as in healthcare, nuclear, and chemical-processing industries, also there has not been a system that was designed with HRO in mind. Most studies have focused on transforming an organization to a HRO, while it would be more productive to design highly reliable organization. In this dissertation we propose guideline for improving service reliability and service interruption in train operations under PTC by eliminating preventable failures and system variations. The main purpose of this guideline is to provide a systematic approach to identify the human, organizational, and technological factors that influence the reliability of the system, and address and evaluate these factors using HRO characteristics. We conducted an in-depth analysis of railroad operation under PTC to evaluate the contribution of implementing HRO elements to the improvement of service reliability and service interruption in train operations under PTC. We identified four performance factors that have a significant influence on the reliability and serviceability of the Metrolink operation under PTC between April 4 and December 22, 2016. Although technical factors such as onboard system failures or software issues affected the PTC operation, our studies showed that software download and updating issues during the initialization process, invalid speed reading due to a tachometer slip/slide, GPS, or navigation issues that result in a loss of location or signal in the PTC system, PTC system not being activated at the designated point, and invalid train consist displayed by PTC system at the start or during the train run are the factors that significantly contributed to PTC failures and major 3 delays. Almost all of these issues arise from inadequate interaction of the subsystems or lack of alignment between organizational and human factors, and the technical system. To show what it means for a railroad organization to be highly reliable, we integrated the HRO principles to train operations. We further expanded that concept to emphasize the train operation under PTC, and provided a check list for onboard factors, communication network, and crew and dispatcher work and interaction, since these were the areas that corresponded to the factors that had the most influence on the reliability of PTC operation in the observed system. The findings of this research will make a significant contribution to the risk reduction and safety operation of the railroad. The main goal of railroad organizations is the safety of their passengers. We optimize the schedule, improve the infrastructure, but at the end of the day we want to make sure that everyone who gets on the train or works on the track returns to their loved ones, safe and sound. 4 CHAPTER 1. Introduction 1.1. Railroad Accidents and PTC On August 20, 1969, two Penn Central commuter trains collided head-on near Darien, Connecticut, killing four and injuring 43. That tragedy 45 years ago began the NTSB's call for development and implementation of Positive Train Control (PTC) systems. Since then, the NTSB has issued almost 50 PTC-related safety recommendations and has included PTC on its Most Wanted List every year from its inception in 1990 until enactment of the RSIA. The NTSB January 16, (2014) 2014, Press Release states: “The NTSB has long been calling for PTC, which works by monitoring the location and movement of trains, then slowing or stopping a train that is not being operated in accordance with signal systems or operating rules. Just since 2004, the NTSB has completed investigations of 25 train accidents that killed 65, injured over 1100, and caused millions of dollars in damages all of which could have been prevented or mitigated by PTC.” Unfortunately, despite some progress in the four decades since that original recommendation, train collisions still occur. Table 1 provides a list of railroad accidents between 1996 and 2015 that, based on NTSB investigations, could have been prevented or mitigated by the PTC technology. These accidents resulted in 84 deaths, 1,307 injuries, and millions of dollars in damages. 5 Date Accident Injuries Location Report 1996 February 16 Collision and Derailment of Maryland Rail Commuter MARC Train 286 and AMTRAK Train 29 11 death 26 injured Silver Spring MD (1997) 1999 January 17 Collision involving three Consolidated Rail Corporation freight trains operating in fog on a double main track near 2 death Bryan, OH (2001) 2002 April 23 Collision of Burlington Northern Santa Fe freight train with Metrolink passenger train 2 death 141 injured Placentia, CA (2003) 2003 October 12 Derailment of Northeast Illinois Regional Commuter Railroad Train 519 2 death 117 injured Chicago, IL (2005) 2003 November 15 Collision of Union Pacific Railroad with a Burlington Northern Santa Fe Railway Company train 2 injured Kelso, WA (2005) 2004 May 19 Collision Between Two BNSF Railway Company Freight Trains 1 death 4 injured Gunter, TX (2006) 2004 June 28 Collision and derailment of UP Railroad train MHOTU-23 With BNSF Railway Company train MEAP-TUL-126-D 3 death 30 injured Macdona, TX (2006) 2005 January 6 Collision of Norfolk Southern Freight Train 192 With Standing Norfolk Southern Local Train P22 9 death 554 injuries Graniteville, SC (2005) 2005 July 10 Collision of Two CN Freight Trains 4 death Anding, MS (2007) 2007 November 10 Collision of two Union Pacific Railroad freight trains 1 death Bertram, CA (2008) 2007 November 30 Collision of Amtrak Passenger Train 371 and Norfolk Southern Railway Company Freight Train 23M 71 injuries Chicago, IL (2009) 2008 May 28 Collision Between Two Massachusetts Bay Transportation Authority Green Line Trains 5 injuries Newton, MA (2009) 2008 September 12 Collision of Metrolink Train 111 with Union Pacific Freight Train LOF65-12 25 death 135 injured Chatsworth, CA (2010) 2009 July 14 Collision of Dakota, Minnesota & Eastern Railroad Freight Train and 19 Stationary Railcars 2 deaths Bettendorf, IA (2012) 2011 April 17 Collision of BNSF Coal Train With the Rear End of Standing BNSF Maintenance-of-Way Equipment Train 2 death 2 injured Read Oaks, IA (2012) 2011 May 8 Collision of Port Authority Trans-Hudson Train with Bumping Post at Hoboken Station 37 injured Hoboken, NJ (2012) 2012 January 6 Collision between Two CSX Transportation Freight Trains 2 injured Westville, IN (2013) 2012 June 24 Collision of Two Union Pacific Railroad Freight Trains 3 death 1 injured Goodwell, OK (2013) 2015 May 12 Derailment of Amtrak Train 188 due to Over speeding 8 death 185 injured Philadelphia, PA (2017) Table 1 - PTC Preventable Accidents (1996-2015) 6 PTC cannot prevent all railroad failures, however it can provide a critical redundancy in railroad safety systems that could stop train accidents. However, for PTC systems to work properly, the organizations need to develop a culture of safety. On April 3, 2016 Amtrak train 89 struck a backhoe with a worker inside while traveling 99 mph near Chester, Pennsylvania. Two Amtrak employees were killed and 39 passengers were injured as the result of this accident (National Transportation Safety Board , 2017). PTC is designed to prevent the incursion of the train into the work zone, and that section of the track was also equipped with PTC (Laughlin, 2017). However, the accident was not prevented because the equipment used for maintenance were not detectable by the PTC system. This accident highlighted the importance of human and organizational factors in a successful implementation and performance of PTC system. 1.2. Problem Statement and Motivation High-risk organizations are organizations operating technologies sufficiently complex to be subject to catastrophic accidents. High-reliability Organizations (HROs) are a subset of high-risk organizations designed and managed to avoid such accidents (Roberts & Rousseau, 1989). Not all organizations need to be an HRO. However, organizations that deal with “low probability, high consequence” events really have no choice but to become one. Dr. Richard S. Hartley, Principal Engineer at Pantex1, notes that an HRO is important because: “Some types of system failures are so punishing that they must be avoided at almost any cost. These classes of events are seen as so harmful that they disable the organization, radically limiting its capacity to pursue its goal, and could lead 1 Pantex Plant, located 17 miles northeast of Amarillo, Texas, in Carson County, is charged with maintaining the safety, security, and reliability of the nation’s nuclear weapons stockpile. The facility is managed and operated by B&W Pantex for the U.S. Department of Energy/National Nuclear Security Administration. 7 to its own destruction.” [Emphasis added by Hartley, quoting renowned UC Berkeley professor Todd Laporte (LaPorte & Consolini, 1991)] The railroad industry, as an example of a high-risk and safety-critical organization, strives to avoid catastrophic events, while performing dynamic tasks under strict time constraints and operating technology posing large-scale physical hazards. Failures in these systems are rare but they are highly visible, making the consequences of such failures disastrous. The fact that trains are major means of transportation around the world is another factor that makes these failures highly significant. A railroad is an example of a dynamic, high-risk organization. It faces the inevitable task of avoiding catastrophic events while performing dynamic tasks under very strict time constraints and operating technology that poses large-scale physical hazards. Although train accidents are rare, they are highly visible, making the consequences of such failures disastrous. The fact that trains are major means of transportation around the world is another factor that makes these failures highly significant. Federal Railroad Administration (FRA) records show that there were 59 train collisions and 891 train derailments in the United States, between January and December 2017, which resulted in two deaths and 185 injuries2. According to a Rail Safety Fact Sheet published on February 2014, accidents related to human error and track defects account for more than two-thirds of all train accidents, while trespassing and highway-rail grade crossing incidents account for 6% of all rail- related fatalities. FRA reports also show a considerable growth in the railroad transportation use. 2 The data represents January to August 2015- http://safetydata.fra.dot.gov/officeofsafety/publicsite/Query/AccidentByRegionStateCounty.aspx 8 Amtrak ridership is up more than 50% since 2000 and freight rail traffic is near an all-time high, which emphasizes the need for more aggressive efforts in railroad safety and reliability improvements (Federal Railroad Administration , 2014). One of the accidents that had a significant impact on the US railroad industry was the 2008 collision between a Metrolink commuter train and a Union Pacific freight near Chatsworth, California. The catastrophic result of the accident was the loss of 25 lives, 135 injuries, and millions of dollars in damages.3 Shortly after the accident, the US Congress passed the Rail Safety Improvement Act that requires Class I railroads to install Positive Train Control (PTC) systems on their tracks, which carry passengers or toxic-by-inhalation materials by the end of 2015. The National Transportation Safety Board (NTSB) January 16, 2014, Press Release states: “The NTSB has long been calling for PTC, which works by monitoring the location and movement of trains, then slowing or stopping a train that is not being operated in accordance with signal systems or operating rules. Just since 2004, the NTSB has completed investigations of 25 train accidents that killed 65, injured over 1100, and caused millions of dollars in damages all of which could have been prevented or mitigated by PTC.” (NTSB, 2014) It seems that there has been a rash of serious and horrific railroad accidents in the United States in recent years. A Metro-North railroad crash near Valhalla, New York, on February 3, 2015 killed five people and injured nine others 4 . The derailment of an Amtrak train near Philadelphia on May 12, 2015 killed eight, injured more than 200 passengers, and interrupted the rail service in a major 3 http://www.reuters.com/article/2008/10/02/us-usa-train-crash-idUSN0152835520081002 4 https://www.ntsb.gov/investigations/AccidentReports/Pages/RAR1701.aspx 9 corridor on the East Coast 5 . In May 2015, U.S. senators Charles Schumer and Richard Blumenthal announced a new rail safety bill, the "Positive Train Control Safety Act", in the wake of the previous February’s tragic train accident in Valhalla, NT to ensure railroads are moving forward to install PTC. 6 PTC is a generic term referring to a range of fully integrated technologies that overlay existing safety systems to prevent train-to-train collisions and improve worker safety. One of the challenges that the railroad industry is facing for implementing PTC is the complications of introducing this new technology to an already existing system. This may seem surprising, as one would think well-established procedures should exist given the long history of railroads, but the reality is that railroads, as in all industries, must continually reestablish themselves with new technologies. The installation of new technology always involves some changes to the organization and its components. Therefore, organizational and technological changes must be considered simultaneously. Studies show that if both organizational and technological changes are not effectively integrated and managed to achieve alignment, the technological change will fail. Therefore, all system variables including psychological, social, organizational, and political processes as well as technological and engineering practices should be considered for a proper alignment of technological and organizational change. This research will provide a guideline for the adaptation of HRO principals as part of the implementation process for the PTC technology in a safety-sensitive railroad organization. The objective is to improve system reliability. We defined reliability in our system as “the lack of 5 https://www.ntsb.gov/investigations/AccidentReports/Pages/RAR1602.aspx 6 http://minutemannewscenter.com/articles/2015/04/23/fairfield/business/doc5536a590bc327750618088.txt 10 unwanted, unanticipated, and unexplainable variance in performance” (Hollnagel, 2009). In other words, we want to minimize the gap between “work-as-planned” and accomplished outcome, or “work-as-done”. We will define human, organizational, and technological factors that influence train operations under PTC and identify the ones that have the most influence on the PTC system’s performance during a train run, or contribute to train delays. We will also discuss an adaptation of HRO principles to railroad operations under PTC, and further provide a checklist for integration of HRO principles into the factors that we identified as the most influential to a system’s reliability. 1.3. Scope We focused on PTC objectives and factors that could potentially lead to train to train collisions, derailments due to over speeding, incursions into roadway work zones, and movement of a train through a switch left in the wrong position. In addition, the focus of the study is on the interaction of the subsystems with the PTC technology and does not evaluate the design and technical features of the PTC system and its components. This dissertation is structured as follows: Chapter 1 provides a brief history of railroad accidents that could have been prevented by PTC, and describes the problem, the motivation, and the scope of the work. Chapter 2 proposes the literature review and identifies the gaps in the existing research. Chapter 3 discusses the objectives and research questions, and introduces research methodology, results, and conclusions that address the research questions. Chapter 4 lists the limitations of the research and provides future research steps. Chapter 5 concludes the dissertation. 11 CHAPTER 2. Literature Review The main goal of organizations operating in the mass transportation industry is the safety of their passengers, and railroads are no exception. Passenger safety in railroads could be achieved by a clear definition of essential fundamentals of a safe operation and forbidding operation outside this definition (Hale & Heijer, 2006). Train operation and its safety rely heavily on the engineer’s performance. Many times the only barrier standing between two trains is “the vigilance and dedication to rules and duty of the men and women in the cab of a locomotive” (Hansen, 2001), and if that fails, nothing can stop the trains and prevent the inevitable tragedy. PTC is designed to prevent such accidents. This technology calculates the time needed to stop a train before it exceeds its authority and will intercede if the locomotive engineer fails to take action. PTC is a system that uses digital Data link communications network, as well as components on the locomotive, along the wayside, and in the control center to prevent four specific types of train accidents (Federal Railroad Administration , 2014): • Train to train collisions, • Derailments due to over speeding, • Incursions into roadway work zones, • Movement of a train through a switch left in the wrong position. The PTC system could enhance railroad security by monitoring the speed and location of all trains, provide onboard enforcement of all movement authorities and speed restrictions, and have the ability to remotely intervene to stop the train. The railroads will also be able to monitor all powered and manual switches, bridges, and tunnels (Ditmeyer, 2011). 12 With the existing signal and train control systems it is possible for one person, either an engineer or the dispatcher, to make a mistake that would result in a collision. The PTC system reduces the possibility and severity of accidents by integrating the human, onboard, and control center subsystems. The PTC also creates features that check the system performance and integrity (Ditmeyer, 2011). HRO provides a systematic approach to organizational safety. This methodology creates a culture of mindfulness and could provide an extra layer of defense in railroad safety systems. To the best of our knowledge and based on extensive research, HRO has not been implemented in railroads so much as healthcare, nuclear, and chemical-processing industries, also there has not been a system that was designed with HRO in mind. Most studies focus on transforming an organization to an HRO, while it would be more productive to design a highly reliable organization. One of the challenges of PTC implementation is the complications of introducing this new technology to an already existing system without impairing the human-automation interaction. Failure in complex systems, include equipment failure as well as human error. Traditional risk/reliability studies assume that the majority of system failures were due to hardware failures, but it has been found from the accident history that human error causes 20–90% of all major systems (Verma, Ajit, & Karanki, 2010). In railroads, as an example of a complex, high-risk system, human related errors caused 39% of accidents (Federal Railroad Administration , 2014). The installation of new technology always involves some changes to the organization and its components. Systems do not always impact the way they were designed, and designers might fail to account for the unexpected features and behaviors that fall outside the pre-established scope of work. Therefore, human, organizational, and technological changes must be considered simultaneously. 13 In this chapter, first, we provide a review of PTC system literature. Then we focus on HRO and discuss the hallmarks of HRO, systems view and decision making in these organizations, safety culture in HROs, resiliency in HROs, and review literature on HRO implementation. Finally, we will provide a review of studies in new technology implementation and its effect on organizational reliability. 2.1. Positive Train Control (PTC) In 1970, the NTSB first addressed the need to require a form of automatic train control. Since then, the NTSB has issued almost 50 PTC-related safety recommendations and has included PTC on its Most Wanted List every year from its inception in 1990 until enactment of the RSIA (NTSB , 2014). Although train accidents are rare, they are highly visible, making the consequences of such failures disastrous. The fact that trains are major means of transportation around the world is another factor that makes these failures highly significant. Shortly after the September 2008 accident between a Metrolink commuter train and a Union Pacific freight, US Congress passed the Rail Safety Improvement Act of 2008 (RSIA) on October 4. President George W. Bush signed the act into law on October 16. RSIA requires Class I railroads to install PTC systems on their tracks that carry passengers or toxic-by-inhalation (TIH) materials (Association of American Railroads, 2011). The law originally requires fully-functional PTC systems to be in place by December 31, 2015; however, in light of railroad challenges in implementing this technology, Congress extended the deadline by at least three years to December 31, 2018. There is a possibility for extension for two additional years if the organizations meet certain requirements (FRA, 2017). Approximately 70,000-80,000 miles of rail miles will be affected by the PTC mandate. 14 Figure 1 - September 2008, Metrolink Accident 7 PTC is a generic term referring to a range of fully integrated technologies that overlay existing safety systems to prevent train-to-train collisions and improve worker safety. The current PTC system gives the notice of an impending penalty break application if the train approaches a speed- limiting with full speed or if the train is traveling beyond speed restrictions. If the engineer does not take the necessary action, the system brings the train to a stop with full-service break application. It also prevents the train from moving beyond the speed restrictions. The conventional safety approach used signal systems with colored signs along the track, and daily bulletin reports to manage the speed of the train. If a train travels through unsignaled (dark) or automatic signal territories, movement authorities are transmitted to and confirmed with train crews over an analogue voice radio system (Ditmeyer, 2011). 7 Metrolink 2008 Train Crash - http://www.berglundandjohnson.com/images/metrolink-train-wreck.jpg 15 Figure 2 - How PTC works (Ditmeyer, 2011) The current state of train operation in most railroad organizations is Centralized Train Control (CTC). In CTC territory, authority for train movements and track occupancy is verbally exchanged between the dispatcher and train crew over the radio. A movement authority consists of determining a safe point to which a train can travel, for example, an absolute signal displaying a stop aspect (Southern California Regional Rail Authority (Metrolink), 2010). The General Code of Operating Rules (GCOR) governs these exchanges of information (Federal Railroad Adminestration , 2003). CTC uses electrical track circuits to determine train location. In PTC systems in addition to electrical track system and wayside unites, a Global Positioning System (GPS) is used for tracking train movements (Southern California Regional Rail Authority (Metrolink), 2010). The PTC system consists of wayside, office, and on-board elements. These elements are linked together through a communications network. This communications network will provide the communication links needed to transmit operational and safety critical data among the Back Office Server (BOS), the Onboard PTC package, and wayside Employee in Charge (EIC) mobile units for the movement authority. 16 Figure 3 - Integrated Positive Train Control (Ditmeyer, 2011) Figure 3 shows the PTC system and its components. The following are descriptions of the major PTC components as presented by Metrolink (Southern California Regional Rail Authority, 2013): 17 Back Office Server System (BOS) The PTC Back Office Server is the database containing the speed restriction, track information, and wayside signaling configuration data. Onboard System The Onboard System is a combination of software and hardware that monitors and controls train movement and displays train operations information. Wayside Signal System The Wayside Signal System includes signal equipment, Wayside Interface Units (WIUs), PTC radios, and GPS antennas. This system communicates with the BOS and onboard system through the WIUs and PTC radios. Communication Network Component The Communication Network consists of a wired and wireless communication network that connects PTC components together. Computer-Aided Dispatching System Computer-Aided Dispatching (CAD) interacts with BOS to enforce a train’s authorization through the designated segments of track. As we described in this section, PTC is a range of fully integrated technologies, which are an overlay on top of the existing safety system in a railroad. The successful implementation and operation of PTC are dependent on the flawless and uninterrupted interaction between the 18 subsystems. Otherwise, the reliability of the whole safety system would be at risk. High Reliability Organizations (HROs) are organizations that manage such risks and conduct relatively error free operations. 2.2. High Reliability Organization 2.2.1. Hallmarks of High Reliability Organizations The fundamental characteristics of an HRO foster a culture of trust, shared values, unfettered communication, and process improvement. It nurtures, promotes, and takes advantage of distributed decision-making, “where the buck stops everywhere”. The culture of a HRO is one that anticipates failures within its organization and sub-systems and works diligently to avoid errors and minimize its impact. This preoccupation with the possibility of failure leads to a continual state of ‘mindfulness’ combined with a strong desire to be a ‘learning organization’. HROs actively seek to learn what they do not know, design systems to disseminate relevant knowledge relating to a problem available to everyone in the organization, learn rapidly and efficiently, train staff to recognize and respond to system abnormalities, empower staff to act, and design redundant (sub-)systems to anticipate problems (N. Meshkati, 2010). The two key attributes marking high reliability organizations: 1. A chronic sense of unease, i.e., they lack any sense of complacency. For example, they do not assume that because they have not had an incident for 10 years, one will not happen imminently; 2. Strong responses to weak signals, i.e., they set their threshold for intervening very low. If something does not seem right, they are very likely to stop operations and investigate. 19 Consequently, they accept an uncommonly much higher level of ‘false alarms’ than other organizations. According to Weick and Sutcliffe (2001), “hallmarks of high reliability”, or major characteristics of HRO while “anticipating and becoming aware of the unexpected”, include: • Preoccupation with failure • Reluctance to simplify interpretations • Sensitivity to operations. In addition, when the “unexpected occurs”, HROs attempt to contain it by: • Commitment to resilience • Deference to expertise. Not all of these characteristics apply to all the organizational processes. The following provides a more detailed description for each HRO characteristic, and the situations in which they will be applied: Preoccupation with Failure “HROs have a mindset of chronic wariness. Hubris is the enemy of system reliability” (Earl Carnes). “Hubris” has devastating effects for system safety, any lapse is a symptom that something may be wrong with the system; near misses provide opportunities to improve; and error reporting is highly encouraged. 20 Reluctance to Simplify Interpretations Given the complex nature of work, HROs accept that systems can fail in ways that have never happened before, and that it is not possible to identify all the ways systems will fail in the future. Moreover, failures or near misses do not necessarily result from a single and simple cause. The work context constantly changes, meaning there is no such thing as “routine” work. Different situations require alertness, sensitivity, and a good dosage of educated and improvised problem solving capability. Sensitivity to Operations HROs have deep knowledge of the technology and management systems they operate, pay close attention to the front line where the actual/real work is done, and are aware of emerging local operating practices. HROs recognize that systems are not necessarily deterministic, orderly, stable, or routine, but are rather dynamic, complicated, and the result of continuous social construction. Commitment to Resilience HROs can detect, contain, and rebound from unexpected events. An HRO is not necessarily error free, but errors do not disable it; the system absorbs or adapts to disruptions without fundamental breakdowns. The system absorbs or adapts to perturbations and disruptions without fundamental breakdown. Through fast, real time communication, feedback, and improvisation, the system can restructure or reconfigure in response to external (or internal) changes or pressures. Worst-case scenarios are always imagined, modeled, and rehearsed. 21 Deference to Expertise In HROs, expertise is distributed and the system controller typically defers to the person with the expertise relevant to the issue they are confronting. An expert is not necessarily the most experienced or the highest ranked person; it is usually someone at the “sharp end” -- where the real work is done. In other terms, this characteristic of HRO refers to empowering expert people closest to a problem and shifting leadership to people who have the answer to the problem at hand. 2.2.2. Implementation of HRO Principles 2.2.2.1. System Views of HRO The HRO framework has been used for reflective learning in a variety of technical domains, from nuclear defense work of the Department of Energy’s Pantex plant to healthcare and transportation. These organizations are socio-technical systems. Madsen (2011) devideds structural elemnts of these organizations in two categories of 1) people and 2) technological components and information networks. Organizations as socio-technical systems are complex systems due to their nature. Perrow (1984) classifies organizations and systems based on their degree of complexity and on the degree to which organizational elements are integrated. Perrow defines interactively complex systems as systems in which two or more discrete failures can interact in unexpected ways that potentially have a complex and unpredictable effect on the overall system. The complexity of a system influences the intensity and nature of the accidents. In a complex system, accidents are driven by unplanned and unexpected interactions between system components, which make it extremely difficult to predict and prepare for accidents (Leveson, Dulac, Marais, & Carroll, 2009). 22 The application of HRO starts with deep knowledge of the technologies that are used in the organization, and a clear understanding of roles and responsibilities. There organizations foster a culture that values diversity and encourages productive collaboration. The concept of complex adaptive systems enables organizational change in the system and is internalized in an HRO through systems, processes, culture, and education (Carnes E. , 2010). Resilience and reliability are achieved at both the meso and macro levels through systems behavior, culture, and cognition (Carnes W. E., 2011) (Figure 4). HROs develop processes that support incident reporting, risk profiling, lessons learned, investigation, and rewards recognition. The culture of the organization supports leadership and engagement. They encourage systems thinking and promote situational awareness and mindfulness ( DOE Office of Corporate Safety Analysis, 2011). Although merely adapting HRO characteristics in an organization does not guarantee a safety culture, it could create a mindful environment where the traits of a healthy safety culture could prosper, people are empowered to participate in decision making processes, and the organization is more resilient in the face of unexpected challenges. Figure 4 - HRO Meta Models (Carnes W. E., 2011) Culture relationships Behavior Processes Resilinence Learning, adaptability Cognition Mental Model 23 HROs strive to minimize the gap between “work-as-imagined” versus “work-as-done” (Figure 5). These organizations develop detailed operation procedures, and due to their safety-sensitive nature strive to perform within the boundaries and safety limits. However, it a known fact that there are discrepancies between plans and the actual performance of the sub-systems (∆W # ). ∆W # represents “what” is not working that puts us out of the physics-based safety basis, and it is determined using Causal Factors Analysis (CFA). The “why” in ∆W # is determined using some of the organizational and culture investigative CFA tools. The goal is to design, implement, and manage work and processes in a way that minimizes this gap (Hartley, High Reliability Organizations and Practical Approach, 2011). Figure 5 - "Work-as-imagine" vs. Work-as-done" in non-HROs (Hartley, 2011) One approach to HRO implementation was developed and implemented in Pantex. This approach, which was adopted from Dr. Deming’s System of Profound Knowledge (SoPK) (W. Edward Deming Institute, 2014), introduces four HRO practices to deliver high reliability in an organization (Hartley, 2014): 24 Manage the System not the Parts It is leadership’s role to align organizational performances and ensure that the system provides safety, security, and quality. They also evaluate variability and safety, foster a culture of reliability, and promote organizational learning. Reduce Variability in HRO System To deliver high reliability, we need to identify the threats and deviation from normal procedures that could result in an undesirable event and place appropriate controls. The leadership communicates the safety objectives of the system within the organization and assign priority to operations. Break-the-Chain framework (DOE, 2012) and the Swiss Cheese model (Reason, 2000) are the tools that help the organization to find the vulnerabilities of the system and minimize and mitigate the risk of accidents by implementing redundancy in systems safety. No one failure, human or technical, is sufficient to cause an accident. They often involve an unforeseeable combination of several contributing factors from different levels of the system (EUROCONTROL Experimental Center , 2006). Foster Strong Culture of Reliability Management should enable employees to make conservative decisions, insure proficiency through hands-on work, and encourage open questioning of, and challenges to, safety system. To sustain an HRO, we should foster a strong culture of reliability. 25 Learn and Adapt as an Organization The final step is to learn as an HRO, or in other words create a cycle of learning throughout the organization. HROs also share information and learn from others’ mistakes. 2.2.2.2. Decision Making at HROs As stated before, organizations are classified based on the degree of complexity and on the degree to which their organizational elements are integrated. Turner’s (1978) Disaster Incubation Model (DIM) explains how poor decisions occur and accumulate in these organizations. This model has six steps: 1) Starting point, 2) Incubation period, 3) Precipitating event, 4) Onset, 5) Rescue and salvage, and) Full cultural readjustment. The first four steps refer to the emergence of poor decision-making. The starting point in an organization involves culturally accepted views of the organization’s processes, rules, laws, codes, and procedures designed to ensure safety and accident prevention. The incubation period is a series of events that are at odds with the organization’s norms and beliefs. These discrepant events represent times during which organizations could detect and change the flaws of their models and procedures (Roberts, Madsen, & Desai, 2005). In organizations with a high probability of risk these incubating events are neglected or misunderstood, and they could be an indicator of latent failures in the system. Latent failures usually are not noticed until they interact with some triggering events; therefore, they weaken an organization’s defense system before any significant accident happens. To detect these events, HROs build redundancy in the system, or create other processes to improve decision-making. Recognizing anomalies or latent errors before the crisis is key to effective decision making that achieves high reliability (Roberts, Yu, Desai, & Madsen, 2009). 26 HROs aim to empower the employees, especially experts in different technical areas, and experts are not necessarily those with the greatest experience in the organization but those who have the best knowledge of the task on hand. Empowerment involves the decentralization of decision- making authority and responsibility, and it purportedly improves organizational flexibility by permitting more localized adjustments (Bigley & Roberts, 2001). In other words, when the hazardous situation happens, HROs flatten their command structure and give the person with more expertise more authority to make decisions. This way they expedite the decision-making process, which is a crucial factor to a successful emergency response. The other reason that HROs encourage empowering and decentralizing decision-making is that the decision errors made at the top of an organizational hierarchy can ramify and intensify as they move down the authority structure. Higher-level errors can induce and interact with lower-level errors, resulting in an increasingly complex and difficult to understand the situation (Bigley & Roberts, 2001). 2.2.2.3. Safety Culture at HROs Over time, organizations learn how to approach and eliminate visible and routine problems, and the positive feedback that they receive creates a culture that directly influences organizational performance. The same concept does not apply to high-risk operations, since in these organizations risks are not clear. Studies show that the serious events are often the result of systemic failures, human errors, or organizational weaknesses. Some of these factors may seem inconsequential when evaluated in isolation (IAEA, 2012). According to an analysis of high-reliability systems, such as flight operations on aircraft carriers, by Weick and Roberts (1993), a culture that encourages individualism, survival of the fittest, 27 macho heroics, and can-do reactions is often counterproductive and accident-prone. Furthermore, interpersonal skills are not a luxury, but a necessity in high-reliability organizations. The culture of a HRO is one that anticipates failures within its organization and sub-systems and works diligently to avoid errors and minimize their impact. This preoccupation with the possibility of failure leads to a continual state of ‘mindfulness’ combined with a strong desire to be a ‘learning organization’ (Weick, Sutcliffe, & Obstfeld, 2008). HROs communicate by paying attention to system interfaces, organizational culture, and flexibility are major factors in risk mitigation of large-scale complex organizations (Grabowski & Roberts, 1996). However, to encourage a culture of high reliability and mindfulness, people within an organization should believe that “their leaders are genuinely committed to safe operations and have taken appropriate measures to communicate safety principles and ensure adherence to safety standards and procedures” (Zohar, 1980). Studies show that employees’ individual perceptions of safety practice is related to the organizations’ safety performance (Desai, Roberts, & Ciavarelli, 2006). Research on organizational culture and safety has outlined five organizational processes that are useful in developing HROs (Wong, Desai, Madsen, Roberts, & Ciavarelli, 2005): 1. Develop a system of process checks to spot expected and unexpected safety problems. 2. Develop a reward system to incentivize proper individual and organizational behavior. 3. Avoid degradation of current processes or inferior processes development. 4. Develop a good sense of risk perception. 5. Develop a good organizational command and control structure. Achieving high reliability is a journey. Merely implementing the HRO characteristics does not guarantee high reliability. The US Department of Energy published a report on the Assessment of 28 Nuclear Safety Culture at the Pantex Plant in November 2012. They stated that although the organization has been trying to communicate and implement HRO principles for years, they failed to internalize those principles due to a lack of effective communication, absence of learning organization, and long-term safety solutions. This report highlighted the importance of a “healthy safety culture” to internalize HRO principles and foster the culture of respect and trust within the organization (DOE, 2012). On June 14, 2011, the US Nuclear Regulatory Committee (NRC) issued its final Safety Culture Policy Statement. In this report, Safety Culture refers to “an organization’s collective commitment, by leaders and individuals, to emphasize safety as an overriding priority to competing goals and other considerations to ensure protection of people and the environment” (NRC, 2011). In this report the NRC introduced nine traits for a positive safety culture: 1) Continuous learning, 2) Problem Identification and Resolution, 3) Work Processes, 4) Environment for Raising Concerns, 5) Personal Accountability, 6) Effective Safety Communication, 7) Questioning Attitude, 8) Leadership Safety Values and Actions, and 9) Respectful Work Environment. In April 2013, the Institute of Nuclear Power Operations (INPO) published a report on Traits of a Healthy Nuclear Safety Culture (INPO, 2013). This report was built on the NRC statement, and added “decision making” as the 10 th trait of a safety culture. 29 Figure 6 - HRO Characteristics vs. INPO Traits of a healthy Safety Culture We mapped HRO principles and safety culture traits side by side to better illustrate their relationship. The comparative analysis conveys that, although there is not an exact linier relationship between the two, there exists a strong positive relationship. In other words, improving upon one set of features would enhance and improve the other and vice versa (Figure 8). Organizational culture is the organizational common knowledge that has been acquired through learning (Bierly & Spender, 1995). The most practical approach towards learning is by trial and error; however, it is not feasible in high-risk organizations due to their complexity, tight coupling, Problem Identification and Resolution Environment for Raising Concerns Questioning Attitude Work Processes Effective Safety Communication Continuous Learning Respectful Work Environment Leadership Safety Values and Actions Reliability Seeking Organization NRC and INPO Traits of a Healthy Safety Culture HRO Characteristics Preoccupation with Failure Reluctance to simplify interpretations Sensitivity to operations Commitment to resilience Deference to Expertise Personal Accountability Decision-Making 30 and the dangerous outcome. Organizations often learn as much about themselves and their internal relationships as they learn about the critical event itself. 2.2.3. High Reliability Organizing and Resiliency The Presidential Policy Directive (PPD) (Office of the Press Secretary, 2013) defines resilience as the ability to “prepare for and adapt to changing conditions and withstand and recover rapidly from disruptions.” This is similar to the generic definition of resiliency, as “the power or ability to return to the original form, position, etc., after being bent, compressed, or stretched; elasticity.” Without understanding, the vital role of human and organizational factors in technological systems and proactively addressing/facilitating their interactions during unexpected (“beyond design basis”) events, recovery will be a sweet dream and resiliency will only be an unattainable mirage. A High-Reliability Organization is a resilient organization. These organizations are ready to respond to unforeseen events by fostering characteristics like flexibility, creativity, and spontaneity, which are filtered through individuals’ capacity to perceive, understand, and make sense of events (Grøtan, Størseth, Rø, & Skjerve, October 2008). Sense making is one the main characteristics of HROs. Studies show that HROs strive to develop the ability to identify situations that have the potential to evolve into safety critical situations by learning from previous events (Dekker & Woods, 2010). Experience provides individuals with a valuable pool of information and knowledge to draw on when engaging in pattern recognition, which could consequently enable them to identify advantage points to create a successful improvised solution (Trotter, Salmon, & Lenne, 2014). Complex and safety-critical organizations emphasize on order, control, reliance, and routine to reduce the probability of errors to suppress creativity and innovation when faced with an 31 unexpected situation. Improvisation in such organizations could be affected by the “chronic temptation to fall back on well-rehearsed fragments to cope with current problems even though these problems don’t exactly match those present at the time of the earlier rehearsal” (Weick K. , 1998, p. 551). Ambiguity triggers innovation. If individuals and organizations shy away from ambiguity in the workplace and relationships, they would only be able to reproduce routine actions (Ahmed, 1998). “Requisite imagination” is a required principle for a resilient organization (Grøtan, Størseth, Rø, & Skjerve, October 2008). Furthermore, it has been empirically validated that experts in high stress demanding situations do not usually operate using a process of analysis. Even their rules of thumb are not readily subjected to it, whereas most of the existing artificial intelligence-based automated systems always rely on an analytical decision process. If operators of complex systems rely solely on a computer's analytic advice, they would never rise above the level of mere competence -- the level of analytical capacity -- and their effectiveness would be limited by the inability of the computer systems to make the transition from analysis to pattern recognition and other more intuitive efforts (Dreyfus & Dreyfus, 1986). Study of two recent noteworthy cases: the 2009 astonishing emergency water ‘landing’ and safe evacuation of US Airways Flight 1549, as well as the restoration of the Fukushima Daini Nuclear Power Station after the 2011 To¯hoku earthquake and tsunami, concluded that front-line operators’ improvisation via dynamic problem solving and reconfiguration of available recourses provide the last resort for preventing a total system failure. Despite advances in automation, operators should remain in charge of controlling and monitoring of safety-critical systems. Furthermore, at the time 32 of a major emergency, operators will always constitute the society’s first and last layer of defense, and it is eventually their improvisation and ingenuity that could save the day (Meshkati & Khashe, 2015). 2.2.4. Application of HRO in Railroads The main goal of organizations operating in the mass transportation industry is the safety of their passengers, and railroads are no exception. Passenger safety on railroads could be achieved by a clear definition of essential fundamentals of a safe operation and forbidding operation outside this definition (Hale & Heijer, 2006). To the best of our knowledge and based on extensive research, HRO has not been implemented in railroads so much as healthcare, nuclear, and chemical- processing industries, also there has not been a system that was designed with HRO in mind. Most studies focus on transforming an organization to an HRO, while it would be more productive to design highly reliable organizations. One of the few studies in this field divides reliability failures into two main categories of organizational vulnerability to disaster at any particular time, and the gradual degradation processes leading organizations into vulnerable states (Busby, 2006). Failure in complex systems, include equipment failure as well as human error. Traditional risk/reliability studies assumed that the majority of system failures were due to hardware failures, but it has been found from the accident history that human error causes 20–90% of all major system failures (Verma, Ajit, & Karanki, 2010). In railroads, as an example of a complex, high-risk system, human factor related errors caused 39% of the accidents (Federal Railroad Administration , 2014), and introducing a new technology in this system would affect organizational, technical, and human elements of the system. 33 Busby in his study of Reliability-Seeking organization (2006) analyzed two railroad accidents in the UK, the Ladbroke Grove disaster in 1999, in which a local train collided with an express at a closing speed of about 130 mph, killing 31 people and injuring 416. The second accident was The Clapham Junction disaster in 1988 that involved a collision of two passenger trains, and the subsequent collision of one of these with an empty train. Figure 7- A general model of limited reliability seeking (Busby 2006, p. 1390) The research results suggested that there are two co-existing elements responsible for the failure of the railroad system: Requisite Processes (reliability seeking) and Limiting Processes (reliability confounding). The result conveys that limiting processes undermine the reliability seeking efforts by requisite processes. Busby further explains that there are two systems within each organization: 34 1) System-of-concern, which is the system that is the object of reform (the system that fails catastrophically), and 2) the System-of-reform, which is the system that is the processes of reform. Busby also introduces the model of limited reliability seeking (Figure 7). This model shows that reliability seeking organizations create and involve processes of systematic reform that are limited by organizational conditions (Busby, 2006). Although HRO studies in railroad are limited, especially compared to the extensive work that has been done in aviation, nuclear safety, and health care, but we can use and apply the lessons learned from these industries due to the similarities in organizational structures and the risk factors. 2.2.5. New Technology Implementation in High-risk Organizations High-risk organizations are inherently complex and depend on the latest technologies to survive and function properly. Therefore, introducing a new technology to such an organization is inevitable. Studies show that the installation of new technology always involves some changes to the organization and its members. Therefore, organizational and technological change must be considered simultaneously. Technological change often requires some organizational changes, and it will fail if both organizational and technological changes are not effectively integrated and managed to achieve alignment (Majchrzak & Meshkat, 2001). Ann Majchrzak and Najmedin Meshkati, Professors at University of Southern California, conducted a research study on effective management of simultaneous change in technology and organizational design. The results of this study have been published as chapter 36 of the Handbook of Industrial Engineering (2001). The result of the study shows that the practicing engineer should 35 have the following four points in mind when designing for organizational and technological alignment: 1. Existence of clear relationships between technological and organizational changes. 2. Introducing a new technology is equal to the introduction of a Technological, Organizational, and People (TOP) change. 3. Prior to the implementation of a new technology, all the possible combinations of TOP changes should be identified and studied. 4. To successfully implement the TOP changes, change strategies must be thoughtfully planned and executed. As Bea et al. (2009) pointed out, the system modeling approach, should be interdisciplinary, especially in the case of a high-risk organization. The same applies to efforts taken by the organization to implement a new technology. Sometimes, technical aspect of the implementation could overshadow the human, social, and organizational elements. However, studies show that this approach could negatively affect the outcome of the implementation. According to Peter Unterweger of the UAW Research Department, the successes of technological implications can be attributable to: • Hardware playing a subordinate role to organizational or human factors and • Developing the technical and organizational systems in step with one another. In a study of 2000 U.S. firms implementing new office systems, less than 10% of the failures were attributed to technical failures; the majority of the reasons given were human and organizational in nature (Long, 1989). 36 Failed technology investment would cost an organization a lot of time, effort, and considerable financial loss, especially if we want to consider a high-reliable organization, which would require sophisticated and expensive tools. An example of an unsuccessful implementation of a new technology is the London Ambulance Service (LAS) Computer Aided Dispatching (CAD) software failure. LAS and PTC share similarities in system elements and dynamics. There were both technological upgrades to improve the reliability of already existing dispatching systems and aside from the CAD software, they both require reliable human and computer interaction. On October 26, 1992, after 36 hours of operation, The London Ambulance Service suffered from the failure of the newly implemented CAD system, which brought their operations to a virtual standstill. Although there was not enough evidence to prove it, the media reported that up to 20- 30 people may have died as a result of ambulances arriving too late on the scene (Finkelstein & Dowell, 1996). The report indicates that on 26 and 27 October 1992 the computer system itself did not fail in a technical sense. Response times did on occasions become unacceptable, but overall the system did what it had been designed to do. However, much of the design had fatal flaws that would and did cumulatively lead to all of the symptoms of systems failure (South West Thames Regional Health Authority, 1993, p. 5). What is clear from the Inquiry Team's investigations is that neither the CAD systems, nor its users, were ready for full implementation. The purpose of implementing a new technology is to improve the effectiveness of the work done in the organization. However, to achieve that goal, organizations need to reestablish boundaries between different working groups in the organization and rethink the way they interact with one another (Black, Carlile, & Repenning, 2004). Adaptive Structuration Theory (AST) suggests that the technology has structures in its own right but that social practices moderate their effects on behavior (DeSanctis & Poole, 1994). In other words, causality runs in both directions: technology 37 influences the patterns of human activity and the technology changes as it is modified in the course of daily activity. Therefore, it is critical to understand the relationship between technological and social factors over time. As Barley (1986) proves “The complexity and uncertainty are functions of how the machine merged with the social system; they are not attributes of the machine itself.” 2.3. Risk and Reliability Analysis in Railroads Risk management uses different approaches and strategies to address both likelihood and consequence(s) of failure (Bea, 2011). According to Bea (2002) there are three general categories of risk management approaches: 1. Proactive - before activities are carried out 2. Reactive - after activities are carried out 3. Interactive - during performance of activities The railroad industry is no exception. The Federal Transit Administration (FTA) has published a report providing guidelines for hazard analysis of transit project (2000). The general concept of the framework is the same in most industries and studies (Harold & Moriarty, 1990) (Bea, 2011) (DOE, 2012), which closely follows Probabilistic Safety Assessment (PSA) steps. In this report they identify five steps for a hazard resolution process: 1) Define the physical and functional characteristics of the system and evaluate people, procedure, and the environment; 2) Identify hazards, undesired events, and their causes; 3) Asses hazards and determine their severity and probability, and decide whether to accept or control the risk; 4) Resolve hazards by either assuming risk or implementing corrective actions; and 5) Follow up by monitoring effectiveness and controlling the hazard. Figure 8 illustrates the PSA procedure. 38 Figure 8 - PSA Procedure (Verma, Ajit, & Karanki, 2010, p. 327) Approaches that could be used for hazard identification include: • Preliminary Hazard Analysis (PHA) • Hazard Operability Analysis (HAZOP) • Failure Modes and Effects Analysis (FMEA) • Failure Modes and Effects Criticality Analysis (FMECA) 39 After identifying the initiating event, the next step is to develop a model to simulate the initiation of an accident and the response of the system. A typical accident sequence consists of an initiating event, specific system failure/success, human errors, and associated consequences. Accident sequence modeling can be divided in three categories of event sequence modeling, system modeling, and consequence analysis. An event sequence model shows the sequence of events, succeeding an initiating event, and resulting in different outcomes. Event Tree Analysis (ETA) is a useful tool for identifying the sequence of events. System modeling is used to quantify system failure probabilities and accident occurrences. Fault Tree Analysis (FTA) is a common tool in hazard analysis for identifying the root causes of the problems. Consequence analysis determines the expected severity of each occurrence generated in the previous model (Verma, Ajit, & Karanki, 2010). 2.3.1. Preliminary Hazard Analysis (PHA) Preliminary Hazard Analysis (PHA) is one of the first efforts in hazard analysis. It is first conducted during the design phase of the system life cycle, and repeated frequently afterwards (Harold & Moriarty, 1990). PHA identifies critical areas, hazards, and criteria in the system. It studies hazardous components, interfaces, environmental constraints, as well as operating, maintenance, and emergency procedures. PHA verifies that corrective and preventive measures are taken into consideration in different aspects of the system. It could also provide inputs for further analysis such as FMEA (Federal Tarnsit Administration, 2000). 2.3.2. Hazard Operability Analysis (HAZOP) Hazard Operability Analysis (HAZOP) is a qualitative approach to identify hazard and operability problems. This method is used worldwide for studying not only the hazards of a system, but also 40 its operability problems, by exploring the effects of any deviations from design conditions (Dunjóa, Fthenakis, Vílchez, & Arnaldos, 2010). HAZOP follows the general risk analysis guideline and has four phases of analysis: 1) definition of the system, 2) preparation for the study, 3) analysis and examination by dividing system into its basic component, and 4) documentation and follow up. In this method, an interdisciplinary team examines the existing processes and operations to identify their possible deviation from the intended design and to evaluate the problems that might represent risks to personnel or equipment (Vinnem, 2007). 2.3.3. Failure Modes and Effects Analysis (FMEA) Failure Modes and Effects Analysis (FMEA) is a reliability analysis with a bottom-up approach to risk. The focus of the analysis is on an event that could cause a state of unreliability or hazard in the system (Harold & Moriarty, 1990). FMEA enables us to determine the results or effects of subsystem failures on a system operation and to classify potential failures according to their severity (Federal Tarnsit Administration, 2000). FMEA has three main approaches of Process FMEA (p-FMEA), Design FMEA (d-FMEA), and System FMEA. These methods could be used in different concepts and situations including design of a new process, changing an existing process, carrying over processes for use in new applications/environments, and preventing a recurrence of a failure. We use p-FEMA when we have a basic knowledge of the process. System FEMA is conducted when system functions are defined, but prior to choosing specific hardware. D-FEMA is used after product functions are defined, but prior to agreeing to the design and releasing it to manufacturing (Smith, 2010). 41 It is more beneficial if FMEA is used in earlier stages of the design, but it is never too late to use this tool. To perform a FMEA analysis, the following process should be implemented (Federal Tarnsit Administration, 2000): • Identify all major system components, functions, and processes • Determine consequences of interest • Determine the potential failure modes of interest • Specify effects of failures of the system • Identify safety provisions to control hazards and failures • Identify detection methods for failures • Establish overall significance of each failure. The FMEA will provide Risk Priority Numbers (RPN), to identify critical items. It could be an input to safety design criteria to eliminate or control all unacceptable and undesirable hazards (Bencini & Pautz, 2010) (Harold & Moriarty, 1990). RPN = (severity) x (frequency of occurrence) x (likelihood of detection) 2.3.4. Failure Modes and Effects Criticality Analysis (FMECA) Failure Modes and Effects Criticality Analysis (FMECA) is a subcategory of FEMA. FMECA is used to identify critical components that are most important to the state of reliability by charting the probability of failure modes against the severity of their consequences (Harold & Moriarty, 1990). 42 2.3.5. Operating Hazard Analysis (OHA) Operating Hazard Analysis (OHA) as to identify and analyze hazards associated with personnel and procedures during production, installation, testing, training, operations, maintenance, and emergencies. This method is similar to FMEA but more focused on human actions and interaction with the system. When a potential hazard is identified, then its mitigation possibilities are analyzed (Federal Tarnsit Administration, 2000). The goal of OHA is to provide corrective or preventive measures that could be taken to minimize the possibility of human error or procedures that would result in injury or system damage. The findings of OHA could be used as inputs to future improvement and safety practices (Federal Tarnsit Administration, 2000). 2.3.6. Fault Tree Analysis (FTA) Fault Tree Analysis (FTA) is one of the most powerful analytic tools for analyzing sets of events arranged in the system (Harold & Moriarty, 1990). This method has a top-down approach to risk assessment and has four major advantages over other forms of system analysis: 1. It directs the analyst to accident-related events 2. It provides a depiction of system functions 3. It provides options for both qualitative and quantitative analyses 4. It provides the analyst with insight into the system behavior. FTA is usually performed during the beginning of the final design (Federal Tarnsit Administration, 2000). Figure 9 illustrates the first level fault tree analysis for train derailment. FTA is further explained in the next chapter. 43 Figure 9 - First level fault tree analysis for train derailment (Hartong, Goel, & Wijeseker, 2011, p. 316) 2.3.7. Event Tree Analysis (ETA) Event Tree Analysis (ETA) provides all possible outcomes of a potential failure. Event trees model the accident sequences, whereas fault trees model individual system responses (Verma, Ajit, & Karanki, 2010). FTA is a logic diagram. Unlike FTA, ETA starts from an initiating event and explores all possible outcomes. Figure 10, shows an example of an event tree. 2.3.8. Human Reliability Analysis (HRA) Failure in complex infrastructure systems (CIS), including equipment failure as well as human error. Traditional reliability studies assumed that the majority of system failures were due to hardware failures, but it has been found from the accident history that human error causes 20–90% of all major system failures (Verma, Ajit, & Karanki, 2010). In railroads, as an example of a complex, high-risk system, 39% of the accidents are due to human factor causes (Federal Railroad Administration , 2014). 44 Figure 10 - Event Tree (Verma, Ajit, & Karanki, 2010, p. 333) The purpose of Human Reliability Analyses (HRA) is to estimate the likelihood of the occurrence of human errors or failures, and to take action to prevent hazardous events or their triggering. It should be noted that the aforementioned action adversely influenced safety, not that the humans were necessarily at fault. Figure 27 illustrates how the environment could potentially induce human errors, and contribute to weakening the safety defenses (Federal Railroad Adminestration , 2003). It is preferable that HRA is implemented early in the design stage; however, it could be used to change the design or investigate its functionality after the implementation (Oxstrand & Boring, 2010). 45 Figure 11 - Relationship of safety, human error and their influence (Federal Railroad Adminestration , 2003) Derailment or train-to-train accidents are examples of top-level hazardous situations in the railroad. These events are controlled (prevented from happening) through defenses or “barriers” implemented in the system, such as train crew complying with the rulebook of operations, the use of the computer-aided dispatch system (CAD), adhering to speed limits, and the application of fail- safe design principles. FRA (2003) identifies four main tasks that need to be performed as part of an HRA: 1. Qualitative Evaluation of Human Factors Issues - Analyze the impact of the current work environment and new technology on human performance. 2. Survey of Databases for HRA Sources - Identify collections of data that may be relevant to the quantification of errors. 46 3. Quantification - Develop quantitative estimates of the likelihood of the human actions in question. 4. Documentation - To permit review and later understanding of the details of the quantification, all results and processes must be well documented, providing the bases for all estimates 2.3.9. Comparison of the Risk Analysis Tools This section provides a summary of the risk analysis tools that were discussed in this chapter. Table 1 shows a comparison evaluation of the seven risk analysis tools. In chapter three we discuss the methodology used in this research. We will examine two studies conducted as part of that methodology, and discuss the tools used for each study in further details. Hazard Analysis Project Life Cycle Qualitative Capability Quantitative Capability Ability to Model HSI Accommodating PTC Comment Concept Planning Preliminary design Final Design Implementation Operation PHA P P O O HAZOP P O O P FMECA P P O P OHA P O P P FTA P P P* O * FTA and ETA could be used in combination with other methods ETA P P P* O HRA P P P P Table 2 - Comparison of risk analysis tools [in part (Federal Transit Administration, 2000) 47 48 49 CHAPTER 3. Methodology 3.1. Overview This chapter introduces the proposed guideline for improving service reliability and service interruption in train operations under PTC by eliminating preventable failures and system variations. The main purpose of this guideline is to provide a systematic approach to identify the human, organizational, and technological factors that influence the reliability of the system, and address and evaluate these factors using HRO characteristics. This chapter consists of three sections. Section 3.2 focuses on Adaptation of HRO Characteristics for PTC Implementation in the Railroad Industry. Section 3.3 consists of two studies: the first study focuses on the factors that influence a successful PTC performance and the second study investigates train delays due to PTC related issues. Finally, section 3.4 looks at the integration of the previous sections, and introduces a set of checklists to guide the incorporation of HRO principals into PTC operations. The data used in these studies is not disclosed in this dissertation, due to the confidentiality agreement with Metrolink. 3.2. Hypothesis Testing and Data Analysis 3.2.1. Research Question and Data Collection As stated before, the goal of this research is to answer the following question: How might HRO principles contribute to the improvement of service reliability and service interruption in train operations under PTC. 50 To achieve this goal, first we defined reliability in our system. Reliability is “the lack of unwanted, unanticipated, and unexplainable variance in performance” (Hollnagel, 2009) 8 . Variances in performance of the train operation under PTC are 1) uninterrupted/successful PTC performance during a train run and 2) train delays due to PTC related issues. The second step was to identify factors that influence the reliability of the system. We were able to acquire two sets of reports for Metrolink train operation under PTC in 2016: • Daily PTC Run Reports – Daily reports on train operations under PTC • PTC Delay Report – Delay reports on trains schedules under PTC These reports were used as the bases for the two studies that will be discussed in sections 3.2.2 and 3.2.3. The dates for the two sets of reports are not compatible, so to be consistent, only the data reported between April 2016 and December 2016 were analyzed. The data in these reports were collected from train runs on Metrolink subdivisions in Southern California that are operating completely or partially under PTC (Figure 12 and Figure 13). The following is the list of Metrolink subdivisions: • Antelope Valley Line (VL) • Orange County Line (OR) • Perris Valley Line (PVS) • Riverside Line (RV) • San Bernardino Line (SG) • Ventura County Line (VN) 8 Eric Hollnagel, 1993, p. 51, adopted by Karl Weick and Kathleen Sutcliffe in “Managing the unexpected” presentation 2005 51 Figure 12 - Metrolink System Map 9 Figure 13 - Metrolink PTC Service Map10 9 http://www.metrolinktrains.com/pdfs/Ride_Guide.pdf 10 https://www.metrolinktrains.com/rider-info/safety--security/positive-train-control/ (Accessed November 24, 2017) 52 3.2.2. Interviews and Site Visits To gain a better understanding of how a PTC system and its subcomponents work and interact, and how HRO characteristics could improve PTC system reliability, we conducted several site visits and interviews with PTC and HRO experts. The following is the list of meeting conducted with Metrolink PTC experts to understand the PTC operation and define the system parameters: 1. A site visit was made to the Metrolink yard at the Union Station in downtown Los Angeles (September 10, 2012). 2. Attending Metrolink board meeting with a follow up meeting with Mr. Larry Day (September 27, 2013). 3. Meeting with Fred Jackson and Larry Day at USC (January 14, 2014). 4. Visit to the Metrolink PTC Revenue Service Demonstration (RSD) (February 20, 2014). 5. A site visit was made to the Metrolink Dispatch Center in Pomona, CA, to interview and observe dispatchers to understand Metrolink dispatch operations and the factors that could contribute to dispatch errors (March 5 and 24, 2014). 6. Meeting with Darrel Maxey, Sergio Marquez, Bailey Rod, Luis Carrasquero, and Gail Davis at Metrolink office in Pomona (April 28, 2016). 7. Meeting with Gail Davis and Luis Carrasquero at Metrolink office in Pomona, CA (August 12, 2016). 8. Meeting with Gail Davis at Metrolink office in Pomona, CA (February 10, 2017). Table 3 provides a summary of the interviews. It should be mentioned that some of the information was redacted to protect the privacy and integrity of the Metrolink operations. 53 Visits and Interviews Visits Observations PTC Roll Out September 10, 2012 PTC – Putting Technology in Charge The investigation reports of the Chatsworth accident 11 which resulted in the Rail Safety Improvement Act of 2008 (RSIA) found human error as the main cause of the accidents. In the Metrolink PTC roll out ceremony, one of the officials referred to PTC as “Putting Technology in Charge,” which underestimates the complexity of human technology interaction in railroad safety. Be the first to implement PTC Metrolink officials said that they want to be the first to implement and roll out PTC in the nation. 12 They are determined to complete the project in early 2013, which might result in incomplete implementation of technology. Metrolink identified the development and implementation of new software and configuration management system and Computer Aided Dispatch (CAD), as two of their main challenges. 13 Cab Design PTC onboard equipment, designed by Parsons 14 , is consistent with the old layout. Mapping System Updates In order to update the route map, Metrolink had to use a mobile operation unit to update maps by traveling the routes. Metrolink Board Meeting September 27, 2013 Metrolink has strived to be the first railroad organization in the nation to implement and roll out PTC. In May 2012, Metrolink identified the development and implementation of new software and configuration management system and Computer Aided Dispatch (CAD), as two of their main challenges.15 More than a year has passed and Metrolink officials still identify these factors, Development of a stable PTC-Compatible CAD and Software and Configuration Management System, as two of the main “hurdles” on the way to achieve their goal.16 Another challenge that Metrolink is facing today is the interoperability of the PTC system and technology with other railroads. Metrolink is installing I-ETMS (6.3.8), which is a revision of I-ETMS (6.3.7) that they have originally implemented to insure interoperability with BNSF and UP. The proper implementation of this system could be challenging and time- consuming. However, it is crucial to both compatibility with other railroads and overall safety of the operation and it should not be compromised in favor of time. Table 3 - PTC Visits and Interviews 11 “The NTSB also cited the lack of a positive train control system (PTC) as a contributing factor in the accident. A positive train control system would have stopped the Metrolink train short of the red signal, thus preventing the accident. "This accident shows us once again that the safety redundancy of PTC is needed now," Hersman said. "It can and will save lives even when operators ignore safety rules or simply make mistakes.” http://web.archive.org/web/20100210075446/http://www.ntsb.gov/pressrel/2010/100121.html 12 Officials said that they would roll out the PTC system before any other railroad in the country.http://www.scpr.org/news/2012/09/10/34205/metrolink-unveil-state-art-rail-technology-designe/ 13 Metrolink PTC factsheet, http://www.metrolinktrains.com/pdfs/Agency/PTC_Fact_Sheet_1.pdf 14 http://www.parsons.com/projects/Pages/metrolink-ptc.aspx 15 Metrolink PTC factsheet, updated May 2012 http://www.metrolinktrains.com/pdfs/Agency/PTC_Fact_Sheet_1.pdf 16 Metrolink PTC factsheet, updated May 2013 www.metrolinktrains.com/content/media/03/files/2013%2005%20PTC%20Fact%20Sheet_Updated%20June% 202013.pdf 54 Visits and Interviews (Cont.) Visits Observations Revenue Service Demonstration February 20, 2014 Observing the operation under PTC system Participating in the simulation session for PTC Metrolink Dispatching Center March 5 and 24, 2014 Simulator log files: These files are sources of raw data for simulation runs conducted by Metrolink engineers. However other rail companies such as Amtrak are not currently tracking these files, but rather setting up a tracking system. We believe that these files, if saved and maintained could be great resource for human- technology interaction analysis, which we plan to recommend to Metrolink, which we think should evolve into an industry standard. PTC Equipment: This includes the event recorder, On Board Computer (OBC) and Office side computer: Event Recorder – can provide PTC train operation data similar to flight recorders on aircraft. On Board Computer – can provide Train operation data Office Side Computer – Depending on the operational server, this is currently being operated by BNSF, but will be shifting to Metrolink by May, 2014. To ensure a comprehensive understanding of how PTC integrates with work procedures and processes, we have identified a document known as the Concept of Operations (CONOPs), which is similar to a process map. Currently we are seeking permission to access the document. Meetings at Metrolink office in Pomona April and August 2016 February 2017 In these meetings the following topics were discussed: PTC operations and its progress The role of engineer and train crew in railroad safety The effects of PTC on situational awareness of the engineer Operability issues PTC reports and data Table 3- PTC Visits and Interviews (Cont.) We cannot disclose the details of these interviews due to privacy issues. However, one of the main outcomes of these meetings is identifying the PTC failure modes that will be discussed in section 3.2.4.1. 3.2.3. Study 1 – Factors Influencing a Successful Performance The first study focused on the Daily PTC Run Reports. This set of reports contains 38,289 observations (train runs) and provides information regarding the daily train operations under PTC, and whether the trains were able to complete their runs successfully. They also have general train 55 information including date, subdivision, and the lead unit. As stated before, the Daily PTC Run Reports from April 1 to December 22, 2016 were analyzed for this study. In this study we answer the question: What are the factors that lead to unsuccessful PTC operation and how they can be categorized? After detailed evaluation and analysis of the report. The performance indicators were divided in three categories: initialization, disengage, and cut-out. The initialization category contains the factors that influence a successful initialization of the PTC system before the train starts its run. System initialization under PTC is illustrated in figure 14. After completing the initialization process, the train goes into the disengaged state. The engineer selects the track that the train currently occupies when prompted by the systems, and if it is compatible with the information previously entered in the system, the PTC will activate at the designated point. However, there are some factors that could make the PTC system to go back to the disengaged state after the system was activated. The main factors that could cause the system to go into the disengaged state are GPS or navigation issues, invalid speed reading due to technical malfunction of the system (tachometer issues), onboard failures, and entering a subdivision that is not yet PTC operable. The second category in this study evaluates factors that could cause the system to go into the disengaged state. 56 Start Initialization (Start engine) Self Test (Verification) Successful test Yes Valid Movement Authority Yes PTC Mode No Intervention Mode Restricted Mode No OBC Non- initialized State Maintenance Figure 14 - System initialization under PTC [extracted from (Southern California Regional Rail Authority (Metrolink), 2010) It is noteworthy to mention that the Metrolink trains are only able to operate PTC on the Metrolink section of the track. If the train goes through a section of the track that is operated by other railroad companies (Amtrak, BNSF, etc.), the PTC system goes to disengaged mode and reactivates when the train goes back to Metrolink territory. The interoperability between different railroads is one of the major issues that could affect the safety of the train operations under PTC, and it will be discussed further in the next chapters. 57 The final category focuses on the factors that could result in a train entering a cut-out state, in other words cutting out the PTC system and operating the train under CTC mode. If the engineer is unable to initialize and the train is five minutes past its scheduled departure time at the originating station, the crew has to contact the train dispatcher and request permission to operate cut-out, and start the run without PTC. In addition, the system may transition to the cut-out state if there is an emergency application of the brakes, either engineer or PTC initiated, or due to onboard system failure. The PTC system could later be initialized or re-initialized en route (Metrolink, 2016). Table 4 shows the three categories and their factors. It also shows the frequency for each factor between April 1 and December 22, 2016. The color red shows the most frequent factor and green is the least frequent factor. Initialization Disengaged Cut-out Unidentified 241 TBD 730 TBD 319 Not Active at Designated Point 704 Speed invalid (wheel tach) 200 Onboard cut-out 149 SW Download 391 GPS/Navigation Fault 119 Crew cut-out, Directed 86 Consist Error 238 Non Sync Subdivision 83 Crew cut-out, Not Directed 11 No attempt to initialize 182 Crew track selection 11 Cut-out due to outage 47 Failed State 147 EBI (Fault 0x680) 0 Air Brake cut-out 18 Sync Flag/Bulletin 177 Invalid DIO Input 2 Crew cut-out, Unknown 40 Incorrect GTB/Train ID unavailable 84 TMC out of memory 0 Non Communication Flag 26 Crew canceled initialization 9 PCS penalty not able to recover 3 Eng. not qualified/Invalid PIN/ID 0 Sync Flag 0 PTC Components removed 350 Tagged out Mechanical 103 Tagged Out- TMC 91 Tagged Out- Wheel tach 33 Tagged Out- Failed Departure Test 42 Tagged Out- Mag Valve 23 Tagged Out- Recorder 26 Tagged Out- Data/Radio 7 Table 4 - Daily PTC Run Reports Summary 58 The factors presented in table 4 represent the issues that contributed to an unsuccessful PTC run. In other words, each of these factors represents a time that the PTC system failed to operate successfully. Therefore, to find out which factor has the most influence on the successful PTC run, we identify the factor that resulted in more failures (higher frequency). To evaluate the influence of the aforementioned factors on train operation under PTC, the train operation was classified into three levels of successful, semi-successful, and failed runs. A successful PTC run is defined as a train that was able to initialize PTC on time and activate PTC at the designated point en route. In this group, the PTC system remains active during the run until the train arrives at its destination. The failed PTC runs are trains that were not able to initialize the PTC system and trains that start the run with an active PTC but are not able to finish the run successfully. The semi-successful category includes train runs that fail but are able to reinitialize the PTC system and finish the run successfully. A statistical analysis was conducted to identify the factors that influence the reliability of the PTC operations. The statistical analysis for this study consists of two steps. First, Multiple Discriminant Analysis (MDA) was conducted to evaluate the influence of PTC on train operations. The second step was to conduct a z-test to identify the most influential factor within each category. The analyses and their results are explained in section 3.2.3.1 and 3.2.3.2. 3.2.3.1. Multiple Discriminant Analysis In this study we used Multiple Discriminant Analysis (MDA) to create a model to forecast a qualitative variable from several quantitative measures from PTC performance measures (independent variables). MDA is used in cases were a large number of independent variables were collected and the goal is to select a useful subset for predicting the dependent variable (Multivariate 59 Data Analysis Using SPSS). Discriminant analysis develops a set of weighted linear combinations of the independent variables that best discriminates between the three levels of PTC operations (dependent variable). These linear combinations (variates) together serve as the model for group differences, and it is referred to as the descriptive or explanatory discriminant function (Meyers, Gamst, & Guarino). The research hypothesis for this study is: H 1 : PTC implementation with HRO characteristics leads to greater service reliability and fewer system interruptions than non-HRO PTC systems. The dependent variable in this analysis is the service reliability and system interruptions (categorical variables). As stated before, this variable contains three levels of train operations under PTC: 1) Successful, 2) Semi-successful, and 3) Failed. The independent variables are the PTC performance measures (quantitative variables) that form a weighted linear combination to differentiate between the groups. The three independent variables are as follows: • Initialization issues • Factors resulting in the disengaged state • Factors resulting in the cut-out state • The operation subdivision. In this study we included subdivisions as the fourth independent variable to study the effects of the geographical factors on PTC operations (Table 5). 60 Assumptions MDA follows the principals of a general linear model, and therefore makes the same assumptions as multiple regression analysis, in addition to the following assumptions (Meyers, Gamst, & Guarino): • Independence of the quantitative measures (IV) • Group memberships are mutually exclusive • The independent variables follow a normal distribution • The analysis is very sensitive to outliers, so the assumption is that the outliers are not adversely affecting the result of the analysis. Independent Variables PTC [Components] Performance Measures Initialization Issues Disengage Factors Cut-out Factors Subdivisions Dependent Variable Service Reliability and System Interruptions Failed PTC Run Daily PTC Run Reports Daily PTC Run Reports Daily PTC Run Reports Daily PTC Run Reports Semi-Successful PTC Run Daily PTC Run Reports Daily PTC Run Reports Daily PTC Run Reports Daily PTC Run Reports Successful PTC Run Daily PTC Run Reports Daily PTC Run Reports Daily PTC Run Reports Daily PTC Run Reports Table 5 - Multiple Discriminant Analysis Sample Size In MDA the groups can have different sample sizes, but the sample size of the smallest group should exceed the number of independent variables. The maximum number of quantitative 61 measures should be N-2, where N is the sample size 17 (Meyers, Gamst, & Guarino). In this study the smallest group has over 3000 observations, which significantly exceeds the number of independent variables. Discriminant Function The number of discriminant functions is limited to the smaller number of the following: • The number of predictor (independent) variables • The degrees of freedom for the groups (k-1, where k is the number of groups) In this study there are three categories for the independent variable (Successful, Semi-successful, and Failed) so ! = 3. A weighted linear combination of independent variables form a latent variable called discriminant Score. $% = & ' ( ' + & + ( + + ⋯+ & - ( - where & ' ,& + ,…,& - = 012341516768 397::1317682 ( ' ,( + ,…,( - = 1607;760768 <=41=>?72 & @ is the Discriminant Coefficient, where A = 1,2,…,6 17 The recommended sample size for the smallest group should be at least 20 times the number of IVs. 62 Testing the statistical significance and evaluating the quality of the solution There are three aspects of the output of the discriminant function that can be used to evaluate the result of the analysis: Wilk’s Lambda: The Wilk’s Lambda is used to test the significance of the discriminant functions. Mathematically, it is one minus the explained variation and the value ranges from 0 to 1. A smaller Wilk’s Lambda signifies that the model has more impact. Canonical Correlation: The canonical correlation is a measure of the association between the groups in the dependent variable and the discriminant function. A high value implies a high level of association between the two and vice-versa. Eigenvalues: The eigenvalue is a ratio between the explained and unexplained variation in a model. There is one eigenvalue for each function in the MDA, and each value should be greater than one for a good model. The bigger eigenvalue indicates a stronger discriminating power of a function. Results Eigenvalues Function Eigenvalue % of Variance Cumulative % Canonical Correlation 1 11.102 97.9 97.9 0.958 2 0.238 2.1 100.0 0.439 Wilks' Lambda Test of Function(s) Wilks' Lambda Chi-square df Sig. 1 and 2 0.067 27641.792 10 0.000 2 0.808 2181.703 4 0.000 Table 6 - Summary of the Canonical Discriminant Function 63 Table 6 shows the summary of the Canonical Discriminant Function for this study. The Wilk’s Lambda is used to test the significance of the discriminant functions (Meyers, Gamst, & Guarino). Subtracting this value from 1.00 shows the proportion of explained variance. In this analysis, the Wilk’s Lambda for the two functions is 0.067. Therefore, the two discriminant functions explain approximately 93% of the variance in the model. The eigenvalue is a ratio between the explained and unexplained variation in a model. The canonical correlation value for the first function is 0.958; therefore, the first function explains approximately 91% of the variance of the dependent variable. Its eigenvalue is 11.102, which is significantly larger than 1, and the first function is responsible for 97.9% of the total amount of explained variance of the two functions. The canonical correlation value for the second function is 0.439; therefore, this function explains approximately 19% of the variance of the dependent variable. Its eigenvalue is 0.238, which is significantly smaller than the previous function, and it is responsible for the 2.1% of the total amount of the explained variance of the two functions. Function 1 Function 2 Subdivision -0.016 0.027 Initialization Issues 0.404 -0.287 Disengage reasons 0.194 0.821 Cut-out reason 0.007 0.060 Table 7 - Standardized Canonical Discriminant Function Coefficients Table 7 shows a summary of the standardized Canonical Discriminant Function Coefficient. The values are the weights for each function by which the raw scores of the variables are multiplied and summed together to produce a discriminant score of the model (Meyers, Gamst, & Guarino). 64 Based on the standardized coefficient, the Initialization and Disengaged are weighted more heavily than the other predictors in both functions. Function 1 Function 2 Initialization Issues 0.282 -.739 Disengage Reasons 0.131 .961 Cut-out reasons 0.073 .079 Subdivision 0.004 -.060 Table 8 - Structure Coefficients Matrix The structure coefficients (Table 8) are the correlations between the variable and the weighted linear composite. The discriminant function is primarily interpreted based on these coefficients (Meyers, Gamst, & Guarino). Each discriminant function is interpreted separately. However, based on the results of the previous section, we focus on the first function, since it is responsible for 97.9% of the total amount of explained variance of the two functions. Therefore, we can conclude that the results show that Initialization issues are the most influential factor in the train operation under PTC. Factors that result in the disengaging and cut-out state are ranked second and third respectively. The train subdivision has the lowest weight. 3.2.3.2. Z-test Differences between the Most Influential Factors Within Each Category After identifying the categories that have more impact on the successful train operation under PTC, we needed to identify the most influential factor within each category. To achieve this goal, we conducted a z-test to evaluate the significant differences between the two factors with the most number of occurrences in each category. Using D E to denote the factors with the most number of occurrences, the hypothesis for the z-test is as follows: 65 H 2 : The factor with the highest frequency is the most influential factor within each category. Or in other words: F G : D E = D G F + : D E > D G Since 6 ≥ K LMN (P,'QP) the distribution of D can be approximated by a normal distribution with a mean of D and a standard deviation of S P = D G (1−D G ) 6 where n is the number of observations in each category, and the test statistics of U = D−D G S P The decision rule is rejected F G if ;−<=?V7 ≤ X, where X = 0.05 (Ott & Longnecker, 2010). Results In this section we developed three sets of z-tests for the dependent variable (two levels: failed and semi-successful) with each one of the independent variables (initialization, disengage, and cut- out). The subdivision variable was not included in this analysis, because the previous MDA showed that it has a very small influence on the outcome of the operation. Run Result Failed Semi-Successful Most Observed SW Download 358 Not Active at Designated Point 589 2 nd Most Observed PTC Components removed 350 Consist Error 15 Hypothesis F G : D E = D G F + : D E > D G F G : D E = D G F + : D E > D G Test Statistic \−<=?V7 = 0.015 \−<=?V7 < 0.0001 Test Result Reject H 0 Reject H 0 Most Influential Factor SW Download Not active at designated point Table 9 - Initialization Most Influential Factor 66 Software download issues at initialization is the most influential factor that contributes to the failed PTC runs. The factor that has the most influence on the semi-successful runs in the initialization category is the issues resulting in the PTC not being activated at the designated point (Table 9). Run Result Failed Semi-Successful Most Observed Speed Invalid (Wheel tech) 46 Speed Invalid (Wheel tech) 154 2 nd Most Observed GPS/Navigation Fault 35 GPS/Navigation Fault 86 Hypothesis F G : D E = D G F + : D E > D G F G : D E = D G F + : D E > D G Test Statistic \−<=?V7 = 0.011 \−<=?V7 < 0.0001 Rejection Rule Reject H 0 Reject H 0 Most Influential Factor Speed Invalid (Wheel tech) Speed Invalid (Wheel tech) Table 10 - Disengage Most Influential Factor Invalid speed reading due to tachometer slip is the most influential factor that results in a disengaged state in PTC systems that contributes to both failed and semi-successful PTC runs (Table 10). Run Result Failed Semi-Successful Most Observed Onboard cut-out 127 Crew cut-out 58 2 nd Most Observed Crew cut-out 98 Onboard cut-out 26 Hypothesis F G : D E = D G F + : D E > D G F G : D E = D G F + : D E > D G Test Statistic \−<=?V7 < 0.0001 \−<=?V7 < 0.0001 Rejection Rule Reject H 0 Reject H 0 Most Influential Factor Onboard cut-out Crew cut-out Table 11 - Cut-out Most Influential Factor 67 Onboard equipment cut-out is the most influential factor in the cut-out category that contributes to the failed PTC runs. The factor that has the most influence on the semi-successful runs in the cut- out category is the PTC cut-out by the crew, both directed and undirected (Table 11). Differences between the Most Influential Factors between Categories To validate our results from the previous analysis, a second z-test is conducted to evaluate the differences between the most influential factors between three categories. The data that we have regarding the train performance is categorical, therefore we use the population proportion D to test the differences and level of significant between the most influential factor in the three categories of initialization, disengage, and cut-out. To compare the proportion of the categories we conduct the following hypothesis test: H 3 : The factor with the highest frequency is the most influential factor between two categories. Or in other words: F G : D ' = D + F ^ : D ' > D + or F G : D ' − D + = 0 F ^ : D ' − D + > 0 Since 6 ' D ' ≥5, 6 ' (1−D ' )≥5, 6 + D + ≥5, and 6 + 1−D + ≥5 we can use the normal approximation for our distribution. The decision rule is rejected F G if \−<=?V7<X, where X = 0.05 (Ott & Longnecker, 2010). 68 S P _ QP ` = D ' (1−D ' ) 6 ' + D + (1−D + ) 6 + U = D ' −D + D ' (1−D ' ) 6 ' + D + (1−D + ) 6 + Results The differences between the most influential factors within two categories for both failed and semi- successful PTC runs were also analyzed. Tables 12-14 show the result of the z-test for the population proportion for the most influential factor within the two categories for both failed and semi-successful PTC runs. We only studied these pairs since it is a priori study. Run Result Failed Semi-Successful Initialization SW Download 0.19 Not Active at Designated Point 0.92 Disengage Speed invalid (wheel tach) 0.43 Speed invalid (wheel tach) 0.49 Hypothesis F G : D ' = D + F ^ : D ' > D + F G : D ' = D + F ^ : D ' > D + Test Statistic \−<=?V7 < 0.0001 \−<=?V7 < 0.0001 Rejection Rule Reject H 0 Reject H 0 Result SW Download Not Active at Designated Point Table 12 - Differences between Initialization and Disengage Run Result Failed Semi-Successful Initialization SW Download 0.19 Not Active at Designated Point 0.92 Cut-out Onboard cut-out 0.45 Crew cut-out 0.61 Hypothesis F G : D ' = D + F ^ : D ' > D + F G : D ' = D + F ^ : D ' > D + Test Statistic P−value < 0.0001 P−value < 0.0001 Rejection Rule Reject H 0 Reject H 0 Result SW Download Not Active at Designated Point Table 13 - Differences between Initialization and Cut-out 69 Run Result Failed Semi-Successful Disengage Speed invalid (wheel tach) 0.43 Speed invalid (wheel tach) 0.49 Cut-out Onboard cut-out 0.34 Crew cut-out 0.61 Hypothesis F G : D ' = D + F ^ : D ' > D + F G : D ' = D + F ^ : D ' > D + Test Statistic P−value = 0.38 P−value = 0.0197 Rejection Rule Fail to reject H 0 Reject H 0 Result Speed invalid (wheel tach) Speed invalid (wheel tach) Table 14 - Differences between Disengage and Cut-out Table 15 summarizes the above analysis for failed PTC runs. The results show that software download issues are more influential when compared with disengage and cut-out factors. These results are in line with our previous findings. Failed Runs Initialization Disengage Cut-out Initialization SW Download SW Download Disengage Speed invalid (wheel tach) Onboard cut-out Cut-out Table 15 - Most Influential Factors Contributing to Failed PTC Runs Table 16 summarizes analysis for semi-successful PTC runs. The results show that not being activated at the designated point is more influential when compared with disengage and cut-out factors. In addition, an invalid speed reading due to tachometer slip is more significant than crew cut-out. These results are also compatible with our previous findings. Semi-successful Runs Initialization Disengage Cut-out Initialization Not Active at Designated Point Not Active at Designated Point Disengage Speed invalid (wheel tach) Cut-out Table 16 - Most Influential Factors Contributing to Semi-successful PTC Runs 70 We will discuss the relationship between these factors and HRO characteristics and how they correspond to our primary research question in section 3.3. However, the result of this study shows that HRO related factors affect the success of PTC operations. The factors that we identified in this study were not purely technical, but mostly represent the interaction between the subsystems or human and the PTC system. To Be Determined (TBD) factors had a high frequency, which strongly contradicts the “reluctance to simplify” principle of HRO. 3.2.4. Study 2 – PTC Delay Factors As stated earlier, one of the sources of variation in performance of the train operation under PTC is train delays due to PTC related issues. In this section, we answer the question: How HRO principles are violated as we encounter delays? The Southern California Regional Rail Authority (CSRRA) identifies “on-time performance” as one of the twelve Key Performance Indicators (KPI) to measure and control their operations. 18 CSRRA defines on-time performance as “trains that reach their destination within 5 minutes and 59 seconds”. In this section we analyze the PTC factors that contribute to the train operation delays. This study is focused on the second set of data that was obtained from Metrolink, the PTC Delay reports. This set of reports includes the delays reported by train crews or dispatchers that were associated with each PTC. Each report contains the date of the incident, train information, alleged and actual PTC delays, and reasons for being late. The information provided in this report is descriptive. To extract the factors that contribute to delays, we first had to define the physical and 18 Southern California Regional Rail Authority, Key Performance Indicators (KPI) Quarterly Performance Report, FY16/17, 2 nd quarter, October – December 2016 71 functional characteristics of PTC, and understand and evaluate the people, procedures, facilities, equipment, and the environment. To understand the PTC system, aside from extensive research, we visited Metrolink yard in Union Station in downtown Los Angeles and the dispatching center in Pomona. We also interviewed the experts. The summary of these meetings was presented in section 3.2.2. One of the main outcomes of these meeting was the identification of PTC failure modes that will be discussed in section 3.2.4.1. The final step in this study is the identification of the factors that contribute to PTC failures and to extract them from the PTC delay reports, which are presented in section 3.2.4.2. 3.2.4.1. PTC Failure Modes Dr. Richard Hartley, one of the experts in the field of high reliability, proposed the Federal Emergency Management Agency (FMEA) approach for hazard identification 19 . We reached out to railroad safety experts and asked them about the tools and methods that they use for safety assurance and risk analysis of railroad operations 20 . They responded that Fault Tree Analysis (FTA) is one of the main references for their practice, and FMECA and Operating Hazard Analysis (OHA) are two major tools that are used for proactive risk analysis of equipment and human operations, respectively. These risk analysis methodologies guided the analysis to identify PTC failure modes. Table 17 summarizes the possible PTC failure modes that were identified based on literature review and expert consultation. 19 Personal contact via Email on June 10, 2014 20 Personal contact via Email with and Mr. Tom Griego and Mr. Dave Schlesinger on June 12, 2014 72 Table 17 - PTC Failure mode (Southern California Regional Rail Authority (Metrolink), 2010) As stated before, the objective of PTC is to prevent four types of train accidents. To identify the factors that contribute to these accidents, four FTAs were developed, each corresponding to one of the PTC objectives (figures 15-18). These analyses are based on the deep analysis of human and organizational factors, as well as technical characteristics of the PTC system. The results were compared to the previous works on PTC failure modes to better understand the system component interactions (Southern California Regional Rail Authority, 2013) (Hartong, Goel, & Wijeseker, PTC Failure Modes Wayside Failures CP, Other Signal and Switch Related Failures Complete WIU Failure Failure of an Individual WIU Input or Output Failure of Switch Detection Signal Lamp Out and Downgrade of Aspect Track Circuit Failure Grade Crossing Related Failures Failure of Communication with Crossing Location Nearside Crossing Inhibit Advance Crossing Activation Failure of a Grade Crossing Warning System On-board PTC Failures Failure to Sustain the Limit of Movement Authority Failure of On-board PTC Equipment Failure of Position Detection Function Failure of Penalty Brake Hold Off Failure of Engineer’s Display Failure of On-board Printers Failure of On-board Radio Communications Communications Failures Wayside Data Network Failure Office Subsystem Failures Failure of One Back Office Server (BOS) Failure of All BOS Functionality Failure of the Interface with OCC Employee in Charge (EIC) Equipment Failures Extended PTC System Failures 73 2011). Figure 19 summarizes and categorizes the results of the FTAs based on human, organizational, and technical factors. Safety in complex processes is based on an understanding of the interactions between ‘H’ (humans), ‘O’ (work organization), and ‘T’ (technology) (Skjerve & Kaarstad, 2014). The reliability and safety of technological systems are a function of interactions between their personnel, organization, and engineering subsystems (Meshkati, 1995). The comparative analysis of the four accidents shows that human, organizational, and technological factors influence the results. In other words, possible occurrences in the system that could result in potential hazardous events that could be technological failure, human error, or organizational issues. To achieve PTC key objectives, and minimize the probability of accident, there should be a system that covers all possible causes that influence safety operations in railroads. However, as illustrated in figure 3, PTC only covers part of the root causes. PTC is very strong in the technical failure and operator error aspects of rail safety operations but it is weak in addressing organizational factors and also dispatcher and Maintenance of the Way (MOW) performance. HRO could provide an additional layer of support that covers human, organization, and technology aspects of the system. HRO also provides guidelines and organizational strategies in the case of PTC system failure. In Section 3.3 we will discussed the integration of HRO and PTC in detail. 74 Over Speeding Breaking system failure Failure to enforce speed restrictions Engineer error Missing speed limit in database PTC system failure Figure 15 - FTA for over speeding [in part (Hartong, Goel, & Wijeseker, 2011) ] 75 movement of a train through a switch left in the wrong position Wayside fails to convey the correct action on board Wayside switch fails to detect the correct position Wayside system failure Communication failure Dispatcher error Loss of power Onboard system failure Figure 16 – FTA for Movement of a train through a switch left in the wrong position [in part (Hartong, Goel, & Wijeseker, 2011)] 76 Incursion into established work zone Over Speeding Failure to set correct route Failure to enforce route Failure of switch equipment Failure to generate correct route PTC equipment failure Wayside equipment failure Communication network failure Dispatcher error Breaking system failure Wayside equipment failure Failure of PTC system to enforce command Failure to enforce speed restrictions Engineer error Missing speed limit in database PTC system failure MOW Error EIC Failure Figure 17 - FTA for Incursion into established work zone [in part (Hartong, Goel, & Wijeseker, 2011)] 77 Train-to-train Accident Over Speeding Failure to set correct route Failure to enforce route Failure of switch equipment Failure to generate correct route PTC equipment failure Wayside equipment failure Communication network failure Dispatcher error Breaking system failure Wayside equipment failure Failure of PTC system to enforce command Failure to enforce speed restrictions Engineer error Missing speed limit in database PTC system failure Figure 18 - TFA for train-to-train accidents [in part (Hartong, Goel, & Wijeseker, 2011)] 78 Communication network failure Wayside equipment failure Database failure PTC failure PTC failure Wayside equipment failure Communication network failure Missing speed limit in database EIC Failure Breaking system failure Missing speed limit in database PTC system failure Wayside system failure Communication failure Loss of power Onboard system failure Communication Problems Failure To Successfully Switch Back To CTC Procedural Issues Lack Of Training Dispatcher error Engineer error Dispatcher error Engineer error MOW Error Engineer error Dispatcher error Train-to-train Accident Incursion into established work zone Over Speeding movement of a train through a switch left in the wrong position Technology Man (Human) Organization PTC Figure 19 - Factors influencing PTC based on FTA 79 3.2.4.2. PTC Delay Factors The findings of the FTA analysis were used as the bases for evaluating the PTC delay reports that were introduced earlier in the chapter. This set of reports contains 700 reported cases of operations delay due to PTC between April 4 and December 22, 2016. After an in depth analysis of the report, the causes of operation delays were extracted from the reports and categorized based on a human- organization-technology model. Table 18 summarizes the result of this study and provides the frequency distribution for PTC delays. PTC technology was further categorized as office, communication, way side, and onboard segments, which are compatible with the PTC technology adopted by Metrolink. The PTC delay reports were also analyzed based on the operating subdivision. We did not identify a significant trend that was associated with the operating subdivisions. The result of this study shows that software download issues during the initialization process is the second most influential factor in PTC delays, followed by invalid speed reading due to faulty tachometer, and GPS/navigations issues. This result is compatible with our findings in a previous study (figure 20). If the PTC is not activated at the designated point, the train will continue its operation without PTC; therefore, this factor, which was the factor that has the most influence on the semi-successful runs in the initialization category, was not included in this analysis. The onboard and crew PTC cut-out were also among the factors that contribute to PTC delay, but they were not as significant as the aforementioned factors. 80 Frequency Organization Interoperability Miscellaneous Late Meet 3 Technology-Centric Onboard Segment Onboard Computer SW Failure – Switch Manager 11 SW Failure - Breaking Calculation 1 SW Failure 23 HW Failure 7 HW Failure - EBI Card 4 Tachometer Slip/Slide 28 Sync Error 2 Tagged-out - Onboard 4 Onboard Cut-out 12 PTC Equipment Tagged-out - Mechanical 6 Onboard Equipment 1 Wayside Interface Unit Signal Aspect Field Signal Issues 2 VHLC 10 Communic ation Segment Communication Processor BOS & Train 1 Latency 2 GPS GPS/Navigation - Location Unknown 26 Tower (Cellular, Radio, Wi-Fi) Initialization - SW Download/Update 29 Radio/HMAC 6 Switch unknown 11 Signal Unknown 23 Data Communication Loss 9 Office Segment BOS Office Cut-out 3 BOS Restart 1 Incorrect GTB/Track Infor/Train ID 3 Initialization 1 CAD CAD Failure 3 CAD - Failure to Warn 1 Human-Centric Dispatcher Consist Error - Initialization 37 Consist Error - Restricted Speed 7 Incorrect GTB/Track Information 6 Crew Cancel Initialization - Departure Test 3 Cancel Initialization - SW Update 1 Initialization - Incorrect GTB 7 Run - Incorrect Track Selection 15 Run - Incorrect Switch Selection 2 Crew Cut-out 5 Table 18 - PTC Delay Factors The factor that has caused the most delay was “consist error”. The train consist is the lineup or sequence of train cars. In order for the PTC system to operate the crew needs to confirm that the 81 PTC system is displaying the most current train consist. If there are any discrepancies, the train crew needs to get an authorization from the dispatcher to update the information (Southern California Regional Rail Authority, 2013). After the consist is updated the train can start its run. Although train consist issues might not result in failed runs, going through the correction process could be time consuming. Hence, the consist error was not a significant factor to PTC failure, but it plays a significant rule in PTC delays. The consist error is mainly a human and organizational issue, and the only approach for resolving this issue is through communication between the crew and the dispatcher, and well-established organizational procedures. This result highlights the importance of incorporating HRO principles, specially “deference to expertise” in PTC operations. Like the previous study, the factors that contribute to PTC delays are mostly results from interactions of the human with PTC subsystems. Another issues that needs to be mentioned is a lack of the measurement factors for organizational performance. As stated before, we were not involved in the data collection for these studies. However, we strongly recommend the incorporation of factors that measures the organizational performance, like interoperability between different railroad sharing tracks, in these reports. In the next section, we discuss the integration of PTC principles into the train operation under PTC. 82 Figure 20 - Factors Contributing to Delayed PTC Runs Unidentified Factors In both sets of reports, there were factors that were labeled as “TBD” (To Be Determined) and “Unidentified”. These labels represent issues that were encountered during the train operation and resulted in an unsuccessful PTC run. Figure 21 shows the frequency of unidentified factors that affected the successful PTC runs between April and December 2016. This figure shows there is no significance decline in the number of unidentified factors in the course of nine months. It shows that even if causes of failures were identified, there were new issues with the PTC implementation and operation that were unknown to the implementers. This is an alarming pattern, and directly corresponds to the “reluctance to simplify” principle in HRO. 0 5 10 15 20 25 30 35 40 PTC Delay 83 Figure 21 - Unidentified factors affecting successful PTC Runs (April - December 2016) To resolve this issue, causes of failures should be identified and communicated with the people involved in the operation and implementation of PTC. The standard operating procedures should be updated to incorporate the findings and prevent their occurrence in the future. Although encountering unknown issues in implementing a new technology is expected, their occurrence should follow a downward trend. 3.3. Integration of HRO Principles into the PTC Operations In this section, we discuss the integration of PTC principles into the train operation under PTC to improve system reliability. As stated before, PTC is an overlay technology to an already existing safety system. The objective of PTC system is to prevent four types of accident. However, as the results of previous studies showed, the successful implementation of this technology is subject to having a balanced system with aligned human, organizational, and technological subsystems. 0 50 100 150 200 250 April May June July August September October November December Unidentified Factors - PTC Daily Run 84 3.3.1. Adaptation of HRO Principles for Railroad Operations under PTC One organization attempted to adopted HRO characteristics to petro-chemical operations and translating them to their own language. 21 Inspired by their work, we integrated the five HRO characteristics into the railroad operations under PTC (Table 19). The unexpected issue in this case would be the failure of the PTC system. In addition, the resilience is the ability of the system to reinitiate PTC or transition to the previous safety operations. The first draft of the adaptation of HRO principles, presented in table 19, was developed based on the extensive study of HRO and PTC literature. The result was consulted with HRO and PTC experts in multiple sessions to develop an adaptation that best represents the HRO characteristics in PTC operations. As stated in previous chapters, research on organizational culture and safety has outlined developing a system of process checks to spot expected and unexpected safety problems as one of the main processes that are useful in developing HROs (Wong, Desai, Madsen, Roberts, & Ciavarelli, 2005). To integrate the HRO principle into PTC operations, we adopted Weick and Sutcliffe’s book, managing the unexpected (2015), as a guide and following the steps they outlined for implementing each HRO principle. It should be mentioned that these checklists are primarily PTC oriented, and not targeted towards general organizational practices. In developing the checklists, we incorporated the findings of the interviews and site visits that were discussed in section 3.2.2, as well as the findings of the previous studies to customize the checklist for PTC operations. 21 Based on a conversation with EQUATE Petrochemical Company Maintenance Leader, Mr. Parthasarathy Kannan at the 8th annual HRO conference in March 2014 - www.equate.com 85 Preoccupation with Failure Reluctance to simplify Sensitivity to operations Commitment to resilience Deference to expertise Detect small discrepancies Do not generalize Identify trends and anticipate impact Commit to resilience Shifting the decision making to the people at the “sharp end” Always look for small, emerging failures in train operations under PTC, because they could indicate a potential problem in the system. Update deliberately and often. Treat the unexpected PTC events/failures with concern rather than rationalization that it is normal. Make fewer assumptions, notice more and ignore less. Look for moment- to-moment changes in performance. Pay attention to what is actually happening on the track and its surrounding, regardless of intention, design and plans. Strengthen the ability to detect and contain unplanned events, and resume PTC operation. Remember that it is the people on the front line of the operations, who have the answer to the problem at hand. PTC PTC Failure Table 19 - HRO adaptation to PTC Preoccupation with failure Weick and Sutcliffe define failure as a “lapse in detection”. Such a lapse could happen due to a lack of anticipation of what and how things could go wrong, failure to catch deviations as soon as possible, or lack of investigation of prior unexpected events. Using this approach, questions that could help detect signs of failure in the PTC systems are as follows: 1. What does failure mean around here? The definition of failure could change in different stages of the operation (design, implementation, etc.). The PTC system is currently at the final stages of implementation, and it is being evaluated for full operation. Therefore, failure could be defined as: • Unsuccessful (annulled, failed, or missed-opportunity) PTC runs • Delays in train operations due to PTC events (all enforcements, red fences, IT problems, and technical problems) and cascade events. 86 2. What is the organization’s approach toward PTC failures once they happen? • Does the organization encourage people to call attention to PTC failure? • Is there an established procedure to investigate PTC failures and document the lessons learnt? • Are the procedures updated to reflect on the new findings and prevent similar failures from happening in the future? Reluctance to simplify Simple rules of thumb are easier to follow, therefore more appealing. However, they could result in undetected weak signs of potential failures, and increase the probability of unreliable performance. This principal is about the concepts people have at hand to do the detecting and recovery. The following questions could help the organization inquire about their approach towards reluctance to simplification: 1. How do people respond to unexpected PTC events? 2. What is the organization’s approach towards challenging the status quo? Are people encouraged to ask questions and express their views at all levels of the organization? 3. How do people interact with one another? Sensitivity to operations Weick and Sutcliffe define sensitivity as a “mix of awareness, alertness, and action that unfolds in real time, and that is anchored in the present” (Weick & Sutcliffe, 2015, p. 79). The two basic reliability mandates in railroad systems are 1) keep the train operations flowing and 2) protecting the system, which in this research implies the PTC system (safety feature), as well as the train 87 operating systems. The following are the qualities related to sensitivity to operations that affect how well a railroad organization can manage unexpected PTC events. • “Be where you are with all your mind” 22 . A distracted train engineer is a danger to himself or herself, to the train’s passengers, and also vehicles and people along the track. • Even the most experienced engineer, conductor, or dispatcher can make mistakes if they are unaware of a danger. • Do not underestimate the risks. An example would be the Metrolink engineer’s perception of the risk associated with texting while operating a commuter train. • Maintain situational awareness. “Sensitivity in operating in an evolving situation” (Weick & Sutcliffe, 2015, p. 82). Overreliance on PTC technology could affect the situational awareness of the train engineer. It is especially dangerous, since the PTC system is blind to people and objects on the track. • Be aware of routines. Reliability is a moving target, so continuing in the course of action without reevaluation would seriously hamper the reliability of the operations. • Continuously looking for feedback about issues that people encountered with the system. • Continuously monitor workload to determine needs for additional resources. Make sure that people have access to resources if unexpected events happen. Commitment to resilience: Interruption is a constant occurrence in HRO operations. However, they are capable of adjusting their functions prior, during, and following interruptions and bounce back to the same functions 22 This sign used to hang in a machine shop on the New York Central Railroad, and now it is hung over the desk of Karl Weick. 88 (Hollnagel, 2009). Strengthening the organization’s commitment to resilience requires inquiries into learning, knowledge, and capability development. In the case of PTC implementation, the definition of normal operations changes based on the different stages of implementation. At the time that this report is drafted, if there is a problem with the PTC system that cannot be resolved in under five minutes, the PTC system is cut-out and the train operation is resumed to the Centralized Train Control (CTC) system. However, after the PTC is fully implemented and goes into operation, all trains must run under PTC at all times. Considering this issue, questions that could provide a better understanding of the resilience of the system are as follows: 1. Is the organization concerned with improving people’s knowledge and ability to respond to unexpected events? 2. What is the organization’s policy regarding resource assignment for training people in their areas of expertise? 3. Do people in the organization reach out to people in other departments to solve problems? 4. What is the definition of trust in the organization? Deference to expertise One of the distinct properties of HROs is migrating decision making. In other words, decisions are pushed down to the lowest levels of the organizations, when they require quick decision making (Roberts, Stouts, & Halpern, 1994). In order for these decisions to migrate to experts, the organizational hierarchy needs to be loose, identify the experts, and envision a decision making mechanisms to reach the experts in time (Weick, Sutcliffe, & Obstfeld, 1999). Deference is an activity that happens when there is an interruption in the flow of the system caused by an unexpected event. In the case of railroad operations under PTC, the dispatcher plays a vital 89 role. They need to maintain high performance at peak times, look for potential solutions, gather and verify information from multiple sources, and maintain the reliability of the system. In chapter 2 we discussed the failure of the London Ambulance Service dispatching system. This case illustrates the importance of the expertise of the dispatchers in the context of automation (section 2.2.5). The following questions suggest how deference to expertise could be practiced in a railroad organization: 1. What is the organization’s culture in regards to expertise? And how do people value expertise and experience over rank? 2. What is the organization’s strategy in response to unexpected events and involving the most qualified people vs. rank in decision making? 3. Do people know who has the expertise in the face of an unexpected interruption in the system? And how easy it is for them to reach that person? Using this guideline, we expanded the adaptation of HRO principles to railroad operations under PTC. To achieve this objective, we needed to create a comprehensive list of performance indicators (measures) of the PTC operations. Earlier in this chapter we defined variance in operation as 1) uninterrupted/successful PTC performance during a train run and 2) train delays due to PTC related issues. In sections 3.3.2 and 3.3.3, we conducted two studies to identify the factors that cause variation in PTC operations. Since each study was focused on a separate data pool, there is a slight disparity in the factors that were identified in these studies. First we conducted comparative analysis of the factors identified in two studies to find the correlation between these factors (Figure 22). Then the factors from two reports were grouped together based on their functionality, technical, and organizational characteristics in six categories of: 90 Cut Out: This category contains measures that indicate the number of times that the PTC system was cut out, either by the crew or due to system failure. Onboard: This category contains measures that indicate the frequency of PTC onboard system failure due to both software and hardware malfunctions. Tagged-out: This category contains measures that indicate the number of times that a part of the PTC equipment (mechanical, onboard, communication, etc.) was removed or tagged to be removed for maintenance. Communication / Signal Issues: This category contains measures that indicate the frequency of failures of communication network components. These failures could result in signal issues or slow download/update of the PTC system at the initialization. Crew/Dispatcher: This category contains factors that measure the performance of the train crew and the dispatcher, and their interaction with the system. To Be Determined (TBD): In the analysis of the PTC reports, we noticed that many times the reason for the system failure was marked as “unknown” or “TBD”. This category is dedicated to measure the frequency of these unknown system failures. The next step was to build on the adaptation of HRO to railroad operations. Table 18 shows the adaptation of HRO characteristics for each category of PTC performance measures. As the results show, not all HRO characteristics are applicable to all performance measures. In the next section, we provide a checklist for integration of HRO principles for the factors that we found influential to the system reliability in previous studies. 91 Figure 22 - Mapping between PTC Run Reports vs. Delay Reports Factors 92 PTC Delay Report (Study 2) Daily PTC Run Reports (Study 1) PTC PTC Failure Preoccupation with Failure Reluctance to Simplify Sensitivity to Operations Commitment to Resilience Deference to Expertise Cut Out Onboard Cut-out Onboard cut-out Look for signs that indicate a potential problem with the system before initiating PTC, or during previous runs. Look for an emerging pattern of failure of onboard/office system. - When a component is cut- out, the train might either finish the run without PTC or successfully reinitiate PTC. (percentage of successful re-initialization) Crew cut-out, directed vs. not directed. Office Cut-out Crew cut-out, Directed Crew Cut-out Crew cut-out, Not Directed Cut-out due to outage Air Brake cut-out Crew cut-out, Unknown Onboard SW Failure – Switch Manager Invalid DIO Input (Fault 0x746) Look for signs that indicate a potential problem with the onboard system before initiating PTC, or during previous runs. Look for an emerging pattern of failure of onboard equipment. - When an on-board equipment fails, the train usually finishes the run without PTC. (percentage of successful re- initialization) - SW Failure - Breaking Calculation TMC out of memory (Fault 0x817) SW Failure HW Failure HW Failure - EBI Card EBI (Fault 0x680) Speed Invalid (Tachometer Slip/Slide) Speed invalid (wheel tach) Onboard Equipment Failed State CAD Failure PCS penalty not able to recover CAD - Failure to Warn Tagged-out Tagged-out - Onboard PTC Components removed Maintenance is an important part of railroad operations. However, it should be managed in a way that does not interfere with the flow of operations. There should be a thorough investigation on why the PTC components had to be removed, is there a pattern? - - - Tagged-out - Mechanical Tagged out Mechanical Tagged Out- TMC Tagged Out- Wheel tach Tagged Out- Failed Departure Test Tagged Out- Mag Valve Tagged Out- Recorder Communication Tagged Out- Data Radio Communication /signal Sync Error Sync Flag Look for signs that indicate a potential problem with the communication network, or spot on communication dead zones on the track. Look for an emerging pattern of communication network failure. - If the signal is lost, the system cannot confirm the location of the train and/or the speed limits and/or switch alignment, and it will issue a penalty break, or the PTC will be disengaged. It could be activated when the signal is recovered. - Sync Flag/Bulletin BOS & Train Non Communication Flag Latency Not Active at Designated Point Radio/HMAC Switch unknown Signal Unknown Data Communication Loss GPS/Navigation - Location Unknown GPS/Navigation Fault Initialization – SW Download SW Download Crew/ Dispatcher Initialization Crew canceled initialization Look for signs that indicate a potential problem in people's interaction with the network. Look for a trend in errors that the crew and dispatchers make in interacting with the system. Crews responsibility is operating the train, situational awareness, and paying attention to the tracks (platforms at the stations), detecting errors in the PTC system. The dispatcher job is to keep the flow of the operations while maintain the reliability of the system. Dispatcher and the crew are the only people capable of reactivating/reinitiating PTC if it fails. Technical competence, complex activities, high performance at peak levels, pressure for safety and on- time performance. Cancel Initialization - Departure Test No attempt to initialize Cancel Initialization - SW Update Incorrect GTB/Track Information Non Sync Subdivision Incorrect PSS Crew track selection Initialization - Incorrect GTB Incorrect GTB/Train ID unavailable Run - Incorrect Track Selection Run - Incorrect Switch Selection Incorrect GTB/Track Information/Train ID Consist Error - Initialization Consist Error Consist Error - Restricted Speed TBD Unidentified - It is crucial to address the unknown errors - - - TBD Table 20 - Integration of HRO Principles into the PTC Operations 93 3.3.2. Integration of HRO Principles into the Identified PTC Performance Measures In section 3.3.2 we identified initialization issues and factors that result in the disengaged state having the most influence on the reliability and serviceability of PTC operation. We further identified software issues and not being active at the designated point (initialization) and invalid speed reading due to faulty tachometer (disengaged) as the most influential factors within each category. The findings of the second study presented in section 3.3.3 confirmed the results of the first study, and also showed that the train consist error is the main reason for delayed PTC runs. We used the guideline presented in section 3.4.1 and develop a checklist for integration of HRO principles into each of the abovementioned categories. HRO Integration Checklist for Onboard Factors The following is a checklist for integration of PTC principles into the PTC initialization process: ü Look for signs that indicate a potential problem with the PTC software before initiating PTC, or during previous runs. ü Look for an emerging pattern of failure of onboard/office system. ü Call attention to PTC software and onboard system failures once they happened. ü Conduct a thorough investigation on why the PTC components had to be removed, is there a pattern? ü Improve people’s knowledge and ability to respond to unexpected PTC onboard/office failures. ü Update procedures to reflect the new findings and prevent similar failures from happening in the future. 94 HRO Integration Checklist for Communication Factors In section 3.3.2, we identified factors that result in the disengaged state of PTC system as the second most influential factor in the reliability of PTC operations. In this category, invalid speed reading due to tachometer slip and GPS/Navigation issues had a significant influence on the reliability of the system. These factors are both related to the communication network failures. Therefore, the checklist for the disengaged category is more focused towards communication issues. ü Look for signs that indicate a potential problem with the onboard tachometer. ü Conduct a thorough investigation on issues regarding the speed readings, is there a pattern? ü Look for signs that indicate a potential problem with the communication network. ü Identify communication dead zones on the track. ü Develop an interoperable system with other railroads that share the track. ü Call attention to network failures once they happened. ü Improve dispatcher and crew communication abilities to respond to unexpected network failures. ü Updated procedures to reflect on the new findings and prevent similar failures from happening in the future. HRO Integration Checklist for Crew and Dispatcher Interaction The analysis of the PTC delay reports showed that errors with the train consist is the main cause of delayed PTC runs. Entering and confirming the train consist in the PTC operation is the job of the dispatcher and the train crew member. As stated before, if there are any discrepancies between the consist displayed by PTC and the most up-to-date information, the issue can only be resolved 95 through communication between the train crew and the dispatcher. Therefore, the following checklist is targeted towards the crew and the dispatcher activities. ü Continuously looking for feedback about issues that people encountered with the system. ü Continuously monitor workload to determine needs for additional resources. ü Look for signs of potential problems with crew-dispatcher communication. ü Improve dispatcher and crew communication abilities to respond to unexpected network failures. ü Maintain situational awareness. ü Dispatcher and the crew are the only people capable of reactivating/reinitiating PTC if it fails. Improve their abilities to respond to unexpected system failures. In this section, we presented an adaptation of HRO principles to PTC operations based on HRO literature, our interviews with HRO and PTC experts, site visits of Metrolink facilities, and the result of the studies on PTC operation reports. We then developed a check list for the four categories of PTC operations that were identified as the most influential factors in reliability of train operations under PTC. These results show that incorporation of HRO principles creates a mindful environment where employees are empowered to participate in decision making processes, especially where only the communication between the dispatcher and the engineer could resolve PTC operation issues. In addition, detection of small deviations before they affect the reliability of the system and reluctance to simplify the results would prevent the occurrence of the problems, like unverified issues, and they will create a more resilient and reliant operation. In addition, as stated in chapter 2, improving upon incorporation of HRO principles in operations would enhance and improve the traits of a healthy safety culture, and vice versa. 96 CHAPTER 4. Challenges and Potential Future Work Human and organizational factors research, especially in complex high-risk organizations are inherently challenging due to the unpredictability and complexity of human and system interactions. These organizations place a lot of attention on their technical systems, and their improvement that could potentially impact the human and organizational efforts. One of the unique features of this research was that it was conducted during the initial implementation phase of the PTC system at Metrolink. Although this window provided us with the rare opportunity to get a close look at the implementation processes and its complications, it introduced serious challenges in data collection processes. We had to try different approaches to collect performance data before acquiring the actual PTC performance reports including Rotem Cab Simulator and General Train Movement Simulator (GTMS). The Rotem Cab Simulator is used for scenario-based PTC training for train engineers. During the February 2014 visit, we had a chance to work with the simulator. The data that is generated through different training scenarios could be saved as a simulator log file. However, at the time of this research the simulator was mainly used for training purposes and they were in the process of setting up a tracking system for the generated data. We believe that these files, if saved and maintained properly, could be a great resource for human-technology interaction analysis, and should evolve into an industry standard. 97 Figure 23 - Metrolink Rotem Cab Simulator Railroad accidents could be successfully modeled using stage-based simulation, since accidents or incidents result from a sequence of events, or a causal chain, that incrementally advances the risk of the system until all prerequisites for an accident are met (Federal Railroad Adminestration, 2014). The FRA funded Generalized Train Movement Simulator (GTMS) was developed to evaluate the potential impact of PTC on safety. We faced many difficulties developing our model using GTMS due to technical problems associated with the software, and due to a lack of sensitive system specification data, we were not able to develop a simulation model ourselves. We believe that given the accurate systems and performance data, a well-developed simulation model could be a great asset in safety and reliability assessment of PTC operations in railroads. The reports that were provided by Metrolink, including actual PTC operation data. Although it has its challenges, it was a unique opportunity to work with the actual performance data. It gave us a unique advantage to evaluate how PTC is integrated into the already existing safety system. In addition, we were able to extract information regarding the interaction of crew and dispatcher with the system, even though we were not able to design the experiments to collect the data ourselves. We believe that a study designed to evaluate human and organizational elements would be a 98 worthy follow up to this research. Specially as more railroad organizations implement and operate PTC on their tracks, we will be able to evaluate the performance across the industry. As stated before, interoperability among railroad organizations that share the PTC territory is vital to successful implementation and operation of PTC. As we approach the 2018 deadline, we should further evaluate and explore the interoperability of this technology within railroad organizations. Highway-rail grade crossing incidents are one of the leading causes of railroad accidents. A potential future work in line with this dissertation could be the evaluation of application of PTC to include highway-rail grade crossings. The other area that could be further explored is the expansion of the methodology used in this dissertation to create a dashboard, as a visually advanced decision support tool and interface, to monitor and control the factors that affect service reliability and service interruption in the system. Figure 24 presents an example for a potential PTC dashboard. The data presented in this dashboard is extracted from the two sets of reports that were obtained from Metrolink and the factors were selected from the list of influential factors that were presented in chapter 3. 99 Figure 24 - Example of a potential PTC Dashboard 100 CHAPTER 5. Summary and Conclusion For the foreseeable future, despite increasing levels of computerization and automation, human operators will have to remain in charge of the day-to-day control and monitoring of complex technological systems, since system designers cannot anticipate all possible scenarios of failure, and hence are not able to provide pre-planned safety measures for every unexpected event and contingency. In other words, “Operators are maintained in [complex technological] systems because they are flexible, can learn and do adapt to the peculiarities of the system, and thus they are expected to plug the holes in the designer’s imagination” (Rasmussen, 1980, p. 97). Overreliance on technology could potentially hamper the reliability and resiliency of the organization. For safety-sensitive operations, like railroads, there are always unforeseeable events that should be covered by safety systems, but are not because of one or more unrecognized elements. That is why HRO acts as a safeguard against events that we cannot wholly understand. Integrating HRO principles in the system from the early stages of PTC implementation not only ensures the achievement of the key objectives of this technology but also mitigates those events that PTC is not designed to prevent. Events like grade-crossing accidents or operators’ overreliance on the technology that might affect their situational awareness, hamper their performance, and eventually the overall safety and reliability of the system. Nearly eight years after Congress instructed the US railroads to install PTC. The derailment of a New York-bound speeding Amtrak train near Philadelphia on May 12, 2015 has increased the interest in PTC. Eight people were killed, and nearly 200 were injured, when the train engineer became distracted and accelerated to 106 mph in a 50 mph zone as he entered a curve (National 101 Transportation Safety Board, 2016). Lack of PTC played a crucial role in this accident. According to the NTSB Chairman, Honorable Robert Sumwalt, “Based on what we know, had such a system [PTC] been installed in this section of track, this accident would not have occurred” (Aisch, et al., 2015). Most railroads are expected to miss the 2018 congressional deadline, consequently the pressure to expedite the implementation is now more than ever, and that could hamper the successful implementation and performance of the rail safety system. Railroad organizations are mandated to implement PTC to improve the reliability of the railroad operations. However, implementing the technology alone will not guarantee such results. On April 3, 2016, Amtrak train 89 struck a backhoe with a worker inside while traveling 99 mph near Chester, Pennsylvania. Two Amtrak employees were killed and 39 passengers were injured as the result of this accident (National Transportation Safety Board , 2017). PTC is designed to prevent the incursion of the train into the work zone, and that section of the track was equipped with PTC (Laughlin, 2017). However, the accident was not prevented because equipment used for maintenance, like the backhoe in this accident, is not detectable by the PTC system, and needs to be registered in the systems and the tracks should be manually shut. This accident highlights the importance of human and organizational factors in the PTC system, as this accident did not occur due to a lack of PTC, but because of a failed safety culture. A culture “of fear, on one hand, and normalization of deviance from the rules on the other” 23 . In this research, we conducted an in-depth analysis of railroad operation under PTC to evaluate the contribution of implementing HRO elements for the improvement of service reliability and 23 Robert Sumwalt, chairman of the National Transportation Safety Board, said at a hearing reviewing the investigation of the crash. http://www.philly.com/philly/business/transportation/ntsb-report-amtrak-derailment- april-2016-chester-pennsylvania-20171114.html?mobi=true 102 service interruption in train operations under PTC. We identified four performance factors that have a significant influence on the reliability and serviceability of the Metrolink operation under PTC between April 4 and December 22, 2016. Although technical factors such as onboard system failures or software issues affected the PTC operation, our studies showed that software download and updating issues during the initialization process, invalid speed reading due to a tachometer slip/slide, GPS or navigation issues that result in a loss of location or signal in the PTC system, PTC system not being activated at the designated point, and invalid train consist displayed by PTC system at the start or during the train run are the factors that have a significantly higher contribution to PTC failures and major delays. Almost all of these issues arise from inadequate interaction of the subsystems or lack of alignment between organizational and human factors, and the technical system. To show what it means for a railroad organization to be highly reliable, we integrated the HRO principles to train operations. We further expanded that concept to emphasize the train operation under PTC, and provided a check list for onboard factors, communication network, and crew and dispatcher work and interaction, since these were the areas that corresponded to the factors that have the most influence on the reliability of PTC operation in the observed system. This research also outlined the importance of interoperability and the engineer’s situational awareness. The successful implementation of PTC is extremely reliant on the interoperability of the PTC system. Other Southern California railroads, including Burlington Northern Santa Fe (BNSF), Union Pacific (UP), and Amtrak, also operate on the 216-mile publicly owned portion of Metrolink’s network. At the time of this research, Metrolink trains were only able to operate PTC on the Metrolink section of the track. If the train went through a section of the track that was operated by other railroad companies, the PTC system went to disengaged mode and reactivated 103 after reentering Metrolink territory. The same concept applies to other railroad companies. This introduces a serious safety concern, as the objectives of the PTC will not be achieved unless all railroad organizations that share a track implement a fully interoperable system. The other factor that could potentially increase the system vulnerability is the engineer’s situational awareness. Overreliance on PTC technology could affect the situational awareness of the train engineer. This is especially concerning as the PTC system is blind to people and objects on the track, while trespassing, and highway-rail grade crossing incidents account for 96% of all rail- related fatalities (Federal Railroad Administration Office of Safety Analysis , 2014). Research shows that technical change is always accompanied by organizational change, and to insure a successful implementation, these two must be aligned. It is a proven fact that the HRO characteristics constitute the “secret of success” for a safe, sustainable, and result-oriented system, which must operate in a safety-sensitive, non-routine, and rapidly changing environment. Implementing these principals from the design stage of the system, would reinforce the pillars of the organization and enhance resiliency. The findings of this research will make a significant contribution to the risk reduction and safe operation of railroads. The main goal of railroad organizations is the safety of their passengers. We optimize the schedule and improve the infrastructure, but at the end of the day we want to make sure that everyone who gets on the train or works on the track returns to their loved ones safe and sound. 104 References DOE Office of Corporate Safety Analysis. (2011, Aprill 14). High Reliability – an experiment in collaborative content development. Retrieved May 2014, from Highly Reliable Performance Blog: http://hsshpi.wordpress.com/2011/04/14/high-reliability-an- experiment-in-collaborative-content-development/ Ahmed, P. K. (1998). Culture and climate for innovation. European Journal of Innovation Management, 1(1), 30-43. Aisch, G., Buchanan, L., Keller, J., Lai, K. R., Marsh, B., Park, H., . . . Yourish, K. (2015, May ). Investigating the Philadelphia. Retrieved from The New York Times: http://www.nytimes.com/interactive/2015/05/13/us/investigating-the-philadelphia- amtrak-train-crash.html Association of American Railroads. (2011). Positive Train Control. Retrieved December 2011, from http://www.aar.org/Safety/Positive-Train-Control.aspx Barley, S. R. (1986). Technology as an occasion for structuring: Evidence from observations of CT scanners and the social order of radiology departments. Administrative Science Quarterly, 78-108. Bea, R. (2002). Human and Organizational Factors in Reliability Assessment and Management of Offshore Structures. Risk Analysis, 22(1), .29-45. Bea, R. (2011). Risk Assessment and Management: Challenges of the Macondo Well Blowout Disaster. Deepwater Horizon Study Group, Working Paper. Bea, R., Mitroff, I., Farber, D., Foster, H., & Roberts, K. H. (2009). A new approach to risk: The implications of E3. Risk Management, 11(1), 30 – 43. Bencini, L., & Pautz, S. J. (2010, March 29). FMEA Can Add Value in Various Project Stages. Retrieved May 2014, from iSixSigma. Bierly, P. E., & Spender, J. C. (1995). Culture and High Reliability Organizations: The Case of the Nuclear Submarine. Journal of Management , 21(4), 639-656. Bigley, G. A., & Roberts, K. H. (2001). The Incident Command System: High-Reliability Organizing for Complex and Volatile Task Environments. The Academy of Management Journal, 44, 1281-1300. 105 Black, L. J., Carlile, P. R., & Repenning, N. P. (2004, December). A Dynamic Theory of Expertise and Occupational Boundaries in New Technology Implementation: Building on Barley's Study of CT Scanning. Administrative Science Quarterly, 49(4), 572-607. Busby, J. S. (2006). Failure to Mobilize in Reliability-Seeking Organizations: Two Cases from the UK Railway. Journal of Management Studies, 43, 1375 – 1393. Carnes, E. (2010). Highly Reliable Governance Of Complex SocioTechnical Systems. In D. H. Group, The MACONDO blowout 3rd progress report (pp. 135-165). Carnes, W. E. (2011). Practicing High Reliability: Organizations designed for humans. CCRM HRO Conference. Retrieved 2011, from http://ccrm.berkeley.edu/conferencesandevents.shtml Dekker, S. W., & Woods, D. D. (2010). The High Reliability Organization Perspective. In E. Salas, F. Jentsch, & D. Maurino (Eds.), Human Factors in Aviation (2nd ed., pp. 123- 143). Academic Press. Desai, V. M., Roberts, K. H., & Ciavarelli, A. P. (2006). The Relationship Between Safety Climate and Recent Accidents: Behavioral Learning and Cognitive Attributions. Human Factors, 48(4), 639–650. DeSanctis, G., & Poole, M. S. (1994, May). Capturing the Complexity in Advanced Technology Use: Adaptive Structuratio Theory. Organization Science, 5(2), 121-147. Ditmeyer, S. (2011). Confused about PTC yet? Trains, 71(10), 24-31. DOE. (2012). DOE Handbook, Accident And Operational Safety Analysis (Vol. Volume I: Accident Analysis Techniques). Washington, DC: U.S. Department of Energy . DOE. (2012). Independent assessment of nuclear safety culture at the Pantex plant. US Department of Energy. Retrieved from http://energy.gov/sites/prod/files/2013/05/f0/Nov_2012_Pantex_IRR- Assessment_of_Nuclear_Safety_Culture_at_Pantex.pdf Dreyfus, H. L., & Dreyfus, S. E. (1986). Mind over machine. New York, NY, USA: The Free Press. Dunjóa, J., Fthenakis, V., Vílchez, J. A., & Arnaldos, J. (2010, January 15). Hazard and operability (HAZOP) analysis. A literature review. Journal of Hazardous Materials, 173(1-3), 19–32. 106 EUROCONTROL Experimental Center . (2006). Revising the Swiss Ceese model of accidents. European organization for the safety of air navigation. Federal Railroad Adminestration . (2003). Human Reliability Analysis in Support of Risk Assessment for Positive Train Control. John A. Volpe National Transportation Systems Center, Research and Special Programs Administration. Cambridge, MA: U.S. Department of Transportation. Federal Railroad Adminestration. (2014). BNSF San Bernardino Case Study: Positive Train Control Risk Assessment. U.S. Department of Transportation, Office of Research and Development, Washington D.C. Federal Railroad Administration . (2014). Rail Safety Fact Sheet . U.S. Department of Transportation . Federal Railroad Administration Office of Safety Analysis . (2014). Ten year accident incident overview . Retrieved June 2014, from Federal Railroad Aministration : http://safetydata.fra.dot.gov/officeofsafety/publicsite/query/TenYearAccidentIncidentOve rview.aspx Federal Tarnsit Administration. (2000). Hazard analysis guidelines for Transit projects . John A. Volpe National Transportation Systems Center, Research and Special Programs Administration . Cambridge, MA: U.S. Department of Transportation. Finkelstein, A., & Dowell, J. (1996). A Comedy of Errors: the London Ambulance Service case study. Proceedings of the 8th International Workshop on Software Specification and Design (pp. 2-4). Schloss Velen: IEEE. FRA. (2017). Positive Train Control. Retrieved November 29, 2017, from Federal Railroad Admienstration: https://www.fra.dot.gov/ptc Grøtan, T. O., Størseth, F., Rø, M. H., & Skjerve, A. B. ( October 2008). Resilience, Adaptation and Improvisation - increasing resilience by organising for successful improvisation. the 3rd Symposium on Resilience Engineering. Antibes, Juan-Les-Pins, France. Grabowski, M., & Roberts, K. H. (1996). Human and organizational Error in Large Scale System. IEEE Transactions on Systems, Man, and Cybernetics – PART A: Systems and Humans, 26(1), 2-16. Hale, A., & Heijer, T. (2006). Is resilience really necessary? The case of Railways. In E. Hollnagel, D. D. Woods, & N. Leveson, Resiliece Engineering, Concepts and Precepts, Ashgate, (pp. Ch 9, 125-148). Ashgate. 107 Hansen, P. A. (2001). Positive train control. Trains, 61(1), 68. Harold, R., & Moriarty, B. (1990). System Safety Engineering and Management. Wiley and Sons . Hartley, R. S. (2011). High Reliability Organizations and Practical Approach. CCRM HRO Conference. University of California, Washington DC Center. Retrieved from http://ccrm.berkeley.edu/conferencesandevents.shtml Hartley, R. S. (2014). High Reliability Organization Implementation. 8th Annual HRO Conference. Fort Worth, TX. Retrieved April 2014, from http://www.cedata.org/hro2014/hropres/ Hartong, M., Goel, R., & Wijeseker, D. (2011). Positive Train Control (PTC) failure modes. Journal of King Saud University – Science , 311–321. Hollnagel, E. (2009). The Four Cornerstones of Resilience Engineering. In C. P. Nemeth, E. Hollnagel, & S. Dekker, Resilience Engineering Perspective (Preperation and Restoration ed., Vol. 2, pp. 117-133). Farnham , Surrey, United Kingdom: Ashgate. IAEA. (2012). Safety Culture in Pre-operational Phases of Nuclear Power Plant Projects - Safety Reports Series 74. International Atomic Energy Agancy. INPO. (2013). Traits of a healthy safety culture. Institute of Nuclear Power Operations. LaPorte, T. R., & Consolini, P. (1991). Working in practice but not in theory: Theoretical challenges of High-Reliability Organizations. Journal of Public Administration Research and Theory, 19–47. Laughlin, J. (2017, November 14). Safety took a back seat at Amtrak, feds say. Then workers died. Retrieved from http://www.philly.com/philly/business/transportation/ntsb-report- amtrak-derailment-april-2016-chester-pennsylvania-20171114.html?mobi=true Leveson, N., Dulac, N., Marais, K., & Carroll, J. (2009). Moving Beyond Normal Accidents and High Reliability Organizations: A Systems Approach to Safety in Complex Systems. Organization Studies. Long, R. J. (1989). Human Issues in New Office Technology. In T. Forester, Computers in the Human Context: Information Technology, Productivity, and People, T. Forester,. Cambridge, MA: MIT Press. 108 Madsen, P. M. (2011). HRO Perspectives on Interdependence In and Across Organizations. CCRM HRO Conference. University of California, Washington DC Center. Retrieved 2011, from http://ccrm.berkeley.edu/conferencesandevents.shtml Majchrzak, A., & Meshkat, N. (2001). Aligning Technological and Organizational Change. In G. Salvendy, Handbook of Industrial Engineering, Technology and Operations Management, Third Edition (pp. Ch 36, 948-974). WiIey-Interscience Publication. Meshkati, N. (1995). Human factors in process plants and facility design. In Cost-effective risk assessment for process design (pp. 113-130). Meshkati, N., & Khashe, Y. (2015). Operators' Improvisation in Complex Technological Systems: Successfully Tackling Ambiguity, Enhancing Resiliency and the Last Resort to Averting Disaster. Journal of Contingencies and Crisis Management, 23(2), 90-96. Metrolink. (2016). PTC Instructions - Crew . Meyers, L., Gamst, G., & Guarino, A. (n.d.). Applied Multivariate Research (2nd ed ed.). Thousand Oaks, CA: SAGE Publivation. Meyers, T., Stambouli, A., McClure, K., & Brod, D. (2012). Risk Assessment of Positive Train Control by Using Simulation of Rare Events. Journal of the Transportation Research Board, 2289, 34-41. Multivariate Data Analysis Using SPSS. (n.d.). Retrieved September 2017, from Research Gate: https://www.researchgate.net/file.PostFileLoader.html?id=54eb12afef97130f298b4576& assetKey=AS%3A273713604300800%401442269816239 N. Meshkati. (2010, January). A High Reliability, Resilient Foreign Policymaking (HR2FP). Office of the Science and Technology Adviser to the Secretary of State and the Administrator of USAID (STAS), Jefferson Science Fellow. Washington D.C.: U.S. Department of State. National Transportation Safety Board . (2009). Collision Between Two Massachusetts Bay Transportation Authority Green Line Trains, Newton, Massachusetts, May 28, 2008. Railroad. Washington, DC: NTSB. National Transportation Safety Board . (2013). Collision between Two CSX Transportation Freight Trains, Westville Indiana, August 20, 2013. Washington, DC: NTSB. National Transportation Safety Board . (2017). Amtrak Train Collision with Maintenance-of-Way Equipment Chester, Pennsylvania April 3, 2016. NTSB. 109 National Transportation Safety Board . (2017, November 29). Derailment of Amtrak Passenger Train 188. Retrieved from National Transportation Safety Board: https://www.ntsb.gov/investigations/AccidentReports/Pages/RAR1602.aspx National Transportation Safety Board. (1997). Collision involving three Consolidated Rail Corporation freight trains operating in fog on a double main track near Bryan, Ohio, January 17, 1999. Washington, D.C.: NTSB. National Transportation Safety Board. (2001). Collision and Derailment of Maryland Rail Commuter MARC Train 286 and National Railroad Passenger Corporation AMTRAK Train 29, Silver Spring, Maryland, February 16, 1996. Washington, DC: NTSB. National Transportation Safety Board. (2003). Collision of Burlington Northern Santa Fe Freight Train With Metrolink Passenger Train Placentia, California, April 23, 2002. Washington DC: NTSB. National Transportation Safety Board. (2005). Burlington Northern Santa Fe Railway Company and Union Pacific Railroad, Kelso, Washington, November 15, 2003. Washington, DC. : NTSB. National Transportation Safety Board. (2005). Collision of Norfolk Southern Freight Train 192 With Standing Norfolk Southern Local Train P22 With Subsequent Hazardous Materials Release at Graniteville, South Carolina, January 6, 2005. Washington, DC: NTSB. National Transportation Safety Board. (2005). Derailment of Northeast Illinois Regional Commuter Railroad Train 519 in Chicago, Illinois, October 12, 2003. Washington, DC.: NTSB. National Transportation Safety Board. (2006). Collision Between Two BNSF Railway Company Freight Trains Near Gunter, Texas, May 19, 2004. Washington, DC: NTSB. National Transportation Safety Board. (2006). Collision of Union Pacific Railroad Train MHOTU-23 With BNSF Railway Company Train MEAP-TUL-126-D With Subsequent Derailment and Hazardous Materials Release, Macdona, Texas, June 28, 2004. Washington, DC: NTSB. National Transportation Safety Board. (2007). Collision of Two CN Freight Trains Anding, Mississippi July 10, 2005. Railroad Accident Report, Washington, DC. National Transportation Safety Board. (2008). Collision of two Union Pacific Railroad freight trains in Bertram, California, November 10, 2007. Washington, DC: NTSB. 110 National Transportation Safety Board. (2009). Railroad Accident Report: Collision of Two Washington Metropolitan Area Transit Authority Metrorail Trains Near Fort Totten Station. Washington, DC: NTSB. Retrieved from http://www.ntsb.gov/investigations/summary/rar1002.html National Transportation Safety Board. (2010). Collision of Metrolink Train 111 With Union Pacific Train LOF65–12, Chatsworth, California, September 12, 2008. Washington, DC: NTSB. National Transportation Safety Board. (2012). Collision of Port Authority Trans-Hudson Train with Bumping Post at Hoboken Station, Hoboken, New Jersey, May 8, 2011. Washington, DC: NTSB. National Transportation Safety Board. (2012). Collision of BNSF Coal Train With the Rear End of Standing BNSF Maintenance-of-Way Equipment Train, Red Oak, Iowa, April 17, 2011. Washington, D.C.: NTSB. National Transportation Safety Board. (2012). Collision of Dakota, Minnesota & Eastern Railroad Freight Train and 19 Stationary Railcars, Bettendorf, Iowa, July 14, 2009. Washington, D.C.: NTSB. National Transportation Safety Board. (2013). Head-On Collision of Two Union Pacific Railroad Freight Trains Near Goodwell, Oklahoma, June 24, 2012. Railroad Accident Report , Washington, DC. National Transportation Safety Board. (2016). Derailment of Amtrak passenger train 188, Philadelphia, PA, May 12, 2015 . NTSB/ DCA15MR010, NTSB. NRC. (2011). Final Safety Culture Policy Statement . Washington, DC: U.S. Nuclear Regulatory Commission. NTSB . (2014). Implement Positive Train Control Systems. Retrieved from National Transportation Safety Board : http://www.ntsb.gov/safety/mwl8_2014.html NTSB. (2014). NTSB Most wanted list . Retrieved 2014, from National Transportation Safety Board : http://www.ntsb.gov/safety/mwl2014/08_MWL_PositiveTrainControl.pdf Office of the Press Secretary. (2013, February 12). Presidential Policy Directive -- Critical Infrastructure Security and Resilience. Retrieved Octoer 2014, from The White House: http://www.whitehouse.gov/the-press-office/2013/02/12/presidential-policy-directive- critical-infrastructure-security-and-resil 111 Ott, R. L., & Longnecker, M. (2010). An Introduction to Statistical Methods and Data Analysis. Belmont, CA: Brooks/Cole. Oxstrand, J., & Boring, R. L. (2010). Human Reliability Guidance – How to Increase the Synergies between Human Reliability, Human Factors, and System Design & Engineering. Phase 1: The Nordic Point of View – A User Needs Analysis. Nordic nuclear safety research. Perrow, C. (1984). Normal Accidents: Living with High Risk Technologies. New York: Basic Books. Rasmussen, J. (1980). What can be learned from human error reports? In K. D. Duncan, M. M. Gruneberg, & D. Wallis, Changes in working life (pp. 97–113.). New York, NY: Wiley. Reason, J. (2000). Human error: models and management. BMJ, 320, 768-770. Roberts, K. H., & Rousseau, D. M. (1989). Research in Nearly Failure-Free, High-Reliability Organizations: Having the Bubble. IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 32(2), 132-139. Roberts, K. H., Madsen, P., & Desai, V. M. (2005). The Space Between in Space Transportation: A Relational Analysis of the Failure of STS 107. In M. Farjoun, & W. Starbuck, Organization at the Limit: NASA and the Columbia Disaster. Oxford, Blackwell. Roberts, K. H., Yu, K. F., Desai, V., & Madsen, P. (2009). Employing Adaptive Structuring As Cognitive Decision Aid In High Reliability Organizations. In G. P. Hodgkinson, & W. H. Starbuck, The Oxford Handbook Of Organizational Decision Making (pp. 194-210). Oxford: Oxford University Press. Roberts, K., Stouts, S., & Halpern, J. (1994). Decision Dynamics in Two High Reliability Military Organizations. Management Science , 40(5), 614 - 624. Skjerve, A. B., & Kaarstad, M. (2014). The MTO Perspective and Selected Research Activities at the Halden Project. IAEA. Smith, D. L. (2010, February 26). FMEA: Preventing a Failure Before Any Harm Is Done . Retrieved May 2014, from iSixSigma: http://www.isixsigma.com/tools- templates/fmea/fmea-preventing-failure-any-harm-done/ South West Thames Regional Health Authority. (1993). Report of the Inquiry Into The London Ambulance Service. South West Thames Regional Health Authority. 112 Southern California Regional Rail Authority (Metrolink). (2010). SCRRA Positive Train Control System Concept of Operations. Retrieved from http://www.regulations.gov/#!documentDetail;D=FRA-2010-0048-0005 Southern California Regional Rail Authority. (2013). An introduction to PTC . Retrieved June 2014, from Metrolink : http://www.metrolinktrains.com/agency/page/title/ptc Trotter, M. J., Salmon, P. M., & Lenne, M. G. (2014). Impromaps: Applying Rasmussen’s Risk Management Framework to improvisation incidents. Safety Science, 60-70. Turnur, B. A. (1978). Man-Made Disasters. Taylor & Francis Group. Verma, A. K., Ajit, S., & Karanki, D. R. (2010). Reliability and Safety Engineering . London, UK: Springer . Vinnem, J. E. (2007). Offshore Risk Assessment; Principles, Modeling, and Applications of QRA Studies (2nd ed.). Springer. W. Edward Deming Institute. (2014). The System of profound knowlege. Retrieved May 2014, from The W. Edward Deming Institute: https://www.deming.org/theman/theories/profoundknowledge Weick, K. E. (1998). Introductory Essay: Improvisation as a Mindset for Organizational Analysis. Organization Science, 543-555. Weick, k. E., & Roberts, K. H. (1993). Collective mind and organizational reliability: The case of flight operations on an aircraft carrier deck. Administrative Science Quarterly, 38, 357- 381. Weick, K. E., & Sutcliffe, K. M. (2001). Managing the Unexpected: Assuring High Performance in an Age of Complexity. Jossey-Bass. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for High Reliability: Processes of Collective Mindfulness. Research in Organizational Behavior, 1, 81–123. Weick, K., & Sutcliffe, K. (2015). Managing the Unexpected - Sustained Performance in a complex world . (3. ed, Ed.) Hoboken, New Jersey : Wiley. Weick, K., Sutcliffe, K., & Obstfeld, D. (2008). Organizing for high reliability: Processes of collective mindfulness. Crisis management, 81-123. Wong, D. S., Desai, V. M., Madsen, P., Roberts, K. H., & Ciavarelli, A. (2005). Measuring Organizational Safety and Effectiveness at NASA. Engineering Management Journal, 113 17(4), 59-62. Retrieved from Daniel S Wong; Vinit M Desai; Peter Madsen; Karlene H Roberts; Anthony Ciavarelli. Zohar, D. (1980). Safety climate in industrial organizations: Theoretical and applied implications. Journal of Applied Psychology, 96–102.
Abstract (if available)
Abstract
An apparent rash of grave railroad accidents in the US has not only damaged the railroad infrastructure and interrupted its operations but also endangered the safety and lives of train crew members and passengers. Railroads operate in high-risk, hazardous, and rapidly changing environments, over long periods of time while facing the inevitable task of avoiding catastrophic events. One such accident was the head-on collision of two trains–passenger and freight–at Chatsworth, California on September 12, 2008, which directly resulted in the US Congress passing of the Rail Safety Improvement Act of 2008, requiring Class I railroads to install Positive Train Control (PTC) systems by December 2015 to prevent such future accidents. One of the challenges of PTC implementation is altering an existing system with new technology that changes the organization and its components. Studies show that organizational and technological changes fail when they are not properly aligned, integrated, and managed. ❧ In this dissertation we propose a guideline for improving service reliability and service interruption in train operations under PTC by eliminating preventable failures and system variations. The main purpose of this guideline is provide a systematic approach to identify the human, organizational, and technological factors that influence the reliability of the system, and address and evaluate these factors using High Reliability Organization (HRO) characteristics.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A risk analysis methodology to address human and organizational factors in offshore drilling safety: with an emphasis on negative pressure test
PDF
Investigation of health system performance: effects of integrated triple element method of high reliability, patient safety, and care coordination
PDF
BRIM: A performance-based Bayesian model to identify use-error risk levels in medical devices
PDF
Designing health care provider payment systems to reduce potentially preventable medical needs and patient harm: a simulation study
PDF
Modeling human bounded rationality in opportunistic security games
PDF
Using a human factors engineering perspective to design and evaluate communication and information technology tools to support depression care and physical activity behavior change among low-inco...
PDF
Environmental effects from a large-scale adoption of electric vehicle technology in the City of Los Angeles
PDF
Total systems engineering evaluation of invasive pediatric medical therapies conducted in non-clinical environments
PDF
Distribution system reliability analysis for smart grid applications
PDF
Dynamic social structuring in cellular self-organizing systems
PDF
Extending systems architecting for human considerations through model-based systems engineering
PDF
Train routing and timetabling algorithms for general networks
PDF
Enabling human-building communication to promote pro-environmental behavior in office buildings
PDF
Educator professional development for technology in the classroom: an evaluation study
PDF
Human error risk reduction in aviation: an evaluation study
PDF
Integration of digital twin and generative models in model-based systems upgrade methodology
PDF
Systems engineering and mission design of a lunar South Pole rover mission: a novel approach to the multidisciplinary design problem within a spacecraft systems engineering paradigm
PDF
Risk transfer modeling among hierarchically associated stakeholders in development of space systems
PDF
A system for trust evaluation and management leveraging trusted computing technology
PDF
Examining felt accountability and uneven practice in dual organizational systems: a bioecological study toward improving organizational accountability
Asset Metadata
Creator
Khashe, Yalda
(author)
Core Title
Human and organizational factors of PTC integration in railroad system and developing HRO-centric methodology for aligning technological and organizational change
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Industrial and Systems Engineering
Publication Date
03/06/2019
Defense Date
12/11/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
high reliability organization,human and organizational factors,new technology implementation,OAI-PMH Harvest,organizational safety,Railroad,railroad safety,safety-sensitive organizations,transportation
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Meshkati, Najmedin (
committee chair
), Majchrzak, Ann (
committee member
), Rahimi, Mansour (
committee member
)
Creator Email
khashe@usc.edu,yaldakhashe@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-484003
Unique identifier
UC11266749
Identifier
etd-KhasheYald-6083.pdf (filename),usctheses-c40-484003 (legacy record id)
Legacy Identifier
etd-KhasheYald-6083.pdf
Dmrecord
484003
Document Type
Dissertation
Rights
Khashe, Yalda
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
high reliability organization
human and organizational factors
new technology implementation
organizational safety
railroad safety
safety-sensitive organizations
transportation