Disruptive by Design: Data Engineering for AI at the Tactical Edge
As artificial intelligence (AI) developers strive to deliver the potential of their applications to future battlefields, the largest stumbling block will likely be the lack of data engineering for an appropriate backside network and automation infrastructure.
As artificial intelligence (AI) developers strive to deliver the potential of their applications to future battlefields, the largest stumbling block will likely be the lack of data engineering for an appropriate backside network and automation infrastructure. This infrastructure, commonly termed as data fabric, can deliver a federated environment that enables access to data, information and analytics across echelons, forces and even classifications. To have AI at the tactical edge, you need effective data fabrics. To deliver effective data fabrics, you need expert data engineering.
AI at the tactical edge may begin with sensors harvesting data. Data is typically encoded in rows and columns within tables. Data within rows pertain to observations. The data in columns correspond to details about these observations.
For example, a weight sensor may register when a scale is pressed. A table of collected weight sensor results would include a column of unique identities for every observation. The next column may then account for the date and time of the measurement. Another column may record the weight, while another records location.
With these few columns of data for each observation, an analyst could develop an algorithm to predict how many people cross a footbridge each day or how many vehicles drive over a bridge each week. Once analysts obtain averages, they can account for deviations and anomalies. Using the scale example, unmanned surveillance drones could be deployed to a given sensor whenever a particular anomaly is observed. With more sensors comes more potential columns of data and more possibilities for data-driven decision making.
Producing data-driven decision making, particularly at the tactical edge, requires data engineering. Data files must reside somewhere and be updated and accessed securely. Application protocol interfaces that enable automation, either within software or hardware, must also be engineered.
Much of the data collected by the Defense Department is simply being warehoused, leading to isolated silos of information opportunity costs. Data lakes, connected by data fabrics operated and maintained by data engineers, are required. Data engineers are necessary to develop data fabrics that serve as architectures to facilitate the integration of data pipelines through cloud environments.
Data engineering requires an understanding of data capturing, storage and delivery processes. Data capture encompasses batch and streaming data from sources, including sensors at the tactical edge, open-source databases, enterprise cloud services or other proprietary databases. Data storage comprises all formats of data within a single repository. Data delivery concerns users, analytics, tools and data science processes. Data engineering uses intelligent and automated systems to manage data capture, storage and processing.
Returning to the weight scale example, seemingly simple decisions about the number of sensor observations to report and how frequently to report them require analysis by a data engineer. Those decisions affect network congestion and plans to synchronize stored data. Suppose information about the sensors and their employment is sensitive. In that case, data engineers also need to assist in managing who has access to what columnar data within a given set of observations. Data engineers could assist analysts and organizations in operationalizing their data by helping them deploy tools, develop real-time monitoring and automated responses and plans to retrain and redeploy models as necessary. As the joint community seeks to deliver a unified vision for the future of AI at the tactical edge, each service could define roles within their current formations and then establish career paths for service members to fill these positions. Arguably, data scientists provide the best return on investment when embedded within brigade and division-level staff. They can define what military problems can benefit from data solutions, shape collection plans that properly employ tactical sensors and develop operational plans where data captured from tactical sensors drive decision points.
Unmanned system maintainer-operators will benefit from guidance from data scientists about how best to employ their systems. Unmanned systems will not only be primary harvesters of tactical data, but their automated behaviors likely will require frequent updates in response to adversarial actions. Therefore, unmanned system maintainer-operators also need to collaborate with data engineers to develop plans for retraining and integration into future AI combat control systems.
The Defense Department and industry partners need to begin developing the back-end infrastructure necessary to deliver AI capabilities at the tactical edge. To do this, the department needs a plan for data fabrics and a cadre of data engineers capable of delivering this vision.
Lt. Col. Ryan Kenny, USA, created an online forum to foster discussions on emerging technologies at www.militarycommunicators.org. The views expressed here are his alone and do not represent the views and opinions of the Defense Department, U.S. Army or other organizations with which he has had an affiliation.