The emerging use of Analytics and Knowledge Discovery in Databases for Cyber Conflict and Trade Negotiations
The blog post is the first in series or articles on cyber conflict and the use of analytics for targeting in both offense and defense in conflict situations.
It covers knowledge discovery in four kinds of databases (so chosen because of perceived importance , sensitivity, criticality and functioning of the geopolitical economic system)-
- Databases on Unique Identity Identifiers- including next generation biometric databases connected to Government Initiatives and Banking, and current generation databases of identifiers like government issued documents made online
- Databases on financial details -This includes not only traditional financial service providers but also online databases with payment details collected by retail product selling corporates like Sony’s Playstation Network, Microsoft ‘s XBox and
- Databases on contact details – including those by offline businesses collecting marketing databases and contact details
- Databases on social behavior- primarily collected by online businesses like Facebook , and other social media platforms.
It examines the role of
-
voluntary privacy safeguards and government regulations ,
-
weak cryptographic security of databases,
-
weakness in balancing marketing ( maximized data ) with privacy (minimized data)
-
and lastly the role of ownership patterns in database owning corporates
A small distinction between cyber crime and cyber conflict is that while cyber crime focusses on stealing data, intellectual property and information to primarily maximize economic gains
cyber conflict focuses on stealing information and also disrupt effective working of database backed systems in order to gain notional competitive advantages in economics as well as geo-politics. Cyber terrorism is basically cyber conflict by non-state agents or by designated terrorist states as defined by the regulations of the “target” entity. A cyber attack is an offensive action related to cyber-infrastructure (like the Stuxnet worm that disabled uranium enrichment centrifuges of Iran). Cyber attacks and cyber terrorism are out of scope of this paper, we will concentrate on cyber conflicts involving databases.
Some examples are given here-
Types of Knowledge Discovery in –
1) Databases on Unique Identifiers- including biometric databases.
Unique Identifiers or primary keys for identifying people are critical for any intensive knowledge discovery program. The unique identifier generated must be extremely secure , and not liable to reverse engineering of the cryptographic hash function.
For biometric databases, an interesting possibility could be determining the ethnic identity from biometric information, and also mapping relatives. Current biometric information that is collected is- fingerprint data, eyes iris data, facial data. A further feature could be adding in voice data as a part of biometric databases.
This is subject to obvious privacy safeguards.
For example, Google recently unveiled facial recognition to unlock Android 4.0 mobiles, only to find out that the security feature could easily be bypassed by using a photo of the owner.
Example of Biometric Databases
In Afghanistan more than 2 million Afghans have contributed iris, fingerprint, facial data to a biometric database. In India, 121 million people have already been enrolled in the largest biometric database in the world. More than half a million customers of the Tokyo Mitsubishi Bank are are already using biometric verification at ATMs.
Examples of Breached Online Databases
In 2011, Playstation Network by Sony (PSN) lost data of 77 million customers including personal information and credit card information. Additionally data of 24 million customers were lost by Sony’s Sony Online Entertainment. The websites of open source platforms like SourceForge, WineHQ and Kernel.org were also broken into 2011. Even retailers like McDonald and Walgreen reported database breaches.
The role of cyber conflict arises in the following cases-
-
Databases are online for accessing and authentication by proper users. Databases can be breached remotely by non-owners ( or “perpetrators”) non with much lesser chance of intruder identification, detection and penalization by regulators, or law enforcers (or “protectors”) than offline modes of intellectual property theft.
-
Databases are valuable to external agents (or “sponsors”) subsidizing ( with finance, technology, information, motivation) the perpetrators for intellectual property theft. Databases contain information that can be used to disrupt the functioning of a particular economy, corporation (or “ primary targets”) or for further chain or domino effects in accessing other data (or “secondary targets”)
-
Loss of data is more expensive than enhanced cost of security to database owners
-
Loss of data is more disruptive to people whose data is contained within the database (or “customers”)
So the role play for different people for these kind of databases consists of-
1) Customers- who are in the database
2) Owners -who own the database. They together form the primary and secondary targets.
3) Protectors- who help customers and owners secure the databases.
and
1) Sponsors- who benefit from the theft or disruption of the database
2) Perpetrators- who execute the actual theft and disruption in the database
The use of topic models and LDA is known for making data reduction on text, and the use of data visualization including tied to GPS based location data is well known for investigative purposes, but the increasing complexity of both data generation and the sophistication of machine learning driven data processing makes this an interesting area to watch.
The next article in this series will cover-
the kind of algorithms that are currently or being proposed for cyber conflict, the role of non state agents , and what precautions can knowledge discovery in databases practitioners employ to avoid breaches of security, ethics, and regulation.
Citations-
- Michael A. Vatis , CYBER ATTACKS DURING THE WAR ON TERRORISM: A PREDICTIVE ANALYSIS Dartmouth College (Institute for Security Technology Studies).
- From Data Mining to Knowledge Discovery in Databases Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyt