The Mysterious Eva

Just when I thought there were no good transliteration puzzles left, a Japanese site leads me to EVA, aka the European Voynich Alphabet, developed to decipher the Voynich manuscript. I’m voting for it being someone’s personal invented language (possibly several languages), but I’d be really really interested in playing with the full text if it ever gets released online.  I just hope it doesn’t upset too many Welsh librarians.

What CrisisCommons learnt from RHOK1.0

Last weekend (4th-5th June 2010) was the second Random Hacks of Kindness, aka RHOK1.0.

RHOKs are hackathons designed to rapid-prototype software that can be used to manage information before, during and after crises like the Haiti earthquake. RHOK was one of the entities born out of the first CrisisCommons camp (other entites included CrisisCamps and the Aid Information Challenges), and because of this and its closeness to the CrisisCamp aims, CrisisCommons has always kept strong ties with the RHOKs.

Lesson 1: Ask for help if you need it.  Crisiscommons realised last month that RHOK organisation was in trouble: we helped where we could, and the RHOK organisers were gracious enough to accept that help. We couldn’t save RHOK1.0 London (two weeks to go and no announcements, no organisation and most crucially no venue arranged in a town that’s getting harder and harder to find BarCamp venues in) but we did manage to do these things:

  • Pass on as much of our experience of coordinating multi-camp, multi-timezone, multi-language and high-pressure CrisisCamps as we could.  Mostly this meant lots of conference calls and emails, with Heather and Noel providing support to the Washington-based organisers.
  • Sent CrisisCommons organisers out to RHOK camps that needed them. Heather Leson (@heatherleson) did an amazing job keeping RHOK Sydney going, although I’m still kicking myself for not putting myself forward when the call for RHOK Jakarta came in (I thought there were other people who deserved the trip more than I did).
  • Raided the CrisisCommons project lists for suitable RHOK problem statements.  This meant more sitting up late getting the project statements into readable order, but the CrisisCommons project managers (including Kimberly Roluf) did us proud.
  • Gathered information, posted information and kept people engaged on RHOK problems before the RHOK camps (big up to Olidag for his work on the UAV problem).
  • Built the RHOK wiki – with one days’ notice! – ready for the RHOK camps to use (big up to @heatherleson, @spikeuk and Brian Chick for getting this sorted so fast).
  • Ran a RHOK operations centre for the 47 hours (8am Sydney til 5pm Washington/Santiago) that RHOK1.0 was on (@spikeUK, @bodaceacat, Sahana’s @rediguana on watch and wiki editing, with RHOK’s Jeremy on the IT admin).
    • Watched the RHOK IRC channel, camp Ustreams (video and chat), twitter hashtags (of which there were many), wiki recent changes log and email for information that needed capturing (projects, video feeds, organisers etc) and projects that could be hooked up with users and external information providers.
    • Maintained the wiki in real-time so the RHOK camps knew what each other were doing and could coordinate if they were working on the same projects.
    • Created a virtual camp from our VirtualCrisisCamp templates.
    • Found information sources, existing projects and potential users for project teams as they came online.
    • Kept people outside the camps informed about what RHOK was doing and how they could join in with other work post-RHOK.
    • Told RHOK organisers and administrators about emerging issues as we found them.
  • Worked as subject matter experts in RHOK hacking teams (Katie @filbertkm, for example)
  • Told potential users (e.g. DFID, CDAC, the UN) that RHOK was about to happen.
  • Judged the results (Noel)!

This didn’t just go one way though – what CrisisCommons got in return was:

  • New and/or improved applications for CrisisCamp and CrisisCommons responses – Haiti Amps Network got a huge boost from RHOK Nairobi, for example.
  • More knowledge about how to set up a string of camps very quickly – remembering that the next time we do this, it will probably be for a major crisis.
  • Much more experience in running a worldwide, short-term-focus operations centre.
  • Lots and lots of new friends around the world.

Lesson 2: Have your infrastructure ready.  The RHOK infrastructure was not ready for RHOK1.0.  Without an Ops Centre, teams working on the same problems were likely to be unaware of each others’ work.  Without a fully-built wiki, it was difficult to know who to contact about what (including who the other countries’ organisers were and where the tech support was). Without an agreed hashtag, the camps weren’t all able to see each others feeds.   Crisiscommons did miss a few tricks in transferring its experience into RHOK though – one of these was forgetting to set multi-language options on the wiki early (people were tweeting in English, Portuguese, Spanish and Indonesian and language-specific wikipages might have engaged yet more non-RHOKers with what RHOK was doing).

Lesson 3: Build on existing work. If it’s already been done, then use it (provided it’s opensource of course) to build something better.  We asked some RHOK teams why they hadn’t contacted existing teams working in their application area; we suspected they were shy, and built these bridges for them.  Other teams got out there and built their own bridges once they’d been given the contact details – the UAV team, for example, reached out to DiyDrones and other UAV usergroups, and started to link groups together.

Lesson 4: Talk to your users. Crisis management is not like office management software: it’s measured in lives, not dollars.  If you build something that nobody wants to use, then you’ve wasted effort that could have made a difference elsewhere.  If you ask the people who might use your system about what they need and want, you’ll build something that makes a difference, and it’ll probably be a better system too.  This is the big lesson that the RHOK1.0 winners (Chasm, a landslide prediction app that used an SME and was ready to go by the end of the weekend) can teach us all.  We linked up all the people we could, but a bit more prior preparation (like warning more of the potential users that RHOK was about to happen) could go a long way.

Lesson 5: Arrange 24-hour visible and robust support with no single points of failure. A huge hand to Jeremy for handling a nasty DOS attack on the RHOK servers having been woken up in the middle of (his) night.  But he really shouldn’t have been the only person with the priviledges to do this: two or three other admins arranged around the world could have helped a lot here.

That said, RHOK1.0 worked well and there was some amazing collaboration in teams across the world (yes, we’re looking at you, UAV people, People Finder et al).  The world has gained apps that will definitely save lives, and another bunch of people have learnt that they can make a difference by using their tech skills.  Hats off the RHOK organisers for pulling this off from an almost standing start, and a big hand to all the people who worked behind the scenes to support them.  I think RHOK is now established as an event, and judging by the reactions of people sad to leave it this time, it already has a groundswell of support. RHOK on!

World insurance schemes

Not all insurance schemes are paid for and delivered in money. Granted, we pay for the European Union, which started as an unstated insurance scheme against war (and seems to have worked very well, given how the German and Greek governments must feel about each other at the moment); ditto the UN.

The biggest insurance scheme is still starting up now, and that’s to distribute skills, capacity and goodwill across the whole of the world, so that if any one part of it is hit by disaster, the rest are willing and able to help. As I keep saying in CrisisCamp, it’s not them and us any more, it’s us and us. I have two favourite illustrations of this at the moment – that an African-led group (Ushahidi) could help with a disaster in the Americas, and that the satellite sites for next month’s Random Hacks of Kindness (RHOK1.0) aren’t London, Paris, New York, but Nairobi, Jakarta and Sao Paolo.

What happens next could be interesting.  The tiger, Chinese, Indian and Brazilian economies all grew rapidly in the last couple of decades.  But there’s more. The ‘third world’ is huge, complex, and to Western eyes deeply disorganised, but there is a lot of will there to learn about using and creating technologies, and a lot of work on making that possible, from the solar-powered internet station in a portacabin to the African-created idea of linking individual PC wifis from house to house to form Internets. If we are all becoming equal in the eyes of the Internet, then we Westerners might one day blink, and find on opening our eyes that Africa has overtaken us in innovation, enterpreneurship and hunger for change.  And then we might need yet another insurance scheme that isn’t paid for or returned in money.

Postscript: spent part of weekend watching Random Hacks of Kindness team working in iHub Nairobi.  V impressive techs; see also Africa Launch business site.

Visualising wikis

I’ve been doing some website updates recently, as part of the CrisisCommons work. My father taught me to always clean and examine something carefully before you take it apart, and this works as well for code and sites as it does for cars and houses, so I’ve been carefully analysing each site against a set of intended (and frequency-weighted) user journeys. And what would be a really nice thing to have would be a tool that generated a semantic network of a wikisite so I could trace its hub nodes and get an easy visual representation of how much each node and link is used (colour-coding seemed obvious here).

Now I remember the small worlds (everything is just 6 steps from everything else if you know which 6 steps to take) and semantic network theories from uni, and I’ve knocked up a few labelled graphs myself in my time, and I know there’s some great graph generation freeware out there, so I thought “this has got to be a standard item in the open source community, surely”.

Er. No. But there are some good things out there already.

* Aharef’s visualising websites as graphs. (example). Verry close to what I’m looking for, but runs off weblinks and doesn’t tag the hub nodes.
* Flexplorer. Great tool for mapping all the websites you’re pointing at. Not so great for mapping just the one wikisite. Does do good labels.
* Wikmindmap. Does wikis. Does labels. Doesn’t do the whole of a wiki. More of an explorer’s torch (sees ashort way but very well) than an explorer’s map (sees everything but in less detail).
* Powermapper – too literal. You get a (non-graphical) representation of the site contents rather than a summary that you can infer metainformation from.

So close, but no cigars. There is however a plan. Aharef has published his/her/their mapping code, so I’m going to see if I can change it to pull out the [[]] tags, and add a name to every heavily-linked node. It’s a plan. We’ll see how the reality pans out. But right now, I’m having a lovely time playing with the Processing visualisation tool and working out whether hacking Flexplorer is a better option.  And it seems like I’m not the only person thinking about doing this.

Tickboxes

Yep, tickboxes.  One of the issues for many real-life entities is data entry from paper forms (as opposed to the online forms mainly Internet-based entities: yes, yes, the physical world does still exist, and yes it is easier sometimes to get people to put crosses on paper than to lead them to the right website/ phone app).   Specifically, they use paper because it’s their best shot at getting lots of people to fill in data fields, but they then have lots of paper data that they need to digitise.

Options for this, rated from high to low operator involvement, low to high flexibility and low to high complexity, are:

  • Use the forms as raw data, i.e. do all the analysis needed by counting up ticks in boxes etc by hand. This works, but takes time, and isn’t easy to cross-check or add to the analysis used; it can however be optimal for small amounts of data where the analysis is needed quickly.
  • Get someone to type in all the data from the forms. This takes time, but is usually a faster way to get relatively small amounts of data ready for analysis than spending time working out how to do the digitisation.
  • Scan the forms whole into the system.  This captures the data online, but is no better for analysis than the first option.
  • Scan the forms then use OCR and image processing to capture the data on each form. This works for some limited types of data: typed data, neatly filled-in tick-boxes, carefully-spaced capital letters (a la postcode reader), but doesn’t capture free text and may have problems with messy inputs.
  • Scan the forms and use OCR/ image processing to capture as much data from the form as possible, then use the operator to cross-check the captured data (e.g. by online comparison between the original form and the results) and input any free text or other difficult-to-read data.

I’m looking at the last option for some IT4Communities stuff.  As with most autonomy, it makes sense to use the human’s and machine’s strengths together, to work comfortably somewhere away from both excessive operator workload (low autonomy) and massively complex systems (high autonomy). 

I could write a segment-then-find-boxes-then-check-their-occupancy image processing subsystem, but I suspect that because this is a relatively common problem, someone has probably already done this.  I’ll attempt to write the system out of sheer curiosity if I can’t find one, but meanwhile places I’m starting to look include:

  • census data processing
  • exam paper processing
  • Search for freeware paper form processing

Helpful references include:

The Black Swan – first impressions

I’m still in the prologue, and already I deeply dislike this book. I’m hoping that the prologue is ironic, because if not, then it appears both arrogant and unaware of much of the work on risk management and risk perception that started out in artificial intelligence with the frame problem, nonmonotonic logics and some very long (as in decades-long) arguments about value and uncertainty.

I’ll probably persist, because I was given the book to read by someone whose judgement I rate, but I suspect I’m going to find it difficult not to shout at the pages sometimes. Maybe it would have been easier if I hadn’t started this book right after the academic joy that was reading Stumbling on Happiness.

Learning Japanese, I think I’m…

Well, not really: I learnt Japanese at uni 20 years ago, and since then several other languages have caught my eye. But sometimes it’s useful to remember enough to ask the right question. Which is how I ended up on the grammar and translation sites this morning, and why I’m now amazed at how much Japanese language teaching has changed since the *cough* 1980s. Thanks to the Internet, we now have more access to native language sites and university notes than was ever possible back then, and like French-English teaching (more of which perhaps later *), the emphasis is very different now.  Most significant is the much greater emphasis on learning charactersets. When I first learnt Japanese, I learnt the hiragana characters and some useful bits of kanji (in case some joker decided to encrypt the labels on the toilets again – if you never learn anything else in a language, memorise please, thank you, do you speak English, men and women), but much of what I was reading and speaking at first was latin alphabet phonetics (romaji).

So I thought, a little light reading-up on grammar, a quick play with the translators (e.g. Babylon8) to check I wasn’t calling anyone a starfish, and off I’d go. Oh no. Not any more. And I really should have expected this after learning pinyin (latin phonetic) and chinese characters alongside each other in my Mandarin lessons, but along with the more grown-up phrasesets (I fondly remember the ideosynchrasies in my copy of Japanese for Busy People), there’s an almost total reversion to kanji for everything. Which would be great if I needed to relearn Japanese, but not so useful for a quick amusement involving a site with unknown characterset support. So if you ever find yourself in this situation too, the best places to go are this English-romaji dictionary  and this kanji-romaji converter.

I think I just about got there on the translation, but I still suspect I managed to call someone a starfish.

*Footnote: If you’re ever stuck in a foggy French town and have run out of museums to visit and books to read, try this. Find a bookshop. Go to the languages section. And buy a “how to speak English” book. In French. Then spend the rest of said foggy afternoon drinking coffee and giggling over what the French writers thought were appropriate and useful English phrases.  Sadly, I suspect this game also works well in reverse for French people stuck in foggy English towns for the afternoon. As does the “everyone get their phrasebook out” game where people from several nationalities share their favourite translations into language X, and learn things like “Italian-X phrasebooks include a section on how to swear”.

Old papers: Intelligence Analysis for New Forms of Conflict

Another old friend from back in the days before old age and disillusionment (aka 1997). I don’t do this sort of thing any more, but I’d really like to think about how it could be applied to more civilian activities.

Abstract

As the volumes of information available from modern sensors and sources increases, processing all available intelligence data has become too complex and time consuming for intelligence analysts. There is a need to assess how technology can and should be used to assist them, and for a clear view of the issues raised by the data and processing methods available. This paper discusses automated intelligence analysis, its impact and its role in modern conflict analysis.

1. Intelligence Analysis

Before we can automate intelligence analysis, we need to know what it does. The output of intelligence analysis (by human or machine) is intelligence: information that is pertinent to the current area of interest of a commander or policy maker, in a form that he can use to improve his position relative to another, usually opposing, commander. The other output of intelligence work is counterintelligence, misleading information supplied to the opposing commander (well-managed counterintelligence should also improve the commander’s position relative to his opponent). Simplistically, intelligence analysis is the collection and painstaking sifting of information to gain insight into and build a model of part of the world (a battlefield, country or other topic of interest ) that is relevant to a commander’s needs and actions; intelligence is a set of beliefs about the state of that world and the actions and intents of another party within it.

1.1 The need for intelligence analysis

To make decisions commanders need accurate and timely information based on as much relevant data as possible. The importance of good intelligence cannot be overstated; in a military situation, a commander is effectively blind without the battlefield awareness gained from intelligence. The view that a commander has of a situation depends on information pools and flows, as shown in table 1 and figure 1, where each OODA loop consists of Observe (using sensors and knowledge sources ), Orient (intelligence analysis ), Decide (command decision making using both learnt doctrine and individual style) and Act (move, oppose or contain a move). Note that in modern conflict, there may be 0, 1, or several other forces’ OODA loops in a commanders viewpoint, and that too little may be known about their structures, doctrine or equipment for conventional (orbat and doctrine-based) analysis. Given that each commander in a battle has their own view of it, it becomes important for a friendly commander’s battlefield awareness to be dominant (better than that of the opposition). Other considerations for intelligence analysts are that it also isn’t enough for analysts to provide a commander with information if it deluges him (the commander is given too much information to understand in the time he has available), changes too rapidly or he can ‘t trust it.

1.2 Why automate intelligence analysis?

Modern conflict focuses on the observation and control of other forces, where command and analysis relies on timely and accurate intelligence, Automated decision -making (currently at the research stage) will need fast, accurate sources of information to succeed. There is little argument about the need for intelligence in modern conflict, but before we consider automating intelligence processing, we need to decide whether, given the small scale uncertainty and high complexity of modern low intensity conflicts (Keegan’s guerilla-style real war), and operations other than war, it is a worthwhile thing to do. There are few metrics available for intelligence analysis: for human analysts there are the principles of intelligence, and intelligence processing systems are generally judged on the speed of their message processing and accuracy of their outputs (although accuracy is usually judged using a model of the world’s ground truth). These can be summarised by saying that intelligence processing should provide the commander with the most honest view possible of his area of interest, given the data, resources and time available.

Table 1 One commander’s information
* each force: World view including intelligence weaknesses, capabilities, intent, doctrine, deception, counterintelligence, available supplies and equipment, psychology, morale, coordination, allegiance,
* environment: terrain, possible changes (for example weather), human features (for example bridges and cities)
* situation: current state of play (situation and events), possible moves and events

Intelligence processing is a natural process for humans, but they have limits: when the input data size becomes too large to process within the time constraints given, or too complex (uncertain) to think clearly about, then automation of some of the processing must be considered. Given the increasing volume and complexity of information available from modern sensors, lower-level processing and open sources, it will soon be impossible (especially in a tactical situation) for analysts to process all of it efficiently and on time, and time now to start thinking about how automation can help. The second argument for automating intelligence processing concerns the advantages (mainly processing speed and uncertainty handling) that a reasonable intelligence processing system would give a commander. Concerns about the conflict superiority that technology gives are not new (indeed, they date at least from when the first arrow was made), but the emphasis has changed; recent warfare has moved from the production of larger, more efficient weapons to the observation and control of other forces (although this idea dates to Sun Tsu’s predecessors). This is reflected in the focus of current technology / research, at the Orient (intelligence processing) and Decide (decision making) stages in figure 1 (the progress of defence automation, as measured on the OODA loop). It is perhaps preferable to understand or possess intelligence processing systems than to fight a force that is better prepared by using them.

2. Models for automating intelligence analysis

The cognitive theory of intelligence analysis has been well studied. Not surprisingly, the intelligence cycle has much in common with models of human perception and processing. These are useful starting points in defining what is needed in an intelligence processing system. We learn that intelligence models are incomplete (we can never model the entire world), disruptible (models will always be vulnerable to external influences and counterintelligence), limited (models will always be limited by sensor capabilities), uncertain (input information and processing are usually uncertain) and continuously changing (models must deal with stale data and changes over time).
Requirements for future intelligence processing systems include rigorously handling uncertainty, partial knowledge (of inputs, area, opponent behaviour, training and equipment), detecting and cleanly retracting misleading information (including counterintelligence), giving credible and understandable explanation of reasoning; handling data that changes over time and including quantitative sensor -derived data in a symbolic reasoning framework. How much the computer should do, and how much should be left for the analyst is also an interesting problem. The analysis of and attempt to automate ‘judgement’ is a necessary step for the automation of low-level command decisions, but the change from assisting analysts to automated intelligence processing will redefine the operating procedures of military staff, and is expected to meet opposition. Since intelligence must be timely to be useful, the control of reasoning in these models must also be addressed : for example, since the creation of intelligence models is cyclic (as more intelligence is generated, so the gaps in the commander’s knowledge are pointed out, and more intelligence gathering is needed), when to stop processing and where to focus processing attention may be issues.

Decisions to be made include whether to distribute processing, and the acceptability of anytime algorithms (these output timely but less accurate information). To be useful, information must also be relevant to its user. Relevance is difficult to define; attempts to model it include user modelling and user profiling, and these should be allowed for in models if possible. Both the intelligence cycle and high-level data fusion models have been considered as a basis for intelligence processing models. These are not the only models that may be appropriate ; the design of generic intelligence models and processing should, as far as possible, incorporate techniques from contributing areas like cognitive psychology (mental models), detective work (psychological profiling), data mining (background information processing and user profiles) and data fusion (situation assessment ).

2.1 High-level data fusion

Data fusion is the combination of separate views of the world (from different information sources, times or sensors bandwidths) into a big picture of the world. Data fusion can occur at different levels of processing, from combining sensor outputs (low level) to combining processed information (high level). Intelligence processing, in its combination of information from different sources, is equivalent to and can gain from high-level (and also low-level) data fusion techniques.

2.2 The intelligence cycle

The Intelligence Cycle describes intelligence processing in stages. Three models are considered here : the UK and US models (which have different names for essentially the same stages), and Trafton’s model which adds an extra stage (utilization) to reflect use of the intelligence. The UK model with direction and utilisation stages added (as shown in figure 2), will be used in the rest of this section. The intelligence cycle isn’t just a description of the processes of intelligence, it also shows the flow of information within intelligence analysis, from commander to intelligence analysts and back. Within the cycle there are subcycles, the most important of which is the redirection of sensors, data collection to cover areas of ignorance found during processing, The rest of this section describes automation of each part of the intelligence cycle.

2.3 Planning, Direction and Collection, deciding what intelligence to collect

Information that analysts use to produce intelligence includes sensor data and intelligence reports, but there are other less obvious inputs : the information that a commander has requested ,both now and in the past, stored knowledge about the area of interest, and knowledge of opponents resources, training and expected behaviour. Input information is usually uncertain and often untrustworthy. Intelligence sources are classified into human, image, signals and measurement intelligence, the inputs from which are often labelled with uncertainties (source and information credibility for human intelligence, sensor errors for other categories of data). Since there is often a larger requirement for intelligence data than collection agencies can meet, there is scope for automating collection management with a constraint satisfaction algorithm, reasoning system using the priority and cost of data as inputs (Currently available tools include the JCMT system).

2.4 Collation and Analysis, processing the input information

Intelligence processing covers the stages of the intelligence cycle where intelligence is created from information. Processing by analysts is broken down into collation (sorting input data into groups, for example by units or subject),and analysis (making sense of the data and producing the big picture from it). Analysis is sometimes broken down into evaluation, analysis, integration and interpretation . Overall, intelligence processing uses input information to update knowledge about the situation, and create relevant intelligence reports. Intelligence processing doesn’t have to be very sophisticated to make a difference to an overworked analyst ; typical intelligence processing requirements are classification and matching reports to units, both of which are within the capabilities of current systems (although there is always some expected error). A good intelligence processing system should be capable of at least these two functions, with room for extension to techniques like recognising counterintelligence, behaviour and intent as research progresses. Intelligence processing is a large part of the reasoning in an intelligence analysis system, and has been given a separate (later) section in this paper.

2.5 Dissemination and utilisation, getting intelligence to the user

Intelligence is no use if it doesn’t get to the user on time and in a comprehensible, credible form : dissemination is essentially an attempt to adjust the user’s mental models of the situation – something which will not happen if the user does not trust the system. Utilisation is a catch-all term for using the data ; any actions taken by the user will affect the state of the real world and may change the commander,s intelligence needs, restarting the intelligence cycle (at the collection stage).
Automating intelligence processing will change the style and types of output available from it. Rigorously handling uncertainty in inputs and processing will improve the accuracy of intelligence processing outputs, but will also give the system designer a difficult choice between providing definite but inaccurate information to the user (this approach is preferred in, for example [10]) and more accurate but possibly confusing information (for example “T123 is a tank with 90 percent accuracy), or a set of possible explanations for the current data.

3 Automating Intelligence Proccesing

Table 3 shows some of the problems and issues in processing intelligence data.
The most pressing of these (uncertainty handling) is discussed further in [4].

3.1 Processing frameworks

Frameworks in existing intelligence processing systems include assumption-based maintenance systems, blackboards and graphical belief networks. Of these, belief networks seem most promising as a generic intelligence processing framework, since they can be used to combine uncertain sensor-derived data and knowledge within a probabilistic framework, and have a body of theory behind them that includes sensitivity analysis. The use of belief networks in intelligence processing is discussed in a separate paper; recommendations from it include further research into processing frameworks that includes study of :
* Focus of processing attention
* Handling of time-varying information
* Retraction of information and its ,traces, within the system
* Explanation of reasoning, including extraction of alternative explanations
* Multiple viewpoints and values
* Recognising and analysing group behaviour
* User profiling

Table 3 Problems for intelligence processing
* Input: multiple reports about the same event,object, information incest ,repeated and single,source reports, untrustworthy information, its removal and effects, varying credibility of information
* Representation: mixture of text,based reports and sensor data, several possible explanations for data, handling large numbers of parameters
* Speed: increasing amounts of complex information,data need for accurate, timely information combinatorial explosion
* Interface: credible and understandable explanation of reasoning, human inspection of data and intervention in reasoning
* Time: data that becomes irrelevant , deciding and removing it data and situations that change over time.

More than one framework could be used to split processing into essential and background work. This would provide opportunities for analysis of the patterns and flow of intelligence data, including dominant routes, behaviour and counterintelligence.

3.2 Existing intelligence processing systems

System designs and studies of how to automate Intelligence Processing are increasing in response to the need to take some pressure off the users and analysts. Although systems that handle intelligence data exist, some (for example GIFT, AUSTACCS) assist users to access information and don’t attempt processing; many (for example ASAS, DELOS) do not process it beyond the correlation (sorting data into groups) stage of the intelligence cycle, and most are vulnerable to uncertainty in their inputs and missing data in their knowledge of tactics and equipment. Systems that attempt full intelligence processing (for example HiBurst, IMSA, TALON and Quarterback) are usually based on Artficial Intelligence methods which include assumption-based truth maintenance systems, argumentation and expert systems augmented with fuzzy logic. Although probabilistic networks seem a promising base representation for intelligence processing,
research on intelligence fusion using probabilistic networks appears to be confined to teams at the UK Defence Research Agency, George Mason University and some US commercial sites.

3.3 Using Data Mining as a testbed for Intelligence Processing

There is a vast amount of electronic data even in a single company. Most of it is distributed across several systems /formats, and is useful but inaccessible. Data Mining is the process of extracting implicit and relevant information from data sources, which are usually databases, but can be open sources ,e,g, the internet. Data Mining gives us the opportunity to test intelligence fusion algorithms using open source information as input.

4 Automating counterintelligence

As counter-terrorism and infiltration units have discovered, getting inside the opponent’s OODA loop is more subtle than destroying communication links : his information can be manipulated to our commander’s advantage. We can view his Observe process using stealth and EW technology : we can disrupt it using deception. Counterintelligence also affects the Orient and Decide processes, both by data deluge of processing resources and the creation of uncertainty in an opposing commander’s mind. The creation of models of an enemy commander from his known doctrine and reactions also allow us to anticipate his moves rather than just react to them – it also makes the targeting of counterintelligence (for instance, in information warfare) possible. This augments the current countermoves of defending against enemy actions (e.g. using air and ballistic missile defences). As conflict can be viewed as an interacting series of these pairs of OODA loops, this is also a useful starting point for the automation of low-level command decisions.

5 Conclusions

Intelligence processing creates a belief in the state of a world (ie a battlefield) from uncertain and often untrustworthy information together with inputs received from sensors. This is a natural process for humans, but they have limits : when the input data size becomes too large to process within the time constraints given, or too complex (uncertain) to think clearly about, then automation of some of the processing must be considered. This paper outlines issues to be considered in designing the next generation of UK intelligence analysis systems. Future systems should allow analysts to concentrate on high-level analysis rather than clerical operations like duplication elimination. Automated intelligence techniques and systems are being designed in response to this need. Most of them assist analysts by providing better information retrieval and handling tools. The automated processing of intelligence is beginning to be addressed, but lacks a mathematically sound representation ; techniques based on data mining, cognitive psychology and graphical networks are promising, but need further research effort. This work also unites sensor-based data fusion systems with knowledge-based intelligence processing and decision support. Possibilities arising from using a complete and efficient representation include the ability to use most of the information available to a system, the analysis of patterns of behaviour and the generation/ recognition of counterintelligence data.

Bibliography

1. Nato Intelligence NATO report AINTP
2. Canamero D Modeling plan recognition for decision support European Knowledge Acquisition Workshop
3. Companion M A and Corso G M and Kass S J Situational awareness an analysis and preliminary model of the cognitive process Technical report IST TR University of Central Florida
4. Farmer S J Uncertainty handling for military intelligence systems WUPES Prague
5. Katter R V and Montgomery C A and Thompson J R Cognitive processes in intelligence analysis a descriptive model and review of the literature Technical report US Army Research Institute for the Behavioural and Social Sciences
6. Keegan J A History of Warfare Random House London
7. Keegan J Computers can t replace judgement Forbes ASAP December
8. Laskey K B and Mahoney S and Stibio B Probabilistic reasoning for assessment of enemy intentions Technical report C I, George Mason University
9. Shulsky A N Silent Warfare understanding the world of intelligence Brassey s US
10. Taylor P C J and Strawbridge R F Data fusion of battlefield information sources Royal Military College of Science Shrivenham
11. Trafton D E Intelligence failure and its prevention Naval War College Newport RI
12. Sun Tsu date unknown The Art of War Oxford University Press

Old papers: Uncertainty Handling for Military Intelligence Systems

Something I had laying around the office; thought it would be useful for the archives in a “what i thought when I was 10 years younger” sort of way.

ABSTRACT

We describe sources of and techniques for handling uncertainty in military intelligence models. We discuss issues for extending and using these models to generate counterintelligence- recognise groups of uncertainly-labelled entities and recognise variations in behaviour patterns.

INTRODUCTION

Intelligence is the information that a commander uses to make his decisions. It is an informed view of the current area of interest of a commander or policymaker, in a form that he can use to improve his position relative to another, usually opposing, commander.
Uses of intelligence include the basis for command decision making and the creation of uncertainty in opposing commanders’ systems and minds. Commanders use intelligence to recognise situations (situation awareness), predict changes in situations, predict an enemy’s behaviour (threat assessment) and decide which actions to take (planning) The quality and availability of intelligence (rather than information) determines whether a force is reactive (can only react to its environment or opponent’s moves) or proactive (can make informed plans and manipulate its situation).
Two major problems for commanders in the Persian Gulf conflict were the volume and complexity of intelligence data. If these are to be alleviated, methods for producing efficient representations of input data and information must be found – this includes automating the processing of raw intelligence data into useful knowledge

MILITARY INTELLIGENCE PROCESSING

Know the enemy and know yourself in a hundred battles you will never be in peril, When you are ignorant of the enemy but know yourself- your chances of winning or losing are equal, If ignorant of both your enemy and yourself- you are certain in every battle to be in peril’ (Sun Tsu)

The role of intelligence processing is to make sense of the world by piecing together the uncertain, conflicting but usually copious evidence available
Intelligence is information that is pertinent to the current area of interest of a commander or policy maker, in a form that he can use to improve his position relative to another, usually opposing, commander. Although some intelligence work is the stuff of James Bond and Le Carre novels, intelligence analysis is painstaking sifting of information to gain insight into the actions and intents of another party Intelligence processing creates a model, or informed belief, of the state of that part of the world which is relevant to a commander’s decisions and actions. Intelligence is produced by fusing uncertain and often untrustworthy information (sensor outputs and text-based reports) with prior knowledge (e.g. enemy equipment and tactics). This is a natural process for humans, but when the input data size becomes too large to process within the time constraints given, or too complex to think clearly about (people do not reason rationally under uncertainty), then automation of some of the processing must be considered The flow of analysis is usually based on the intelligence cycle : Direction – deciding what intelligence is needed; Collection – collecting information; Collation – sorting information; Evaluation – processing information into intelligence; and Dissemination – giving that intelligence to the commanders/ users (NB this is the UK definition of the intelligence cycle; different labels are used in the US definition)
Current intelligence is gathered on a limited number of topics or geographical areas, but currently irrelevant basic intelligence is also processed and stored ready for when the attention of a commander or situation shifts. The creation of intelligence models is cyclic; as a better picture of the world is generated, the gaps in the commander’s knowledge are pointed out, and more intelligence gathering is needed (this is shown in the diagram below by a feedback loop from collection to evaluation).
Characteristics of intelligence systems are that they are driven by a set of goals (the commanders requests for information), have information sources that can be partially controlled, situations that change over time and a large body of input information that is uncertain and incomplete. There is normally at least one non cooperating red agent capable of actions against the blue commander using these systems. Enemy actions against blue’s intelligence operations include counterintelligence and deception: attempts to distort blue’s model of the world Other agents that may need to be modelled include neutral forces and civilians Intelligence processing concentrates on resolving the uncertainties caused by inaccurate, infrequent and incomplete inputs, cultural differences, counterintelligence and approximate reasoning.
Intelligence models are :
* incomplete (we can never model the entire world),
* disruptible (models will always be vulnerable to external influences and counterintelligence),
* limited (models will always be limited by sensor capabilities),
* uncertain (input information and processing are usually uncertain) and
* continuously changing (models must deal with stale data and changes over time)
Military intelligence can go one step further than just modelling an uncertain
World; in using counterintelligence and deception about his plans, situation and
actions, a commander is creating uncertainty in an opposing commander’s models
The use of counterintelligence is one of the main differences between military intelligence and other uncertainty handling models (although there are similarities in handling counterintelligence, fraud, input errors and cultural differences).

AUTOMATING INTELLIGENCE PROCESSING

Although intelligence is currently processed by analysts, its automation is being driven by increasingly smaller time-frames, greater volume and complexity of available information Intelligence processing is increasingly similar to high-level data fusion (intelligence level fusion); making sense of the world from as much input data and information as possible.
Military conflict is essentially chaotic It is a sequence of well-defined moves that interact locally, yet produce long-range effects At the local level it is still possible to model these effects if they are bounded by physical laws, resources and the trained behaviour or rules of the parties involved
During the cold war, the West faced known enemies on known territory with well modelled outcomes (a winter war across Germany) Post cold-war intelligence analysis deals with more uncertain (less is known about the enemy) and complex (conflict is more likely to be in a setting which contains neutral populations) environments and forces Although small-scale, terrorist and guerilla conflict may seem random, they are still constrained (by environment and logistics), their players still trained (often in tactics well known to the west) and their sequences of actions still partially predictable.
Automated intelligence analysis systems are limited by time constraints and are unlikely to produce perfect summaries of the world It should be stressed that their prime function should be to improve current intelligence analysis The aim of this work is not to produce exact solutions and assessments of uncertain inputs and situations, but to give a commander as honest an assessment of a battlefield as possible within the constraints of the inputs, uncertainties and processing time available This paper focuses on the sources of and methods for handling uncertainty in military intelligence systems: [5] discusses other aspects of automating intelligence processing in greater depth.
War is the realm of uncertainty three quarters of the factors on which action in war is based are wrapped in a fog of greater or lesser uncertainty, the commander must work in a medium which his eyes cannot see which his best deductive powers cannot always fathom and with which- because of constant changes- he can rarely become familiar [4]
Uncertainty is not an important issue most of the time, as a commander will recognise the situation and react to it. Issues to be addressed include sources of uncertainty, whether we can improve our sensor allocations to reduce uncertainty, and how much uncertainty matters (how much uncertainty we can tolerate before a system is ignored or useless).
An intelligence processing system should use all (or as much as possible) of the information available to it This information is more than just input reports and sensor data; the context of an operation, open source, analysts’ knowledge and the needs, preferences of users are also available. The systems should not take every input fact as certain; fortunately, most of this information is tagged with source and information credibility, sensor accuracy or a range of possible values.
The reasoning framework used cannot be divorced from decisions about how to handle uncertainty The aim of an intelligence processing system is to use prior experience and knowledge to pull out the information implicit in input data, whilst losing as little of that information as possible One of the main differences between reasoning frameworks is the point at which they discard information This ranges from rule-based expert systems, which force a user to decide on the truth or falsity of input statements, to systems which manage uncertainty about inputs, conclusions and reasoning to produce an assessment of a situation which takes account of all of these This latter system is most desirable.

COUNTERINTELLIGENCE AND ERRONEOUS INPUTS

Counterintelligence is the main difference between uncertainty handling in military and other systems Modelling a military domain is compounded by an enemy attempting to deceive sensors and subtly change our models of the situation Counterintelligence manifests itself as conflicts between conclusions and unexpected lack of accumulation of supporting evidence Conflicts can be traced back to sources and information and counterintelligence hypotheses included and evaluated This can be incorporated from the outset by regarding inputs as observations of hidden information (either intelligence or counterintelligence).

THE USER’S UNCERTAINTY

The information output includes physical data (geography and positions; movements of forces), tactics and expected behaviour patterns, and social factors. Although most of these should have uncertainties associated with them, they currently do not, and one of the first questions in building an intelligence processing system should be whether this matters and if so, how much The final point at which information is discarded (uncertainty occurs) is in the user’s mind Knowing what the user is interested in (user profiling) can focus the output Even if an honest summary of the situation has been produced, complete with uncertainties; probabilities of different scenarios and actions, if this model is not transferred to the user’s model of the world, then the processing will have been useless Users also suffer from hypothesis lock in which alternative explanations are rejected regardless of accumulating evidence Managing this phenomena requires good explanation of reasoning, uncertainty and evidence.

ARCHITECTURE

The choice of reasoning framework is central to this work, both in its flexibility and its handling of uncertainty. Although intelligence is currently processed by human analysts, attempts to model it have included fuzzy logic, belief networks, assumption-based truth maintenance systems and rule-bases with exceptions. A Belief Network is a network (nodes connected by links) of variables that probabilistically represents a model – ie the beliefs that a user has about a world. Its main use is as a reasoning framework for manipulating and combining uncertain knowledge about both symbolic and numeric information. Belief networks can be extended to make decisions based on a users’ stated preferences. Such networks are known as Influence Diagrams There is a large body of research into many aspects of their use which includes learning networks from data, temporal (dynamic) networks and efficient evidence propagation We consider Belief Networks to be an appropriate framework because they handle uncertainty in a mathematically rigorous way, and they can be manipulated to provide more than just a model of a world Our experience in using belief networks for such a complex and uncertain application has, however, highlighted shortcoming in current belief network theory Key problems identified include the lack of high-level structure, treatment of time-varying information (including hysteresis effects), correlation between the real world and the model, slow speed (we may need to accept tradeoffs between uncertainty and execution times), handling of ignorance, and their single model (viewpoint) of the world.

GROUPS AND OBJECT HIERARCHIES

Analysis of typical intelligence problems has shown information to be hierarchical (and sometimes fractal), grouped and layered. An example is air picture compilation where an aircraft can carry several different weapons (which are each applicable to different types of target), and aircraft of different types are grouped into packages which then perform single missions. We propose the use of an object-oriented framework, where each object (i e aircraft) contains a network that can inherit nodes, sub-nets and conditional probability tables from a class hierarchy Each object network contains hooks – nodes that correspond to similar nodes in other objects’ networks Links between these nodes are often simple one-to-one conditional probability tables, but can be more complex; for instance, a package will have a one-to-many relationship with several aircraft This allows the dynamic creation of large networks from components It also allows the use of default propagations across objects (which are often single nodes in higher-level networks), default sub-networks (prototypes) and extra functionality (for instance the modification of input data). Using these robust architectures should improve network design times, but some consideration needs to be made of how much representation accuracy is lost in using them (for example whether adding an extra child link to a node will change the importance of its other children proportionally) Much of the theory has already been covered in discussions of semantic networks, plates, meta-nodes and representing conditional probabilities as networks We propose the use of self-organisation to create the boundaries between sub-nets, and the use of constraint satisfaction techniques to decide which hooks should be joined.

REAL-TIME PROCESSING

Intelligence processing is real-time and computationally expensive Ideas for overcoming the time-constraints and bottlenecks caused when processing large amounts of data include distributed processing, using hierarchical architectures to limit the spread of information, and modifying analog radial basis function chip designs to belief network representations We propose limiting propagation by collection of information at the boundaries of meta-nodes, approximate propagation across these boundaries, then propagation of batches of information properly when time allows.
Propagation can thus occur at the node or metanode level When propagation is allowed to proceed at both levels simultaneously (this is equivalent to using two layers of networks- one deterministic/approximate, the other detailed/probabilistic) the output will reflect the most detailed model for the time and attention constraints.

USING REAL-WORLD INPUTS

How a network corresponds to the real world, particularly the pragmatic and semantic subtleties of representing evidence uncertainty and ignorance, is also interesting. The problem of unreliable witnesses is so rife in intelligence processing that all information and human sources have reliability estimates attached to them. Current attempts to model this partial ignorance include using techniques from possibility theory to handle vague inputs, and using evidence nodes to spread the probabilities at input nodes.

OTHER INTELLIGENCE MODELLING ISSUES

Other issues that have been identified which impact on the automation of intelligence processing are:
* representation of time-varying information feedback (for instance using recurrent belief networks and methods based on Markov chains)
* incremental build-up of errors from evidence removal (rebuilding networks using only currently available data),
* multiple space and timescales (no current solutions, but some signal processing theory may help),
* multiple utilities (multiple attribute utility theory)
* when to refer problems to human operators (sensitivity analysis to data and data flows)
* multiple viewpoints to give a spread of possible outcomes rather than a point view of the environment (layered networks to avoid repeating entire networks
– see the section on real-time processing)
* reasoning about limited resources (colored network theory)
* discovering high-level patterns and trends in information, including behaviour patterns (adapting numeric pattern processing techniques to use symbolic inputs)
* generating behaviour novel plans (destabilising the networks – cf chaotic net theory)

CREATING UNCERTAINTY

Since any view of an environment is subjective, limited by the knowledge and information available, that view is open to manipulation by an intelligent adversary. This is the basic premise of information warfare; the planning of counterintelligence and deception moves (i e mock-up tank emplacements) to manipulate or attack a red commander’s mental model of the situation Information warfare is a powerful technique which complements existing command and control warfare (the disruption of communications between the red commander, his forces and intelligence). We already have models of blue’s view of a situation Some theory already exists for the adjustment of network-based models to their input/outputs, and for multiple views of the same situation It is therefore useful to adjust a blue model of a situation to create a blue estimate of the red commander’s viewpoint, using red’s known doctrine, sensors and reactions Sensitivity analysis of blue’s red commander model can then be used to determine which of several possible deception moves by blue would be most likely to alter the red commander’s view of a situation to that desired by blue.

EXAMPLE DOMAIN

The analysis of conflict, like game theory, embraces any interaction between parties with differing and usually contradictory aims Intelligence analysis provides a viewpoint from which an agent or human can decide and act in the real world. Applications of intelligence processing techniques range from battlefield awareness to security systems and intelligent data mining; our example/test applications include classifying combat aircraft missions from sensor data and recognising criminal behaviour patterns.

CONCLUSIONS

Intelligence processing is an interesting area for the application of uncertain reasoning techniques The main difference between this and other applications is the deliberate creation of uncertainty (counterintelligence) both by own and opposing agents. This gives a new perspective on uncertainty – that of a useful thing to create.

REFERENCES

1. NATO Intelligence Doctrine NATO report AINTP-1 1996
2. AN Shulsky Silent Warfare Brassey’s US 1993
3. Sun Tsu The Art of War Oxford University Press
4. C von Clausewitz On War Princeton University Press
5. S,J Farmer Making Informed Decisions Intelligence Analysis for New Forms of Conflict IMA Conference on Modelling International Conflict- Oxford ),) April
6. W Feller An Introduction to Probability Theory and its Applications Wiley-
7. DA Norman and DG Bobrow On the data,limited and resources, limited processes Cognitive Psychology
8. R Szafranski A Theory of Information Warfare Preparing for, Airpower Journal- Spring
9. A Tversky and D Kahneman Judgement under uncertainty heuristics and biases, SIAM Journal on Computing
10. G Shafer Savage Revisited SIAM Journal on Computing
11. C Elkan The Paradoxical Success of Fuzzy Logic IEEE Expert- August
12. SG Hutchins and JG Morrison and RT Kelly Principles for Aiding Complex Military Decision Making Command and Control Research and Technology Symposium-Naval Postgraduate School- Monterey- California- June
13. J Pearl Probabilistic Reasoning in Intelligent Systems Morgan Kaufmann
14. RE Neapolitan Probabilistic Reasoning in Expert Systems Wiley
15. E Horvitz and F Jensen Uncertainty in Artificial Intelligence Morgan Kaufmann