The Ushahidi Treasure Hunt

We’ve been busy of late setting up OpenCrisis – a place to connect UK crisismappers, local authorities, responders, citizens and government folks, to produce a coordinated and prepared population to help with overseas crises, and to be able to use crisismapping techniques and tools if an overwhelming crisis ever hits the UK.

Part of the group’s aims is to get people used to crisismapping tools (both using and responding to them) when they’re not wet, cold, scared and/or running for their lives.  Ushahidi is an important part of the crisismapping toolkit, so we’ve started with that.  And one of the best ways to get people engaged with a tool is through play.  The Ushahidi zombie game is fun, but we wanted something that Brits could play competitively at an unconference, that wasn’t too far from their previous experience and wouldn’t take very much time.

Hence the Ushahidi Treasure Hunt.  We’re still working on the ideas (and planning to play the first game at BarCamp London 8), but they go something like this.  We set up an Ushahidi instance around the camp site. It has a whole bunch of categories, with two special ones: clues and answers. We start the clock running, then one by one we post up clues. Each clue refers to something within about 50m of the clue’s point on the map: in London, for instance, that could be monuments, pubs, shops or blue plates (history markers).  The contestants dash out and name the object in the ‘answers’ category – with bonus points given for a photograph of their camp badge in front of the referenced object.  And so it continues, for an hour, a a lunchtime or as long as seems reasonable on the day.  This game has the added bonus of creating live data for a bunch of crowdsourcers running the back end of the Ushahidi instance in the camp – win-win, or at least that’s the plan.

In London we’re blessed with lots of barcampers who’ve set up maps across the capital and built opendata sites based on data on its street objects. Which means that it’s theoretically possible to answer the treasure hunt without actually leaving the camp.  For a camp, the ‘photo with your badge’ rule could be useful here, but it also means that the game could be played from anywhere in the world at the same time.  It’s a thought – I’ll report back here afterwe run the first game.

Notes from OpenTech 2010

11th sept 2010: OpenTech, ULU, London

Data.gov.uk talk

  • Richard Stirling – Cabinet office.
  • Standards… linked data api… NAPTAN is part of those standards… focus on small lightweight patterns… core patterns that can then be specialised. E.g. Data cubes – developed general standard for these, then specialization for payments etc.
  • Important: ids for space & time. E.g. http://reference.data.gov.uk/id/day/2010-09-11 or http://transport.data.gov.uk…wat
  • http://Source.data.gov.uk/gridworks – OS tool for viewing spreadsheet-type displays of govt data. Easy to view data / search through this… could have done with this when was looking for UK LAs. NB can reconcile each column against another data source. Other tools: RDFizer… data enrichment service – gov.tso.co.uk – highlighting recognized text (just off dictionary?) .
  • http://Legislation.gov.uk Developing standards for publishing information… info for people and data for apps. Any pages : append /data.feed to bring back an atom list. /data.xml will bring back underlying xml. /data.xht brings back plain html. /data.rdf gives metadata. i.e. human website, with access to information underneath it.
  • Linked data api… restful api over linked data – maps rdf into human-readable. “avoids having to use sparql”. Default return xml, but also json… can filter via ?x= in url.
  • http://Danpaulsmith.com/gov/orgvis/?dept-bis … doing visualisations for DBIS.nOngoing work: “standards, data, production processes and publishing”. Asking for open help…
  • Q… linking legislations… working with UniSheffield GATE tools – auto id amendments…
  • Q: check http://data.gov.uk/blog to find out what they’re doing.
  • Q: anyone can implement gridworks api… gridworks development company are now owned by google who are changing its name.
  • Q: govt monitoring & authentication of data. See Richard session this pm. Focus on provenance because “repeatability as basic of trust”.
  • Q: provenance. Is there machine access to allow auto provenance checks? Work on generic provenance standard, that can access in a machine-readable way. “doing provenance on the web is something that the entire web community is still learning how to do”.

LinkedGov talk

  • Hadley Beeman, @linkedgov Hadley@linkedgov.org “the tidying-government-data project”
  • Idea – is getting govt data in multiple formats, and all understandable (e.g. no codes)
  • Reality – is codes and acronyms, holes in data, multiple formats, lack of modeling & connections. People with answers to these, keen to help, in LAs and govt… getting same queries from multiple places. Need to provide answer once, in way that’s easy for the civil servants.
  • Response – data sets –> crowdsourcing module (la/govt) -> formatting/structure/modeling -? Search/query/filter tools & apis -> app, visualisation, search site. Big focus on crowdsourcing. Doesn’t appear to be built yet.
  • Crowdsourcing: decrypting budget codes, acronyms; dealing with labels and annotations. E.g. label for a ‘wrong’ piece of data that’s already been used.
  • Formatting&linking: connecting datasets, standardizing headings, assigning uris, reformatting – didn’t we just see a cabinet office data.gov.uk talk on this? Acknowledged, but ‘huge pile of data’ – want to help by adding into holes here. “where there are standards missing, we’d like to pull together all the people to help in their creation”.
  • Query tool… looks a bit like Ask, i.e. “how many govt employees work in victoria st”, with reliability score and data source – still working on the reliability score.
  • Action: How we can help – looking for architects, data structure experts, developers. Join by sending email to joinus”linkedgov.org by mon 20th sept.nQ: how to correlate across LAs… need to label and recognize if the underlyings are disparate. Can’t standardize through this project, but could id commonalities and represent that in the data. Like Wikipedia – not perfect, but better than anything else out there.
  • Q: suggestion mechanism for datasets and projects from local govt? Can take suggestions on the project. Idea is to fill gap in the open data process, but if there are other things we can do as a community, then we should do it.
  • Q: funding. Biggest costs hosting and legal (making sure crowdsourcing not open to lawsuits). Otherwise all volunteers, but also talking to bodies about sponsorship.

Ben Goldacre & Louise Crowe on unpicking dodgy scientific claims

  • There are a lot of useful datasets out there that could be used. Biggest prob in medicine: selective publication, i.e. don’t have to tell anyone about bad trials results. FDA etc doing v badly about telling anyone about the negative results that they receive. Published data isn’t accurate reflection of results.
  • 1) Nobody’s job to check if something has been done but not published
  • 2) Incompetent regulators. E.g. drug company telling FDA about heart risk on diabetes drug but FDA not telling anyone else. UK E.g. Seroxine doesn’t work in children – drug company knew but didn’t publish. Was an aside in a MHA submission. V common that drugs are used off-license in children… not so bad if we can see all the evidence. GSK lawyers pointed out that they weren’t obliged to tell govt about this because it was used off-license.
  • Solutions in-place are failures. E.g. clinical trials registers. 58 created… diff to find info… is nobody’s job to go back and check that trials data is published, or that trials being done are pre-registered. 1/3-1/2 not registered properly; 15% not registered at all. Clinical trials dbase has clinical trial id – then search for this id in published trials. Then produce a list of all the trials that are unpublished. Louise tech on this… recruiting people to work on it… repositories inc http://clinicaltrials.gov which includes change history of each trial, e.g. status changes. Finding lots of odd patterns in these trials. http://isarctn ?UK registry… big text fields… issue of being coherent across different repositories. Udract – totally closed repository… difficult to access this… ‘unresponsive to need to open up clinical trials data’… register of all trials in EU 7k-10k trials per year – can see trials, but can’t see what they are.
  • Taking trials data into pubnet… checking here to see if anything is published under each id… looking for level of false negatives – i.e. is it unpublished, or published but not with the id; is this a structural fault in the system? Looking for patterns of bad behavior, good/bad publishers, drugs to be aware of because there’s a lot of missing data under the surface.
  • Main outputs: spotting the patterns. A website where anyone looking at a drug can see which trials on it have been published… ie. Missing data and which companies hold it.
  • Most trials dbases have a contact email – are going to flood these with “where is the data on” emails.
  • Q: safebiopharma EU project. Also a US equivalent. Seen a few random repositories of results. Are looking specifically for clinical trial data.
  • Q: URL? Is slightly secret at the moment. Github/probot? ben@badscience.net – looking for people to help, e.g. cross-checking results.
  • Q: anecdotes. Not us, but http://patientslikeme.comn

RewiredState

  • Emma Mulqueeny… rewiredstate 2 events a year – national hackgovernmentday, rewiredstate.
  • After govt got involved, decided to include next generation of kids in getting involved with data. Youngrewiredstate (yrs)… example: Izzy who wrote govspark.
  • Ben Webb on yrs Manchester 2010. Yrs decentralized this year – London, brighton, mancs, Norwich etc. week-long event. Tools include finding bus routes (because mancs and London have released their bus timetable data).
  • Q: project repository? http://rewiredstate.org/ http://dev.dfey.org/
  • Q: involvement with schools. Tried, but no real response from the schools. Probably because coding isn’t being taught in the schools. Norwich recruited from schools, but via media and creative arts departments. Plan is to use this to put pressure on govt on this.
  • Carbon energy hackday on 30-31st October.. hardware as well as software hacking… need arduino people.

Louise Crow on FixMyTransport

  • Aim: give people simple tangible benefits… this to find out what happens with more complex problems. Looked at http://www.fixmystreet.com … e.g. Euston station problem – Camden and NetworkRail & TFL both claiming the other was responsible.
  • One of the reasons its tricky is the question of responsibility… i.e. who to tell about graffiti on a bus stop… council, ad company, transport exec etc. High cost of finding who to tell.
  • Datasets – Naptan… public transport gazetteer… third dataset mptdr (accessibility dataset) – once a year, record all the public transport journeys taken. Need to know routes and route operators. E.g. operator codes not unique… crowdsourcing finding details of transport company details…
  • Action: contact Julia about this.
  • Visibility: i.e. knowing if someone else has had the same problem.
  • Workinprogress section – i.e. handwavy part – can now report problem to responsible org – what can we do next about this. Two points: dirth of campaigning tools on the internet (i.e. how to start a mini-campaign about a transport problem), and things that influence decision makers are people… e.g. campaign pages on fixmytransport e.g. “save the c10” – to get people to join a campaign. Have planned this out but not developed it yet.
  • Tools: mysociety tools for e.g. write to representative, print flyers & posters – this is the project Fosbury part… “a piece of civic infrastructure that can be useful to other things and other people” i.e. a kind of broker to send people off to do specific tasks… callbacks when jobs are done etc… something that democracy.gov did in a very straightforward way… i.e. can share between applications. “how do you figure out what the most effect thing to do next is”… i.e. give list of tools, sysadmin suggests something to you, “we think the next thing you should do is this” emails, recommendation from other campaigns, i.e. ‘other campaigns that were successful did these ind of things next”. Will release something later this year – will be relatively simple. Appreciate any thoughts and questions on this.
  • Q: have you spoken to transport companies? Richard George…
  • Q: what types of data? All the transport places that might have been or will be…. Not real-time at all… longer-term view. Are looking for people to help put in their own local authority transport details. Twitter.com/mysociety – link is in the last post.
  • Q: talking to transport authorities.
  • Q: fixmystreet – what % of councils are now in the process for this? Now can report to almost all councils in the UK now. Councils are beginning to get problems input into their problem reporting process (instead of just sending emails). With this, can now start doing stats on how long it takes each council to fix e.g. potholes. Thought about breaking fixmytransport into fixmytransport and Fosbury.
  • Action: check out how this maps onto Ushahidi deployment… what are the overlaps between them?

Tom Steinberg on Failfair/ groupsnearyou

  • Failfair – evening for people to get together to talk about projects that didn’t work and why. Talk about groupsnearyou and why it’s failed in its current instantiation.
  • Idea: internet great at connecting people together but useless at connecting people in local neighbourhoods. Gathering list of pre-existing neighbourhood organisations… e.g. neighbourhood watches… there is no definitive list like this – mostly in heads, on flyers through doors etc. Platform neutral… web service… built apis – google maps with rectangles around groups. Included crowdsourcing interface to take yahoo groups and convert them into groups in this system. Connected to fixmystreet… realized at this point that there was a problem. Problem was that the data quality was ‘awful’. Probls: groups were miscategorised (e.g. bible study group classified as general); groups put in wrong areas (e.g. single uni group covered whole of London)… 7-8/10 users were given inappropriate groups to report their problems to.
  • What went wrong? Didn’t have the money to do this project properly. Had money for a lightweight crowdsourcing tool. Fixmystreet didn’t have data quality problems because not conceptually difficult. Groupsnearyou didn’t have people in council fixing problems in the data, and was a different level of abstraction. Why more money needed to make it work? Proper AD conversion testing – could have modified and tested until shrank number of people producing bad data. Didn’t have money to have anyone to fix the site… site didn’t have the obvious appeal to attract volunteers to go in and fix info problems.
  • Future from here? Mysociety isn’t going to do any more with this – perhaps someone else will take it on. Political and technological salience of this has gone up –matches govt big society agenda. Something like this needs to exist if we can get large number of people across UK involved in local societies… e.g. local council sending out residents association address at bottom of council tax bills – but does need a reliable database underneath it.
  • Want real failfair in the UK – thinking about organizing.
  • Q: diff between geographical and catchment areas. SJF: could build this in… i.e. “near my house/ near my work/ near my station/ 30 mins away”
  • Q: demand. Mysociety builds things that might become useful rather than what people ask for.
  • Q: idea of group is nebulous because people don’t know what they’re supposed to be putting into it.
  • Q: crowdsourcing. To get energy from volunteers, need site where people get something immediate from. This is a nice to have piece of the internet – main beneficiaries are other people who run internet sites.

Evan/Tim on election lessons

  • Never been done before – never been a candidate quiz where every individual candidate was asked and answered questions. Specifically wanted local questions.
  • Lesson 1: divide and conquer. Democracyclub gathered volunteers then fed them to other projects and asked questions, yournextmp found local candidates, theyworkforyou sent questions out to candidates. Don’t try to do it yourself – get someone else involved and share. Conquer: do what you’re doing and do it well.
  • Lesson 2: anyone can do this. Python/perl. Being geographically distributed is now no longer a problem.
  • Lesson 3: build on other people’s work. Lots of tools already exist & are powerful. Main tool: $twfy->getConstituency($postcode) is basis for most of this work. Find out what other people have done, and use it.
  • Lesson 4: share your data. Share it early, share it dirty, but just get it out there. E.g. json/csv links at bottom of each twfy page. Users grabbing data for apps – e.g. Terence Eden wrote a mobile app. People used data, and sent corrections back to the site for it.
  • Lesson 5: use the crowd. Having generated the data, you get a crowd. Use it. Democracyclub…. Homepage suggests tasks you can do… (SJF: could do this on cc page – i.e. suggest tasks for each person). Researched 4150 candidates this way… found 3255 local issues. 949 leaflets uploaded… 52% of candidates answered the survey (used local faces to convince them to do it). Yournextmp worked well with crowdsourcing (finding contact details for candidates). Crowdsourcing worked well for straight choice – is it a leaflet or not. Local issues was a bigger crowdsourcing problem. Form optimized for quantity, which led to lots of low quality entries, which needed to be moderated and rewritten. Became a massive bottleneck in the process – small team of moderators were spending their evenings doing this. Similar problem with govt cuts website, e.g. what to cut – became a big discussion forum rather than a clear black or white.
  • Q: crowdsource rewriting? i.e. like this one, but would rewrite like this, or don’t like. Given more time, would build checking into this. Lots of effort to build an interface for this.
  • Local issues worked in the end mostly. Not a good crowdsourcing exercise – need to better crowdsource the moderation.
  • Lesson 6: think like the media. Yournextmp is a database – this isn’t very sexy. Media weren’t too interested in this, but did get interested in the democracyclub. First article was BBC “mps to be watched on local issues”. Also got interesting quotes to send to journos… built up relationships with these journos… i.e. instead of sending out press releases about how good db is, need to explain how people use the system, e.g. how someone could be held to account, dodgy leaflets etc. %ge of conservatives who didn’t answer the quiz etc. Think “what do they want in a story” rather than “what is cool in itself”.
  • Q: wht would you do differently next time. Start earlier. Get quiz out earlier – week’s notice for quiz was too late in terms of newspapers – getting out earlier would attract more users. Talk to parties more, e.g. get conservatives onboard instead of just writing to their inboxes.
  • Q: what did you do with the volunteers after polling day. Haven’t. ideal to do same for local elections. Concern that same as Obama not maintaining validity of volunteer lists and not keeping them engaged. Energy disappears after elections. Circa 60k candidates for local elections – many more. If retain volunteers from this, they become basic cadre for the next elections. NB 3 volunteers put in circa 1 year of unpaid work… wanted their lives back after this.
  • Q: need to engage the election agents.
  • Q: need for moderating crowdsource questions. Better to get volunteers to ask the questions? Prob was that were trying to make a body of questions – were v concerned about bias, e.g. Tory candidate being able to login and bias the question base. Diff problem to get round.

Lunchtime chat with Tony Archer, W3C

Richard Stirling on transparency in Government

  • Data.gov.uk oct 2009 – said we will be launching http://data.gov.uk. Lot happened since then. Data.gov.uk, OS maps (postcodes, vector maps, os identifiers)
  • Biggest thing is the momentum that’s been built up. Enabled by ‘opening up; data – PM who really cares about this stuff… and published detailed commitments (look at these, e.g. spending transparency and other key govt datasets). Has a transparency board (Francis Maude) – meets to ensure govt does what it says it’s going to do. “Nation of armchair auditors”.nFirst commitment – treasury dbase on public spending (COINS website… 120Gb csv file)… Guardian’s COINS data explorer… q is “what is it that we can do to get the data out there”… in a form that people can work with it… at this point, good things like this happen.nIs happening across the whole public sector. Real stars are London, Windsor&Maidenhead, Kent all doing amazing things.
  • Challenges: finding data, licensing data, using data. “need clear permission statement on reuse”. “winning the war with pdf”. Crown Copyright license that’s compatible with CreativeCommons attribution… i.e. can use standard creativecommons license.
  • Open approach: open data, open standards, open source tools where we can (Drupal, CKAN as registry, CMS etc).
  • Using data: csv. Api “flexible way of serving up chunks of data”. Linked data: joins up data and datasets, and can create a customizable API on top (“spitting out json”, with SPARQL on top)… possibly most flexible way to serve the data.
  • Data: about 4k items. Not all perfect… get all the national statistics including pdf. DfiD Pakistan flood monitor went up yesterday. NB random data set button.
  • Action: tell Spike about DfiD flood monitor site.nWhen building stuff, reference back to data.gov.uk and mention the dataset you’ve based on.
  • Next: more of the same. Lots more data to get out. Quality (Hadley’s linkedgov work). Read/write: accept community data. Make it easiert to publish rich data. Want people to “get excited and make things”.

? On London data experience

  • Started scoping in GLA last October – police, London development agency, city hall, 1 other agency.nHad an open day – “come and help us to free London’s data”. Key messages: don’t worry about standards, go ugly early, and don’t let perfection be one of the goals. Key concerns were crime and transport. Isn’t just about city hall, is about partnership with developers. Difficult to get data out of govt – people saying isn’t technically possible – developers return with “this is how you do it”, also helping my coming to meetings with her. Getting data is incredibly slow… officials emailing each other, meetings etc – not very visible… is keeping this visible via twitter.
  • Main challenges: real-time data for transport, and licensing conditions (TFL). Licensing: still asking who are you and what want data for, but are working on this… tfl have come a long distance from this.
  • Action: also talk to this speaker about transport data provision and hits on servers… big pressure to provide real-time data to end users.
  • Got locations of bikes via freedom of information data request. Need realtime datafeed to cycle data.
  • Next phase: NHS London phoning up about how to release their data… told is about engagement with developers, not just putting out the data… are hooking-up community with NHS to visualize their data etc. Mayor recently announced his “digital advisory board”.

Geoff talking about http://ASBOrometer.com app that uses this open data.

  • Tool to measure levels on antisocial behavior at a location. Queries open data sets about asbos and antisocial behavior in your location. Was more popular than facebook on istore for a while.
  • Asbo dataset is poor, more interesting data in home office survey about concerns on anti-social behavior… used this. Inc leaderboard of worst offenders… Did because huge public appetite, wide audience, content matters etc.
  • Datasets mostly csv files. Opportunity to “beat the goldrush”. Took 5 days, basic process: geocode data, write query logic, construct API, build mobile client apps. Used google app engine, geopy, geomodel, google chart tools, android sdk, appcelerator titanium. Could have used yql, yahoo placemake, scraperwiki, fluiddp?
  • Advise to:
  • • make it mobile – device always on, always with you, knows where you are. Open OrdSurvey data making easier (e.g. to locate borough/town from postcodes etc).
  • • make it local. People interested in what’s around them. Unique experience depending on location. Exploit local pride and rivalry. (e.g. top trumps theme, people picking their hometown in bars and comparing crackhouse figures etc).

Post-talk chat with transport folks Fixmytransport.

Chris on Mapping DFID’s reach in Africa

  • Looking for patterns in CSV data. NB csv in various levels of ugliness… “getting a measure of where things break”. NB want open.gov to use ISO country names instead of current ones. Started looking at different countries, and how much aid is spent in them per year and what this is going into.
  • Some huge CSV files on local spending – e.g. Guildford council data… huge list of local depts. Not enough explanation, e.g. huge spent of vehicles in October – but types of vehicles not explained.
  • SJF: how about being able to crowdsource data holes, e.g. knowing what types of truck!
  • Need to spend time thinking about the people looking at what we create, e.g. visualization.
  • Examples: http://www.Owlsnearyou.com/ site! And http://mrsanearyou.appspot.com/ – both early stuff.

Rufus Pollock on OpenTech a year ago

  • Compare with other govt it projects… UK being looked to as a bit of a world leader here. Q: where does our money go when we pay taxes? “show us a better way” competition. Uses lots of data.gov.uk data and coins. Big thing done: sorting out the licensing, i.e. what could happen if you start reusing data – could people make you take it down was an issue.
  • Want to automatically download datasets, then start doing something with them. V tedious to wget everything line by line… not reproducible or automatable. Datapkg is a tool to do this.
  • Ecosystem around datapkg… e.g. community can clean up the datasets. Lot of work is about cleaning up the data, and joining it together. People don’t want to know why something is wrong- it’s just wrong.
  • Example from ckan.net… i.e.. someone cleans up someone else’s data and restructures it – where do they put it?
  • Data.gov.uk apis: catalogues and datasets.
  • Giving people control, but beware the porn upload problem.
  • Q: is squeezing data out of govt depts. – will it still be tough in 5 years’ time. Mindset is part of public management – are asking people to turn orgs round 360 degrees, e.g. huge organizational change. Will take a long time, but door is open and there’s no going back. IS it a tough sell? V tough – risk averse culture – audit- people afraid of criticism if data is wrong… and officials have to engage with data questions, politicial to answer questions on this. Risk/reward setup not an easy one for public servants… i.e. pats on back for right vs probls if erroneous or wrong data gets out. We need to be willing to reward govt when they put out data. Journey likely to be painful, but in 5 years time… conversations are getting easier over time. E.g.. transport, things like release of NAPTAN and world didn’t end. Really good environment for making progress quite quickly. “Won’t be the massive culture hit that it is today”.
  • Q: Talos platform – thanks for mentioning it (Phil Archer again!). Would apps have been better if data were available through something like Talos or another Sparql engine? Issue at the moment with linkeddata – developers “reach for their guns” over rdf. Issue is lack of toolset for rdf… getting rdf store on system could take half a day to do rather than v quick. CSV because every program can parse this. Lovely thing about talos platform is that can get json back.
  • Q: experience of getting hold of datasets that might be used in safety-critical applications. Example: pilot. CAA publishes NOTAMS list… is published on ‘gruesome’ website… CAA edited out machine-readable line that gave this. Other countries give xml formatted datasets for this – people go to these to get it. eNotams! Answer: MPSI unlocking service is on data.gov webpage for this sort of question – current call out for datasets that can be unlocked – get to service before Wednesday next.
  • Q: Trying to extract spending data from local authority. Keep getting back, if lucky, 3 costs aggregated from 3 different categories, and cite curious exemptions. LAs tend to be v risk-averse and closed; easier for someone on the inside to be pushing from the inside.
  • Q: behavior change (couldn’t hear question). E.g. Website: “whatcanibuy” – put in e.g. 20 pounds and see what city hall has spent that on.

Sebastian Brannstorm on Wild Ducks mobile OS at Symbian foundation

  • Trying to build own smartphone off ots hardware.nSymbian – 100m devices shipped/year. OS since feb 2010… EPL/LGPL. Symbian foundation London (non-profit)… arguably world’s largest open-source project.n“what good is open source when there is no open hardware”. i.e. can’t flash the hardware. Current open source stakes are generally linux. Symbian decided to make a phone of their own. Currently Nokia, Sony etc… starting own garage project to use open hardware for this… using the BeagleBoard.nBeagleBoard – is a Texas Instruments garage project. System-on-a-board platform, OMAP3530 chipset… cheap ($100-$150), extensible, open specifications. Lacks a few things, e.g. modem, radio, internet, support for lcd displays etc. Antrax Germany built BeagleUTMS board for connectivity, Chinese BYD Industries (“build your dreams” conglomerate) built the touch LCD display… will soon be commercially available. Current device – needs external power, but works pretty much as a phone should.
  • Have corporate support from Symbian corporate members, e.g. Accenture, Nokia etc… some people (1+) now full-time paid on the project, but still meeting Tuesday evening for pizza discussions about it… will buy pizza for anyone (i.e. us!) who comes to this.
  • Now: lots working, simple to get started, simple to contribute (e.g. documentation, todo lists, backlogs in wiki, code repositories,know-how), good hardware, active mailing list. But need more help – more drivers, more functionality, usb host, etc. Need to provide feedback, exchange ideas, help with developer outreach and project promotion. “Need people to talk to about how we should take this project”.
  • Wiki http://tiny.symbian.org/wildducksnQ: Symbian? Symbian 3. QML in Symbian 4.7.nQ: Drivers opensource? UTMS probably not entirely opensource. Are some GSM, but not UTMS – some qualcomms patents prevent this.
  • Q: what can we do with this magical piece of hardware? Not solving good phone problem.. is opensource attitude to this…
  • Q: get core members into providing a development version of their mainstream phones? Are drivers that can flash, but not there yet.
  • Q: why not use OpenMoco hardware? OpenMoco is getting a bit dated now… BeagleBoard was closer to existing baseboard they already have.
  • Q: ideas for the powersupply. Haven’t really thought about this yet. Not a priority at the moment.
  • Q: do you need regulatory approval for experimental phones in the UK? Not using GSM, so not an issue – and are buying type-approved phones.
  • “Come and have pizza with us on Tuesday”.

TerenceEden on “why doesn’t your site work on my mobile phone”

  • @Edent is mobile phone consultant. US federal govt stickers on each car for e.g. safety records and QR codes (http://fueleconomy.gov/m) – takes to a mobile website… which is usable, and works on every phone with an internet connection and a web browser.
  • Stats –http://www.opera.com/smw http://communities-dominate.blogs.com http://www.gartner.com/it/page.jsp?id=1306513 – give apple at 14%, rim 18%, android 17%, Symbian at 41% of the smartphone market; worldwide sales = 19% smartphones, 81% dumb phones.
  • Dumb phones are not smart, but they do have web browsers.
  • Tip: http://www.forum.nokia.com/Develop/Web/Mobile_web_browsing provides a whole bunch of templates for mobile apps. “more or less guaranteed that your service is going to work on all mobile phones”.
  • Tip: Also see deviceatlas or wurfl to investigate phone screens. Can shrink images using http://tinysrc.netnTip: KISS. People are still using phones from a long time ago. Don’t over-rely on javascript… doesn’t run on anything but the most expensive macbook. Think about what your users really need. Make accessible to the most amount of people.
  • Tip: wordpress. Plenty of mobilization plugins. http://wordpress.org/extend/plugins/wordpress-mobile-pack/ (@edent helped to write this).
  • Tip: don’t ignore the smartphones. Sure, create something beautiful for the people with rich kids toys, but make your app open to as many people as possible. NB more people have private access to a mobile phone with a web browser than to a pc. If it’s sensitive, e.g. rape data, most people share a pc. Are more likely to use their personal, private mobile phone for this.
  • http://shkspr/mobi.blog/ http://edent.tel/
  • Q: do most people with mobile phones have internet access? People with older phones do access the web, but don’t access it as much, partly because of the dearth of sites for older mobile phones. Are using in a fairly limited way because the services aren’t there.
  • Q: how to test that software is compatible. Easiest is to buy a bunch of cheap phones… also services like deviceanywhere will give free testing times. Also zoonu and utest (which crowdsource phone testing, e.g. people in Africa, Russia etc). Are also good emulators for higher-end phones; easiest way to check older phones is to buy a couple and test on them.
  • Q: less-developed countries. More than 90% of people’s internet access in these is through mobile phones. If you want data to be enabled in these countries, need to think about low-end use. Also services in e.g. Kenya are done through SMS and voice. Talk to people who can provide good SMS and voice links.
  • Q: WAP phones – WAP completely forgotten? Still significant number of WAP devices around. V. different protocol and way of writing. Not a bad route if want really wide dispersal, but WAP use is shrinking in e.g. Kenya.
  • Q: what features work on which devices? See deviceatlas and wurfl for this.nnCraig Heath (Symbian Chief Security Technologist) on People Power in your PocketnOpportunity for activists/ community to address privacy and security concerns.
  • World’s most widely used opensource pc platform is Symbian.nAreas of interest: information asymmetries and better mgmt. of personal info.
  • “reciprocal surveillance” i.e. simple recording of service provider calls, with “digital notary” – hash, with trusted third party signature on hash… for “we’ve got no record of that” problem. (q: nb Ribbit speech to text could be used here to provide transcripts for court).
  • Premium-rate charges. Phonepayplus (OFCOM subsidiary) holds data about premium-rate charges. Currently can check charges with free sms to 76787… more useful if phone could do this automatically, and enforce rules, e.g. “don’t spend more than 5 pounds”, “don’t send more than 2 texts a day”.
  • Personal data sharing controls: e.g. private, not employer, people I trust etc. Could borrow “sensitivity labels” from MLS (multi-level secure) orange book, where label is indivisible from the data.
  • Control of own identity. Mydex.org – UK govt pilot project; “can share this with you but you can’t pass on to third parties”. Legal/regulatory framework as well as technology. Looking for Symbian volunteers (C++). User-driven identity.
  • Craigh@Symbian.org http://Secblog.symbian.org http://developer.symbian.org/mailman/listinfo/privacy
  • Q: reflashing Symbian phones. Concept of eclipsing – may need to get operator permission to do this. Difficulty depends on which APIs you need access to.
  • Q: change in Symbian culture. Will this feed through into phone manufacturers. Speculation: change in culture. Symbian much smaller (100 not 1700); new focus on getting incoming contributions. Remains to be seen if this culture extends to phone providers – but might be under pressure from regulators. Hoping for some reflashable phones eventually.

FrontlineSMS

  • “technology doesn’t work everywhere”. Ken Banks work http://kiwanja.net in Bushbuckridge.
  • 4 million people in the UK are offline. 39% over 65; 38% unemployed; 19% adults with children. Mobiles reach everywhere, and are in the hands of many.
  • In places like Nairobi, see entrepreneurship, e.g. mobile charging stations – battery in backpack.
  • Tech to auto-reply to messages, subscribe to groups etc over SMS. Ken wrote s/w for this, frontlinesms, when he broke his leg.
  • Hardware: GSM modem (O2 10 quid) in laptop/pc, outlook for SMS… can send out alerts, jobs, prices etc. and get input into radio programs… “did you experience bullying today” etc… get connection into field to send back data…
  • Auto-reply to sms; auto-subscribe; forward to email; forward to person; use to trigger an external command. Frontlineforms: create form onto java-enabled phone – sends compressed messages back to base. MMS: multimediamessage receiving just now, picture messaging includes cell-level diagnostics. New: regularly-scheduled SMS reminders (e.g. to take medicine, go to clinic).nSister orgs: Frontlinemedic, frontlinelearn, frontlinelegal.
  • Use cases: UN FAO has been texting market prices to fishermen. CELAD txting out agricultural advice. Foleshillfield vision project in Coventry towerblock – building community cohesion – use software to send out e.g. gardeners meeting reminders. SurvivorsConnect – to report trafficking.nIdeas: Mencap to campaign and send out reminders on e.g. probation and health appts. Domestic violence helplines… people have sms helpline available – quiet, deletable and safer. Support for depression… microfinance… Manchester soup van to homeless people “are in your area”.
  • Think about your audience – do they have broadband, are they heavy SMS users. Think about, then choose a tool. “we learn from you because you’re the innovators and we’re providing the tool”.n@laurawhudson http://frontlinesms.com
  • Action: talk to Laura about CrisisCamp day on 25th.
  • Q: setting up. Noted that getting a short code takes a while, but this is v easy to set up.
  • Q: BT microcell project? Can find out and tweet about this. BT is opensource project. Maybe better for camp etc.
  • Q: can plug into web services to send out text? Yes.
  • Q: skype send text? No… only just found you can do this.
  • Q: geographical? Have to register with a group that corresponds to your village etc.
  • Q: USSD e.g. automatic getting of data? Tweet to ask frontlinesms about this.
  • Q: security and encryption of data. Person sending/receiving sms details, lots data on laptop. Issues in oppressive environments; tacticaltech have good security section.

Iris on apps for good; CDI EuropenAlso self-invited.

  • Went for higher end, e.g. app development instead, deliberately. CDI train young people to program mobile phone apps. Innovation comes from fringes of society. Useful to ask groups you haven’t asked before.
  • CDI founded 1995 by Rodrigo Bargas? Use tech to help people to solve their own problems. 3 pillars (Educational models): learn better when you try to solve something you care about; community action; technology – latin America = pcs; UK = android apps.
  • Spent last year thinking about how to add value in the UK. “train young people to develop mobile apps to change their world” NEETs.
  • Stop & search app… geotagged… initial release doesn’t include police epaulette numbers yet.
  • Launching again in a Tower Hamlets girls’ school.nLessons learnt: 4 key lessons.
  • • Programming isn’t for everyone. Using app inventor for android next.
  • • Platform wars: be pragmatic and start somewhere – the course is larger than this
  • • Apps are currently 3m: male, middle-aged, middle-class. Brixton teens are using Blackberrys. Provide phones & use emulators.
  • • Poverty – in Brazil poverty of stuff, in UK poverty of ambition. Don’t have vision of where they want to get, and confidence to get there. Take ambitions and dreams seriously, but challenge them and don’t provide answers – can provide bridges, but people need to walk themselves over them.
  • Bottom-up innovation: design for multiple outcomes; don’t focus on specific tools; be pragmatic about adoption; try to raise ambitions.
  • http://appsforgirls.eventbrite.com iris.lapinski@cdieurope – can go join them in this event next Wednesday (5-7pm). http://cdieurope.eunq: where. Are talking to people in Brighton too, and planning to create an opensource online project too.
  • Action: spread word about Weds session to WomenInTechnology and GirlGeeks.nnOpensource building design software

Random bonus talk… opensource/ free software for building design is ‘shocking’ and want to do something about this.

  • Lots of software, but proprietary, for windows, and v expensive. Need to satify building controllers with it – looking at 450 quid upwards. Dewpoint calc software at 350 quid e.g. 500 quid for something else that’s easy to fix. Online calculators aren’t flexible enough, and need to put same info into multiple apps. V. easy to reimplement some of these, e.g. ones based on British Standards.
  • http://Bimserver.org/ one of the few free sw projects online – no software on this yet. Needs people to write user interfaces online. Get e.g. info from manufacturers websites (e.g. builddesk has this info in, and manufacturers all publish datasheets that could be linked to – finding these could be crowdsourced) – looking for techie people… have sums covered, but need help on the interface.
  • Action: look for a UK crowdsourcing site. If there isn’t one, ask why not.
  • Q: is there opensource autocad software. Yes. E.g. qcad “bit 1980s”… works, but tedious (provided under Aladdin GPL model).
  • Q: why not use sketchup. Not free, and difficult to run on Linux. Building information modeling – has standards. Is a dutch opensource implementation of this, but license for this isn’t really opensource. Learnt from OpenStreetMap to start really simple, then make more complicated over time. If get system architecture right, this will help enormously over time. “real shortage of stuff that’s genuinely open at the moment”.
  • Q: frontlinesms needed to make room for e.g. plugins – e.g. an appstore for apps that can sit on top of frontlinesms – are having to go back and think again about their architecture and how they do it.

How to be an Innovations Manager

These are some of the notes that I’ve left for a successor.  They’re about the spirit rather than the detail of the job (I’ve left that part out to protect the guilty<del><del>innocent), but I’m hoping they’ll be useful beyond their original organisational boundaries.

First Advice
To do this job, you need the mind of Edison, the spirit of a streetfighter and the thick hide of a battleaxe.  It gets messy sometimes, but always remember that it’s not personal (usually): you’re asking people to change the way that they work and think, and that tends to kick in defence systems (the “corporate immune system”).  Or as my colleague used to say whilst giggling manically “the technology is difficult, but the politics is impossible” (before going back in to sort out the politics).

Your boss might be a long way away. Listen hard to them, but always have a plan in place for what you’re doing over the next few months and how you’re going to respond to unanticipated events (a new idea, a new request, a change of strategy).  They are also not your only boss: their bosses are also keen to keep innovations alive (or they wouldn’t keep signing the cheques) and may often drop by with ideas about the group’s direction and useful things it can do. Listen hard to them too – they’ve got where they are for several very good reasons – but make sure you tell your boss what’s going on if you suddenly seem to change direction on him/her.

What the Job is (and isn’t)
A large part of this job is to work very hard in lots of different areas at once, handing technical and business aspects of ideas, doing heavy-duty politics where necessary to keep them alive, and at the same time making sure that enough other people in the right places take ownership – i.e. believe that it’s their idea and their work that has got it to the stage that it’s at.  This is not a job for egotists: if you want the big prizes then stay away, but if you want quiet respect from people all over the organisation and the satisfaction of seeing things happen, and you’ve got the skills to do that, then it’s probably for you.

You need to understand business. You don’t need to have this understanding right away, but you need to be able to have that gut feel about whether an idea is worth pushing forward or not; how it fits into the organisation, what the competition landscape is, whether the team behind it can be trained to (or is already able to) succeed.  This could come from years of experience (unsolicited bid work helps lots here), but mostly it’s applied common sense: much of it can be gleaned from a combination of training courses, reading the background notes on things like Dragons Den and The Apprentice, watching and learning from the people in the organisation who are really good at it and lots and lots of practice (sponsoring Business Challenges and reading the feedback that its teams get from the business gurus is good for this).

You need to understand the business. What does the organisation do – what do connected organisations do that could help it (or that you could post not-us-but-still-connected ideas to).

You also need to be able to do business, which is not quite the same as understanding how it works. Mostly this is about people, and part of your role is to protect each ideas team whilst they learn enough to do this themselves.

You need to understand and like people. Much of the time you’re supporting people whilst they learn and develop. You need to gain their trust; done right, you will also over time earn their respect. You need to understand how to spot different personalities: their needs, drivers, the things they worry about, the things that scare them, and know when to manage them closely and when to let them fly. And you need to make them feel safe: remember that the deal is that they get the plaudits for their work, and you take the hit if it goes wrong (aka “if it goes right it’s the team’s success, if it goes wrong it’s an innovations experiment”). This isn’t to say that you should paint ‘doormat’ on your head and let people take advantage of you, but a little support in the right way can have some wonderful personal development and business results.

And being a natural systems engineer helps. Lots. Although again much of this can be learnt through patience, training and experience.

Organisation, expectations, design

I’ve written before about the issues that happen when the view from the top of an organisation doesn’t match the view from the bottom.  It’s one of the simple tests that I do to get an indication of that organisation’s health: do the people leading it have the same understanding of the company’s roles, ambitions and operating rules as the people who are doing the work?

If you find yourself in one of these organisations, you have three choices: you can try to change yourself, you can try to change the organisation, or you can leave.  I tend to start with introspection: is it me? Are my expectations too high? Am I applying the wrong cultural norms? Can I change?  Then negotiation: what is it that this organisation is expecting of me? Is it respectful (and conversely, am I – always good to have a sanity check there), is it reasonable, is it honest?  And finally: are my beliefs and the organisation’s belief set so different that there really is no acceptable solution for us both?  And sometimes, no matter how much you believe in something, you just have to walk away.

FEMA

Yesterday, CrisisCommons spent some quality time with Craig Fugate, the head of the Federal Emergency Management Agency (FEMA). He was talking about how to integrate crowdsourcing into formal emergency responses in the US, which is something that’s been bothering me since Haiti.

It goes something like this. Most governments have plans for different types of emergencies. Those plans generally treat the population as something to be contained, controlled and moved from place to place. Yesterday was the first time that I heard an official describe that population as a resource rather than a herdable set of semi-helpless victims. Now granted in an emergency, people will be responding differently and some of them will be in shock and in need of direction, but other people will be in a position (e.g. carrying mobile phones) to feed into a crisis response. FEMA seem to have made an excellent start on this by including a twitter layer into their Common Operating Picture (the name here gives us a clue to how this picture has been created and organised up til now – it’s a military term).  The questions now include how best to use this, and how to use the feedback loop from the government to the people to shift emergency control from group-directed herding at a macro level to guidance of groups and individuals at a micro level based on their gps positions and reported states.

Craig asked CrisisCommons for two technology ideas that could help FEMA’s work with crowdsourcing (i.e. help its operations to work better by using crowdsourcing techniques).  The one that I’d really like to spec out and test is emergency egress from a first-world city.  It goes something like this: every time you see a first-world emergency on TV, the one most striking this isn’t the water or the damaged houses – it’s the lines and lines of cars in traffic jams because they’re all trying to exit the area at the same time. Most times, the emergency is bad but not travelling outwards with the traffic. But. Some emergencies that we haven’t had yet – and I’m thinking about dirty bombs and city-wide fires here – do travel outwards, and for some UK cities the time taken to evacuate their population to a safe distance far exceeds the time that that emergency wave would take to cover that distance. So what could we do with crowdsourcing to help stop the crisis region overtaking the traffic escaping from it?  It may have to involve some *gasp* innovative thinking like car-sharing, people getting out just with what they’re carrying, and playing with the mix of public and private transport. But it’s a systems problem looking for some fresh systems thinking, and we happen to have a set of people like that in Washington right now.

The Mysterious Eva

Just when I thought there were no good transliteration puzzles left, a Japanese site leads me to EVA, aka the European Voynich Alphabet, developed to decipher the Voynich manuscript. I’m voting for it being someone’s personal invented language (possibly several languages), but I’d be really really interested in playing with the full text if it ever gets released online.  I just hope it doesn’t upset too many Welsh librarians.

What CrisisCommons learnt from RHOK1.0

Last weekend (4th-5th June 2010) was the second Random Hacks of Kindness, aka RHOK1.0.

RHOKs are hackathons designed to rapid-prototype software that can be used to manage information before, during and after crises like the Haiti earthquake. RHOK was one of the entities born out of the first CrisisCommons camp (other entites included CrisisCamps and the Aid Information Challenges), and because of this and its closeness to the CrisisCamp aims, CrisisCommons has always kept strong ties with the RHOKs.

Lesson 1: Ask for help if you need it.  Crisiscommons realised last month that RHOK organisation was in trouble: we helped where we could, and the RHOK organisers were gracious enough to accept that help. We couldn’t save RHOK1.0 London (two weeks to go and no announcements, no organisation and most crucially no venue arranged in a town that’s getting harder and harder to find BarCamp venues in) but we did manage to do these things:

  • Pass on as much of our experience of coordinating multi-camp, multi-timezone, multi-language and high-pressure CrisisCamps as we could.  Mostly this meant lots of conference calls and emails, with Heather and Noel providing support to the Washington-based organisers.
  • Sent CrisisCommons organisers out to RHOK camps that needed them. Heather Leson (@heatherleson) did an amazing job keeping RHOK Sydney going, although I’m still kicking myself for not putting myself forward when the call for RHOK Jakarta came in (I thought there were other people who deserved the trip more than I did).
  • Raided the CrisisCommons project lists for suitable RHOK problem statements.  This meant more sitting up late getting the project statements into readable order, but the CrisisCommons project managers (including Kimberly Roluf) did us proud.
  • Gathered information, posted information and kept people engaged on RHOK problems before the RHOK camps (big up to Olidag for his work on the UAV problem).
  • Built the RHOK wiki – with one days’ notice! – ready for the RHOK camps to use (big up to @heatherleson, @spikeuk and Brian Chick for getting this sorted so fast).
  • Ran a RHOK operations centre for the 47 hours (8am Sydney til 5pm Washington/Santiago) that RHOK1.0 was on (@spikeUK, @bodaceacat, Sahana’s @rediguana on watch and wiki editing, with RHOK’s Jeremy on the IT admin).
    • Watched the RHOK IRC channel, camp Ustreams (video and chat), twitter hashtags (of which there were many), wiki recent changes log and email for information that needed capturing (projects, video feeds, organisers etc) and projects that could be hooked up with users and external information providers.
    • Maintained the wiki in real-time so the RHOK camps knew what each other were doing and could coordinate if they were working on the same projects.
    • Created a virtual camp from our VirtualCrisisCamp templates.
    • Found information sources, existing projects and potential users for project teams as they came online.
    • Kept people outside the camps informed about what RHOK was doing and how they could join in with other work post-RHOK.
    • Told RHOK organisers and administrators about emerging issues as we found them.
  • Worked as subject matter experts in RHOK hacking teams (Katie @filbertkm, for example)
  • Told potential users (e.g. DFID, CDAC, the UN) that RHOK was about to happen.
  • Judged the results (Noel)!

This didn’t just go one way though – what CrisisCommons got in return was:

  • New and/or improved applications for CrisisCamp and CrisisCommons responses – Haiti Amps Network got a huge boost from RHOK Nairobi, for example.
  • More knowledge about how to set up a string of camps very quickly – remembering that the next time we do this, it will probably be for a major crisis.
  • Much more experience in running a worldwide, short-term-focus operations centre.
  • Lots and lots of new friends around the world.

Lesson 2: Have your infrastructure ready.  The RHOK infrastructure was not ready for RHOK1.0.  Without an Ops Centre, teams working on the same problems were likely to be unaware of each others’ work.  Without a fully-built wiki, it was difficult to know who to contact about what (including who the other countries’ organisers were and where the tech support was). Without an agreed hashtag, the camps weren’t all able to see each others feeds.   Crisiscommons did miss a few tricks in transferring its experience into RHOK though – one of these was forgetting to set multi-language options on the wiki early (people were tweeting in English, Portuguese, Spanish and Indonesian and language-specific wikipages might have engaged yet more non-RHOKers with what RHOK was doing).

Lesson 3: Build on existing work. If it’s already been done, then use it (provided it’s opensource of course) to build something better.  We asked some RHOK teams why they hadn’t contacted existing teams working in their application area; we suspected they were shy, and built these bridges for them.  Other teams got out there and built their own bridges once they’d been given the contact details – the UAV team, for example, reached out to DiyDrones and other UAV usergroups, and started to link groups together.

Lesson 4: Talk to your users. Crisis management is not like office management software: it’s measured in lives, not dollars.  If you build something that nobody wants to use, then you’ve wasted effort that could have made a difference elsewhere.  If you ask the people who might use your system about what they need and want, you’ll build something that makes a difference, and it’ll probably be a better system too.  This is the big lesson that the RHOK1.0 winners (Chasm, a landslide prediction app that used an SME and was ready to go by the end of the weekend) can teach us all.  We linked up all the people we could, but a bit more prior preparation (like warning more of the potential users that RHOK was about to happen) could go a long way.

Lesson 5: Arrange 24-hour visible and robust support with no single points of failure. A huge hand to Jeremy for handling a nasty DOS attack on the RHOK servers having been woken up in the middle of (his) night.  But he really shouldn’t have been the only person with the priviledges to do this: two or three other admins arranged around the world could have helped a lot here.

That said, RHOK1.0 worked well and there was some amazing collaboration in teams across the world (yes, we’re looking at you, UAV people, People Finder et al).  The world has gained apps that will definitely save lives, and another bunch of people have learnt that they can make a difference by using their tech skills.  Hats off the RHOK organisers for pulling this off from an almost standing start, and a big hand to all the people who worked behind the scenes to support them.  I think RHOK is now established as an event, and judging by the reactions of people sad to leave it this time, it already has a groundswell of support. RHOK on!

World insurance schemes

Not all insurance schemes are paid for and delivered in money. Granted, we pay for the European Union, which started as an unstated insurance scheme against war (and seems to have worked very well, given how the German and Greek governments must feel about each other at the moment); ditto the UN.

The biggest insurance scheme is still starting up now, and that’s to distribute skills, capacity and goodwill across the whole of the world, so that if any one part of it is hit by disaster, the rest are willing and able to help. As I keep saying in CrisisCamp, it’s not them and us any more, it’s us and us. I have two favourite illustrations of this at the moment – that an African-led group (Ushahidi) could help with a disaster in the Americas, and that the satellite sites for next month’s Random Hacks of Kindness (RHOK1.0) aren’t London, Paris, New York, but Nairobi, Jakarta and Sao Paolo.

What happens next could be interesting.  The tiger, Chinese, Indian and Brazilian economies all grew rapidly in the last couple of decades.  But there’s more. The ‘third world’ is huge, complex, and to Western eyes deeply disorganised, but there is a lot of will there to learn about using and creating technologies, and a lot of work on making that possible, from the solar-powered internet station in a portacabin to the African-created idea of linking individual PC wifis from house to house to form Internets. If we are all becoming equal in the eyes of the Internet, then we Westerners might one day blink, and find on opening our eyes that Africa has overtaken us in innovation, enterpreneurship and hunger for change.  And then we might need yet another insurance scheme that isn’t paid for or returned in money.

Postscript: spent part of weekend watching Random Hacks of Kindness team working in iHub Nairobi.  V impressive techs; see also Africa Launch business site.

Visualising wikis

I’ve been doing some website updates recently, as part of the CrisisCommons work. My father taught me to always clean and examine something carefully before you take it apart, and this works as well for code and sites as it does for cars and houses, so I’ve been carefully analysing each site against a set of intended (and frequency-weighted) user journeys. And what would be a really nice thing to have would be a tool that generated a semantic network of a wikisite so I could trace its hub nodes and get an easy visual representation of how much each node and link is used (colour-coding seemed obvious here).

Now I remember the small worlds (everything is just 6 steps from everything else if you know which 6 steps to take) and semantic network theories from uni, and I’ve knocked up a few labelled graphs myself in my time, and I know there’s some great graph generation freeware out there, so I thought “this has got to be a standard item in the open source community, surely”.

Er. No. But there are some good things out there already.

* Aharef’s visualising websites as graphs. (example). Verry close to what I’m looking for, but runs off weblinks and doesn’t tag the hub nodes.
* Flexplorer. Great tool for mapping all the websites you’re pointing at. Not so great for mapping just the one wikisite. Does do good labels.
* Wikmindmap. Does wikis. Does labels. Doesn’t do the whole of a wiki. More of an explorer’s torch (sees ashort way but very well) than an explorer’s map (sees everything but in less detail).
* Powermapper – too literal. You get a (non-graphical) representation of the site contents rather than a summary that you can infer metainformation from.

So close, but no cigars. There is however a plan. Aharef has published his/her/their mapping code, so I’m going to see if I can change it to pull out the [[]] tags, and add a name to every heavily-linked node. It’s a plan. We’ll see how the reality pans out. But right now, I’m having a lovely time playing with the Processing visualisation tool and working out whether hacking Flexplorer is a better option.  And it seems like I’m not the only person thinking about doing this.

Tickboxes

Yep, tickboxes.  One of the issues for many real-life entities is data entry from paper forms (as opposed to the online forms mainly Internet-based entities: yes, yes, the physical world does still exist, and yes it is easier sometimes to get people to put crosses on paper than to lead them to the right website/ phone app).   Specifically, they use paper because it’s their best shot at getting lots of people to fill in data fields, but they then have lots of paper data that they need to digitise.

Options for this, rated from high to low operator involvement, low to high flexibility and low to high complexity, are:

  • Use the forms as raw data, i.e. do all the analysis needed by counting up ticks in boxes etc by hand. This works, but takes time, and isn’t easy to cross-check or add to the analysis used; it can however be optimal for small amounts of data where the analysis is needed quickly.
  • Get someone to type in all the data from the forms. This takes time, but is usually a faster way to get relatively small amounts of data ready for analysis than spending time working out how to do the digitisation.
  • Scan the forms whole into the system.  This captures the data online, but is no better for analysis than the first option.
  • Scan the forms then use OCR and image processing to capture the data on each form. This works for some limited types of data: typed data, neatly filled-in tick-boxes, carefully-spaced capital letters (a la postcode reader), but doesn’t capture free text and may have problems with messy inputs.
  • Scan the forms and use OCR/ image processing to capture as much data from the form as possible, then use the operator to cross-check the captured data (e.g. by online comparison between the original form and the results) and input any free text or other difficult-to-read data.

I’m looking at the last option for some IT4Communities stuff.  As with most autonomy, it makes sense to use the human’s and machine’s strengths together, to work comfortably somewhere away from both excessive operator workload (low autonomy) and massively complex systems (high autonomy). 

I could write a segment-then-find-boxes-then-check-their-occupancy image processing subsystem, but I suspect that because this is a relatively common problem, someone has probably already done this.  I’ll attempt to write the system out of sheer curiosity if I can’t find one, but meanwhile places I’m starting to look include:

  • census data processing
  • exam paper processing
  • Search for freeware paper form processing

Helpful references include: