Software

Notes from John Sarapata’s talk on online responses to organised adversaries

John Sarapata (@JohnSarapata) = head of engineering at Jigsaw  (= new name for Google Ideas).  Jigsaw = “the group at Google that tries to help users facing organized violence and oppression”.  A common thread in their work is that they’re dealing with the outputs from organized adversaries, e.g. governments, online mobs, extremist groups like ISIS. One example project is redirectmethod.org, which looks for people who are searching for extremist connections (e.g. ISIS) and shows them content from a different point of view, e.g. a user searching for travel to Aleppo might be shown realistic video of conditions there. [IMHO this is a useful application of social engineering in a clear-cut situation; threats and responses in other situations may be more subtle than this (e.g. what does ‘realistic’ mean in a political context?).] The Jigsaw team is looking at threats and counters at 3 levels of the tech stack: device/user: activities are consume and create content; threats include attacks by governments, phishing, surveillance,…

Software

Why am I writing about belief?

[Cross-post from LinkedIn] I’ve been meaning to write a set of sessions on computational belief for a while now, based on the work I’ve done over the years on belief, reasoning, artificial intelligence and community beliefs. With all that’s happening in our world now, both online and in the “real world”, I believe that the time has come to do this. We could start with truth. We often talk about ‘true’ and ‘false’ as though they’re immovable things: that every statement should be able to be assigned one of these values. But it’s a little more complicated than that. What we see as ‘true’ is often the result of a judgement we made, given our perception and experience of the world, that a belief is close enough to certain to be ‘true’. But what is there are no objective truths? In robotics, we talk about “ground truth” and the “god’s…

Software

WriteSpeakCode/ PyLadies joint meetup 2015-10-22: Tales of Open Source: rough notes

Pyladies: international mentorship program for female python coders meetup,com, NYC Pyladies Lisa moderating, Panelists: Maia McCormick, Anna Herlihy, Julian Berman, Ben Darnell, David Turner Intros: Maia: worked on Outreachy (formerly OPW) – gives stipends to women and minorities to work on OS code; currently at Spring Anna: works at MongoDb, does a lot of Mongo OS work. Julian: works at Magnetic (ad company); worked on Twisted, started OS project (schema for validating Json projects) Ben: Tornado maintainer, working on OS distributed database on Go. David: ex FSF, OpenPlans, now at Twitter, “making git faster”. Q: how to find OS projects, how to get started? D: started contributing to Xchat… someone said “wish chat had the following feature”… silence… recently, whatever the company is working on. Advice: find the right project, see if they’re interested, then write the feature. B: started on python interpreter, was using game library, needed bindings for…

Data Science

Looking at data with Python: Matplotlib and Pandas

I like python. R and Excel have their uses too for data analysis, but I just keep coming back to Python. One of the first things I want to do once I’ve finally wrangled a dataset out of various APIs, websites and pieces of paper, is to have a good look at what’s in it.  Two python libraries are useful here: Pandas and Matplotlib. Pandas is Wes McKinney’s library for R-style dataframe (data in rows and columns) manipulation, summary and analysis. Matplotlib is John D Hunter’s library for Matlab-style plots of data. Before you start, you’ll need to type “pip install pandas” and “pip install matplotlib” in the terminal window.   It’s also convention to load the libraries into your code with these two lines: import pandas as pd import matplotlib.pyplot as plt Some things in Pandas (like reading in datafiles) are wonderfully easy; others take a little longer to learn….

Software

Singularity

i’ve been thinking today about the singularity: the point at which machines become smarter than humans, about an internet of things so smart that we don’t know how to manage it with our existing software paradigms.  And I wondered: a good manager will already be managing entities that are much smarter than them (because you don’t want your best thinkers doing the paperwork, management is another discipline/ skill etc etc); is it perhaps time to think about how to use those management skills on clusters of machines?

Software

Notes from meetup: data-driven design 2.0 (Data-driven architecture), 2015-08-24

Meetup: data-driven design 2.0 (Data-driven architecture), 2015-08-24 Basics: AIANY chapter (http://main.aiany.org/)  #datadrivendesign http://www.meetup.com/Transforming-Architectural-Practice-Meetup/events/224093716/ First panel was http://main.aiany.org/eOCULUS/newsletter/data-in-the-built-environment-new-sources-new-strategies/ Melissa Marsh on intros and bios…  “Transforming architectural practice series” = thinking differently about the process of arch: tools, practice, how they run their business (leads to thinking differently about product).  Panelists showing how taken on data-led practice changes how arch does their work… incorporating different methodologies, s/m/l/xl data.  Today = moving from data sources and collection to examples within projects, how to set up projects and client relationships differently.  Came out of feedback from June event.  Continuing looking at future of design relationships.  Panelists:  Jeff Ferzoco (linepointpath),  Zak Kostura (ARUP, hiph performance structures – currently form found roof system for MX city)… thinking about project setup and info sharing and how it’s changing client relationships.  Darrick Borowski – on tools and techniques… data-driven design = ask better questions at the beginning… back and…

Software

This is not my journey

I spent some of my Christmas break thinking about work styles: what worked last year, what didn’t, and what I could do to improve my own.  I’ve got it down to just two things: “this is not my journey” and “do what the boss asks for”. People often talk of their jobs (and themselves) as something that they do now, as in at one particular point in time. That’s a little like saying “I’m in seat 29C” instead of “I’m flying from New York to Japan and when I get there I’m going to try out the heated toilet seats” when someone asks you where you are.  We are all on journeys – sometimes literally, but always on journeys through time, careers, relationships.  And if you want to think about your career, a journey is a useful idea. So last year I got really frustrated because I ended up doing…

Data Science

Web Scraping, part 1: files and APIs

Web scraping is extracting information from webpages, usually (but not always) as tables of data that you can save to csv files, json/xml files or databases. Design it first, then scrape it When you start on any piece of code, try asking yourself some design questions first; definitely do this if you’re thinking about something as potentially complex as web scraping code.   So you’ve seen a dataset hiding in a website – it might be a table of data that you need, or lists of data, or data spread across multiple pages on the site. Here are some questions for you: 1. Do you need to write scraper code at all? Is the dataset very small – if you’re talking about 10 values that don’t get updated, writing a scraper will take longer than just typing all the numbers into a spreadsheet. Has the site owner made this data…

Software

Ruby day 9: Local power!

This. Just this: local mappers made more changes to the map of the Philippines during Typhoon Ruby than anyone else in the world (by a very very big margin). Anyone who doesn’t believe in the strength of local people to build their own resilience should look very, very hard at these numbers. Ruby’s all over now for the mappers – DHN is de-activated, everyone’s gone back to work.  There’s still a lot of work to do on the cleanup: MarkC mentioned 35000+ houses destroyed and 200000 people without shelter, and there will still be OSM mapping to do for that.  This weekend Celina’s running a “train the trainers” OSM event in Manila: if you’re one of the people who created the figures above, please please go and help spread your skills further!

Software

Connecting Ushahidi data to the HDX repository

I’ve been talking to the HDX team for some time now (well, since before HDX was a thing, but then so have many of us).  HDX is a data repository for humanitarian data: basically, it’s a place to put machine- and human-readable datasets so that other people can use them too. Ushahidi tools (Ushahidi platform instances, Crowdmap instances) often have datasets in them that could be useful to other people, so part of the conversation has been about how to share data from Ushahidi sites, both on the HDX site and in the HXL humanitarian standard. Ushahidi CSV to HDX CSV First let’s look at how to share the CSV file that Ushahidi creates when you click on the “download reports” button. Before you do this, please, please, please read my post about mitigating potential risks to people from sharing your Ushahidi data. Converting that CSV into HXL format is pretty…