Misinformation readings

[Cross-posted from Medium https://medium.com/misinfosec/misinformation-readings-d29b62a60f10]

my misinformation bookshelf

I always tuck two books in my bag for travel , and was amused last week to see multiple Sofwerx speakers with the same yellow book (Singer and Brooking’s LikeWar). Other people were asking me for book recommendations, so here they are.

First, a caveat: these are not practical books (that’s why I’m busy writing one), and current literature on misinformation has a heavy focus on the 2016 US elections. That’s understandable: like the 9/11 attacks in 2001, the 2016 Russian attacks across US social media showed a previously-secure America fragilities it assumed that only other countries had (FWIW, my personal opinion is that the truth is more complex than “some brain hackers stole an election”, but there was definitely good craft there), and it makes a strong case study. So let’s start with those books.

  • Benkler, Faris, Roberts “Network Propaganda: manipulation, disinformation, and radicalization in American politics”. Example quote: “Something fundamental was happening to threaten democracy, and our collective eye fell on the novel and rapidly changing — technology”. A data scientists’ book, in that it looks at the systems in play, analyses the data available, and considers how insights from it could be applied in other situations. I’ve been recommending this to people who want to start building fixes.
  • Singer and Brooking “LikeWar: The Weaponization of Social Media”. Example quote: “Narrative, emotion, authenticity, community, and inundation are the most effective tools of online battles”. Clearly illustrates a lot of basic ideas in misinformation, written by people who’ve tracked its effects. Personally I find it over-hawkish: I know that we need to talk about information warfare, but one person’s infowars is sometimes another person’s politics, or advertising, or just plain interesting community or meme. I also found the pace slow, but that might be because I’ve lived through a lot of these things (some of the illustrations are events that I’ve worked myself). Could be a good introduction for someone completely new to the area.
  • Jamieson “Cyberwar: how Russian hackers and trolls helped elect a president”. Example quote: “Outcome three: the former Secretary of State is elected and the country simply moves on, but the sabotage nonetheless has magnified cultural tensions and functioned as a pilot from which to birth later success — perhaps when she runs for a second term”. A scholarly work with a lot of insight specifically on the fragility of American politics; the style reminds me of a three-letter-agency report. Useful if you want to understand how America got to where it is now; could be well matched with Heuer’s “Structured Analytics Techniques for intelligence analysis”.
  • Watts “Messing with the enemy: surviving in a social media world of hackers terrorists, Russians and fake news”. Example quote: “Having trained at the FBI academy and studied Soviet politics, military and intelligence at West Point, I recognized the technique as the digital update to age-old spycraft”. First-person account from one of the people I call ‘trackers” — individuals who interact with terrorist and misinformation accounts and targets online. Useful if you want to understand what tracking misinformation online looks like in practice.

The next part of my bookshelf is about how we got here in the first place.

  • Denning “Information Warfare and Security”. Misinformation attacks are a form of hacking. If you really want to understand information warfare, and where the misinformation version of it might go next, Denning is the classic book. Read it.
  • Berger “Contagious: why things catch on”. If you want to understand misinformation, you need to understand belief systems, online belief transfer and specifically how we transfer beliefs through things like memes. This book helps.
  • Johnson “The information diet: a case for conscious consumption”. A book focussed on the problem of online information overload, written at the time that crisis data teams first started seeing test misinformation messages appearing during disasters (Kate Starbird has detailed these in her work), by the person who managed Obama’s online campaign. I re-read this to remind myself of where we’ve come from online.
  • Tufekci “Twitter and tear gas: the power and fragility of networked protest”. A tool is a tool. The things that are useful to someone running a misinformation campaign are also useful to someone supporting an honest political campaign, or a popular revolution. The author understands well both parts of this, and works through the Arab Spring, algorithms and surveillance states as part of this new internet power.

(that list was hard to build: McLuhan’s Mechanical Bride, Schelling’s Strategy of conflict, Holiday’s Trust me I’m lying, VanPutte’s Walking Wounded, and Galloway’s The Four are just some of the great books in the reject pile).

And finally, a section on ‘next’. YMMV, but I like to keep these three books around to ground me:

  • Brunton and Nissenbaum “Obfuscation: a user’s guide for privacy and protest”. Keeping yourself safe and private is hard work these days. This “how-to” book for individuals has lots of useful ideas that could be scaled and/or automated.
  • O’Neil “Weapons of math destruction: how big data increases inequality and threatens democracy”. Future campaigns will most likely be a combination of people and algorithms targetting people and algorithms. This book describes what happens when algorithms are applied to people’s lives without care, e.g. without checking for bias or discrimination in the data or models that they’re built with, but that lack of care could also be created through model poisoning and other algorithm-based attacks.
  • Saxe and Sanders “Malware data science: attack detection and attribution”. Misinformation, botnets, campaigns, trolls: to date they’ve been fairly crude, mostly manual or only lightly automated. That’s changing, and it’s worth keeping up with trends in the application of machine learning and artificial intelligence to hacking, to see what might be coming next. There aren’t many books on this subject yet: here’s a good one.

That’s my basic booklist, and I’ve started a Goodreads list based on it that I’d love people to add to.

The other books you read on your misinformation journey will depend on who you are: I’m looking across now at shelves containing everything from geopolitics and psychology to image processing and user experience, with a little cooking and bonsai thrown in for light relief. There are also websites, groups and people to follow — I keep a basic list of those too.

Practical influence operations

Image may contain: Sara-Jayne Terp, standing

[Cross-posted from Medium https://medium.com/misinfosec/practical-influence-operations-760b03f8a493]

Sofwerx are the US Special Operations command’s open source innovations unit – it’s where scarily fit (but nice) military people, contractors, academics and people with green hair, piercings, dodgy pasts and fiercely bright creative minds come together to help solve wicked problems that need extremely unconventional but still disciplined thinking (don’t ever make the mistake of thinking hackers are just chaotic; given a good problem, they are very very focussed and determined). This week, Sofwerx gathered some experts together to tell other parts of the US government about Weaponized Information.

I was one of the experts. I’m low on purple hair and piercings, but I do have about 30 years practice in the defense innovation community, am lucky enough to speak at Black Hat and Defcon about things like how algorithms control humans and the ethics of adversarial machine learning, and I really get interested in things.  Like how autonomous vehicles and people team together. Or how agencies like FEMA and the UN use data in disasters. Or, for the past couple of years, how to counter attacks on human belief systems.  Here are some notes from my talk whilst I wait for the amazing media team to make me look elegant in the video (please please!).

I spoke at the end of the day, so my talk is rightly one of a pair of talks, with David Perlman and myself bookending the day. After a military introduction (waves to Pablo), David opened the day by describing the landscape of misinformation and other large-scale, internet-enabled influence operations; expert talks during the day built out from that, explaining lessons we can learn from earlier operations against Jihadis (Scot Terban), deep dives into specific technologies of interest (Matthew Sorrell and Irene Amerini on countering deepfakes and other multimedia forensics), then me pulling those back together with a talk setting out a framework from which we (and by we I meant the people in front of us in the room plus a discipline created from the skills of a very specific set of experts) could start to respond to the problems, before passing it back to the military in the form of Keith Dear from the RAF. 

So. Lots of people talk about the problem of misinformation, election hacking, influence operations, Russia, the internet research agency blah blah blah.  Very few of them talk about potential solutions, including difficult or unconventional solutions.  The analogy I used at the start was from my days playing rugby union for my university.  Sometimes we would play another side, and the two sides would be completely matched in power and ballskill and speed, but they just didn’t understand the tactics of the game. And the other side would score again and again to the point of embarrassment against them, because they knew the field and had gameplay and the other side didn’t.  And that is what recent history has felt like to me.  If you’re going to talk about solutions, and about handling misinformation as a someone has to do this, and someone is going to have to do this forever because this is just like spam and isn’t going to go away thing, you’re going to need processes, you’re going to need your own gameplay, and you’re going to need to understand the other side’s gameplay so you can get inside and disrupt it. Play the game. Don’t just stand on the field.

Offense

So first, offense. Usually I talk a lot about this, because I’ve spent a lot of time in the past two years raising the alarm with different communities, and asking them to frame the problem in terms of actors with intents, targets, artefacts and potential vulnerabilities.  This time I skimmed over – mentioned the Internet Research Agency in Russia as the obvious biggest player in this game, but that despite their size they were playing a relatively unsubtle unsophisticated game and that more interesting to me were the more subtle tests and attacks that might be also happening whilst we were watching them. I defined misinformation as deliberately false information with an objective that’s often money or geopolitical gain and ranges from the strange (“Putin with aliens!”) to the individually dangerous (“muslim rape gangs in Toronto”).  I also pushed back o the idea that influence operations aren’t the same as social engineering; to me, influence operations are social engineering at scale, and if we use the SE definition of “psychological manipulation of people into performing actions or divulging confidential information”, we are still talking about action, but those actions are often either in aggregate or at a remove from the original target (e.g. a population is targeted to force a politician to take action), with the scale being sometimes millions of people (Russian-owned Facebook groups in the 2016 Congress investigation had shares and interactions in the 10s of millions, although we do have to allow for botnet activities there).

Scale is important when we talk about impacts: these can range from individual – people caught up in opposing-group demonstrations deliberately created at the same place and time, to communities – disaster responses and resources being diverted around imaginary roadblocks (e.g. fake “bridge out” messaging) to nationstate (the “meme war” organizing pages that we saw with QAnon and related groups’ branding for the US, Canada and other nations in the past year). 

Targeting is scaled too: every speaker mentioned human cognitive biases; although I have my favorite biases like familiarity backfire (if you repeat a message with a negative in it, humans remember the message but not the negative) there are hundreds of other biases that can be used as a human attack surface online (the cognitive bias codex lists about 180 of them). There’s sideways scale: many efforts focus on single platforms, but misinformation is now everywhere there’s user-generated content: social media sites like facebook, twitter, reddit, eventbrite, but also comment streams, payment sites, event sites: anywhere you can leave a message, comment, image, video, content that another human can sense.  Influence operations aren’t new, but social media buys reach and scale: you can buy 1000 easy-to-find bots for a few dollars or 100 very hard to detect Twitter or Facebook ‘aged users’ for $150; less if you know where to look.  There are plenty of botnet setup guides online; a large cheap set can do a lot of damage very quickly, and you can play a longer, more subtle online game by adding a little pattern matching or AI to a smaller aged set.

Actors and motivations pretty much divide into: state/nonstate actors who are doing this for geopolitical gain (creating discord or swaying opinion on a specific topic), entrepreneurs doing it for money (usually driving people to view their websites and making money from advertising on them), grassroots groups doing it for fun (e.g. to create chaos as a new form of vandalism) and private influencers for either attention (the sharks on the subways) or, sometimes, money.  This isn’t always a clean-cut landscape: American individual influencers have been known to create content that is cut-and-pasted onto entrepreneurs’ websites (most, but increasingly not all, entrepreneurs don’t have English as their first language and the US is a large market); that messaging is often also useful to the state actors (especially if their goal is in-country division) and attractive to grassroots groups.  This is a huge snurfball that people like Ben Nimmo do great work unravelling some of the linkages in.

Defence

One of the most insightful comments I got at a talk was “isn’t this just like spam? Won’t they just make it go away the same way?”.  I didn’t appreciate it at the time, and my first thought was “but we’re the ‘they’, dammit”, but IMHO there are some good correlates here, and that one question got me thinking about whether we could treat misinformation the same way we treat other unwanted internet content like spam and ddos attacks.

I’ve looked at a lot of disciplines, architectures and frameworks (“lean/agile misinformation”, anyone?) and the ones that look closest to what we need come from information security.  One of these is the Gartner cycle: deceptively simple with its prevent-detect-respond-predict.  The good news is that we can take these existing frameworks and fit our problem to them, to see if there are areas that we’ve missed or need to strengthen in some way.   The other good news is that approach works well.  The bad news is that if you fit existing misinformation defense work to the Gartner cycle, we’ve got quite a lot of detect work going on, a small bit of prevent, almost no effective respond and nothing of note except some special exceptions (Chapeau! again to Macron’s election team for the wonderful con-job you pulled on your attackers) on predict.

Looking at “detect”: one big weakness of an influence operation is that the influencer has to be visible in some way (although the smart ones find ways to pop up and remove messages quickly, and target small enough to be difficult to detect) – they leave “artefacts”, traces of their activity.  There are groups and sites dedicated to detecting and tracking online botnets, which is a useful place to look up any ‘user’ behaving suspiciously.  The artifacts they use tend to split into content and context artifacts.  Content artifacts are things within a message or profile: known hashtags (e.g. #qanon), text that correlates with known bots, image artifacts in deepfake videos, known fake news URLs, known fake stories.  Stories are interesting because sites like Snopes already exist to track at the story level, and groups like Buzzfeed and FEMA have started listing known fake stories during special events like natural disasters.  But determining whether something is misinformation from content alone can be difficult – the Credibility Coalition and W3C credibility standards I’ve been helping with also include context-based artifacts: whether users are connected to known botnets, trolls or previous rumors (akin to the intelligence system of rating both the content and the carrier), their follower and retweet/likes patterns and metadata like advertising tags and DNS.  One promising avenue, as always, is to follow the money, in this case advertising dollars; this is promising both in tracking misinformation and also in its potential to disrupt it.

There are different levels of “respond”, ranging from individual actions to community, platform and nationstates.  Individuals can report user behaviors to social media platforms; this has been problematic so far, for reasons discussed in earlier talks (basically platform hesitation at accidentally removing user accounts).  Individuals can also report brands advertising on “fake news” sites to advertisers through pressure groups like Sleeping Giants, who have been effective in communicating the risk from this to the brands.  Individuals have tools that they can use to collaboratively block specific account types (e.g. new accounts, accounts with few followers): all of these individual behaviors could be scaled.  Platforms have options: they do remove non-human traffic (the polite term for “botnets and other creepy online things”) and make trolls less visible to other users; ad exchanges do remove non-human traffic (because of a related problem, click fraud – bots don’t buy from brands) and problematic pages from their listings.

Some communities actively respond.  One of my favorites are the Lithuanian ‘Elves’: an anonymous online group who fight Russian misinformation online, apparently successfully, with a combination of humor and facts.  This has also been promising in small-scale trials in Panama and the US during disasters (full disclosure: I ran one of those tests).  One of the geopolitical aims of influence operations that was mentioned by several other speakers was to widen political divides in a country.  A community that’s been very active in countering that is the peace technology community, and specifically the Commons Project, which used techniques developed across divides including Israel-Palestine and Cyprus with a combination of bots and humans to rebuild human connections across damaged political divides.

On a smaller scale, things that have been tried in the past years include parody-based counter-campaigns, SEO hacks to place disambiguation sites above misinformation sites in search results, overwhelming (“dogpiling onto”) misinformation hashtags with unrelated content, diverting misinformation hashtag followers with spoof messages, misspelt addresses and users names (‘typosquatting’), and identifying and engaging with affected individuals.  I remain awed by my co-conspirator Tim who is a master at this. 

All the above has been tactical because that’s where we are right now, but there are also strategic things going on. Initiatives to innoculate and educate people about misinformation exist, and the long work of bringing it into the light continues in many places.

Adaptations

I covered offense and defence, but that’s never the whole of a game: for instance, in yet another of my interests, MLsec (the application of machine learning to information security),  the community divides its work into using machine learning to attack, using it to defend, and attacking the machine learning itself. 

Right now the game is changing, and this is why I’m emphasizing frameworks.  This time also feels to me like the moment Cliff Stoll writes about in The Cuckoo’s Egg, when one man is investigating an information security incursion, a “hack”, happening through his computers, and slowly finding other people across the government who were recognizing the problem too, before that small group grew out into the huge industry we see today. 

We need frameworks because the attacks are adapting quickly, and it’s going to get worse because of advances in areas like MLsec: we’re creating adaptive, machine-learning-driven attacks that learn to evade machine-learning-driven detectors and rapidly heading from artefact-based to behavior-based to intent-based discussions.  Already happening or likely to happen next include hybrid attacks where attackers combine algorithms and humans to evade and attack a combination of algorithms (e.g. detectors, popularity etc) and humans; a current shift from obvious trolls and botnets to infiltrating and weaponizing existing human communities (mass-scale “useful idiots”), and attacks across multiple channels at the same time masked with techniques like pop-up and low-and-slow messaging.  This is where we are: this is becoming an established part of hybrid warfare that needs to be considered not as war, but certainly on a similar level to, say, turning up in part of Colombia with some money and a gunboat pointed at the railway station and accidentally creating a new country from a territory you’d quite like to build a canal in (Panama).  Also of note is what happens if the countries currently attacking the US make the geopolitical and personal gains they required, stop their current campaigns and leave several hundred highly-trained influence operators without a salary.  Generally what happens in those situations is an industry forms around commercial targets: some of this has already happened, but those numbers could be interesting, and not in a good way.

One framework isn’t enough to cover this. The SANS sliding scale of security describes, from left to right, the work needed to secure a system from architecting that system to be secure through passively defending it against threats, actively responding to attacks and producing intelligence all the way to “legal countermeasures and self-self-defense against an adversary”.  We have some of the architecture work done.  Some of the passive defence. Lots of intelligence.  There’s potential for defense here.  There’s going to need to be strategic and tactical collaboration, and by that I mean practical things like nobody quite knows what to call the state we’re in: it’s not war but it is a form of landgrab (later in the day I whispered “are we the Indians?” to a co-speaker, meaning this must have been what it felt like to be a powerful leader watching the settlers say “nice country, we’ll take it”), possibly politics with the addition of other means, and without that definition it’s really hard to regulate what is and isn’t allowed to happen (also perhaps important: it seems that only the military have limits on themselves in this space).  With cross-platform subtle attacks, collaboration and information sharing will be crucial, so trusted third-party exchanges matter.  Sharing of offensive techniques, tactics and processes matter too, so a misinformation version of the ATT&CK framework for now (I tried fitting it to the end of the existing framework and it just doesn’t fit – the shape is good but there’s adjustments needed) with a SANS top 20 later (because we’re already seeing the same attack vectors repeating, misinformation versions of script kiddies etc etc).  There’s a defense correlate to the algorithms+ humans comment on offense above: we will most likely need a hybrid response of algorithms plus humans countering attacks by algorithms plus humans.  We will need to think the unthinkable, even if we immediately reject it (“Great Wall Of America”, nah).   And we really need to talk about what offense would look like: and I don’t mean that in a kinetic sense, I mean what are valid self-self-defense actions.

I ended my presentation with a brief glimpse at what I’m working on right now, and a plea for the audience.  I’m working half my time helping to build the Global Disinformation Index, an independent disinformation rating system, and the rest researching areas that interest me, which right now is that misinformation equivalent to the ATT&CK techniques, tactics and procedures framework. My plea for the audience was to please not fight the last war here.

Bodacea Light Industries LLC

I have a consulting company now.  It’s not something I meant to do, and I’ve learned something important from it: creating a company for its own sake is a lot less likely to succeed than creating a company because it helps you do something else that you really wanted to do.

In my case, that’s to work full-time on countering automated and semi-automated influence operations, whether that’s through a 20-hour-a-week contract helping to create the Global Disinformation Index as part of the forever pushback on misinformation-based fraud (“fake news” sites etc), or working on practical solutions, e.g. writing a how-to book and working on infosec-style architectural frameworks for misinformation responses, so as I put it in a talk yesterday we can “actually play the game, instead of standing on the field wondering what’s going on whilst the other team is running round us with gameplays and rulebooks”.

I still have much paperwork and website filling to go before I ‘hard launch’ BLightI, as I’ve started affectionally calling the company (and buying the namespace for, before you get any ideas, fellow hackers…).  I also have quite a lot of work to do (more soon). In the meantime, if you’re interested in what it and I are doing, watch this space and @bodaceacat for updates.

Security frameworks for misinformation

[Cross-posted from Medium https://medium.com/misinfosec/security-frameworks-for-misinformation-d4b58e4047ec]

Someone over in the AI Village (one of the MLsec communities – check them out) asked about frameworks for testing misinformation attacks.  Whilst the original question was perhaps about how to simulate and test attacks – and more on how we did that later – one thing I’ve thought about a lot over the past year is how misinformation could fit into a ‘standard’ infosec response framework (this comes from a different thought, namely who the heck is going to do all the work of defending against the misinformation waves that are now with us forever).

I digress. I’m using some of my new free time to read up on security frameworks, and I’ve been enjoying Travis Smith’s Leveraging Mitre Att&ck video.  Let’s see how some of that maps to misinfo.

First, the Gartner adaptive security architecture.

The Four Stages of an Adaptive Security Architecture

It’s a cycle (OODA! intelligence cycle! etc!) but the main point of the Gartner article is that security is now continuous rather than incident-based. That matches well with what I’m seeing in the mlsec community (that attackers are automating, and those automations are adaptive, e.g. capable of change in real time) and with what I believe will happen next in misinformation: a shift from crude human-created, templated incidents to machine-generated, adaptive continuous attacks.

The four stages in this cycle seem sound for misinformation attacks,  but we would need to change the objects under consideration (e.g. systems might need to change to communities) and the details of actions under headings like “contain incidents”.  9/10 I’d post-it this one.

Then the SANS sliding scale of cyber security

SlidingScaleCyberSecurity

AKA “Yes, you want to hit them, but you’ll get a better return from sorting out your basics”. This feels related to the Gartner cycle in that the left side is very much about prevention, and the right about response and prediction. As with most infosec, a lot of this is about what we’re protecting from what, when, why and how.

With my misinformation hat on, architecture seems to be key here.I know that we have to do the hard work of protecting our base systems: educating and inoculating populations (basically patching the humans), designing platforms to be harder to infect with bots and/or botlike behaviours. SANS talks about compliance, but for misinformation there’s nothing (yet) to be compliant against. I think we need to fix that. Non-human traffic pushing misinformation is an easy win here: nobody loves an undeclared bot, especially if it’s trying to scalp your grandmother.

For passive defence, we need to keep up with non-human traffic evading our detectors, and have work still to do on semi-automated misinformation classifiers and propagation detection. Misinformation firewalls are drastic but interesting: I could argue that the Great Firewall (of China) is a misinformation firewall, and perhaps that becomes an architectural decision for some subsystems too.

Active defence is where the humans come in, working on the edge cases that the classifiers can’t determine (there will most likely always be a human in the system somewhere, on each side), and hunting for subtly crafted attacks. I also like the idea of misinformation canaries (we had some of these from the attacker side, to show which types of unit were being shut down, but they could be useful on the defence side too).

Intelligence is where most misinformation researchers (on the blue team side) have been this past year: reverse-engineering attacks, looking at tactics, techniques etc.  Here’s where we need repositories, sharing and exchanges of information like misinfocon and credco.

And offense is the push back – everything from taking down fake news sites and removing related advertisers to the types of creative manoeuvres exemplified by the Macron election team.

To me, Gartner is tactical, SANS is more strategic.  Which is another random thought I’d been kicking around recently: that if we look at actors and intent, looking at the strategic, tactical and execution levels of misinformation attacks can also give us a useful way to bucket them together.  But I digress: let’s get back to following Travis, who looks next at what needs to be secured.

V7 Matrix web 1024x720.png

For infosec, there’s the SANS Top 20, aka CIS controls (above). This interests me because it’s designed at an execution level for systems with boundaries that can be defended, and part of our misinformation problem is that we haven’t really thought hard about what our boundaries are – and if we have thought about them, have trouble deciding where they should be and how they could be implemented. It’s a useful exercise to find misinformation equivalents though, because you can’t defend what you can’t define (“you know it when you see it” isn’t a useful trust&safety guideline).  More soon on this.

Compliance frameworks aren’t likely to help us here: whilst HIPAA is useful to infosec, it’s not really forcing anyone to tell the truth online. Although I like the idea of writing hardening guides for adtech exchanges, search, cloud providers and other (unwitting) enablers of “fake news” sites and large-scale misinformation attacks, this isn’t a hardware problem (unless someone gets really creative repurposing unused bitcoin miners).

So there are still a couple of frameworks to go (and that’s not including the 3-dimensional framework I saw and liked at Hack Manhattan), which I’ll skim through for completeness, more to see if they spark any “ah, we would have missed that” thoughts.

Image result for lockheed-martin cyber kill chain

Lockheed-Martin cyber kill chain: yeah, this basically just says “you’ve got a problem”. Nothing in here for us; moving on.

Related image

And finally, MITRE’s ATT&CK framework, or Adversarial Tactics, Techniques & Common Knowledge. To quote Travis, it’s a list of the 11 most common tactics used against systems, and 100s of techniques used at each phase of their kill chains. Misinformation science is moving quickly, but not that quickly, and I’ve watched the same types of attacks (indeed the same text even) be reused against different communities at different times for different ends.  We have lists from watching since 2010, and pre-internet related work from before that: either extending ATT&CK (I started on that a while ago, but had issues making it fit) or a separate repository of attack types is starting to make sense.

A final note on how we characterise attacks. I think the infosec lists on characterising attackers make sense for misinformation: persistence, privilege escalation, defence evasion, credential access, discovery, lateral movement, execution, collection, exfiltration, command and control are all just as valid if we’re talking about communities instead of hardware and systems. Travis’s notes characterising response also make sense for us too: we also need to gather information each time on what is affected, what to gather, how the technique was used by an adversary, how to prevent a technique from being exploited, and how to detect the attack.

We’re not at war — we’re just doing infosec

[Cross-posted from Medium https://medium.com/misinfosec/were-not-at-war-we-re-just-doing-infosec-fff1d25fbcd1]

I read three connected things today: the digital Maginot line, about how we’ve been in a digital “warm war’ for the past few years, why influence matters in the spread of misinformation, and are ads really that bad?.

The first one tells a story that I’ve told myself over the past few years — that the age-old drive to gain territory and power has moved from turning up with weapons and fighting (‘kinetic warfare’) to get other countries to hand them over, to using social media channels to persuade those countries’ people to either welcome the new overlords, or persuade their own politicians to not interfere when their neighbouring countries get overrun.

I’ve sounded warnings because I believe I’ve lived in a uniquely fragile country in uniquely fragile conditions at a uniquely fragile time. I don’t however believe that that’s always going to be the case, and I think the answers to that lie in both past and recent history, and across articles like these three. Bear with me because I’m still gathering my own thoughts (into, amongst other things, a practical book) and hadn’t meant to write so much background yet.

Every country in the world has been trying to influence every other country that it interacts with since before there have been countries: it’s geopolitics. It tries to influence populations: it’s own through politics, others through propaganda and other more benign forms of outreach and image manipulation.

What social media buys an aggressor country is reach and scale. The same thing that lets me chat with friends in Kenya, or you see that not everyone in Peru is living on a farm with 10 llamas also allows anyone anywhere to insert themselves into an influence network in any language on any topic anywhere else on the planet. And there’s the power, right there.

One way to stop playing a game is to step outside of the game. Widespread user-generated content is relatively new: so what’s to stop defending countries from putting up their own digital firewalls, China-style, and collapsing the troublesome parts of the internet on themselves? Perhaps it’s that the entities that are massively powerful have changed, and countries are small in power now, compared to large global companies. And that gets me to a small glimmer of hope. Because those companies’ power is rooted in ad money. And ads are shown to humans. And despite all our globalisation, all our progress, all our ability to buy our Christmas decorations from Norway, I’ve seen the data, and people are still very geographically rooted in their ad clicks. It will be interesting to see what countries do.

It’s also interesting to see what people do. Because without users, a social media company becomes — is Myspace still around? And without people who can be manipulated, an influence campaign is just more conspiracy-laden shouting (how *is* qanon doing these days btw?). I said when I started that I’ve lived in a uniquely fragile country in uniquely fragile conditions at a uniquely fragile time. I think we’re still there, but I’m starting to see signs of resilience that give me hope.

Yes, we have work to do, but FWIW I don’t think we’re at war. I do think we need a new layer of information security and, amongst others, I’m working on tactics, techniques and procedures for that. We’re perhaps at the Cuckoo’s Egg part of the journey, and this is forever work, not a quick set of fixes, but it’s worth it.

Not just America

I’ve been reading lately – my current book is Benkler et al’s Network Propaganda, and in its Origins chapter, I was reading about 19th century vs 20th century American political styles and how they apply today, and caught myself thinking “am I studying America too much? Misinformation is hitting most countries around the world now – am I biased because I’m here?”.

I think that’s a valid question to ask.  Despite there being good work in many places around the world (looking at you, Lithuania and France!), much has been made of the 2016 US elections and beyond; many of the studies I’ve seen and work I’ve been involved in have a US bias, funding or teams.  And are we making our responses vulnerable because of that?

I think not.  And I think not because we’ve all been touched by American culture.  When we talk about globalisation, we’re usually talking about American companies, American movies and styles and brands, and much Internet culture has followed that too (which country doesn’t bother with a country suffix?) And just as we see Snickers and Disney all over the world, so too have our cultures been changed by things like game shows.

making a simple map (15 minute project)

It’s Portland’s tech crawl tomorrow, where we all visit each others’ offices, admire the toys (reminder to self: must put some more air in the 7′ lobster), drink each others’ beers and try to persuade each others’ techs to work for somewhere with cooler toys.  My office is on the crawl, but rather than be the token female data scientist in there, I’m going to go out visiting.
I started by grabbing and formatting some data.
There’s no map of the crawl (yet),and I needed to know how far I’d be walking (I’m still recovering from my time in a wheelchair). There is a list of companies on the route , so I cut and pasted that list into a text file that I imaginatively named techcrawl_raw.txt.
Then I applied a little Python:
import pandas as pd
fin = open(‘techcrawl_raw.txt’, ‘r’)
txt = fin.read()
df = pd.DataFrame([x.strip()[:-1].split(‘ (‘) for x in list(set(txt.split(‘\r’))- set([”]))],
columns=[‘name’, ‘url’])
df[‘address’] = df[‘name’] + ‘, portland, oregon’
df.to_csv(‘tech_crawl.csv’, index=False)
This takes the contents of the text file, which looks like this:
Creates an array that looks like this (NB the order changed because I used set() to remove any duplicate entries):
Converts it to a pandas dataframe, and adds a new column with the company name plus “, portland, oregon”, then dumps that dataframe to a csv file that looks like this:
The code that creates the array is fragile (just one added space could break it), but it works for a one-shot thing.
And then I made a map.
Go to google maps, click on the hamburger menu then “your places”, “maps”, “create map” (that’s at the bottom of the page) then “import”.   Select the csv (e.g. tech_crawl.csv), click on “address” when it say “choose columns to position your placemarks” and “name” for “column to title your markers”.
You’ve now got a map, because Google used its gazetteer to look up all the addresses you gave it.  It won’t be perfect: I got “2 rows couldn’t be shown on the map” – you can click on “open data table” and go edit the “address” field til Google finds where they are.
It’s a little fancier than the basic map.  To fancy it up, I clicked on “untitled map” and gave the map a name; I hovered over “all items” until a little paintpot appeared, clicked on that and chose a colour (green) and icon (wineglass).  I also clicked on “add layer” and added a layer called “important things”, used the search bar to find the start and afterparty locations, then clicked on the icons that appeared on the map, then “add to map” and used the paintpot to customise those too.  And that was it.  One simple map, about 15 minutes, most of which was spent creating the CSV file.  And a drinking <del><del> visiting route that I can walk without getting to exhaustion.
Making your own maps for fun is fun, but there are other more serious maps you can help with too, like the Humanitarian OpenStreetMap maps of disaster areas – if you’re interested, the current task list is at https://tasks.hotosm.org/

Squirrel! Or, learning to love nonlinearity

I write a lot.  I wrote posts, and notes, comments on other people’s work and long emails to friends about things that interest me (and hopefully them too).  But I don’t write enough here.  And part of that is the perception of writing as a perfect thing, as a contiguous thread of thought, of a blog as a “themed” thing written for an audience.

So I stopped writing here. I do that sometimes. Because the things that interest me vary, and aren’t always serious, or aren’t part of the current ‘theme’ (which is currently misinformation and how people think).  Or I don’t have enough time, and leave half-written notes in my ‘drafts’ folder waiting to be turned into ‘good’ content.

But that seems a little grandiose. I’m assuming that people read this blog, that they’re looking for a specific thing, for something polished, a product.  And that it’s my job to provide that.  And that leads to the above, to stasis, to me not publishing anything.

So for now, I’m just going to write about what interests me, about the projects I’m working on, the thoughts that I spin out into larger things.  The serious stuff will be on my medium, https://medium.com/@sarajayneterp

Who handles misinformation outbreaks?

[Cross-post from Medium https://medium.com/misinfosec/who-handles-misinformation-outbreaks-e635442972df]

Misinformation attacks— the deliberate and sustained creation and amplification of false information at scale — are a problem. Some of them start as jokes (the ever-present street sharks in disasters) or attempts to push an agenda (e.g. right-wing brigading); some are there to make money (the “Macedonian teens”), or part of ongoing attempts to destabilise countries including the US, UK and Canada (e.g. Russia’s Internet Research Agency using troll and bot amplification of divisive messages).

Enough people are writing about why misinformation attacks happen, what they look like and what motivates attackers. Fewer people are actively countering attacks. Here are some of them, roughly categorised as:

  • Journalists and data scientists: Make misinformation visible
  • Platforms and governments: Reduce misinformation spread
  • Communities: directly engage misinformation
  • Adtech: Remove or reduce misinformation rewards

Throughout this note, I want to ask questions about these entities like “why are they doing this”, “how could we do this better” and “is this sustainable”, and think about what a joined-up system to counter larger-scale misinformation attacks might look like (I’m deliberately focussing on sustained attacks, and ignoring one-off “sharks in the streets” type misinformation here).

Make Misinformation Visible

If you can see misinformation attacks in real time, you can start to do something about them. Right now, that can be as simple as checking patterns across a set of known hashtags, accounts or urls on a single social media platform. That’s what the Alliance for Securing Democracy does, with its dashboards of artefacts (hashtags, urls etc) from suspected misinformation trolls in the US (Hamilton68) and Germany (Artikel38), what groups like botsentinel do with artefacts from suspected botnets, and sites like newstracker do for artefacts from Facebook.

In almost real-time, there are groups tracking and debunking rumours: Snopes and similar fact-checking organisations, small-scale community groups like no melon and Georgia’s Myth Detector, and state-level work like the European External Action Service East Stratcom Task Force’s EU vs Disinfo campaign and US State Department’s polygraph.info.

For some events, like US elections, there are many groups collecting and storing message-level data, but that work is often academic, aimed at learning and strategic outputs (e.g. advocacy, policy, news articles). Some work worth watching includes Jonathan Albright’s event-specific datasets; Kate Starbird’s work on crisis misinformation; Oxford Internet Institute’s Computational Propaganda project (e.g. their 2016 USA elections work) and DFR Lab’s #electionwatch and #digitalsherlocks work.

Real-time misinformation monitoring is costly to set up accurately, and is usually tied to a geography, event, group of accounts or known artefacts, and a small number of platforms. That leaves gaps (were there any dashboards for the recent Ontario elections?) and overlaps, and there’s a lot of scope here for real-time information-sharing (oh god did I just volunteer for yet another data standards body?) across platforms, countries and topics, because that’s where the attackers are starting to work.

Simple things like checking a new platform for known misinformation artefacts, e.g. the US House Intelligence Committee’s datasets [“Exhibit A”] of Russia-funded Facebook advertisements, and checking for links to known fake news sites have already unearthed new misinformation sources. We’re already seeing multi-platform attacks and evolution in tactics to make detection and responses harder, and really good techniques (the evolutions of early persona management software into large-scale natural language posts) haven’t really been used yet. When they do, our tracking will need to get broader and more sophisticated, and will need all the artefacts it can get.

More tracking projects are listed in Shane Greenup’s article, and it’s worth watching Disinfo Portal, which reports on other anti-misinformation campaigns.

Reduce misinformation spread

Large-scale misinformation is a pipeline. It starts in different places and for different reasons, but it generally needs a source, content (text, images, video etc), online presence (websites, social media accounts, advertisements, ability to write comments or other user-generated content in sites and fora etc), reach (e.g. through botnet or viral amplification), and end targets (community beliefs, people viewing advertising etc) to succeed.

The source is a good place to start. I’ve had success in the past asking someone to remove or edit their misinforming posts, but that’s not our use-case here: we’re looking at persistent and often adversarial content. Attacking a creator’s intent, or using other less-savory methods to dissuade them is perhaps something from a bad spy novel, but these have also been known to happen in real life. The creators of fake news sites are often there to make money, and either removing the promise of cash (see below) or making it riskier to obtain that money (e.g. by penalising the site creators) might work. The creators of social media-based misinformation attacks often have other incentives, getting quickly into the realm of politics and diplomacy (also see below).

The next place is the platforms that host misinformation. These are typically websites (e.g. “macedonian fake news sites” ) and social media platforms, but we’ve also seen misinformation coordinated across other user generated content hosts, including comment sections and payment sites.

Websites typically use internet service providers, domain name providers and advertising exchanges (for monitisation). Internet service providers could and have removed sites that violate their terms of service (e.g. GoDaddy removed the Daily Stormer and other sites have been removed for stealing content), but we haven’t seen them remove reported “fake news” sites yet (we’d love to be corrected on that). Sites can be removed for domain squatting or names similar to trademarks, but most fake news sites steer clear of this. Advertising exchanges and their customers do keep blacklists of sites that include fake news sites — that’s covered in more detail below. It’s also possible to disrupt views to fake news sites, by changing search engine results using sites with similar text, squatting on domains with similar names etc.

Misinformation on social media can be original content (text, images etc), pointers to fake news sites, or amplification of either of these through repetition, use of popular hashtags and usernames, or other marketing techniques to get sources and messages more attention. One area of much interest recently has been amplification of misinformation by trolls, bots and brigades (e.g. coordinated 8chan users); we’re starting to see these adapt to current detection techniques and are looking at ways they’re likely to evolve as artificial intelligence techniques improve (that’s for a different post).

Social media platforms have tried a variety of ways to stop misinformation spreading.

  • Early work focussed on the reader of misinformation, e.g. adding credibility markers to messages, adding fact-check buttons and notifying people who engaged with known bots/trolls (this is included in Facebook’s list of possible actions). Whilst reader education is important in combatting misinformation, user actions are difficult to obtain (which is why advertisers set a high monetary value on clicks and actions) and some actions (e.g. Twitter check marks) added to reader confusion.
  • The most-discussed platform actions are identifying and removing fake accounts. In an adversarial environment, this needs to be done quickly, e.g. remove botnets whilst misinformation attacks are happening, not after, and that speed increases the potential for collateral damage including misclassification and platform friction for users. Removing accounts is also not without risk to the company: Twitter suspended 70 million accounts recently, which had very little effect on the active botnets being tracked (most of the accounts removed were inactive), but did damage Twitter’s share price: Twitter, like many social media platforms, makes most of its revenue from advertising, and also has to manage perception of the company (a good example of this is the drop in transactions after AppNexus’s 2015 bot cleanout).
  • People are good pattern and nuance detectors: users can and do flag fake accounts and hate speech in messages to platforms, but this both creates a queue to be managed, and a potential for abuse itself (several of my female friends have had their social media accounts reported by abusers online). Abuse reports appear to go into ‘black holes’ (many of the well-documented botnet accounts are still active), and misinformation messages are often carefully crafted to create division without triggering platform hate speech rules, but there may still be some merit in this.
  • A softer way to reduce the spread of misinformation is to limit the visibility of suspected misinformation and its promoters, and increase the operating cost (in terms of time and attention) for accounts thought to be parts of botnets. We’ve seen shadowbans (making content invisible to everyone except its creator) and requests for account verification (e.g. by SMS) on bot accounts with specific characteristics recently: whilst reverifying every account in a 1000-bot network can be done, it takes time and adds friction to the running of a misinformation botnet.
  • The end point of misinformation is its target demographic(s). Since reach (the number of people who see each idea or message) is important, anything that limits either reach or the propagation speed of misinformation is useful. There are spam filter-style tools at this end point (e.g. BlockTogether) that highlight suspicious content in a user’s social media feeds, but platforms don’t yet have coordinated blocking at the user end point.
  • Large-scale misinformation attacks happen across multiple platforms, any one of which might not have a strong enough signal for removing messages on its own. There are meetings but not much visible response coordination across platforms yet, and the idea of an EFF-style watchdog for the online providers that misinformation flows through is a good one if it’s backed up with coordinated real-time response. This is where standards bodies like the Credibility Coalition are valuable, in helping to improve the ways that information is shared.

Stopping misinformation at the platform level can be improved by borrowing frameworks and techniques from the information security community (which yes, is also a post in its own right), including techniques for adversarial environments like creating ‘red teams’ to attack our own systems. Facebook has, encouragingly, set up a red team for misinformation; it will be interesting to see where that goes.

Stopping misinformation at the platform level also needs new policies, to provide a legal framework for removing offenders, and to align misinformation with the business interests of the platforms. One solution is for platforms to extend their existing policies and actions for hate speech, pornography and platform abuses. Removing misinformation comes with financial, legal and social risks to a platform, and if it’s to get into policies and development plans, it needs strong support. This is where governments can play a large part, in the same way that GDPR data privacy regulations forced change. There are already regulations in Germany, and similar activity in other countries that look very similar to the pre-GDPR discussions on data privacy and consent; unfortunately, misinformation regulations are also being discussed by countries with a history of censorship, making it even more important to get these right. A sensible move for platforms now is to create and test their own policies and feed into government policy, before policy is forced on them from outside.

Reducing misinformation rewards

Misinformation attacks usually have goals, ranging from financial profit to creating favorable political conditions (approval or confusion: both can work) for a specific nationstate and its actions.

  • Online advertising is a main source of funding for many misinformation sites. Reducing misinformation advertising revenues reduces the profits and economic incentives of “fake news” sites, and the operating revenue available to misinformation campaigns. Adtech companies keep blacklists of fraudulent websites: although very few of them (e.g. online advertisers in Slovakia) explicitly blacklist misinformation sites, misinformation sites and bots are often blacklisted already because they’re correlated with activities like click fraud, bitcoin mining and gaming market sentiment (e.g. in cryptocurrency groups). Community campaigns like Sleeping Giants have been effective in creating adtech boycotts of political misinformation and hate speech sites. Other work supporting this includes collection and analysis of fake news sites.
  • Political misinformation is part of a larger information warfare / propaganda game: fewer politicians have played it better than Macron’s election team creating false information honeypots and responses to their disclosure before the election blackout started in France. Social media is now a part of cyberwarfare, so we’re likely to see (or not see) more skillful responses like this online.

Ultimately when we respond to misinformation attacks by reducing rewards, we’re not trying to completely eradicate the misinformation — as with all good forms of conflict reduction, we’re trying to make it prohibitively costly for any but the most determined attackers to have the resources and reach to affect us.

Engaging misinformation online

Even if platforms work to remove misinformation botnets and limit the reach of trolls, misinformation will still get through. It’s not clear who’s responsible for countering misinformation attacks at this point: it’s made it past the platforms’ controls, but still has the potential to damage belief systems and communities (increasingly in real life, as evidenced by trolls setting up opposing protests at the same locations and times).

At this point, it’s appropriate to engage with the material and its creators, limiting the reach of bots and trolls whilst being mindful of personal and community safeties. This engagement can happen at most points in the misinformation pipeline, and this is another area where the infosec mindset of creatively adapting systems can be useful. Some recent notable examples include:

  • Lithuanian and Latvian ‘Elves’: roughly 100 people who respond to Russian-backed misinformation with humour and verified content, in “Elves vs. Trolls skirmishes”.
  • VOST work (in Spanish) on tagging disaster misinformation with links to verified updates, and retweeting misinformation with “Falso” stamps.
  • Overwhelming misinformation hashtags with other content (beautiful recent examples: a right-wing hashtag suddenly filled with news about a swimming contest, and an Indian guru using the Qanon hashtag).
  • Using search engine optimisation and brigading to move other stories above misinformation pages and related search terms in search and news results.

Engaging with bots and trolls has risks, including doxxing, which should be carefully considered if you’re planning to do this yourself.

The ecosystem, in general

Many of the players countering misinformation are working on a small scale. This can be effective, but we’re looking at an issue that will probably be part of the internet for the foreseeable future, and from my experience running crisismapping deployments, I know that without more support, these efforts might not be sustainable in the long term.

The social media platforms have a great deal of leverage on this problem — they could shut down misinformation almost overnight, but that would be at great potential cost to them, both in audience and in financial cost (e.g. the cost to share price of ad-supported platforms removing ad viewings). Reducing money supplies (e.g. adtech) and other misinformation incentives is another good avenue of approach, and we’ve seen some working community responses but they might be difficult to scale over time.

Misinformation attacks aren’t going away. Today they’re nation-state backed and personal attacks; tomorrow those skills could be scaled with AI and applied to companies, groups and other organisations. We need to think about this as an ecosystem, using similar tools and mindsets to information security. We also need to create more joined-up, cross-platform responses.

Thanks to connie moon sehat (hide). 

“Max evil” MLsec: why should you care?

[cross-posted from Medium https://medium.com/@sarajayneterp/max-evil-mlsec-why-should-you-care-ae3a42bfea52]

[Shoutout: we’re looking for papers — if you’re working on MLsec, please consider submitting to AI Village by June 15th, CAMLIS by June 30th.]

MLsec is the intersection of machine learning, artificial intelligence, deep learning and information security. It has an active community (see the MLsec project, Defcon’s AI Village and the CAMLIS conference) and a host of applications including the offensive ones outlined in “The Malicious Use of AI”.

One of the things we’ve been talking about is what it means to be ethical in this space. MLsec work divides into using AI to attack systems and ecosystems, using AI to defend them, and attacking the AI itself, to change its models or behavior in some way (e.g. the groups that poisoned the Tay chatbot’s inputs were directly attacking its models to make it racist). Ethics applies to each of these.

Talking about Ethics

Talking about algorithm ethics is trendy, and outside MLsec, there’s been a lot of recent discussion of ethics and AI. But many of the people talking aren’t practitioners: they don’t build AI systems, or train models or make design decisions on their algorithms. There’s also talk about ethics in infosec because it’s a powerful field that affects many people- and when we twin it with another powerful field (AI), and know how much chaos we could unleash with MLsec, we really need to get its ethics right.

This discussion needs to come from practitioners: the MLsec equivalents of Cathy O’Neill (who I will love forever for seamlessly weaving analysis of penis sizes with flawed recidivism algorithms and other abuses of people). It still needs to be part of the wider conversations about hype cycles (people not trusting bots, then overtrusting them, then reaching a social compromise with them), data risk, and what happens to people and their societies when they start sharing jobs, space, homes, societal decisions and online society with algorithms and bots (sexbots, slackbots, AI-backed bots etc), but we also need to think about the risks that are unique to our domain.

A simple definition

There are many philosophy courses on ethics. In my data work, I’ve used a simple definition: ethics is about risk, which has 3 main parts:

  • how bad the consequences of something are (e.g. death is a risk, but so is having your flight delayed),
  • how likely that thing is to happen (e.g. death in a driverless train is relatively rare) and
  • who it affects (in this case, the system designer, owner, user and other stakeholders, bystanders etc).
  • Risk also has perceptions: for example, people believe that risks from terrorism are greater than those from train travel, and people’s attitudes to different risks can vary from risk-seeking through risk-neutral to risk-averse.

Ethics is about reducing risk in the sense of reducing the combination of the severity of adverse effects, the likelihood of them happening and the number of people or entities it affects. It’s about not being “that guy” with the technology, and being aware of the potential effects and sideeffects of what we do.

Ethics in MLsec

One of my hacker heroes is @straithe. Her work on how to hack human behaviors with robots is creative, incredible, and opening up a whole new area of incursion, information exfiltration and potential destruction. She thinks the hard thoughts about mlsec risks, and some of the things she’s talked about recently include:

  • Using a kid or dead relative’s voice in phishing phonecalls. Yes, we can do that: anyone who podcasts or posts videos of themselves online is leaving data about their voice, it’s relatively easy to record people talking, and Baidu’s voice-mimicking programs are already good enough to fool someone.
  • Using bots (both online and offline) to build emotional bonds with kids, then ‘killing’ or threatening to ‘kill’ those bots.
  • Using passive-aggressive bots to make people do things against their own self-interests.

The bad things here are generally personal (distress etc), but that’s not “max evil” yet. What about these?

  • Changing people’s access to resources by triggering a series of actions that reduce their social credit scores and adversely change their life (this is possible using small actions and a model of their human network).
  • Microtargetting groups of people with emotive content, to force or change their behavior (e.g. start riots and larger conflicts: when does political advertising stop and warfare start?).
  • Taking control of a set of autonomous vehicles (what responsibility do you have if one crashes and kills people?)
  • Mass unintended consequences from your machine learning system (e.g. unintentional racism in its actions).

Now we’re on a bigger scale, both in numbers and effect. When we talk about chaos, we’re talking about letting loose adaptive algorithms and mobile entities that could potentially do anything from making people cry through to destroying their lives and death. Welcome to my life, where we talk about just how far we could go with a technology on both attack and defense, because often we’re up against adversaries who won’t hesitate to have the evil thoughts and act on them, and someone has to think them in order to counter them. This is normal in infosec, and we need to have the same discussions about things like the limits of red team testing, blue team actions, deception and responsible disclosure.

Why we should care

People can get hurt in infosec operations. Some of those hurts are small (e.g. the loss of face from being successfully phished); some of them are larger. Some of the damage is to targets, sometimes it’s to your own team, sometimes it’s to people you don’t even know (like the non-trolls who found themselves on the big list of Russian trolls).

MLsec is infosec on steroids: we have this incredibly powerful, exciting research area. I’ve giggled too at the thought of cute robots getting people to open secure doors for them, and it’s fun to think the evil thoughts, to go to places where most people (or at least people who like to sleep at night) shouldn’t go. But with that power comes responsibility, and our responsibility here is to think about ethics, about societal and moral lines before we unknowingly cross them.

Some basic actions for us:

  • When we build or use models, think about whether they’re “fair” to different demographics. Human-created data is often biased against different groups — we shouldn’t just blindly replicate that.
  • If our models, bots, robots etc can affect humans, think about what the worst effects could be on them, and whether we’re prepared to accept that risk for them.
  • Make our design choices wisely.

Further reading: