For many organization’s social media monitoring is a key component of being able to detect specific threats from cyber threat actors. While some nation states and terrorist groups will leverage social media as part of their larger strategic efforts to influence populations, recruit followers and create C2 infrastructure, hacktivists and other cyber criminals will often leverage it as part of their efforts to promote their brand and agenda.
Don’t think that it is just malicious actors, many white hats and grey hats will try to reach out through social media to notify organizations about a vulnerability or a possible breach. So you should ask yourself; do you know if your social media team is tracking this type of information? Would you be notified if someone makes a threat to your organization through social media, or if a security researcher publicly discloses a vulnerability on your site? I’m going to show you how to create a tool will allow you to get this information straight from Twitter delivered to your inbox.
This blog post is focused on providing a high-level overview of how any organization can monitor social media based on set of criteria that you create. Once created the script will provide a means for analysts to receive and triage notifications from Twitter that are pertinent to their jobs. There are of course various enterprise level tools that do this, but depending on your resources and maturity level of threat intel team, it might make more sense to build your own tools, establish a work flow and then make a case for additional tooling to enhance the pre-existing capabilities.
Our tool is going to be focused on three main processes. First, we want to be able to automatically pull-down tweets from Twitter, second we want to compare tweets to some established criteria to know if we should notify or not. Lastly when a tweet meets our criteria, we want an email notification to be sent to an inbox for further triaging.
Let’s get started
For this starting series, we’re going to be looking at Twitter, and that’s for two main reasons, from our experiences Twitter’s open community model where anyone can follow anyone is the preferred means of communication for the majority of the threat actors that we’re interested in and Twitter has a baller API.
For the uninitiated, an API, or Application Programming Interfaces, allows you to programmatically access resources from another service, in this case we can collect information directly from Twitter. Twitter provides two major ways to interact with its service, there’s the REST API and the Streaming API. REST API allows you to interact with basic objects found in Twitter, so you can post a tweet in Twitter, you pull information from an account, create lists and so on. Thinks of the REST API as your way of programmatically doing on things Twitter that you can do through a web front. All the information you’ll want to know about Twitter and their API can be found here: https://dev.twitter.com/overview/api
The Streaming API is ultimately the one we’re most interested in, this API allows us to set a “stream” with Twitter, where we give it a list of terms or user IDs and Twitter then continuously sends us any tweets that contain any of terms or are from any specific user. The streaming API gives us near real-time (I mean within seconds) access to the tweets on Twitter as they’re being generated and process.
Once you got a stream set up, your system will be automagically getting the tweets directly from Twitter as they are processed, neat stuff! However, Twitter will only send you tweets that contain a term from the list of words you sent it, so this is considered the first level of triaging and also one of the more difficult pieces to get right. If you use terms that are too generic, you’ll be capturing too much worthless information, however if you’re too precise you’ll probably be missing a lot of really valuable information. For the current version of the Twitter streamer you can have up to 400 number of terms, that’s a fair amount to leverage, so feel free to experiment.
For example, if we’re very interested in Denial of service attacks, we can identify which terms are often associated with claimed DDoS attacks. Words like “Down”, “TangoDown”, “DDoS”, “Loic”, might be good terms to use. As you’re continuing to develop your capabilities, you’ll be able to further finesse these terms to fit your needs. The catch is that the stream will bring in EVERYTHING that has those terms, so we need some more triaging to make sure that we’re only getting what we want.
Setting up the Stream
This is where we’re going to go into a little bit more of the technical details associated with getting a Twitter Stream up and running. For my usage, I ended up choosing Python as my main scripting language, for two main reasons, the first of which python has an awesome Twitter library called Tweepy and second, I’m one of those entitled millennials that loves pseudocode and interpreted languages. Another opportunity for me to reiterate, I’m not a developer, just a person who likes to solve problems.
So when we want to set up a stream with Twitter, Twitter needs some information from us, actually 5 tidbits of information.
Application Key and Application Secret:
When you’re setting up fun stuff with twitter’s API you need to sign up and register an application with Twitter. That can be done through the following steps which can be found on Twitter’s Dev website: https://dev.twitter.com/oauth/overview
Now that you have registered your application with Twitter, you have access to both the application key and secret. These are your unique identifiers for your “application” so make sure that you keep them private, you wouldn’t want someone else to be impersonating your application.
User key and User secret.
These two pieces of information are associated with a specific user account. So what does this mean, it means that you can use one set of application key/secrets and associated with multiple user key/secrets. In our cases we’ve been able to do pretty much everything we need with one set of both. In this case you can create a new Twitter account that will be dedicated for this purpose, or you can use a pre-existing account (not recommended). To do this, you need to go to your twitter account, account settings and pull out Twitter API keys.
Track Terms or UserIDs
The last tibit of information you’ll need to open up a Twitter stream is the list of terms or list of UserIDS that you want twitter to filter on. For this example we’ll just focus on terms, but it’s pretty much the exact process if want to directly follow users. We call these the “track” and on average when we use them have about 200 of them that are being maintained by the intel team. In this case we’ll use a relatively simple list of terms to get started. Structurally we can store this information in either the code or more ideally in some type of file so that you can make easy edits to it. I’ve just used a text file.
Starting the Code
Structurally we’ve divided our code into three different sections, one section dedicated to setting up the stream, the second section dedicated to processing the tweets and lastly send out the notifications.
When building code, especially for complex code that has multiple moving pieces, I’ve found that it’s best to tackle it in a piecemeal fashion, addressing one major functionality at a time and ensuring that you’re testing each part as you’re going along. I’m sure there’s some super fancy developer term that describes this approach, like iterative functional build process. ..but once again, I’m not a developer.
Setting up the Foundation
The first steps are going to be import your dependencies. At this very beginning your dependencies are going to be pretty limited, since we’re only setting up the functionality for you to pull down tweets from twitter directly.
In our file, twitter_monitoring.py start by setting up a function name ‘main’. This is the function that will get called when you call the main script function.
|from tweepy.streaming import StreamListener|
|print " Twitter monitoring starting"|
|# these are the values associated with the both the application and consumer|
|#update these to the values that you gained from the previous section|
|ckey = ""|
|csecret = ""|
|atoken = ""|
|asecret = ""|
|#create the auth object leveraging the client key + client secret|
|auth = tweepy.OAuthHandler(ckey, csecret)|
|#set the application token + application secret to the new auth object|
|# now that we have our authentication set, we can connect to the API|
|api = tweepy.API(auth)|
|# the listener the API which we've defined below|
|listener = Listener(api)|
|#when our powers combine, twitter allows us to connect!|
|streamer = tweepy.Stream(auth=auth, listener=listener)|
|print ' Twitter successfully connected'|
|#intrim use of track, we'll make sure this value is pulled down from a file in the final version|
|track = ['test','awesome','hello','world']|
|# we need to extend the listener class from Tweepy and build upon it|
|def __init__(self, api=None):|
|self.api = api or tweepy.API()|
|#simple proof of concept that will trigger when we recieve a status|
|def on_status(self, status):|
|#In our example we're going to print from the status object we recieved|
|#from twitter the name associated wit the user who posted the tweet|
|print '%s tweeted something' % status.user.name|
|if __name__ == '__main__':|
Once you have the pieces in place, you can simply call your function by:
If you did everything correctly you should be receiving be receiving messages with the names of the Twitter User. Congratulations, you got the first component built of your Twitter Streamer, you’re now directly connected to Twitter based on the key filter terms you have provided.
However, as you’re examining this, you’ll realize that there’s lot more data than can be processed and a lot of it is probably not going to be relevant for you. So the next step that we’ll be doing is to start developing the filtering process for you to extract the tweets that are most relevant to your organization.