Introduction In 2014 the American Red Cross opened


Social media sites
like Twitter and Facebook, with millions of active users producing content from
all over the world, have emerged as a host of sensors in real world events. This
direct way of information exchange, could be used in crisis situations, such as
disasters brought on by natural hazards. In emergency situations, people react
immediately posting the situation on their social profiles. Thus, tracking the
public stream of Twitter in combination with Natural Language Processing (NLP)
and clustering techniques could be quite easy to detect an emergency. But how
possible is it to track all of these tweets (500 million tweets per day) from
the Twitter public stream?


Emergency situations and
especially no
warning emergency situations (suddenonset crises), generate a situation which
questions, plenty of uncertainties, and the need for quick
decisions. The information scarcity at the first seconds after of a disaster,
confirms the increasingly important role of social media communications in
disaster situations, as well as, that broadcast via social media can enhance
situational awareness during an emergency situation.


The continues
update of social media during disasters from nearby users, has created tremendous
opportunities for information propagation. Emergency response agencies seems to
be quite inadequate in comparison with people’s tweets who are in the emergency
spot. People post situation-sensitive information on social media related to
what they experience or hear from other sources 1. The result of this
practice is that people who are outside the impact zone can learn about the situation
first-hand and in near real time.


In these
fraught situations, there are people who have to react and make decisions as
soon as possible. The information posted to social media platforms affect very
positively in the whole situation and according to previous research, the
information that contributes to situational awareness is reported via Twitter
(and other social media platforms) during mass emergencies 2 3. A quite big
amount of emergency
responders recognize the value of the information distributed on social media
platforms by members of the public, and are interested in finding ways to
quickly and easily locate and organize information that is of most use to them
4.  The utilization of this information has begun
from Local fire departments to international aid agencies. In 2014 the
American Red Cross opened their Social Media Digital Operations Center for
Humanitarian Relief. The goal of this operation is to source additional
information from affected areas during emergencies to better serve people who
need help, as well as spot trends and better foresee the public’s needs and
connect people with the resources they need, like food, water, shelter or even
emotional support. With examples like these someone understands the detection
of emergencies from social media works in practice and it is worth to further
investigate methods which could have better results.


In this
survey, the various detection methods are classified based on the common traits
they share. By categorising the methods, the reader will gain a perspective on
each of the general research directions and she/he will understand which
details separate the event detection techniques.



2 Data
collection and preparation

Twitter and
Facebook, as the social networks with millions of active users, provide access
to their content through an Application Programming Interface (API). There are
two types of APIs. The one type allows queries of an archive of past messages
(search API) and the other allows data collectors to track a real-time data
feed (streaming API). In both of these types of APIs data collectors are able to filter the data
based on a time period, a geographical region for messages that have GPS
coordinates or a set of keywords that must be present in the messages.


2.1 Data availability

It is very reasonable for someone to think that during disasters,
network connectivity may be disrupted 5, hence, access to social media is in
general limited, which is a serious obstacle to research and development in
this space 6. Moreover, there are some limitations in the data collection
using APIs. The first one is that many post are not publicly available to web
crawlers, the second is that in real-time data streaming there is a percentage
of posts that are not available to the APIs. These limitations have as a result
tweets be more appropriate for emergency detection.


2.2 Data pre-processing

The pre-processing of social media data is an indispensability step that
most researchers follow before performing the actual analysis. There are many
pre-processing techniques available and the choice of the technique deepens on
the type of data at hand and the goals of the analysis.


One of the
most known methods to pre-process text messages is Natural Language Processing
(NLP) method from which researchers try to extract meaningful information from
text. There is a number of implementations available online Stanford NLP and
NLTK for Python as well as there are social media specific NLP toolkits like ArkNLP 7. ArkNLP
is trained on Twitter data and it is able to recognize Internet idioms (for
example “ikr” which means “I know, right?”).


Since all of these data are going to use by automatic
information-processing algorithms, each data item must be represented as an
information record. The representation of choice for text is usually a
numerical vector in which each position corresponds to a word or phrase. In
information retrieval this is known as the vector space model. The value in
each position can be a binary variable, indicating

the presence or absence of that word or phrase in the message, or a
number following some weighting scheme, such as TF-IDF, which favours terms
that are infrequent in the collection 8. In order to reduce the number of
variables, textual features can be discarded by removing stop words and
part-of-speech tagging (POS).


One more step that it is very important in data pre-processing is
filtering. There is a number of off-topic messages which use the same tags as
the on-topic messages 9. These messages can be post-filtered by human
labelling, keyword-based heuristics or automatic classification. Moreover,
there is a big amount of spam messages which are posted automatically for
financial gain and need to be removed 10.


Geotagging and Geocoding

is the process of attaching geographical coordinates to a message and it is
useful in emergency situations 11. Geotagging allows the retrieval of information
about a local event, by filtering the messages corresponding to a particular
geographical region. It also gives the opportunity to visualize information
about an event on a map, making it more actionable for emergency responders. The
prediction of predict epidemic transmission of diseases based on geographical
proximity is a higher-level task that geotagging can also be used.


Tweets and in general social media posts, does not contain always machine
readable location information. This depends on the user’s devices (GPS sensors)
and on the user’s privacy options. In practice, a very small amount of emergency-related
messages includes machine-readable location information 12.


On the other hand, even if location information absents, many tweets
contain implicit references to names of places (e.g., “Big Ben is on fire”).
Geocoding is the method that finds these geographical references in the text,
and linking them to geographical coordinates. This can be done by using a named
entity extractor to extract potential candidates, and then comparing those candidates
with a list of place names 13.


It is well understandable that extracting location information from
tweets is an important component for detecting emergency, however there are
some ambiguities which make the geotagging a more complex task. There two
categories of ambiguities, “geo/non-geo” (e.g. Let’s play Texas Hold ’em ) and “geo/geo”
(e.g. There is a fire in Athens). In the fist, the ambiguity is between a card
game and a state of America and in the second one the ambiguity is between the
capital of Greece and other cities in the world with the same name. There is a
research by Sultanik and Fink 14 in which they propose an unsupervised approach
to extract and disambiguate location mentions in Twitter messages during crisis




Scalability and content issues                                      


Large crises or emergency situation often
generate a mass reaction of social media activity. This huge amount of data may
be an issue since millions of messages may be recorded. While the text of each
tweet can be sorted, a data record for a tweet is around 4KB when we consider
the metadata attached to each message. Thus, a Twitter collection for an
emergency situation could consist of many gigabytes. Moreover, storage space
requirement is increased significantly by multimedia objects such as images and
videos. Some methods to reduce this amount of data are applied as for example
removing repeated (reposted/retweeted) tweets.


Usually, tweets and in general microblog messages are brief, informal and akin to speech. This
implies a language that is fragmentary, with a number of typographical errors,
without the use of punctuation, and sometimes downright incoherent” 15. This
poses significant challenges to computational methods, and can lead to poor and
misleading results. The quality of tweets itself is a complex question for crisis managers,
encompassing a number of attributes including objectivity, clarity, timeliness,
and conciseness, among others 16. Additionally, different languages can be
present in the same crisis making difficult, for both machines and humans
(e.g., content annotators), to understand or classify messages.




3. Core

Most of the systems, concerning emergency detection from social media,
start with event detection. Events happen in specific time and location for a
significant purpose 17. However, due to the online nature of social media,
events as they play out in social media may not be necessarily associated with
a physical location. In the context of social media, 18 define an event as:
“An occurrence causing changes in the volume of text data that discusses the
associated topic at a specific time. This occurrence is characterized by topic
and time, and often associated with entities such as people and location.”


Emergency situations can be categorised by two categories, the predicted
and the unexpected. Some disaster events can be predicted to a certain level of
accuracy based on meteorological data (e.g storms and tornadoes), and
information about them is usually broadcast publicly before the actual event
happens. However, some events like a mass protest may not be explicitly
anticipated but still may be forecast from social media 19. Moreover, there
are some events which are unpredictable (e.g. earthquakes). In this case, an
automatic detection method is useful to find out about them as quickly as
possible once they happen.


3.1 Background on Event Detection


There are some differences in event detection between social media and
traditional event detection approaches that are suitable for other document
streams. Data stream from social media refreshes very quickly and in larger
volumes than traditional document streams. Moreover, tweets are usually short
and noisy which often require a different approach than what is used with
traditional news articles., however, techniques and evaluation metrics from the
Topic Detection and Tracking (TDT) community provide insight into methods that
might work for the Twitter domain.


3.2 New Event Detection

In the field of mass emergencies detection, the task of discovering the
first message for a particular event, by continuously monitoring a stream of
messages is called New Event Detection (NED). NED makes the decision for a message
about how fresh it is 20. “New” is normally operationalized as sufficiently
different according to a similarity metric. The metrics that are commonly used
in NED are Hellinger similarity, Kullback-Leibler divergence, and cosine
similarity 21.



3.2.1 Retrospective New Event Detection

news event detection (RED) has been studied for many years in order to discover
previous unidentified events. There are ongoing works done to improve RED
techniques such as distance measure and clustering approaches to overcome
issues such as huge dimensionality of data. 



In this context of methods TwiCal 22 extract significant events from
Twitter by focusing on certain types of words and phrases. More specifically
their method is based on the extraction of event phrases, named entities, and
calendar dates. To extract named entities, they use a named entity tagger
trained on 800 randomly selected tweets and to extract event mentions, they use
a Twitter-tuned POS tagger 23. The extracted events are classified retrospectively
into event types using a latent variable model that first identifies event types
using the given data, and then performs classification.


Another system in this category is 24 Twevent which uses message
segments instead of individual words to detect events. According to the authors,
tweet segments which consist of one or more consecutive words in tweets,
contains more meaningful information than certain words. The first phase of Twevent
is that the individual tweet is segmented, and burst segments are identified
using the segments’ frequency in a particular time window. Next, identified
segments are retrospectively clustered using K-Nearest Neighbors (KNN)
clustering. 24 In the final step, a post-filtering step uses Wikipedia concepts
to filter the detected events.


3.2.2 Online New-Event Detection

In online new-event detection previously seen messages or past knowledge
about the events are not used in order to be identified. Online NED is performed
in real time. That means that the time between the emergence of a document
corresponding to a new event and the detection of a new event, is relatively


3.3 Keyword burst approach

In the context of online new-event detection, one simple approach is to
assume that words which are used with high frequency over time are related to a
new event. Based on this approach, Robinson et al. 2013 present an earthquake
detector using Twitter 25. More specifically, the earthquake detector is
based on the Emergency Situation Awareness (ESA) platform 26. ESA platform tracks
the keywords “earthquake” and “#eqnz ” in the real-time Twitter stream, and analyses
word frequencies by appling a burst detection method in fixed-width time
windows and compare them to historical word frequencies. By analysing these
frequencies, spikes determine unusual events. A comparison between simple
keyword-based approaches and data from seismological sensors was conducted 27
and in some cases, detections of earthquake by Twitter users was faster than
seismographic instruments.


In the same context of methods, TwitInfo 28 is a system which detects,
summarizes, and visualizes events on Twitter. More specifically, TwitInfo collects
tweets based on a user-defined query (e.g., keywords used to filter the Twitter
stream) and then detects events by identifying sharp increases in the frequency
of tweets that contain the particular user-defined query as compared to the
historical weighted running average of tweets that contain that same query.
Further, tweets are obtained from the identified events to identify and represent
an aggregated sentiment (i.e., classifying tweets into positive and negative classes)
28. The evaluation of the system was conducted on events like earthquakes and
football games.


Similarly, TwitterMonitor 29 is a detecting trends system (e.g.,
emerging topics such as breaking news) in real time by collecting tweets from
the Twitter stream. TwitterMonitor works in two phases. First it identifies
bursty keywords and categorises them based on their co-occurrences and then
since a trend is identified, additional information from the tweets is
extracted in order to describe the trend.


Another Twitter event detection approach 30 uses Locality Sensitive
Hashing (LSH) for hashing a fixed number of recent documents in a bounded
space, and processed in a bounded time, to increase the performance of nearest neighbours


3.4 Beyond keyword bursts.

Methods relying on increases in the frequency of a
keyword to detect events, seem to be quite simple and a subsequent result of
this is that there are some problems. A common problem is that popular hashtags
create digital pseudo-events. Pseudo-event is an event that is created specifically for media coverage and it
does not take place in a particular physical location. This study 31
presents an approach to classify real-world events from non-events using
Twitter. The researchers use four types of features: temporal, social, topical,
and Twitter-specific, to identify real events using the Twitter stream in real
time. First, based on temporal features, like volume of messages posted during
an hour, they form initial clusters using the most frequent terms in the
messages. Clusters are then refined using social features (i.e., users’
interactions like retweets, replies, mentions). Next, they apply heuristics,
for example, a high percentage of retweets and replies often indicates a non-event,
whereas a high percentage of mentions indicates that there is an event.
Further, cluster coherence is estimated using a cosine similarity metric between
messages and cluster centroid. Finally, as the authors report that multiword
hashtags are highly indicative of some sort of Twitter-specific discussion and
do not represent any real event, they check the frequency of such hashtags used
in each cluster to further refine the results.


Another algorithm 32 for event detection from tweets, taking a step
further Keyword burst approach

was presented, using clustering of wavelet-based signals was. In this
approach the first step is the wavelet transformation and auto-correlation to
find bursts in individual words, and keep only the words with high-signal
auto-correlations as event features. Then, the similarity for each pair of
event-features is measured using cross-correlation and finally a
modularity-based graph partitioning algorithm was used to detect real-life
events. The strong point of this approach which differentiates it from the
traditional event detection approaches is the capability of differentiating
real-life big events from trivial ones. Two factors that contribute to this achievement
are the number of words, and the cross-correlation among the words related to
an event.


A different approach from the aforementioned is presented by Corley et
al. 2013. In this method, the detection and investigation of events is
conducted through metadata analytics and topic clustering on Twitter. Various
features such as retweets, usage of different terms, and hashtags are analysed for
a certain time period to determine a baseline and noise ratio. For the
detection of an event a particular feature value has to exceeds its noise
boundaries and an expected threshold. Once an event has been detected, its
related topics are identified using the topic clustering approach.


3.5 Domain-specific approaches.

In natural-language processing (NLP) applications, a very important
factor which influences the emergency detection systems is the domain. The methods
which are focused in a specific domain (e.g. earthquake) generally perform
better than the approaches that are open-domain or generic. A well-studied
method 33 describe an approach for detecting breaking news from Twitter.
First,tweets containing the hashtag “#breakingnews” or the phrase “breaking
news” are fetched from the Twitter streaming API. Then the extracted tweets are
grouped based on content similarity using a variant of the TF-IDF technique.
Specifically, the similarity variant assigns a high similarity score to
hashtags and proper nouns, which they identify using the Stanford Named Entity
Recognition (NER) implementation.


Data from traditional media sources can also be used for detecting
newsworthy events. Not surprisingly, traditional media and social media have
different editorial styles, perceived levels of credibility, and response
delays to events. Tanev et al. 2012 find news articles describing
security-related events (such as gun fights), and use keywords in their title
and first paragraph to create a query. This query is issued against Twitter to
obtain tweets related to the event. Dou et al. 2012 describe LeadLine , an interactive
visual analysis system for event identification and exploration. LeadLine  automatically identifies meaningful events in
social media and news data using burst detection. Further, named entities and
geo-locations are extracted from the filtered data to visualize them on a map
through their interface.


Another domain-specific event-detection method is based on pre-specified
rules and introduced in Li et al. 2012a. Their system, TEDAS , detects,
analyzes, and identifies relevant crime- and disaster-related events on
Twitter. First, tweets are collected based on iteratively refined rules (e.g.,
keywords, hashtags) from Twitter’s streaming API. Next, tweets are classified
via supervised learning based on content as well as Twitterspecific features
(i.e., URLs, hashtags, mentions). Additionally, location information is extracted
using both explicit geographical coordinates and implicit geographical references
in the content. Finally, tweets are ranked according to their estimated level of


Sakaki et al. 2010 detect hazards and crises such as earthquakes,
typhoons, and large traffic jams using temporal and spatial information. The
authors consider three types of features associated with tweets: statistical
features (i.e., number of words in a tweet, position of the query word within a
tweet), keyword-based features (i.e., the actual words in a tweet), and
contextual features (e.g., words appearing near the query term, for instance if
“earthquake” is a query, terms such as “magnitude” and “rocks” would be
features in the tweet “5.3 magnitude earthquake rocks parts of Papua New
Guinea”). In order to determine if a tweet corresponds to one of these hazards or
crises, they use Support Vector Machines (SVM)—a known supervised
classification algorithm (more about supervised classification in Section
6.1.2). LITMUS Musaev et al. 2014 detects landslides using data collected
from multiple sources. The system, which depends on the USGS seismic activity
feed provider, the TRMM (NASA) rainfall feed, and social sensors (e.g.,
Twitter, YouTube, Instagram), detects landslides in real time by integrating
multisourced data using relevance-ranking strategy (Bayesian model). Social
media data is processed in a series of filtering steps (keyword-based

removing stop-words, geotagging, supervised classification) and mapped
based on geo-information either obtained from metadata or from content. Events
with high relevancy are identified as “real events.”