tag:blogger.com,1999:blog-61884644808832382842024-03-13T13:06:49.216+00:00On GATE, Text and Social Media Analysis, and Detecting Misinformation OnlinePosts about our text and social media analysis work and latest news on GATE (http://gate.ac.uk) - our open source text and social media analysis platform. Also posts about the PHEME project (http://pheme.eu) and our work on automatic detection of rumours in social media. Lately also general musings about fake news, misinformation, and online propaganda.Kalina Bontchevahttp://www.blogger.com/profile/15012686857148213914noreply@blogger.comBlogger63125tag:blogger.com,1999:blog-6188464480883238284.post-1174451596161465662022-08-03T10:43:00.002+01:002022-08-03T10:43:33.194+01:00Populate a Corpus from a List of URLsGATE provides support for loading numerous different document formats, as well as a number of ways populating corpora. Until recently, however, we've not offered any way of populating a corpus from a simple list of URLs. Worse, even though it's now quite easy to do this in GATE it's unlikely you would come across the option by accident.
<br/><br/>
The support for this is actually hidden away inside the "Format CSV" plugin (you'll need to use version 8.7 or above) and in GATE Developer is exposed through the "Populate from CSV file..." option in the context menu of a corpus.
<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOGJcEXrmoMZCJ6qYgbuq8B54IME9M7i2vFN_sbdof1_EMWbnEW1Ga9kaY_NpsQo4DuQ_134EMH7bjMj4QOBmVJ-ZU3ZgCIjCcOEvCj0W8nWvW7n0N2sCdDhm5IAImp3929YlschFz-4u4CKw-klc9KE0QfZw2Gk9vXIKFyei7GoNT7-dr7_Dwe9Q6/s1659/Screenshot%20from%202022-08-03%2009-54-44.png" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" width="600" data-original-height="683" data-original-width="1659" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOGJcEXrmoMZCJ6qYgbuq8B54IME9M7i2vFN_sbdof1_EMWbnEW1Ga9kaY_NpsQo4DuQ_134EMH7bjMj4QOBmVJ-ZU3ZgCIjCcOEvCj0W8nWvW7n0N2sCdDhm5IAImp3929YlschFz-4u4CKw-klc9KE0QfZw2Gk9vXIKFyei7GoNT7-dr7_Dwe9Q6/s600/Screenshot%20from%202022-08-03%2009-54-44.png"/></a></div>
In this screenshot I've configured the populator ready to build a corpus from a simple text file with one URL per line. The important settings are:
<ul>
<li>Column Separator is set to "\t". This means we are using a tab character as the column separator. We do this simply as you can't have a tab in a URL whereas you could have a URL containing a comma and we don't want our URLs split in half.</li>
<li>Document Content is in column 0. We always count columns (or almost anything) starting from 0, so this just ensures we use the URL as the document content.</li>
<li>Create one document per row is selected. The important option isn't available if we don't first select this as it makes no sense to try and load multiple URLs into the same GATE document.</li>
<li>Cell contains document URL is selected. This is the new feature which makes this trick possible. Essentially it looks at the contents of a cell and if it can be interpreted as a URL then it creates a document from the contents of the URL, otherwise it uses the cell content as normal to build the document.</li>
</ul>
Once configured it's simply a case of selecting your text file, one URL per line, and hitting the OK button. Be aware that there is currently no rate limiting so be careful if you are listing a lot of URLs from a single domain etc. You may also want to combine this with <a href="https://gate4ugc.blogspot.com/2022/03/gate-and-cookie-jar.html">the cookie trick from the previous post</a> to ensure you get the correct content from each of the URLs.
<br/><br/>
Of course while this post has been about how to populate a corpus from a simple list of URLs you can use more complex CSV or TSV files which happen to contain URLs in one column. In that case the details from the other columns will be added as document features.Mark A. Greenwoodhttp://www.blogger.com/profile/05949555648983374325noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-55933889106516099682022-03-02T09:12:00.000+00:002022-03-02T09:12:24.079+00:00GATE and the Cookie Jar<img src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/cookie-jar.jpg" width="300" style="float:right;"> One of the useful features of GATE is that documents can be loaded directly form the web as well as from local files. This is specifically useful for pages which update frequently which you might want to process repeatedly. While using this feature recently we came across some pages that refused to load correctly. The page loaded fine in a web browser but returned a 403 unauthorised response when accessed via GATE.
<br /><br />
After a bit of debugging it turned out that this issue was related to cookies. The specific URL we were trying to load went through a number of redirects before ending up at the final page. The problem was that the first redirect set a cookie, and that needed to be present for the further redirects to work. By default Java, and hence GATE, doesn't maintain cookies across requests, as each connection is handled independently.
<br /><br />
If you are using GATE in an embedded context, then it is trivial to add support for cookies using the default Java cookie handler. This is a JVM level setting so once configured in your own code, all requests made by GATE to load documents will also gain support for handling cookies. The entire solution is the following single line of code:
<br /><br />
<div style="text-align: center;"><code>java.net.CookieHandler.setDefault(new java.net.CookieManager());</code></div>
<br /><br/>
The problem we faced though, was that we wanted to be able to load documents that required cookies from within GATE Developer and that required a little more thought. Whilst we could have just added the code to GATE there are a number of reasons not to (details of which are outside the scope of this blog post) and I wanted to make it easier for all existing GATE users to be able to use cookies without needing to upgrade. The answer is the rather versatile Groovy plugin.
<br><br>
If you load the Groovy plugin into GATE Developer you can then access the Groovy Console from within the tools menu. Simply pasting that single line of code into the console and executing it is enough to add the cookie support within that instance of GATE. It's slightly annoying that it won't persist across multiple instances of GATE, but as it's such a simple trick hopefully it's easy enough to apply when needed.Mark A. Greenwoodhttp://www.blogger.com/profile/05949555648983374325noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-24481398221544251442022-02-28T15:12:00.002+00:002022-02-28T15:20:12.343+00:00How green is your recipe? Using GATE to calculate the environmental impact of recipes<p style="text-align: justify;"><img height="496" src="https://lh4.googleusercontent.com/pVeGdNymPekzpVMUHEfBn7tIdmLWoE0Veph5i9JKFNZqdMQjlQJqUg81aUPAUHkZDhEckkpuobo9Wha_jWvxjY3Q-WW9RRrvw2s1Bxz8K8Bwnb0pUAHQPRxxs1_HZFkKb0ZsOVQG=w643-h496" style="text-align: left;" width="643" /> </p><p style="text-align: justify;"><span style="font-family: Arial; text-align: justify; white-space: pre-wrap;"><br /></span></p><p style="text-align: justify;"><span style="font-family: Arial; text-align: justify; white-space: pre-wrap;">The calculation of environmental impacts from recipes remains a barrier to effective uptake of sustainable diets. In a recent project funded by <a href="https://www.alprofoundation.org/">Alpro</a>, led by <a href="https://www.city.ac.uk/about/people/academics/christian-reynolds" target="_blank">Dr Christian Reynolds</a> from the Centre for Food Policy at City University London, we explored digitised recipe texts from websites in English, Dutch and German. We study recipes rather than individual ingredients because this is how people typically think about environmental impact and diet.</span></p><p style="text-align: justify;"><span style="font-family: Arial; text-align: justify; white-space: pre-wrap;">Recipes are hard to process because they use different weights and measures, and sometimes quite vague or obscure terms (e.g. "a pinch of salt", "a handful of lettuce"). Together with our project partner <a href="http://textminingsolutions.co.uk" target="_blank">Text Mining Solutions</a>, we used GATE to develop customised tools to automatically extract ingredients, quantities and units from 220,168 indexed recipes, and to match these to a food environmental database of 4500 ingredients (using the classification system FoodEx2). This database provided Land Use, GHG emissions, Eutrophying Emissions, Stress-Weighted Water Use, and Freshwater Withdrawals for each ingredient.</span></p><p style="text-align: justify;"><span style="font-family: Arial; text-align: justify; white-space: pre-wrap;">Nutrition information was sourced from the USDA FoodData Central (McKillop et al., 2021) and McCance and Widdowson's Composition of Foods Integrated Database (Public Health England, 2015). Environmental and Nutrition information was matched to two classification systems (FoodEx2, containing 4,500 ingredients, and USDA Nutrient Database, containing 2,484 ingredients). This allowed us to calculate these impacts at the mean, 5% and 95% confidence level per recipe and per portion, enabling us to explore the environmental impacts of vegan, vegetarian and non-vegetarian (omnivore) recipes if we were to cook these recipes using contemporary ingredients.</span></p><p style="text-align: justify;"><span style="font-family: Arial; text-align: justify; white-space: pre-wrap;">To validate the tool, we manually calculated the impacts of 50 recipes from 4 websites: </span><a href="https://www.bbcgoodfood.com/" style="font-family: Arial; white-space: pre-wrap;" target="_blank">BBC Good Food</a><span style="font-family: Arial; white-space: pre-wrap;">, <a href="https://www.ah.nl/allerhande" target="_blank">Albert Heijn/Allerhande</a>, </span><a href="http://AllRecipes.com" style="font-family: Arial; white-space: pre-wrap;" target="_blank">AllRecipes.com</a><span style="font-family: Arial; white-space: pre-wrap;"> and <a href="https://www.kochbar.de/" target="_blank">Kochbar</a>, and compared these with the results from our tool. </span></p><p><span style="font-family: Arial; text-align: justify; white-space: pre-wrap;">We created a website where you can enter a recipe and get back the calculation for the recipe and per portion (with confidence intervals). The image below shows a sample screenshot.</span></p><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhLQCnM3IpPSl0vhbl7AmvFsoacA6Ychw17lJdyCx6g4APxa1GaZdy2yio7HP-XFKP2rhEgjkKRuTVVZjJD9KYnVz-kvOO69woBAfxUBiBcv4Ogry-3ZEAkwQcUdSS7GS4BZ1hh5u7KtlIN_hpl9QWCyNSu0hANqZIjsrAA0sg8dcS02mvW_CwR_Kn4=s996" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="574" data-original-width="996" height="368" src="https://blogger.googleusercontent.com/img/a/AVvXsEhLQCnM3IpPSl0vhbl7AmvFsoacA6Ychw17lJdyCx6g4APxa1GaZdy2yio7HP-XFKP2rhEgjkKRuTVVZjJD9KYnVz-kvOO69woBAfxUBiBcv4Ogry-3ZEAkwQcUdSS7GS4BZ1hh5u7KtlIN_hpl9QWCyNSu0hANqZIjsrAA0sg8dcS02mvW_CwR_Kn4=w637-h368" width="637" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><p><br /></p><p style="text-align: justify;"><span style="font-family: Arial; white-space: pre-wrap;">We presented some of our findings as a </span><a class="cow-url" href="https://gate.ac.uk/gate/doc/LEAP-Poster-2021.pdf" style="background-color: white; color: #009b00; font-family: Verdana, Arial, sans-serif; text-align: left;">poster</a><span style="text-align: left;"> </span><span style="font-family: Arial; white-space: pre-wrap;">at the </span><span face="Verdana, Arial, sans-serif" style="background-color: white; text-align: left;">Livestock, Environment and People (</span><span style="font-family: Arial; white-space: pre-wrap;">LEAP) conference in December 2021. You can find more examples of our analysis and results there.</span></p><p style="text-align: justify;"><span face="Calibri, Arial, Helvetica, sans-serif" style="caret-color: rgb(0, 0, 0); font-size: 16px;">It's interesting to see how the recipes from the different countries, as well as recipes with different protein sources, lead to different median CO2 footprints. Below we see a chart showing the median GHGE per portion in recipes from different protein sources (e.g. those containing beef, those containing tofu) in omnivore, vegetarian, and vegan recipes. Unsurprisingly, the dishes containing meat have higher GHGE values on the whole, though we do find variations within individual recipes. We were particularly excited to find a recipe for chocolate cake that "beat" a salad in terms of low GHGE!</span></p><p><img height="401" src="https://lh3.googleusercontent.com/UkSG0H70HetXSewynGtIt6PJnYWJFA0Dp1CYl3QRI2ZU3l0auigUfkDqwzIRheuRRvpgekc2TTXYtkCnONdG7FkfK9cE-8f6sYawYu1xH7kbqbjt9vb6ESyGKBEBh3V6Kcl0ZuaQaCM0=w692-h401" style="font-family: Arial; text-align: justify; white-space: pre-wrap;" title="Chart" width="692" /></p><span id="docs-internal-guid-32519c71-7fff-f782-929b-2c2e11b416ae"><p dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 8pt; text-align: justify;">When we compared the different datasets (depicting recipes from different European countries) in terms of median GHGE per protein source, we found that Kochbar (German) recipes typically fared the worst, followed by the BBC Good Food recipes (British), and Albert Heijn (Dutch) faring much better.</p><p dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 8pt; text-align: justify;">The work is now continuing with the development of a dashboard enabling additional visualisations and further analysis to be produced.</p></span>Diana Maynardhttp://www.blogger.com/profile/10115059373361509161noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-13596911944787254432021-02-07T23:00:00.003+00:002021-02-09T11:12:28.741+00:00New releases bringing GATE and Python closer together<h2 style="text-align: left;">Release <br /></h2><p>The GATE Team is proud to announce two new releases that bring GATE and Python together:</p><ul style="text-align: left;"><li><a href="https://gatenlp.github.io/python-gatenlp/" target="_blank">Python GateNLP (version 1.0.2)</a>: a Python 3 package that brings many of the concepts and the ease of handling documents, annotations and features to Python.<br /></li><li><a href="http://gatenlp.github.io/gateplugin-Python/" target="_blank">GATE Python Plugin (version 3.0.2)</a>: a new plugin that can be used from Java GATE to process documents using Python code and the methods provided by the Python GateNLP package <br /></li></ul><p></p><p>Both releases are meant as first releases to a wider community to give feedback about what users need and what the basic design should look like. </p><h2 style="text-align: left;">Feedback <br /></h2><p>Users are invited to give <b>feedback about the Python GateNLP</b> package:</p><ul style="text-align: left;"><li>If you detect a bug, or have a feature request, please use the <a href="https://github.com/GateNLP/python-gatenlp/issues" target="_blank">GitHub Issue Tracker</a></li><li>For more general discussions, ideas, asking the community for help, please use (preferably) the <a href="https://github.com/GateNLP/python-gatenlp/discussions" target="_blank">GitHub Discussions Forum</a> or the <a href="https://groups.io/g/gate-users/" target="_blank">General GATE Mailing List</a></li><li>We are also interested in feedback about the API and the functionality of the package. If you want to use the package for your own development and want to discuss changes, improvements or how you can contribute, please use the <a href="https://github.com/GateNLP/python-gatenlp/discussions" target="_blank">GitHub Discussions Forum</a> </li><li>We are happy to receive contributions! Please create an issue and discuss/plan with developers on the issue tracker before providing a pull request. <br /></li></ul><p>To give <b>feedback about the Python Plugin</b>:</p><ul style="text-align: left;"><li>For reporting bugs or feature requests, please use the <a href="https://github.com/GateNLP/gateplugin-Python/issues" target="_blank">GitHub Issue Tracker</a></li><li>For getting help and more general discussions, please use the <a href="https://groups.io/g/gate-users/" target="_blank">General GATE Mailing List</a> </li></ul><p>IMPORTANT: whenever you give feedback, please include as much detail about your Operating System, Java or Python version, package/plugin version and your concrete problem or question as possible! </p><h2 style="text-align: left;">GATE Course Module</h2><div style="text-align: left;">Module 11 of the <a href="https://docs.google.com/document/d/18Ip8VcTS-tIsgUxyxfGePJY3egQeuxezzt_EkUaFdhM/edit" target="_blank">upcoming online GATE course in February 2021</a> will introduce the Python GateNLP package and the GATE Python plugin. You can register for this and many other modules of the course <a href="https://docs.google.com/forms/d/12XHC23inOfTof4SiUHmYwrah1_6IO5YfabxnxwF6xyY/viewform?edit_requested=true" target="_blank">here</a>.<br /></div><h2 style="text-align: left;">Python GateNLP</h2><div style="text-align: left;"><a href="https://gatenlp.github.io/python-gatenlp/" target="_blank">Python GateNLP</a> is a Python NLP framework which provides some of the concepts and abstractions known from Java GATE in Python, plus a number of new features: </div><div style="text-align: left;"><ul style="text-align: left;"><li>Documents with arbitrarily many features, arbitrarily many named Annotation sets. GateNLP also adds the capability of keeping a ChangeLog</li><li>AnnotationSets with arbitrarily many (stand-off) Annotations which can overlap in any way and can span any character range (not just entire tokens/words)<br /></li><li>Annotations with arbitrarily many features, grouped per set by some annotation type name</li><li>Features which map keys to arbitrary values </li><li>Corpora: collections of documents. Python GateNLP provides corpora that directly map to files in a directory (recursively). </li><li>Prepared modules for processing documents. In GateNLP these are called "Annotators" and also allow for filtering, splitting of documents</li><li>Reading and writing in various formats. GateNLP uses three new formats, "bdocjs" (JSON serialization), "bdocym" (YAML serialization) and "bdocMP" (Message Pack serialization). Documents in that format can be exchanged with Java GATE through the <a href="https://gatenlp.github.io/gateplugin-Format_Bdoc/" target="_blank">GATE plugin Format_Bdoc</a></li><li>Gazetteers for fast lookup and annotation of token sequences or character sequences which match a large list of known terms or phrases</li><li>A way to annotate documents based on patterns based on text and other annotations and annotation features: PAMPAC</li><li>A HTML visualizer which allows the user to interactively view GATE documents, annotations and features as separate HTML files or within Jupyter notebooks. <br /></li><li>Bridges to powerful NLP libraries and conversion of their annotations to GateNLP annotations:</li><ul><li><a href="https://spacy.io/" target="_blank">Spacy</a></li><li><a href="https://stanfordnlp.github.io/stanza/" target="_blank">Stanford Stanza</a> </li></ul><li>GateWorker: an API that allows the user to directly run Java GATE from Python and exchange documents between Python and Java</li><li>The Java GATE <a href="http://gatenlp.github.io/gateplugin-Python/">Python Plugin</a> (see below) allows the user to run Python GateNLP code directly from Java GATE and process documents with it. <br /></li></ul></div><div style="text-align: left;"><h2 style="text-align: left;">GATE Python Plugin</h2><div style="text-align: left;">The GATE <a href="http://gatenlp.github.io/gateplugin-Python/" target="_blank">Python Plugin</a> is one of many GATE plugins that extend the functionality of Java GATE. This plugin allows the user to process GATE documents running in the Java GATE GUI or via the multiprocessing <a href="https://github.com/GateNLP/gcp" target="_blank">Gate Cloud Processor (GCP)</a> with Python programs (which use the GateNLP API for manipulating documents). <br /></div></div>Johann Petrakhttp://www.blogger.com/profile/06869180018061661389noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-89974801619676925182020-10-20T10:19:00.004+01:002020-10-20T10:21:07.636+01:00From Entity Recognition to Ethical Recognition: a museum terminology journey<div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: justify;"><br /></div><div class="separator" style="clear: both; text-align: justify;"><div class="separator" style="clear: both; text-align: center;"><a href="https://lh3.googleusercontent.com/-ACaWEKSMyVM/X46gLtEz-EI/AAAAAAAAAiY/oHniP7KwmtAy9a6A12EctnW4gYT5ZdhgwCNcBGAsYHQ/image.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img alt="" data-original-height="69" data-original-width="145" height="95" src="https://lh3.googleusercontent.com/-ACaWEKSMyVM/X46gLtEz-EI/AAAAAAAAAiY/oHniP7KwmtAy9a6A12EctnW4gYT5ZdhgwCNcBGAsYHQ/w200-h95/image.png" width="200" /></a></div>This guest blog post from Jonathan Whitson Cloud tells the story of "how a relatively simple entity recognition project at the Horniman Museum has, thanks to the range and flexibility of tools available in GATE, opened the door to a method for the democratisation and decolonisation of terminology in Museums."</div><div class="separator" style="clear: both; text-align: justify;">In 2018 the Horniman Museum opened a new long term display called the <a href="https://www.horniman.ac.uk/event/world-gallery/" target="_blank">World Gallery.</a> As is usual with museum displays, there was only a very limited amount of space for text giving context to the over 3,000 items in the cases. As is also now usual, the Horniman looked to its website to share more of the research and stories the curators had unearthed in the 6 year gestation of the gallery. </div><div class="separator" style="clear: both; text-align: justify;"><br /></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://lh3.googleusercontent.com/-hM7nmSnGKi4/X46grKEyQzI/AAAAAAAAAik/HmojpWNGXJwCvL4QVUDNYPzUcrA5LPUMwCNcBGAsYHQ/image.png" style="margin-left: auto; margin-right: auto; text-align: center;"><img alt="" data-original-height="887" data-original-width="945" height="601" src="https://lh3.googleusercontent.com/-hM7nmSnGKi4/X46grKEyQzI/AAAAAAAAAik/HmojpWNGXJwCvL4QVUDNYPzUcrA5LPUMwCNcBGAsYHQ/w640-h601/image.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The Horniman World Gallery<br /></td></tr></tbody></table><div class="separator" style="clear: both; text-align: justify;"><br /></div><h3 style="clear: both; text-align: justify;">Entity Recognition</h3><div class="separator" style="clear: both; text-align: justify;"><div class="separator" style="clear: both;">Central to the ambition for the web content was a desire to bridge the gap between the database that the museum uses to record its collections and the narrative and research texts recorded in a wiki. The link would be the database terminologies and authority lists, used as business controls in the database. The construction of these terminologies has a revered place in museum practice. Museums as they are today emerged from the enlightenment project to categorise and bring order to the world. More on the consequences of this later, but for now it was useful to have a series of reference terms for the types of objects in the gallery, the cultures they came from, the people, places and materials etc. </div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;">I had learnt about GATE and participated in the week's training course in 2015, when I first became interested and aware of the potential for Natural Language Processing as a way of managing and getting the most out of the vast and often messy data holdings in museums. </div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;">My hope was that the terminologies and authorities in our collections database could serve as gazetteers for gazetteer-based entity recognition in GATE. The terminology entities from the database-generated gazetteers would be matched in the wiki texts and rendered as hyperlinks to reference pages for the entities on our website.</div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;">This worked pretty well, and we released over 500 <a href="https://www.horniman.ac.uk/collections/subject/1304/" target="_blank">wiki pages of marked up text,</a> with new pages continuing to come on line. The gazetteer matching, though, was only accurate enough to be suggestive, with many strings appearing in multiple gazetteers (people’s names were particularly difficult). I had been wanting an excuse to explore the machine learning potential in GATE and this seemed like an opportunity, so I came up to Sheffield for an additional day’s training (thank you Xingyi) in early March 2020, and came away with a pipeline that used Machine Learning to identify term types independently of the gazetteers, which could then be built into a set of rules that improved the gazetteer identification significantly. The annotations produced were still checked prior to publication, but with considerably fewer adjustments required.</div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://lh3.googleusercontent.com/-esxs_aq6aLo/X46haAc4xRI/AAAAAAAAAiw/CBPgecuUsCU_-0zsADm41nNu1OHaWM3vwCNcBGAsYHQ/image.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="603" data-original-width="637" height="378" src="https://lh3.googleusercontent.com/-esxs_aq6aLo/X46haAc4xRI/AAAAAAAAAiw/CBPgecuUsCU_-0zsADm41nNu1OHaWM3vwCNcBGAsYHQ/w400-h378/image.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The Gazetteer Pipeline developed in GATE<br /><br /></td></tr></tbody></table></div><div class="separator" style="clear: both;">The next experiment was to run the machine learning enhanced gazetteer pipeline over a set of gallery texts for an older exhibition. This produced a lot of matches/links, and should we publish these texts online, they will appear with in-line links to terms already in use in our Mimsy and the World Gallery Wiki texts, so becoming an integrated part of the web of linked terms and texts.</div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://lh3.googleusercontent.com/-HTkSqKBUfXs/X46hpS6MwII/AAAAAAAAAi0/MeaRGomntu8F8EtG-OuD4Ger1MLyCq8hQCNcBGAsYHQ/image.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="332" data-original-width="626" height="213" src="https://lh3.googleusercontent.com/-HTkSqKBUfXs/X46hpS6MwII/AAAAAAAAAi0/MeaRGomntu8F8EtG-OuD4Ger1MLyCq8hQCNcBGAsYHQ/w400-h213/image.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The Machine Learning pipeline built in GATE<br /><br /></td></tr></tbody></table></div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;">Another very welcome outcome of this process was that the pipeline identified a number of terms that were not in our gazetteers and which became suggested new terms for our terminologies, demonstrating GATE’s ability to create as well as identify terminology, and it is this function that we are now looking to exploit in a new project.</div><div class="separator" style="clear: both;"><br /></div><h3 style="clear: both;">Decolonisation of Museum Collections</h3><div class="separator" style="clear: both;">In 2019 the Horniman was appointed by the Department of Culture Media and Sport (DCMS) to lead a group of museums in <a href="https://www.horniman.ac.uk/project/rethinking-relationships/" target="_blank">developing new collecting and interpretation practice</a> addressing the historic and ongoing cultural impact of the UK as a colonising power. The terminology that museums use about their collections is very much a subject of interest to museums seeking to decolonise their collections. As mentioned before, the creation and application of categories has been fundamental to museum practice since museums emerged as knowledge organisations in the 18th century. It has now become painfully clear, however, that these categories have been created and applied with the same scant regard for the rights and culture of the people who made and used the items to which they have been applied as the ‘collecting’ of them. That is to say, at best rudely and at worst violently. </div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;">We are currently building a mechanism, again based on a wiki and GATE, whereby new and existing texts authored by the communities who made and used the items in the museum collection can also be marked up by those communities to make learning corpora. A machine learning pipeline will then build new terminologies to be applied to the items that the communities made and used. This is not only decolonising but democratising as it gives value to texts by any members of a community, not just cultural academics or other specialists, in many media including social media.</div><div class="separator" style="clear: both;"><br /></div><div class="separator" style="clear: both;">The GATE tool with its modular architecture has enabled me to take an experimental and incremental approach to accessing advanced NLP tools, despite not being an NLP or even a computer expert. That it is open source and supported by an active user community makes it ideal for the Cultural Heritage sector which otherwise lacks the funding, the confidence and the expertise to access the powerful NLP techniques and all they offer for the redirecting of museum interpretation away from expert exposition towards a truly democratic and decolonised future. </div></div></div><p></p>Diana Maynardhttp://www.blogger.com/profile/10115059373361509161noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-16416629611001966942020-02-24T11:00:00.003+00:002021-09-14T12:21:04.256+01:00Online Abuse toward Candidates during the UK General Election 2019<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;"><span id="docs-internal-guid-fa23b435-7fff-4c7b-3f71-6b92bffc9165" style="font-weight: normal;"><span style="font-size: 11pt; font-variant-east-asian: normal; font-variant-numeric: normal; font-weight: 700; vertical-align: baseline;"><span style="border: none; display: inline-block; height: 329px; overflow: hidden; width: 602px;"><img height="329" src="https://gate.ac.uk/sale/images/blog/most-abused-mps-2019.png" style="margin-left: 0px; margin-top: 0px;" width="602" /></span></span></span></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
In this blog post I’m going to discuss the 2019 UK general election and the increase in abuse aimed at politicians online. We collected 4.2 million tweets sent to or from election candidates in the six week period spanning from the start of November until shortly after the December 12th election. The graph above shows the who received the most abuse up to and including December 14th, with Boris Johnson and Jeremy Corbyn receiving the most by far.</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
The 2016 "Brexit" referendum left the parliament and the nation divided. Since then we have seen two general elections, and two Prime Ministers jostle to strengthen their majority and improve their negotiating position with the EU. National feeling has never been so polarised and it will come as no surprise that with the social changes brought about through the rise of social media, abuse towards politicians in the UK has increased. Using natural language processing we can identify abuse and type it according to whether it is political, sexist or simply generic abuse.</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
Our work investigates a large tweet collection on which natural language processing has been performed in order to identify abusive language, the politicians it is targeted at and the topics in the politician’s original tweet that tend to trigger abusive replies, thus enabling large scale quantitative analysis. A list of slurs, offensive words and potentially sensitive identity markers was used. The slurs list contained 1081 abusive terms or short phrases in British and American English, comprising mostly an extensive collection of insults, racist and homophobic slurs, as well as terms that denigrate a person’s appearance or intelligence, gathered from sources that include http://hatebase.org and Farrell et al [2].</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<b>Method</b></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
Tweets were collected in real-time using Twitter’s streaming API. We began immediately to collect any candidate who had been entered into Democracy Club’s database[10] who had Twitter accounts. We used the API to follow the accounts of all candidates over the campaign period. This means we collected all the tweets sent by each candidate, any replies to those tweets, and any retweets either made by the candidate or of the candidate’s own tweets. Note that this approach does not collect all tweets which an individual would see in their timeline, as it does not include those in which they are just mentioned. We took this</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
approach as the analysis results are more reliable due to the fact that replies are</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
directed at the politician who authored the tweet, and thus, any abusive language</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
is more likely to be directed at them. Ethics approval was granted to collect the data through application 25371 at the University of Sheffield.</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<b>Findings</b></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
Table 1 gives overall statistics of research period, which contains a total of 184,014 candidate-authored original tweets, 334,952 retweets and 131,292 replies. 3,541,769 replies to politicians were found, of which abuse was found in 4.46%. The second row gives similar statistics for the 2017 general election period. It is evident that the level of abuse received by political candidates has risen in the intervening two and a half years. </div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
In terms of representation in the sample of election candidates with Twitter accounts, gender balance is skewed heavily in favour of men for the Conservatives and LibDems; Labour in contrast had more female/non-binary than male candidates. Most abuse is aimed at Jeremy Corbyn and Boris Johnson, with Matthew Hancock, Jacob Rees-Mogg, Jo Swinson, Michael Gove, David Lammy and James Cleverly also receiving substantial abuse. Michael Gove received a great deal of personal abuse following the climate debate. Jo Swinson received the most sexist abuse.</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span id="docs-internal-guid-90ed7a4c-7fff-a19d-5efb-a42ea71c53d0"><br /></span></div>
<div align="left" dir="ltr" style="margin-left: 0pt;">
<table style="border-collapse: collapse; border: none; table-layout: fixed; width: 451.27559055118115pt;"><colgroup><col></col><col></col><col></col><col></col><col></col><col></col><col></col></colgroup><tbody>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">Period</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">Original MP tweets</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">MP retweets</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">MP</span></div>
<div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">replies</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">Replies to MPs</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">Abusive replies to MPs</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">%</span></div>
<div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">Abuse</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">3 Nov–15 Dec 2019</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">184,014 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">334,952</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">131,292 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">3,541,769 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">157,844</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">4.46</span></div>
</td></tr>
<tr style="height: 0pt;"><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">29 Apr–9 Jun 2017</span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">126,216 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">245,518 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">71,598 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">961,413 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">31,454 </span></div>
</td><td style="border-bottom: solid #000000 1pt; border-left: solid #000000 1pt; border-right: solid #000000 1pt; border-top: solid #000000 1pt; overflow-wrap: break-word; overflow: hidden; padding: 5pt 5pt 5pt 5pt; vertical-align: top;"><div dir="ltr" style="line-height: 1.2; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">3.27</span></div>
</td></tr>
</tbody></table>
</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<b>Who is getting abuse?</b></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
The topic of Brexit draws abuse for all three parties. Conservative candidates initially move away from this, toward their safer topic of taxation, before returning to Brexit. Liberal Democrats continue to focus on Brexit despite receiving abuse. Labour candidates consistently don’t focus on Brexit; public health is a safe topic for Labour. </div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
Levels of abuse increased in the run up to the election. The figure below highlights the number of abusive tweets received by the three major parties. There is a considerable spike for both Labour and the Conservatives in the week prior to the election.</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span id="docs-internal-guid-5b547bbd-7fff-8b22-2159-8f779e577e26"><span style="font-family: Arial; font-size: 11pt; font-variant-east-asian: normal; font-variant-numeric: normal; font-weight: 700; vertical-align: baseline; white-space: pre-wrap;"><span style="border: none; display: inline-block; height: 363px; overflow: hidden; width: 602px;"><img height="363" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/abuse-per-party-per-week.png" style="margin-left: 0px; margin-top: 0px;" width="602" /></span></span></span></div>
<div>
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
In the graph below we look at the average abuse per month received by MPs did not stand again those who did choose to stand again. We see that in all bar one of the earlier months of the year those individuals received more abuse, and particularly in June.MPs who stood down received more abuse than those who chose to stand again in all but one month in the first half of 2019, and in June they received over 50% more abuse.</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span id="docs-internal-guid-12833cce-7fff-e3b8-300c-2edaeeae0abb"><span style="font-family: Arial; font-size: 11pt; font-variant-east-asian: normal; font-variant-numeric: normal; font-weight: 700; vertical-align: baseline; white-space: pre-wrap;"><span style="border: none; display: inline-block; height: 312px; overflow: hidden; width: 554px;"><img height="312" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/volume-abuse-2019.png" style="margin-left: 0px; margin-top: 0px;" width="554" /></span></span></span></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<b>Conclusions</b></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
Between Nov 3rd and December 15th, we found 157,844 abusive replies to candidates’ tweets (4.44% of all replies received)–a low estimate of probably around half of the actual abusive tweets. Overall, abuse levels climbed week on week in November and early December, as the election campaign progressed, from 17,854 in the first week to 41,421 in the week of the December 12th election. The escalation in abuse was toward Conservative candidates specifically, with abuse levels towards candidates from the other two main parties remaining stable week on week; however, after Labour’s decisive defeat, their candidates were subjected to a spike in abuse. Abuse levels are not constant; abuse is triggered by external events (e.g. leadership debates) or controversial tweets by the candidates. Abuse levels have also been approximately climbing month on month over the year, and in November were more than double by volume compared with January.</div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">[1] </span><a href="https://gate-socmedia.group.shef.ac.uk/election-analysis-and-hate-speech/" style="text-decoration: none;"><span style="-webkit-text-decoration-skip: none; background-color: transparent; color: #1155cc; font-family: Arial; font-size: 11pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration-skip-ink: none; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap; white-space: pre;">https://gate-socmedia.group.shef.ac.uk/election-analysis-and-hate-speech/ ge2019-supp-mat/</span></a></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span id="docs-internal-guid-fd241f43-7fff-6d75-9508-ecdc358f2704"><br /><span style="font-family: Arial; font-size: 11pt; font-variant-east-asian: normal; font-variant-numeric: normal; vertical-align: baseline; white-space: pre-wrap;">[2]</span><span style="color: #1155cc; font-family: Arial; font-size: 11pt; font-variant-east-asian: normal; font-variant-numeric: normal; text-decoration-line: underline; text-decoration-skip-ink: none; vertical-align: baseline; white-space: pre-wrap;"><a href="https://www.stylist.co.uk/long-reads/women-mps-standing-down-uk-onlineabuse-election/325744" style="text-decoration-line: none;">https://www.stylist.co.uk/long-reads/women-mps-standing-down-uk-onlineabuse-election/325744</a></span></span></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-19031874487695815242019-11-06T13:18:00.002+00:002021-09-14T12:39:37.034+01:00Which MPs changed party affiliation, 2017-2019<div class="separator" style="clear: both; text-align: center;"><a href="https://gate.ac.uk/sale/images/blog/mp-change-party.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://gate.ac.uk/sale/images/blog/mp-change-party.png" width="640" height="512" data-original-width="1187" data-original-height="949" /></a></div>
As part of our work tracking Twitter abuse towards MPs and candidates going into the December 12th general election I've been updating our data files regarding party membership. I thought you might be interested to see the result!</br></br>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-9222206714417736892019-08-12T10:22:00.001+01:002021-09-14T12:59:37.157+01:00In the News: Online Abuse of Politicians, BBC<div class="separator" style="clear: both; text-align: center;"><a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/in-the-news-1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/in-the-news-1.png" width="640" data-original-width="1273" data-original-height="686" /></a></div></br>
We've been working together with the BBC to bring public attention to the issue of online abuse against politicians. Rising tensions in Q1 and Q2 of 2019 meant that politicians were seeing more verbal abuse on Twitter than we have previously observed. The findings were presented on the 6 o'clock and 10 o'clock news on Tuesday, August 6th, and you can see in the histogram above that we found the level of incivility rising to almost 4%. <strong><a href="https://www.bbc.co.uk/news/uk-politics-49247808">You can see the BBC article describing the work here</a></strong>.</br></br>
The BBC also did a survey. They found 139 MPs out of the 172 who responded to their survey who said either they or their staff had faced abuse in the past year. More than 60% (108) of those who replied said they had been in contact with the police about threats in the last 12 months.</br></br>
We found that levels of abuse on Twitter fluctuate over time, with spikes driven by events such as the death of IS bride Shamima Begum's baby or key events in the Brexit negotiations. Labour MP David Lammy has received the most abuse of any MP on Twitter so far this year.</br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/in-the-news-2.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/in-the-news-2.png" width="400" data-original-width="1275" data-original-height="723" /></a></div>
As previously, we also found that on average, male MPs attract significantly more general incivility than female ones, though women attract more sexist abuse. Conservative MPs on average, as previously, attracted significantly more abuse than Labour ones, perhaps because they are in power. Sexist abuse is the most prevalent, as compared with homophobia or racism.
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-38629399247743832032019-07-30T14:20:00.001+01:002020-10-20T13:44:54.794+01:00GATE Cloud services for Google Sheets featured in the CLARIN Newsflash<a href="https://www.clarin.eu/content/clarin-in-a-nutshell">CLARIN ERIC</a> is a research infrastructure through Europe and beyond to encourage the sharing and sustainability of language data and tools for research in the humanities and social sciences. We are pleased to announce that our functions for text analysis in Google Sheets were featured in the<a href="https://mailchi.mp/9d7f50c03029/clarin-newsflash-july-2019"> July 2019 issue</a> of the CLARIN Newsflash.<br />
<br />
We are still working on getting Google to publish our add-on, which we hope to have available in the marketplace in a few months. Until then, you can follow the instructions in our <a href="https://gate4ugc.blogspot.com/2019/07/gate-cloud-services-for-google-sheets.html">previous blog post</a> to use this tool, which currently provides standard and Twitter-oriented named entity recognition for English, French, and German; named entity linking for English, French, and German; and rumour veracity evaluation for English. In the future we will expand the range of functions to cover a wider variety of <a href="https://cloud.gate.ac.uk/shopfront">GATE Cloud services</a>.Adam Funkhttp://www.blogger.com/profile/10598570295455716024noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-26150667546566642302019-07-15T07:58:00.002+01:002020-10-20T13:53:11.762+01:00GATE Cloud services for Google SheetsSpreadsheets are an increasingly popular way of storing all kinds of information, including text, and giving it some informal structure, and systems like Google Sheets are especially popular for collaborative work and sharing data.<br />
<br />
In response to the demand for standard natural language processing (NLP) tasks in spreadsheets, we have developed a Google Sheets add-on that provides functions to carry out the following tasks on text cells using GATE Cloud services:<br />
<ul>
<li>named entity recognition (NER) for standard text (e.g. news) in English, French, or German;</li>
<li>NER tuned for tweets in English, French, or German;</li>
<li>named entity linking using our YODIE service in English, French, or German;</li>
<li>veracity reporting for rumours in tweets.</li>
</ul>
<div>
<br />
We have demonstrated this work several times, most recently at the IAMCR conference "Communication, Technology and Human Dignity: Disputed Rights, Contested Truths", which took place on 7–11 July at the Universidad Complutense de Madrid in Spain. There we used it to show how organisations monitoring the safety of journalists could automatically add information about entities and events to their spreadsheets. Potential users have said it looks very useful and they would like access to it as soon as possible.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-np8QUjnjQ7E/XSxGl-gZEtI/AAAAAAAAARM/gUJjVThEqlYteeqLb8Z16qg1h8Mi0HMnwCLcBGAs/s1600/Screenshot%2B2019-07-15%2B10.18.04.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="496" data-original-width="1146" height="276" src="https://1.bp.blogspot.com/-np8QUjnjQ7E/XSxGl-gZEtI/AAAAAAAAARM/gUJjVThEqlYteeqLb8Z16qg1h8Mi0HMnwCLcBGAs/s640/Screenshot%2B2019-07-15%2B10.18.04.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td class="tr-caption" style="font-size: 12.8px;">Google sheet showing Named Entity and Linking applications run over descriptions of journalist killings from the Committee to Protect Journalists (CPJ) databases</td></tr>
</tbody></table>
</div>
<div>
<br /></div>
<div>
We are applying to have this add-on published in the G Suite Marketplace, but the process is very slow, so we are making the software available now as a read-only Google Drive document that anyone can copy and re-use. </div>
<div>
<br /></div>
<div>
The document contains several examples and instructions are available from the <i>Add-ons</i> → <i>GATE Text Analysis</i> menu item. The language processing is actually done on our servers; the spreadsheet functions send the text to GATE Cloud using the REST API and reformat the output into a human-readable form, so they require a network connection and are subject to rate-limiting. You can use the functions without setting up a GATE Cloud account, but if you <a href="https://cloud.gate.ac.uk/register/index">create one</a> and authenticate while using this add-on, rate-limiting will be reduced.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-ZI75VBkxWJM/XSioXz6LGAI/AAAAAAAAAOU/S7UDuLXbU3ghLFsXKik_awK0oyMiuB6egCLcBGAs/s1600/gate_ner_cloud.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="591" data-original-width="1065" height="353" src="https://1.bp.blogspot.com/-ZI75VBkxWJM/XSioXz6LGAI/AAAAAAAAAOU/S7UDuLXbU3ghLFsXKik_awK0oyMiuB6egCLcBGAs/s640/gate_ner_cloud.png" width="640" /></a></div>
<div style="text-align: center;">
<br /></div>
<br />
Open <a href="https://docs.google.com/spreadsheets/d/1Hfb_4aHf4qoDWxFHxp0Y5PwLKfSDiyn7XD50mW7QGXs/edit?usp=sharing">this Google spreadsheet</a>, then use <i>File</i> → <i>Make a copy</i> to save a copy to your own Google Drive (you can’t edit the original). For the functions to work, you will have to grant permission for the scripts to send data to and from GATE Cloud services and to use your user-level cache.<br />
<br />
This work has been supported by the European Union’s Horizon 2020 research and innovation programme under grant agreements No 687847 (<a href="https://www.comrades-project.eu/">COMRADES</a>) and No 654024 (<a href="http://www.sobigdata.eu/">SoBigData</a>).<br />
<br />
<br />Adam Funkhttp://www.blogger.com/profile/10598570295455716024noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-71630717596397212072019-07-12T17:25:00.000+01:002019-07-12T17:27:07.117+01:00Using GATE to drive robots at Headstart 2019<br />
In collaboration with Headstart (a charitable trust that provides hands-on science, engineering and maths taster courses), the Department of Computer Science has just run its fourth annual summer school for maths and science A-level students. This residential course ran from 8 to 12 July 2019 and included practical work in computer programming, Lego robots, and project development as well as tours of the campus and talks about the industry.<br />
<br />
For the third year in a row, we have included a section on natural language processing using GATE Developer and a special GATE plugin (which uses the <a href="https://github.com/ramsay-t/ShefRobot">ShefRobot library</a> available from GitHub) that allows JAPE rules to operate the Lego robots. As before, we provided the students with a starter GATE application (essentially the same as in <a href="https://gate4ugc.blogspot.com/2018/09/students-use-gate-and-twitter-to-drive.html">last year's course</a>) containing just enough gazetteer entries, JAPE, and sample code to let them tweet variations like "turn left" and "take a left" to make the robot do just that. We also use the <a href="https://cloud.gate.ac.uk/shopfront/displayItem/twitter-collector">GATE Cloud Twitter Collector</a>, which we have modified to run locally so the students can set it up on a lab computer so it follows their own twitter accounts and processes their tweets through the GATE application, sending commands to the robots when the JAPE rules match.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-RYKSq287pzQ/W4_pjKpHm2I/AAAAAAAAAH0/t_FDAIq2f84d4Tb0sPRiPNzAEKmLidyWwCPcBGAYYCw/s1600/twitter-robot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1057" data-original-width="1600" height="211" src="https://1.bp.blogspot.com/-RYKSq287pzQ/W4_pjKpHm2I/AAAAAAAAAH0/t_FDAIq2f84d4Tb0sPRiPNzAEKmLidyWwCPcBGAYYCw/s320/twitter-robot.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
Based on lessons learned from the previous years, we put more effort into improving the instructions and the Twitter Collector software to help them get it running faster. This time the first robot started moving under GATE's control less than 40 minutes from the start of the presentation, and the students rapidly progressed with the development of additional rules and then tweeting commands to their robots.</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-WA41tmj0V6E/XScR3BnIZ1I/AAAAAAAAAN4/NvRRb13cQ8YYVEfJ26x_CsBBMlMgejLWwCLcBGAs/s1600/IMG_2110.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1200" data-original-width="1600" height="240" src="https://1.bp.blogspot.com/-WA41tmj0V6E/XScR3BnIZ1I/AAAAAAAAAN4/NvRRb13cQ8YYVEfJ26x_CsBBMlMgejLWwCLcBGAs/s320/IMG_2110.JPG" width="320" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-M8bRqi_T488/XScR3AuX8GI/AAAAAAAAAN0/wX7JEhHcCKkC4rbBlvBL3pmjCncXT8JOwCLcBGAs/s1600/IMG_2111.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1200" data-original-width="1600" height="240" src="https://1.bp.blogspot.com/-M8bRqi_T488/XScR3AuX8GI/AAAAAAAAAN0/wX7JEhHcCKkC4rbBlvBL3pmjCncXT8JOwCLcBGAs/s320/IMG_2111.JPG" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
The structure and broader coverage of this year's course meant that the students had more resources available and a more open project assignment, so not all of them chose to use GATE in their projects, but it was much easier and more streamlined for them to use than in previous years.</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dygJdiCnfb5uS3BctPvGWE3AJBksuOvqvL4tILXQXipKn9msxMV1IVZPkf4UpqORpds4bJHOYiHuli6iIAVDg' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dycfDGwS0HOiKKX0bVzaSh0IWMwfPc8S9xhZCzz2lMgmTGtHPlJKCQY1LfONg9_WJfeXzNE_QALyHB7O06mdQ' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dzWnnu8KxIKvPR9EbrZpDGTev_kbA5fCdVe24vidEnXAbYDiXRBKXapjiVW1q-A9KQqszoOybkGdj4MRO4y9w' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dz4hRXXZFiCHBKOtv3SJaDy3sUdtGC6e6D4TNiOA3yxf6SKEjttdfW6z19oF8fj5hRpyKlHL38gPVsRmLN8Ig' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dwIBs2e7GhDsx9fXY8dhYQVVHQ0fmTG3wVX0aeYYon4B7Qsfvji7U44cnxtShtME4sJsu-XejTGScvN-jySYQ' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
This year 42 students (14 female; 28 male) from around the UK attended the Computer Science Headstart Summer School.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-nvktnZjymg8/XScR_P-DoZI/AAAAAAAAAOA/c4ofsXEfmE0hcFMrC8vjqLIAXOFEekP5wCLcBGAs/s1600/Computer_Science_Map_2019.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="695" data-original-width="670" height="320" src="https://1.bp.blogspot.com/-nvktnZjymg8/XScR_P-DoZI/AAAAAAAAAOA/c4ofsXEfmE0hcFMrC8vjqLIAXOFEekP5wCLcBGAs/s320/Computer_Science_Map_2019.png" width="308" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Geography of male students</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-tpYlDxqZx4Y/XScR_FS9OaI/AAAAAAAAAN8/nXIWD-I1Y58vBsDy_Q3RXyNBYmFQXnKTACLcBGAs/s1600/Female_Computer_Science_Map_2019.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="748" data-original-width="671" height="320" src="https://1.bp.blogspot.com/-tpYlDxqZx4Y/XScR_FS9OaI/AAAAAAAAAN8/nXIWD-I1Y58vBsDy_Q3RXyNBYmFQXnKTACLcBGAs/s320/Female_Computer_Science_Map_2019.png" width="287" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Geography of female students</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
The <a href="https://gate.ac.uk/sale/talks/headstart-2019/handout.pdf">handout</a> and <a href="https://gate.ac.uk/sale/talks/headstart-2019/presentation.pdf">slides</a> are publicly available from the GATE website, which also hosts <a href="https://gate.ac.uk/download/">GATE Developer</a> and other software products in <a href="https://gate.ac.uk/family/">the GATE family</a>. Source code is available from <a href="https://github.com/GateNLP">our GitHub site</a>. </div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
GATE Cloud development is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 654024 (the <a href="http://sobigdata.eu/">SoBigData</a> project).</div>
<div class="separator" style="clear: both;">
<br /></div>
<br />Adam Funkhttp://www.blogger.com/profile/10598570295455716024noreply@blogger.com0The Diamond, The University of Sheffield, 32 Leavygreave Rd, Sheffield S3 7RD, UK53.381619699999987 -1.482085099999949327.859585199999987 -42.790679099999949 78.903654199999991 39.82650890000005tag:blogger.com,1999:blog-6188464480883238284.post-55124289240653358682019-07-03T15:56:00.000+01:002019-07-03T16:40:16.297+01:0012th GATE Summer School (17-21 June 2019)<br />
<h2>
12th GATE Training Course: open-source natural language processing with an emphasis on social media</h2>
For over a decade, the GATE team has provided an annual course in using our technology. The course content and track options have changed a bit over the years, but it always includes material to help novices get started with GATE as well as introductory and more advanced use of the <a href="https://gate.ac.uk/sale/tao/splitch8.html#chap:jape">JAPE language</a> for matching patterns of document annotations.<br />
<br />
The latest course also included machine learning, crowdsourcing, sentiment analysis, and an optional programming module (aimed mainly at Java programmers to help them embed GATE libraries, applications, and resources in web services and other "behind the scenes" processing). We have also added examples and new tools in GATE to cover the increasing demand for getting data out of and back into spreadsheets, and updated our work on social media analysis, another growing field.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-_hA18iUKwgM/XRzJkmABEOI/AAAAAAAAANg/zHMigNtFHeMNNX05Ar-3qQ86njKEjtdjQCLcBGAs/s1600/Screenshot%2B2019-07-03%2B13.47.29.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="445" data-original-width="743" height="191" src="https://1.bp.blogspot.com/-_hA18iUKwgM/XRzJkmABEOI/AAAAAAAAANg/zHMigNtFHeMNNX05Ar-3qQ86njKEjtdjQCLcBGAs/s320/Screenshot%2B2019-07-03%2B13.47.29.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Information in "feral databases" (spreadsheets)</td></tr>
</tbody></table>
We also disseminated work from several current research projects.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-YCAZ5Ualx8k/XRy6hxlXQeI/AAAAAAAAAM8/vIV231cf0HEIYatrZpxN4gvtj4XeGJvMgCLcBGAs/s1600/IMG_2077.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1200" data-original-width="1600" height="240" src="https://1.bp.blogspot.com/-YCAZ5Ualx8k/XRy6hxlXQeI/AAAAAAAAAM8/vIV231cf0HEIYatrZpxN4gvtj4XeGJvMgCLcBGAs/s320/IMG_2077.JPG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Semantics in scientometrics</td></tr>
</tbody></table>
<br />
<ul>
<li>From <a href="https://www.knowmak.eu/">KNOWMAK</a> and <a href="https://www.risis2.eu/">RISIS</a>, we presented our work on using semantic technologies in scientometrics, by applying NLP and ontologies to document categorization in order to contribute to a searchable knowledge base that allows users to find aggregate and specific data about scientific publications, patents, and research projects by geography, category, etc.</li>
<li>Much of <a href="https://gate-socmedia.group.shef.ac.uk/">our recent work on social media analysis</a>, including opinion mining and abuse detection and measurement, has been done as part of the <a href="http://sobigdata.eu/index">SoBigData</a> project.</li>
<li>The increasing range of tools for languages other than English links with our participation in the <a href="https://www.european-language-grid.eu/">European Language Grid</a>, which is also supported further development of <a href="https://cloud.gate.ac.uk/">GATE Cloud</a>, our platform for text analytics as a service.</li>
</ul>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-p_z1-oswDSU/XRzI8a1K2rI/AAAAAAAAANU/AMY2N-MvlAIzSle1jQOdfSNjtKkYO8L6QCLcBGAs/s1600/D9frd_cXkAAvR7y.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="900" data-original-width="1200" height="240" src="https://1.bp.blogspot.com/-p_z1-oswDSU/XRzI8a1K2rI/AAAAAAAAANU/AMY2N-MvlAIzSle1jQOdfSNjtKkYO8L6QCLcBGAs/s320/D9frd_cXkAAvR7y.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Conditional processing of multilingual documents</td></tr>
</tbody></table>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-jbzpYOA3mKQ/XRzI8UHsZGI/AAAAAAAAANQ/OjyqFGSZqpYIFYV-8yFrX7bJ8SBrFf7JwCLcBGAs/s1600/D9freeqXoAAsaZg.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="900" data-original-width="1200" height="240" src="https://1.bp.blogspot.com/-jbzpYOA3mKQ/XRzI8UHsZGI/AAAAAAAAANQ/OjyqFGSZqpYIFYV-8yFrX7bJ8SBrFf7JwCLcBGAs/s320/D9freeqXoAAsaZg.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Processing German in GATE</td></tr>
</tbody></table>
The GATE software distributions, documentation, and training materials from our courses can all be downloaded <a href="https://gate.ac.uk/">from our website</a> under open licences. Source code is also available from <a href="https://github.com/gatenlp">our github page</a>.<br />
<div>
<h3>
Acknowledgements</h3>
</div>
<span style="font-family: inherit;">The course included research funded by the European Union's Horizon 2020 research and innovation programme under grant agreements No. 726992 (KNOWMAK), No. 654024 (SoBigData), No. </span>824091 (RISIS), <span style="font-family: inherit;">and No. </span>825627 (European Language Grid); by the <a href="https://www.freepressunlimited.org/">Free Press Unlimited</a> pilot project "Developing a database for the improved collection and systematisation of information on incidents of violations against journalists"; by EPSRC grant EP/I004327/1; by the British Academy under the call "The Humanities and Social Sciences: Tackling the UK’s International Challenges"; and by <a href="http://nesta.org.uk/">Nesta</a>.Adam Funkhttp://www.blogger.com/profile/10598570295455716024noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-55484309238206384582019-06-27T10:52:00.000+01:002019-06-27T10:52:06.785+01:00GATE's submission wins 2nd place in United Nations General Assembly Resolutions Extraction and Elicitation Global Challenge<div class="MsoNormal">
In May 2019 we submitted a prototype to the <a href="https://uniteideas.spigit.com/unga-resolutions/Page/Home">United Nations General Assembly Resolutions Extraction and Elicitation Global Challenge</a>, which asked for submissions using mature natural language processing techniques to produce semantically enhanced, machine-readable documents from PDFs of UN GA resolutions, with particular interest in identifying named entities and items in certain thesauri and ontologies and in making use of the document structure (in particular, the distinction between preamble and operative paragraphs).</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Our prototype included a customized GATE application designed to read PDFs of United Nations General Assembly resolutions and identify named entities, resolution adoption information (resolution number and adoption date), preamble sections, operative sections, and references to keywords and phrases in the English parts of the <a href="https://lib-thesaurus.un.org/DPI/DHL/DHLUNBISThesaurus.nsf?Open">UN Bibliographical Information System thesaurus</a>.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
We downloaded and automatically annotated over 2800 resolution documents and pushed the results into a <a href="https://demos.gate.ac.uk/unga/mimir/demo/search/index">Mímir index to allow semantic search</a> using combinations of the entities and sections identified, such as the following (more examples are provided in <a href="https://github.com/GateNLP/gateplugin-UNGA/blob/master/doc/GATE-UNGA-2019-04-26.pdf">the documentation</a> that we submitted):</div>
<div class="MsoNormal">
</div>
<ul>
<li>find sentences in operative paragraphs containing a person and an UNBIS term;</li>
<li>find preamble paragraphs containing a person, an organization, and a date;</li>
<li>find combinations referring to a specific UNBIS term.</li>
</ul>
<br />
<div class="MsoNormal">
We also developed an <a href="https://demos.gate.ac.uk/unga/search/">easier to use web front end</a> for exploring co-occurrences of keywords and semantic annotations.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-dmRPaQTv7T4/XQzAt7VrJTI/AAAAAAAAAMY/Nat2_7r0F98D4pod2Cbi2VrUBMcbRIR_QCLcBGAs/s1600/demo-search.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="282" data-original-width="1160" height="95" src="https://1.bp.blogspot.com/-dmRPaQTv7T4/XQzAt7VrJTI/AAAAAAAAAMY/Nat2_7r0F98D4pod2Cbi2VrUBMcbRIR_QCLcBGAs/s400/demo-search.png" width="400" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-JMYNksugYX4/XQzAt1kgzcI/AAAAAAAAAMc/zttvJTGWQjMhAKaydiIRKutXUBuEaj_vgCLcBGAs/s1600/demo-matrix.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="295" data-original-width="324" height="291" src="https://1.bp.blogspot.com/-JMYNksugYX4/XQzAt1kgzcI/AAAAAAAAAMc/zttvJTGWQjMhAKaydiIRKutXUBuEaj_vgCLcBGAs/s320/demo-matrix.png" width="320" /></a></div>
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
We are excited to receive the second place award, along with an invitation to improve our work with more feedback and a "lessons learned" discussion with the panel. The panel highlighted in particular the submission of comprehensive and testable code, and the use of GATE as a mature respected framework.<br />
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-73K_TF_83_g/XQylMn7yaII/AAAAAAAAAMM/-Khsd-UjXwYb09M_fHRMsy3I_EJ56SofQCLcBGAs/s1600/UNGA_Resolutions_2nd.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="540" data-original-width="720" height="300" src="https://1.bp.blogspot.com/-73K_TF_83_g/XQylMn7yaII/AAAAAAAAAMM/-Khsd-UjXwYb09M_fHRMsy3I_EJ56SofQCLcBGAs/s400/UNGA_Resolutions_2nd.png" width="400" /></a></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Our GitHub site contains the <a href="https://github.com/GateNLP/gateplugin-UNGA">information extraction</a> and <a href="https://github.com/GateNLP/UNGA-search">search front end</a> software, licensed under the GPL-3.0 and available for anyone to download and use.</div>
Adam Funkhttp://www.blogger.com/profile/10598570295455716024noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-56024831526765010682019-06-06T17:28:00.002+01:002021-10-10T02:11:31.138+01:00Toxic Online Discussions during the UK European Parliament Election Campaign<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/pies-for-blog.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="449" data-original-width="1096" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/pies-for-blog.png" width="580" /></a></div>
<br />
The Brexit Party attracted the most engagement on Twitter in the run-up to the UK European Parliament election on May 23rd, their candidates receiving as many tweets as all the other parties combined. Brexit Party leader Nigel Farage was the most interacted-with UK candidate on Twitter, with over twice as many replies as the next most replied-to candidate, Andrew Adonis of the Labour Party.<br />
<br />
We studied all tweets sent to or from (or retweets of or by) UK European Election candidates in the month of May, and classified them as abusive or not using the classifier presented <a href="https://arxiv.org/pdf/1904.11230.pdf">here</a>. It must be noted, in particular, that the classifier only identifies reliably whether a reply is abusive or not. It is not sufficiently accurate for us to reliably judge the target politician or party of this abusive reply. What this means is that we can only reliably identify which EP candidates triggered abuse-containing discussion threads on Twitter, but that often this abuse is actually aimed at other politicians or parties.<br />
<br />
In addition to attracting the most replies, the Brexit Party candidates also triggered an unusually high level of abuse-containing Twitter discussions. In particular, we found that posts by Farage triggered almost six times as many abuse-containing Twitter threads than the next most replied to candidate, Gavin Esler of Change UK, during May 2019.<br />
<br />
There is an important difference, however, in that that many of the abuse-containing replies to posts by Farage and the Brexit Party were actually abusive towards other politicians (most notably the prime minister and the leader of the Labour party) and not Farage himself. In contrast, abusive replies to Gavin Esler were primarily aimed at the politician himself, triggered by his use of the phrase "village idiot" in connection with the Leave Campaign.<br />
<br />
Candidates from other parties that triggered unusually high levels of abuse-containing discussions were those from the UK Independence Party, now considered far right, and Change UK, a newly formed but unstable remain party. Change UK was the most active on Twitter, with candidates sending more tweets than other parties. Gavin Esler was the most replied-to Change UK candidate, and also received an unusually high level of abuse. The abuse often referred to <a href="https://www.huffingtonpost.co.uk/entry/gavin-esler-tv-news-must-stop-giving-airtime-to-the-village-idiots-of-brexit_uk_5cc5c36fe4b0fd8e35bda67d">his use of the phrase "village idiot"</a> in connection with the leave campaign, which resulted in anger and resentment.<br />
<br />
In contrast, MEP candidates from the Conservative and Labour Parties were not hubs of polarised, abuse-containing discussions on Twitter.<br />
<br />
What these findings, unsurprisingly, demonstrate is that politicians and parties who themselves use divisive and abusive language, for example, to brand political opponents as “village idiots”, “traitors”, or as “desperate to betray”, are thus triggering the toxic online responses and deep political antagonism that we have witnessed.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/bars-for-blog.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="322" data-original-width="594" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/bars-for-blog.png" width="500" /></a></div>
After the Brexit Party, the next most replied-to MEP candidates were from the Labour partyAfter the Brexit Party, the next most replied-to party was Labour, according to the study, followed by Change UK.<br />
<br />
MEP candidates from both the Liberal Democrats and the Green Party were also active on Twitter, with the Green MEP candidates second only to Change UK ones for number of tweets sent, but didn't get a lot of engagement in return. The Liberal Democrats in particular received a low number of replies. This may suggest that these parties became the choices of default for a population of discouraged remainers, as both made gains in the election. Both parties attracted a particularly civil tone of reply.<br />
<br />
Brexit Party candidates were also the ones that replied most to those who tweeted them, rather than authoring original tweets or retweeting other tweets.<br />
<br />
Acknowledgements: Research carried out by Genevieve Gorrell, Mehmet Bakir, and Kalina Bontcheva. This work was partially supported by the European Union under grant agreements No. 654024 SoBigData and No. 825297 WeVerify.Unknownnoreply@blogger.comtag:blogger.com,1999:blog-6188464480883238284.post-83863034016192256372019-05-09T17:19:00.001+01:002019-05-09T17:23:57.953+01:00GATE at World Press Freedom Day<h1 class="entry-title" style="background-color: white; border: 0px; box-sizing: border-box; color: #00844d; font-family: "Open Sans"; font-size: 23px; font-stretch: inherit; font-variant-east-asian: inherit; font-variant-numeric: inherit; letter-spacing: 0.5px; line-height: 1.5em; margin: 0px; padding: 0px 0px 10px; text-align: center; text-transform: uppercase; vertical-align: baseline;">
GATE at World Press freedom day: STRENGTHENING THE MONITORING OF SDG 16.10.1</h1>
<div>
<br /></div>
<div style="text-align: justify;">
In her role with <a href="http://www.cfom.org.uk/">CFOM</a> (the University's Centre for Freedom of the Media, hosted in the department of Journalism Studies), Diana Maynard travelled to Ethiopia together with CFOM members Sara Torsner and Jackie Harrison to present their research at the <a href="https://en.unesco.org/commemorations/worldpressfreedomday">World Press Freedom Day</a> <a href="https://en.unesco.org/wpfd-2019-academic-conference">Academic Conference on the Safety of journalists</a> in Addis Ababa, on 1 May, 2019. This ongoing research aims to facilitate the comprehensive monitoring of violations against journalists, in line with Sustainable Development Goal (SDG) <a href="https://www.sdgdata.gov.au/goals/peace-and-justice/16.10.1">16.10.1</a>. This is part of a collaborative project between CFOM and the press freedom organisation <a href="https://www.freepressunlimited.org/en">Free Press Unlimited</a>, which aims to develop a methodology for systematic data collection on a range of attacks on journalists, and to provide a mechanism for dealing with missing, conflicting and potentially erroneous information.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Discussing possibilities for adopting NLP tools for developing a monitoring infrastructure that allows for the systematisation and organisation of a range of information and data sources related to violations against journalists, Diana proposed a set of areas of research that aim to explore this in more depth. These include: switching to an events-based methodology, reconciling data from multiple sources, and investigating information validity.</div>
<div style="text-align: justify;">
<br /></div>
<div class="separator" style="clear: both; text-align: justify;">
<a href="https://4.bp.blogspot.com/-JzGcNi2kaNU/XNQ713zc0SI/AAAAAAAAANI/LZdwm7E2Z00SrjzFDETLUPF0I6Ts_UfXACLcBGAs/s1600/IMG_4160-1280x640.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="640" data-original-width="1280" height="320" src="https://4.bp.blogspot.com/-JzGcNi2kaNU/XNQ713zc0SI/AAAAAAAAANI/LZdwm7E2Z00SrjzFDETLUPF0I6Ts_UfXACLcBGAs/s640/IMG_4160-1280x640.jpg" width="640" /></a></div>
<div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Whereas approaches to monitoring violations against journalists traditionally uses a person-based approach, recording information centred around an individual, we suggest that adopting an events-based methodology instead allows for the violation itself to be placed at the centre: <i>‘by enabling the contextualising and recording of in-depth information related to a single instance of violence such as a killing, including information about key actors and their interrelationship (victim, perpetrator and witness of a violation), the events-based approach enables the modelling of the highly complex structure of a violation. It also allows for the recording of the progression of subsequent violations as well as multiple violations experienced by the same victim (e.g. detention, torture and killing)’.</i></div>
<div style="text-align: justify;">
<i><br /></i></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: justify;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-2bzBnDUjC_M/XNROLAXfEzI/AAAAAAAAANk/ZPkm0ta9fYcYsRnJ-o6_XSLO7gAzlD9XQCLcBGAs/s1600/event-based-data-model.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="639" data-original-width="706" height="361" src="https://1.bp.blogspot.com/-2bzBnDUjC_M/XNROLAXfEzI/AAAAAAAAANk/ZPkm0ta9fYcYsRnJ-o6_XSLO7gAzlD9XQCLcBGAs/s400/event-based-data-model.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Event-based data model from HURIDOCS Source:
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>SV</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<w:DoNotOptimizeForBrowser/>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="false"
DefSemiHidden="false" DefQFormat="false" DefPriority="99"
LatentStyleCount="376">
<w:LsdException Locked="false" Priority="0" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index 9"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" Name="toc 9"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footnote text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="header"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footer"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="index heading"/>
<w:LsdException Locked="false" Priority="35" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="table of figures"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="envelope address"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="envelope return"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="footnote reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="line number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="page number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="endnote reference"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="endnote text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="table of authorities"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="macro"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="toa heading"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Bullet 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Number 5"/>
<w:LsdException Locked="false" Priority="10" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Closing"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Signature"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="true"
UnhideWhenUsed="true" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="List Continue 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Message Header"/>
<w:LsdException Locked="false" Priority="11" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Salutation"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Date"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text First Indent"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text First Indent 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Note Heading"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Body Text Indent 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Block Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Hyperlink"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="FollowedHyperlink"/>
<w:LsdException Locked="false" Priority="22" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Document Map"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Plain Text"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="E-mail Signature"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Top of Form"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Bottom of Form"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal (Web)"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Acronym"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Address"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Cite"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Code"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Definition"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Keyboard"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Preformatted"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Sample"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Typewriter"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="HTML Variable"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Normal Table"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="annotation subject"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="No List"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Outline List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Simple 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Classic 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Colorful 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Columns 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Grid 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 4"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 5"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 7"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table List 8"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table 3D effects 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Contemporary"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Elegant"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Professional"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Subtle 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Subtle 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 1"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 2"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Web 3"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Balloon Text"/>
<w:LsdException Locked="false" Priority="39" Name="Table Grid"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Table Theme"/>
<w:LsdException Locked="false" SemiHidden="true" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" SemiHidden="true" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" QFormat="true"
Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" QFormat="true"
Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" QFormat="true"
Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" QFormat="true"
Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" QFormat="true"
Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" QFormat="true"
Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" SemiHidden="true"
UnhideWhenUsed="true" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" SemiHidden="true"
UnhideWhenUsed="true" QFormat="true" Name="TOC Heading"/>
<w:LsdException Locked="false" Priority="41" Name="Plain Table 1"/>
<w:LsdException Locked="false" Priority="42" Name="Plain Table 2"/>
<w:LsdException Locked="false" Priority="43" Name="Plain Table 3"/>
<w:LsdException Locked="false" Priority="44" Name="Plain Table 4"/>
<w:LsdException Locked="false" Priority="45" Name="Plain Table 5"/>
<w:LsdException Locked="false" Priority="40" Name="Grid Table Light"/>
<w:LsdException Locked="false" Priority="46" Name="Grid Table 1 Light"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark"/>
<w:LsdException Locked="false" Priority="51" Name="Grid Table 6 Colorful"/>
<w:LsdException Locked="false" Priority="52" Name="Grid Table 7 Colorful"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 1"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 1"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 1"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 1"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 1"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 2"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 2"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 2"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 2"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 2"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 3"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 3"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 3"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 3"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 3"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 4"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 4"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 4"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 4"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 4"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 5"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 5"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 5"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 5"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 5"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="46"
Name="Grid Table 1 Light Accent 6"/>
<w:LsdException Locked="false" Priority="47" Name="Grid Table 2 Accent 6"/>
<w:LsdException Locked="false" Priority="48" Name="Grid Table 3 Accent 6"/>
<w:LsdException Locked="false" Priority="49" Name="Grid Table 4 Accent 6"/>
<w:LsdException Locked="false" Priority="50" Name="Grid Table 5 Dark Accent 6"/>
<w:LsdException Locked="false" Priority="51"
Name="Grid Table 6 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="52"
Name="Grid Table 7 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="46" Name="List Table 1 Light"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark"/>
<w:LsdException Locked="false" Priority="51" Name="List Table 6 Colorful"/>
<w:LsdException Locked="false" Priority="52" Name="List Table 7 Colorful"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 1"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 1"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 1"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 1"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 1"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 1"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 2"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 2"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 2"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 2"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 2"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 2"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 3"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 3"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 3"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 3"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 3"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 3"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 4"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 4"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 4"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 4"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 4"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 4"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 5"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 5"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 5"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 5"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 5"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 5"/>
<w:LsdException Locked="false" Priority="46"
Name="List Table 1 Light Accent 6"/>
<w:LsdException Locked="false" Priority="47" Name="List Table 2 Accent 6"/>
<w:LsdException Locked="false" Priority="48" Name="List Table 3 Accent 6"/>
<w:LsdException Locked="false" Priority="49" Name="List Table 4 Accent 6"/>
<w:LsdException Locked="false" Priority="50" Name="List Table 5 Dark Accent 6"/>
<w:LsdException Locked="false" Priority="51"
Name="List Table 6 Colorful Accent 6"/>
<w:LsdException Locked="false" Priority="52"
Name="List Table 7 Colorful Accent 6"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Mention"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Smart Hyperlink"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Hashtag"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Unresolved Mention"/>
<w:LsdException Locked="false" SemiHidden="true" UnhideWhenUsed="true"
Name="Smart Link"/>
</w:LatentStyles>
</xml><![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:0;
mso-generic-font-family:roman;
mso-font-pitch:variable;
mso-font-signature:-536870145 1107305727 0 0 415 0;}
@font-face
{font-family:Cambria;
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:0;
mso-generic-font-family:roman;
mso-font-pitch:variable;
mso-font-signature:-536870145 1073743103 0 0 415 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin:0cm;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Times New Roman",serif;
mso-fareast-font-family:"Times New Roman";}
p.MsoCaption, li.MsoCaption, div.MsoCaption
{mso-style-priority:35;
mso-style-qformat:yes;
mso-style-next:Normal;
margin-top:0cm;
margin-right:0cm;
margin-bottom:10.0pt;
margin-left:0cm;
mso-pagination:widow-orphan;
font-size:9.0pt;
font-family:"Times New Roman",serif;
mso-fareast-font-family:"Times New Roman";
color:#1F497D;
mso-themecolor:text2;
font-style:italic;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
mso-themecolor:hyperlink;
text-decoration:underline;
text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-noshow:yes;
mso-style-priority:99;
color:purple;
mso-themecolor:followedhyperlink;
text-decoration:underline;
text-underline:single;}
.MsoChpDefault
{mso-style-type:export-only;
mso-default-props:yes;
font-family:"Cambria",serif;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"MS Mincho";
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;
mso-ansi-language:SV;
mso-fareast-language:SV;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;
mso-header-margin:36.0pt;
mso-footer-margin:36.0pt;
mso-paper-source:0;}
div.WordSection1
{page:WordSection1;}
-->
</style>
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Cambria",serif;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;
mso-ansi-language:SV;
mso-fareast-language:SV;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoCaption" style="line-height: 115%; margin-left: 18.0pt; text-align: justify; text-justify: inter-ideograph;">
<span style="font-family: "cambria" , serif; font-size: 10.0pt; line-height: 115%;">: </span><a href="https://openevsys.org/the-methodology-designing-formats-and-data-consistency/"><span style="font-family: "cambria" , serif; font-size: 10.0pt; line-height: 115%;">https://openevsys.org/the-methodology-designing-formats-and-data-consistency/</span></a><span style="font-family: "cambria" , serif; font-size: 10.0pt; line-height: 115%;"> </span></div>
</td></tr>
</tbody></table>
<div style="text-align: justify;">
Another area of research includes possibilities for reconciling information from different databases and sources of information on violations against journalists through NLP techniques. Such methods would allow for the assessment and compilation of partial and contradictory data about the elements constituting a given attack on a journalist. ‘<i>By creating a central categorisation scheme we would essentially be able to facilitate the mapping and pooling of data from various sources into one data source, thus creating a monitoring infrastructure for SDG 16.10.1</i>’, said Diana Maynard. Systematic data on a range of violations against journalists that are gathered in a methodologically systematic and transparent way would also be able to address issues of information validity and source verification: ‘<i>Ultimately such data would facilitate the investigation of patterns, trends and early warnings, leading to a better understanding of the contexts in which threats to journalists can escalate into a killing undertaken with impunity</i>’. We thus propose a framework for mapping between different datasets and event categorisation schemes in order to harmonise information.</div>
</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<a href="https://3.bp.blogspot.com/-VhJV1YJjDCg/XNRPdGmqvUI/AAAAAAAAAN0/smi2hJucQz4vN_zvOVI8ikCPC4w5EuFQACLcBGAs/s1600/grand-scheme-categorisation-incident.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="650" data-original-width="1350" height="307" src="https://3.bp.blogspot.com/-VhJV1YJjDCg/XNRPdGmqvUI/AAAAAAAAAN0/smi2hJucQz4vN_zvOVI8ikCPC4w5EuFQACLcBGAs/s640/grand-scheme-categorisation-incident.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div style="text-align: justify;">
<br /></div>
<div>
<div style="text-align: justify;">
In our proposed methodology, GATE tools can be used to extract information from the free text portions of existing databases and link them to external knowledge sources in order to acquire more detailed information about an event, and to enable semantic reasoning about entities and events, thereby helping to both reconcile information at different levels of granularity (e.g. Dublin vs Ireland; shooting vs killing) and to structure information for further search and analysis. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<a href="https://4.bp.blogspot.com/-dL-Li50SQn8/XNQ9DuTTjaI/AAAAAAAAANQ/xdi7FLjnuNkaIGjGLz4MbMXAqC-BIUbMACLcBGAs/s1600/free-text-annotation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="519" data-original-width="1600" height="206" src="https://4.bp.blogspot.com/-dL-Li50SQn8/XNQ9DuTTjaI/AAAAAAAAANQ/xdi7FLjnuNkaIGjGLz4MbMXAqC-BIUbMACLcBGAs/s640/free-text-annotation.png" width="640" /></a></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Slides from the presentation are available <a href="https://www.slideshare.net/dianamaynard/methodological-possibilities-for-strengthening-the-monitoring-of-sdg-indicator-16101" target="_blank">here</a>; the full journal paper is forthcoming.<br />
The original article from which this post is adapted is available on the <a href="http://www.cfom.org.uk/2019/05/06/ground-breaking-cfom-research-strengthening-the-monitoring-of-sdg-16-10-1/" target="_blank">CFOM website</a>. </div>
</div>
Diana Maynardhttp://www.blogger.com/profile/10115059373361509161noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-89419925486622137392019-04-17T07:57:00.000+01:002019-06-07T22:04:49.997+01:00WeVerify: Algorithm-Supported Verification of Digital Content<div>
<div bgcolor="#FFFFFF" style="background-color: white;" text="#000000">
Announcing WeVerify: a new project developing AI-based tools for computer-supported digital content verification. The WeVerify platform will provide an independent and community driven environment for the verification of online content, to be used to assist journalists in gathering and verifying quickly online content. Prof. Kalina Bontcheva will be serving as the Scientific Director of the project.<br />
<br />
<i>Online disinformation and fake media content have emerged as a
serious threat to democracy, economy and society. Content verification is currently far from trivial, even for
experienced journalists, human rights activists or media literacy scholars. Moreover, recent advances in artificial intelligence
(deep learning) have enabled the creation of intelligent bots and highly realistic synthetic multimedia content. Consequently, it is
extremely challenging for citizens and journalists to assess the credibility of online content, and to navigate the highly
complex online information landscapes.</i><br />
<i><br />
WeVerify aims to address the complex content verification
challenges through a participatory verification approach, open source algorithms, low-overhead human-in-the-loop machine learning
and intuitive visualizations. Social media and web content will be analysed and contextualised within the broader
online ecosystem, in order to expose fabricated content, through cross-modal content verification, social network analysis,
micro-targeted debunking and a blockchain-based public database of known fakes.</i><br />
<i><br /></i>
<br />
<div bgcolor="#FFFFFF" text="#000000">
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://weverify.eu/wp-content/uploads/2018/12/Screenshot-2018-12-03-at-16.02.01-1024x638.png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="398" src="https://weverify.eu/wp-content/uploads/2018/12/Screenshot-2018-12-03-at-16.02.01-1024x638.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Add caption</td></tr>
</tbody></table>
<i>A key outcome will be the WeVerify platform for collaborative,
decentralised content verification, tracking, and debunking.</i><br />
<i><br />
The platform will be open source to engage communities and citizen
journalists alongside newsroom and freelance journalists. To enable low-overhead integration with in-house
content management systems and support more advanced newsroom needs, a premium version of the platform will also be
offered. It will be furthermore supplemented by a digital companion to assist with verification tasks.</i></div>
<i><br /></i></div>
</div>
<div>
<div bgcolor="#FFFFFF" style="background-color: white;" text="#000000">
<i>Results will be validated by professional journalists and
debunking specialists from project partners (DW, AFP, DisinfoLab), external participants (e.g. members of the First Draft News
network), the community of more than 2,700 users of the InVID verification plugin, and by media literacy, human rights and
emergency response organisations.</i><br />
<br />
The WeVerify website can be found at <a href="https://weverify.eu/">https://weverify.eu/</a>, and WeVerify can be found on Twitter <a href="https://twitter.com/WeV3rify">@WeV3rify</a>!</div>
</div>
Unknownnoreply@blogger.comtag:blogger.com,1999:blog-6188464480883238284.post-81184770043487377132019-03-11T09:36:00.001+00:002021-09-14T12:39:13.024+01:00Coming Up: 12th GATE Summer School 17-21 June 2019<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/fig12.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="563" data-original-width="1600" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/fig12.jpeg" width="640" /></a></div>
It is approaching that time of the year again! The GATE training course will be held from 17-21 June 2019 at the University of Sheffield, UK.<br />
<br />
No previous experience or programming expertise is necessary, so it's suitable for anyone with an interest in text mining and using GATE, including people from humanities backgrounds, social sciences, etc.<br />
<br />
This event will follow a similar format to that of the 2018 course, with one track Monday to Thursday, and two parallel tracks on Friday, all delivered by the GATE development team. You can read more about it and register <a href="https://gate.ac.uk/conferences/fig/fig12.html">here</a>. Early bird registration is available at a discounted rate until 1 May.<br />
<br />
The focus will be on mining text and social media content with GATE. Many of the hands on exercises will be focused on analysing news articles, tweets, and other textual content.<br />
<br />
The planned schedule is as follows (NOTE: may still be subject to timetabling changes).<br />
Single track from Monday to Thursday (9am - 5pm):<br />
<ul>
<li>Monday: Module 1: Basic Information Extraction with GATE</li>
<ul>
<li>Intro to GATE + Information Extraction (IE)</li>
<li>Corpus Annotation and Evaluation</li>
<li>Writing Information Extraction Patterns with JAPE</li>
</ul>
<li>Tuesday: Module 2: Using GATE for social media analysis</li>
<ul>
<li>Challenges for analysing social media, GATE for social media</li>
<li>Twitter intro + JSON structure</li>
<li>Language identification, tokenisation for Twitter</li>
<li>POS tagging and Information Extraction for Twitter</li>
</ul>
<li>Wednesday: Module 3: Crowdsourcing, GATE Cloud/MIMIR, and Machine Learning</li>
<ul>
<li>Crowdsourcing annotated social media content with the GATE crowdsourcing plugin</li>
<li>GATE Cloud, deploying your own IE pipeline at scale (how to process 5 million tweets in 30 mins)</li>
<li>GATE Mimir - how to index and search semantically annotated social media streams</li>
<li>Challenges of opinion mining in social media</li>
<li>Training Machine Learning Models for IE in GATE</li>
</ul>
<li>Thursday: Module 4: Advanced IE and Opinion Mining in GATE</li>
<ul>
<li>Advanced Information Extraction</li>
<li>Useful GATE components (plugins)</li>
<li>Opinion mining components and applications in GATE</li>
</ul>
</ul>
On Friday, there is a choice of modules (9am - 5pm):<br />
<ul>
<li>Module 5: GATE for developers
<ul>
<li>Basic GATE Embedded</li>
<li>Writing your own plugin</li>
<li>GATE in production - multi-threading, web applications, etc.</li>
</ul>
</li>
<li>Module 6: GATE Applications
<ul>
<li>Building your own applications</li>
<li>Examples of some current GATE applications: social media summarisation, visualisation, Linked Open Data for IE, and more</li>
</ul>
</li>
</ul>
These two modules are run in parallel, so you can only attend one of them. You will need to have some programming experience and knowledge of Java to follow Module 5 on the Friday. No particular expertise is needed for Module 6.<br />
Hope to see you in Sheffield in June!Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-85906622318631247092019-03-07T13:01:00.000+00:002019-03-07T13:01:21.843+00:00Python: using ANNIE via its web API<a href="https://cloud.gate.ac.uk/">GATE Cloud</a> is GATE, the world-leading text-analytics platform, made available on the web with both human user interfaces and programmatic ones.<br />
<br />
My name is David Jones and part of my role is to make it easier for you to use GATE. This article is aimed at Python programmers and people who are, rightly, curious to see if Python can help with their text analysis work.<br />
<br />
GATE Cloud exposes a web API for many of its services. In this article, I'm going to sketch an example in Python that uses the GATE Cloud API to <a href="https://cloud.gate.ac.uk/shopfront/displayItem/annie-named-entity-recognizer">ANNIE, the English Named Entity Recognizer</a>.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://4.bp.blogspot.com/-gnTgLNUifFg/XH-odbGyGMI/AAAAAAAAAFM/3AiHl37dRnM2ezgrS40a-s4Iw4L5FxOBgCLcBGAs/s1600/annie.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="499" data-original-width="755" height="211" src="https://4.bp.blogspot.com/-gnTgLNUifFg/XH-odbGyGMI/AAAAAAAAAFM/3AiHl37dRnM2ezgrS40a-s4Iw4L5FxOBgCLcBGAs/s320/annie.png" width="320" /></a></div>
<br />
<br />
I'm writing in <a href="https://www.python.org/">Python 3</a> using the really excellent <a href="https://pypi.org/project/requests/">requests library</a>.<br />
<br />
The <a href="https://cloud.gate.ac.uk/info/help/online-api.html">GATE Cloud API documentation </a>describes the general outline of using the API, which is that you make an HTTP request setting particular headers.<br />
<br />
The full code that I'm using is <a href="https://github.com/GateNLP/gate-cloud-python-example">available on GitHub</a> and is installable and runnable.<br />
<br />
A simple use is to pass text to ANNIE and get annotated results back.<br />
In terms of Python:<br />
<br />
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace;"> text = "David Jones joined the University of Sheffield this year"</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> headers = {'Content-Type': 'text/plain'}</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> response = requests.post(url, data=text, headers=headers)</span></blockquote>
<br />
The <span style="font-family: "courier new" , "courier" , monospace;">Content-Type</span> header is required and specifies the MIME type of the text we are sending. In this case it's <span style="font-family: "courier new" , "courier" , monospace;">text/plain</span> but GATE Cloud supports many types including PDF, HTML, XML, and Twitter's JSON format; details are in the <a href="https://cloud.gate.ac.uk/info/help/online-api.html">GATE Cloud API documentation</a>.<br />
<br />
The default output is JSON and in this case once I've used Python's <span style="font-family: "courier new" , "courier" , monospace;">json.dumps(thing, indent=2)</span> to format it nicely, it looks like this: <br />
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace;">{<br /> "text": "David Jones joined the University of Sheffield this year",<br /> "entities": {<br /> "Date": [<br /> {<br /> "indices": [<br /> 47,<br /> 56<br /> ],<br /> "rule": "ModifierDate",<br /> "ruleFinal": "DateOnlyFinal",<br /> "kind": "date"<br /> }<br /> ],<br /> "Organization": [<br /> {<br /> "indices": [<br /> 23,<br /> 46<br /> ],<br /> "orgType": "university",<br /> "rule": "GazOrganization",<br /> "ruleFinal": "OrgFinal"<br /> }<br /> ],<br /> "Person": [<br /> {<br /> "indices": [<br /> 0,<br /> 11<br /> ],<br /> "firstName": "David",<br /> "gender": "male",<br /> "surname": "Jones",<br /> "kind": "fullName",<br /> "rule": "PersonFull",<br /> "ruleFinal": "PersonFinal"<br /> }<br /> ]<br /> }<br />}</span></blockquote>
The JSON returned here is designed to have a similar structure to the format used by Twitter: <a href="https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/entities-object">Tweet JSON</a>. The outermost dictionary has a <span style="font-family: "courier new" , "courier" , monospace;">text</span> key and an <span style="font-family: "courier new" , "courier" , monospace;">entities</span> key. The <span style="font-family: "courier new" , "courier" , monospace;">entities</span> object is a dictionary that contains arrays of annotations of different types; each annotation being a dictionary with an <span style="font-family: "courier new" , "courier" , monospace;">indices</span> key and other metadata. I find this kind of thing is impossible to describe and impossible to work with until I have an example and half-working code in front of me.<br />
<br />
The full Python example uses this code to unpick the annotations and display their type and text:<br />
<br />
<blockquote>
<span style="font-family: "courier new" , "courier" , monospace;"> gate_json = response.json()<br /> response_text = gate_json["text"]<br /> for annotation_type, annotations in gate_json["entities"].items():<br /> for annotation in annotations:<br /> i, j = annotation["indices"]<br /> print(annotation_type, ":", response_text[i:j])</span></blockquote>
<br />
With the text I gave above, I get this output:<br />
<blockquote class="tr_bq">
<span style="font-family: "courier new" , "courier" , monospace;">Date : this year<br />Organization : University of Sheffield<br />Person : David Jones</span></blockquote>
We can see that ANNIE has correctly picked out a date, an organisation, and a person, from the text. It's worth noting that the JSON output has more detail that I'm not using in this example: "University of Sheffield" is identified as a university; "David Jones" is identified with the gender "male".<br />
<br />
<h4>
Some notes on programming</h4>
<ul>
<li><span style="font-family: "courier new" , "courier" , monospace;">requests</span> is nice.</li>
<li><span style="font-family: "courier new" , "courier" , monospace;">Content-Type</span> header is required.</li>
<li><span style="font-family: "courier new" , "courier" , monospace;">requests</span> has a <span style="font-family: "courier new" , "courier" , monospace;">response.json()</span> method which is a shortcut for parsing the JSON into Python objects.</li>
<li>the JSON response has a <span style="font-family: "courier new" , "courier" , monospace;">text</span> field, which is the text that was analysed (in my example they are the same, but for PDF we need the linear text so that we can unambiguously assign index values within it).</li>
<li>the JSON response has an <span style="font-family: "courier new" , "courier" , monospace;">entities</span> field, which is where all the annotations are, first separated and keyed by their annotation type. </li>
<li>the indices returned in the JSON are 0-based end-exclusive which matches the Python string slicing convention, hence we can use <span style="font-family: "courier new" , "courier" , monospace;">response_text[i:j]</span> to get the correct piece of text.</li>
</ul>
<h4>
Quota and API keys</h4>
<br />
The public service has a fairly limited quota, but if you <a href="https://cloud.gate.ac.uk/register/index">create an account on GATE Cloud</a> you can create an API key which will allow you to access the service with increased quota and fewer limits.<br />
<br />
To use your API key, use HTTP basic authentication, passing in the Key ID as the <i>user-id</i> and the API key password as the <i>password</i>. <span style="font-family: "courier new" , "courier" , monospace;">requests</span> makes this pretty simple, as you can supply <span style="font-family: "courier new" , "courier" , monospace;">auth=(user, pass)</span> as an additional keyword argument to <span style="font-family: "courier new" , "courier" , monospace;">requests.post()</span>. Possibly even simpler though is to put those values in your <span style="font-family: "courier new" , "courier" , monospace;">~/.netrc</span> file (<span style="font-family: "courier new" , "courier" , monospace;">_netrc</span> in Windows):<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"> machine cloud-api.gate.ac.uk<br /> login 71rs93h36m0c<br /> password 9u8ki81lstfc2z8qjlae</span><br />
<br />
The nice thing about this is that <span style="font-family: "courier new" , "courier" , monospace;">requests</span> will find and use these values automatically without you having to write any code.<br />
<br />
Go <a href="https://github.com/GateNLP/gate-cloud-python-example">try using the web API now</a>, and let us know how you get on!Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-54423891120576335222019-03-05T11:24:00.001+00:002021-09-20T11:38:47.571+01:00Brexit--The Regional Divide<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/referendum.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" data-original-height="761" data-original-width="550" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/referendum.png" width="170" /><br />
Referendum result</a></div>
Although the UK voted by a narrow margin in the UK EU membership referendum in 2016 to leave the EU, that outcome failed to capture the diverse feelings held in various regions. It's a curious observation that the UK regions with the most economic dependence on the EU were the regions more likely to vote to leave it. The image below on the right is taken from <a href="https://www.cer.eu/insights/brexiting-yourself-foot-why-britains-eurosceptic-regions-have-most-lose-eu-withdrawal">this article</a> from the Centre for European Reform, and makes the point in a few different ways. This and similar research inspired <a href="https://www.thebritishacademy.ac.uk/projects/uk-international-challenges-17-social-understandings-scale-role-print-and-social-media-eu-referendum-debate">a current project</a> the GATE team are undertaking with colleagues in the Geography and Journalism departments at Sheffield University, under the leadership of <a href="https://www.sheffield.ac.uk/geography/staff/jaun_miguel_kanai">Miguel Kanai</a> and with funding from the British Academy, aiming to understand <b>whether lack of awareness of individual local situation played a role in the referendum outcome</b>.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/chart1_econintegration_1.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="668" data-original-width="800" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/chart1_econintegration_1.jpg" width="350" /></a></div>
Our Brexit tweet corpus contains tweets collected during the run-up to the Brexit referendum, and we've annotated almost half a million accounts for Brexit vote intent with a high accuracy. You can read about that <a href="https://link.springer.com/chapter/10.1007/978-3-030-01129-1_17">here</a>. So we thought we'd be well positioned to bring some insights. We also annotated user accounts with location: many Twitter users volunteer that information, though there can be a lot of variation on how people describe their location, so that was harder to do accurately. We also used local and national news media corpora from the time of the referendum, in order to contrast national coverage with local issues are around the country.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/topic-histo.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;">
Topics representation in different media<br /><img border="0" data-original-height="391" data-original-width="672" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/topic-histo.png" width="380" />
</a></div>
<table border="1px" cellpadding="5px"><tbody>
<tr><td><b><i>"People's resistance to propaganda and media‐promoted ideas derives from their close ties in real communities"</i></b><br />
<a href="https://onlinelibrary.wiley.com/doi/full/10.1111/1467-923X.12296">Jean Seaton</a></td></tr>
</tbody></table>
Using topic modelling and named entity recognition, we were able to look for similarities and differences in the focus of local and national media and Twitter users. The bar chart on the left gets us started, illustrating that foci differ between media. Twitter users give more air time than news media to trade and immigration, whereas local press takes the lead on employment, local politics and agriculture. National press gives more space to terrorism than either Twitter or local news.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/entities-terrorism-immigration-agriculture.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="767" data-original-width="1600" height="192" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/entities-terrorism-immigration-agriculture.png" width="400" /><br />
NER diff between national and local press</a></div>
On the right is just one of many graphs in which we unpack this on a region-by-region basis (you can find more on the <a href="http://services.gate.ac.uk/politics/ba-brexit/">project website</a>). In this choropleth, red indicates that the topic was significantly more discussed in national press than in local press in that area, and green indicates that the topic was significantly more discussed in local press there than in national press. Terrorism and immigration have perhaps been subject to a certain degree of media and propaganda inflation--we talk about this in <a href="https://link.springer.com/chapter/10.1007/978-3-030-01129-1_17">our Social Informatics paper</a>. Where media focus on locally relevant issues, foci are more grounded, for example in practical topics such as agriculture and employment. <b>We found that across the regions, Twitter remainers showed a closer congruence with local press than Twitter leavers.</b><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/comp-survey-twitter-readership.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="364" data-original-width="674" height="173" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/comp-survey-twitter-readership.png" width="320" /></a></div>
The graph on the right shows the number of times a newspaper was linked on Twitter, contrasted against the percentage of people that said they read that newspaper in the <a href="https://www.britishelectionstudy.com/">British Election Study</a>. It shows that the dynamics of popularity on Twitter are very different to traditional readership. This highlights a need to understand how the online environment is affecting the news reportage we are exposed to, creating a market for a different kind of material, and a potentially more hostile climate for quality journalism, as discussed by project advisor Prof. Jackie Harrison <a href="https://inforrm.org/2018/05/03/fake-news-has-always-existed-but-quality-journalism-has-a-history-of-survival-jackie-harrison/">here</a>. Furthermore, local press are increasingly <a href="https://www.theguardian.com/media/2018/feb/06/decline-of-local-journalism-threatens-democracy-says-may">struggling to survive</a>, so it feels important to highlight their value through this work.<br />
You can see more choropleths on the <a href="http://services.gate.ac.uk/politics/ba-brexit/">project website</a>. There's also an extended version <a href="https://arxiv.org/pdf/1902.06521.pdf">here</a> of an article currently under review.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-20592827435040471842019-02-20T17:12:00.000+00:002019-03-20T09:19:51.508+00:00GATE team wins first prize in the Hyperpartisan News Detection Challenge<div>
<div style="text-align: justify;">
<span style="font-family: "arial" , "helvetica" , sans-serif;">SemEval 2019 recently launched the Hyperpartisan News Detection Task in order to evaluate how well tools could automatically classify hyperpartisan news texts. The idea behind this is that "<i>given a news text, the system must decide whether it follows a hyperpartisan argumentation, i.e. whether it exhibits blind, prejudiced, or unreasoning allegiance to one party, faction, cause, or person.</i>" </span></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<div style="text-align: justify;">
<span style="font-family: "arial" , "helvetica" , sans-serif;">Below we see an example of (part of) two news stories about Donald Trump from the challenge data. The one on the left is considered to be hyperpartisan, as it shows a biased kind of viewpoint. The one on the right simply reports a story and is not considered hyperpartisan. The distinction is difficult even for humans, because there are no exact rules about what makes a story hyperpartisan.</span></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://3.bp.blogspot.com/-wnOkPbwoiNs/XG12dNBPdTI/AAAAAAAAAKg/O6kBdk3QhMw2jmq4dylqkQCi37Q384F3wCLcBGAs/s1600/Picture1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="416" data-original-width="1156" height="228" src="https://3.bp.blogspot.com/-wnOkPbwoiNs/XG12dNBPdTI/AAAAAAAAAKg/O6kBdk3QhMw2jmq4dylqkQCi37Q384F3wCLcBGAs/s640/Picture1.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<div style="text-align: justify;">
<span style="font-family: "arial" , "helvetica" , sans-serif;">In total, 322 teams registered to take part, of which 42 actually submitted an entry, including the GATE team consisting of Ye Jiang, Xingyi Song and Johann Petrak, with guidance from Kalina Bontcheva and Diana Maynard.</span></div>
<br />
<br />
<div style="text-align: justify;">
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-family: "arial" , "helvetica" , sans-serif;">The main performance measure for the task is accuracy on a balanced set of articles, though additionally precision, recall, and F1-score were measured for the hyperpartisan class. </span><span style="font-family: "arial" , "helvetica" , sans-serif;">In the final submission, the GATE team's hyperpartisan classifying algorithm achieved <span style="background-color: #f8f8f8; color: #222222; text-align: right;">0.822 accuracy for manually annotated evaluation set, and <a href="https://pan.webis.de/semeval19/semeval19-web/leaderboard.html" target="_blank">ranked in first position in the final leader board</a>.</span></span></span></div>
</div>
<div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://2.bp.blogspot.com/-OektflXtFAA/XG2Jzm-bjoI/AAAAAAAAAKs/5LbnUdvGqAQ9g6G-xdtw1pO3b2RPjZR5ACLcBGAs/s1600/Screenshot%2B2019-02-20%2B17.09.04.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="369" data-original-width="832" height="283" src="https://2.bp.blogspot.com/-OektflXtFAA/XG2Jzm-bjoI/AAAAAAAAAKs/5LbnUdvGqAQ9g6G-xdtw1pO3b2RPjZR5ACLcBGAs/s640/Screenshot%2B2019-02-20%2B17.09.04.png" width="640" /></a></div>
<br /></div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="text-align: justify;">
<span style="font-family: "arial" , "helvetica" , sans-serif;">Our winning system was based on using sentence representations from averaged word embeddings generated from the pre-trained ELMo model with a Convolutional Neural Network and Batch Normalization for training on the provided dataset. An averaged ensemble of models was then used to generate the final predictions. </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="text-align: start;">The source code and full system description is available </span><a href="https://github.com/GateNLP/semeval2019-hyperpartisan-bertha-von-suttner" style="text-align: start;" target="_blank">on github</a><span style="text-align: start;">.</span></span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">One of the major challenges of this task is that the model must have the ability to adapt to a large range of article sizes. Most state-of-the-art neural network approaches for document classification use a token sequence as network input, but such an approach in this case would mean either a massive computational cost or loss of information, depending on how the maximum sequence length. We got around this problem by first pre-calculating sentence level embeddings as the average of word embeddings for each sentence, and then representing the document as a sequence of these sentence embeddings. We also found that actually ignoring some of the provided training data (which was automatically generated based on the document publishing source) improved our results, which leads to important conclusions about the trustworthiness of training data and its implications.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Overall, the ability to do well on the hyperpartisan news prediction task is important both for improving knowledge about neural networks for language processing generally, but also because better understanding of the nature of biased news is critical for society and democracy.</span></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"></span><br />
<br /></div>
<div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span></div>
<div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span></div>
<div>
<br /></div>
Xingyi Songhttp://www.blogger.com/profile/04132189359238936416noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-56127055958764237462019-02-18T12:19:00.001+00:002021-09-23T00:59:37.293+01:00Russian Troll Factory: Sketches of a Propaganda Campaign<div class="separator" style="clear: both; text-align: center;">
<a href="https://gate.ac.uk/g8/page/print/2/ira/network/#" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="759" data-original-width="840" height="578" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/account-networks.png" width="640" /></a></div>
When Twitter shared <a href="https://about.twitter.com/en_us/values/elections-integrity.html#data">a large archive of propaganda tweets</a> late in 2018 we were excited to get access to over 9 million tweets from almost 4 thousand unique Twitter accounts controlled by Russia's <a href="https://en.wikipedia.org/wiki/Internet_Research_Agency">Internet Research Agency</a>. The tweets are posted in 57 different languages, but most are in Russian (53.68%) and English (36.08%). Average account age is around four years, and the longest accounts are as much as ten years old.<br />
A large amount of activity in both the English and Russian accounts is given to <b>news</b> provision. Secondly, many accounts seem to engage in <b>hashtag games</b>, which may be a way to establish an account and get some followers. Of particular interest however are the political trolls. <b>Left trolls</b> pose as individuals interested in the Black Lives Matter campaign. <b>Right trolls</b> are patriotic, anti-immigration Trump supporters. Among left and right trolls, several have achieved large follower numbers and even <a href="https://www.theguardian.com/technology/shortcuts/2017/nov/03/jenna-abrams-the-trump-loving-twitter-star-who-never-really-existed">a degree of fame</a>. Finally there are <b>fearmonger</b> trolls, that propagate scares, and a small number of <b>commercial</b> trolls. The Russian language accounts also divide on similar lines, perhaps posing as individuals with opinions about Ukraine or western politics. These categories were proposed by <a href="https://www.wired.com/story/twitters-dated-data-dump-doesnt-tell-us-about-future-meddling/">Darren Linvill and Patrick Warren</a>, from Clemson University. In the word clouds below you can see the hashtags we found left and right trolls using.<br />
<table>
<tbody>
<tr>
<td align="center"><a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/left-troll-hashtags.png" imageanchor="1"><img border="0" data-original-height="322" data-original-width="512" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/left-troll-hashtags.png" width="320" /></a><br />
<b>Left Troll Hashtags</b>
</td>
<td align="center"><a href="https://gate.ac.uk/g8/page/show/2/sale/images/blog/right-troll-hashtags.png" imageanchor="1"><img border="0" data-original-height="608" data-original-width="936" src="https://gate.ac.uk/g8/page/show/2/sale/images/blog/right-troll-hashtags.png" width="320" /></a><br />
<b>Right Troll Hashtags</b>
</td>
</tr>
</tbody></table>
Mehmet E. Bakir has created some interactive graphs enabling us to explore the data. In the network diagram at the start of the post you can see the network of mention/retweet/reply/quote counts we created from the highly followed accounts in the set. You can <a href="https://gate.ac.uk/g8/page/print/2/ira/network/#">click through</a> to an interactive version, where you can zoom in and explore different troll types.<br />
In the graph below, you can see activity in different languages over time (interactive version <a href="https://gate.ac.uk/ira/daily_tweets.html">here</a>, or interact with the embedded version below; you may have to scroll right). It shows that the Russian language operation came first, with English language operations following after. The timing of this part of the activity coincides with Russia's interest in Ukraine.<br />
<iframe height="550" src="https://gate.ac.uk/g8/page/print/2/ira/daily_tweets.html" width="100%"> </iframe>
<br />
In the graph below, also available <a href="https://gate.ac.uk/ira/retweet_counts_excluding_from_trolls.html">here</a>, you can see how different types of behavioural strategy pay off in terms of achieving higher numbers of retweets. Using Linvill and Warren's manually annotated data, Mehmet built a classifier that enabled us to classify all the accounts in the dataset. It is evident that the political trolls have by far the greatest impact in terms of retweets achieved, with left trolls being the most successful. Russia's interest in the Black Lives Matter campaign perhaps suggests that the first challenge for agents is to win a following, and that exploiting divisions in society is an effective way to do that. How that following is then used to influence minds is a separate question. You can see a pre-print of our paper describing our work so far, in the context of the broader picture of partisanship, propaganda and post-truth politics, <a href="https://arxiv.org/abs/1902.01752">here</a>.<br />
<iframe height="550" src="https://gate.ac.uk/g8/page/print/2/ira/retweet_counts_excluding_from_trolls.html" width="100%"> </iframe>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-73764973080575688162019-02-08T15:40:00.001+00:002019-02-08T15:48:11.976+00:00Teaching computers to understand the sentiment of tweets<span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;">As part of the </span><a href="http://sobigdata.eu/" style="font-family: arial, helvetica, sans-serif; text-align: justify;">EU SoBigData</a><span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;"> project, the GATE team hosts a number of short research visits, between 2 weeks and 2 months, for all kinds of data scientists (PhD students, researchers, </span><span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;">academics, professionals) to come and work with us and to use our tools and/or datasets on a project involving text mining and social media analysis. <a href="https://www.linkedin.com/in/kristoffer-stensbo-smidt/" target="_blank">Kristoffer Stensbo-Smidt</a> </span><span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;">visited us in the summer of 2018 from the University of Copenhagen, to work on developing machine learning tools for sentiment analysis of tweets, and was supervised by GATE team member <a href="http://staffwww.dcs.shef.ac.uk/people/D.Maynard/" target="_blank">Diana Maynard</a> and by former team member <a href="https://twitter.com/IAugenstein" target="_blank">Isabelle Augenstein</a>, who is now at the University of Copenhagen. Kristoffer has a background in Machine Learning but had not worked in NLP before, so this visit helped him understand how to apply his skills to this kind of domain.</span><br />
<span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;"><br /></span>
<span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;">After his visit, Kristoffer wrote up an excellent <a href="https://towardsdatascience.com/making-computers-understand-the-sentiment-of-tweets-1271ab270bc7" target="_blank">summary of his research</a>. He essentially tested a number of different approaches to processing text, and analysed how much of the sentiment they were able to identify. Given a tweet and an associated topic, the aim is to ascertain automatically whether the sentiment expressed about this topic is positive, negative or neutral. Kristoffer experimented different word embedding-based models in order to test how much information different word embeddings carry for the sentiment of a tweet. This involved choosing which embeddings models to test, and how to transform the topic vectors. The main conclusions he drew from the work were that in general, word embeddings contain a lot of useful information about sentiment, with newer embeddings containing significantly more. This is not particularly surprising, but shows the importance of advanced models for this task.</span><br />
<span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://4.bp.blogspot.com/-4jiukLxRG8I/XF2iCjLatrI/AAAAAAAAAJo/qAk1BQS3Xv4T6cV6SFku-884DgWWUNBrgCLcBGAs/s1600/1_VlCUcwHnoZVVWPR437a5MQ.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="412" data-original-width="800" height="329" src="https://4.bp.blogspot.com/-4jiukLxRG8I/XF2iCjLatrI/AAAAAAAAAJo/qAk1BQS3Xv4T6cV6SFku-884DgWWUNBrgCLcBGAs/s640/1_VlCUcwHnoZVVWPR437a5MQ.png" width="640" /></a></div>
<span style="color: #505050; font-family: "arial" , "helvetica" , sans-serif; text-align: justify;"><br /></span>
Diana Maynardhttp://www.blogger.com/profile/10115059373361509161noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-92189078721551534182019-02-08T14:04:00.000+00:002019-03-31T18:11:14.899+01:003rd International Workshop on Rumours and Deception in Social Media (RDSM)<div data-mce-style="text-align: center;" style="text-align: center;">
<span data-mce-style="color: #939393;" style="color: #939393; font-family: "arial" , "helvetica" , sans-serif;">June 11, 2019 in Munich, Germany<br />
Collocated with ICWSM<a data-mce-href="https://www.icwsm.org/2019/" href="https://www.icwsm.org/2019/">'2019</a></span></div>
<h2 align="justify" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Times New Roman, serif;" style="font-family: "times new roman" , serif;"><span data-mce-style="font-size: large;" style="font-size: medium;"><b><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "arial" , "helvetica" , sans-serif;">Abstract</span></b></span></span></span></h2>
<span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-family: "arial" , "helvetica" , sans-serif; font-size: small;">The 3rd edition of the RDSM workshop will particularly focus on online information disorder and its interplay with public opinion formation.</span></span></span><br />
<br />
<span style="font-family: "arial" , "helvetica" , sans-serif;">Social media is a valuable resource for mining all kind of information varying from opinions to factual information. However, social media houses issues that are serious threats to the society. Online information disorder and its power on shaping public opinion lead the category of those issues. Among the known aspects are the spread of false rumours, fake news or even social attacks such as hate speech or other forms of harmful social posts. In this workshop the aim is to bring together researchers and practitioners interested in social media mining and analysis to deal with the emerging issues of information disorder and manipulation of public opinion. The focus of the workshop will be on themes such as the detection of fake news, verification of rumours and the understanding of their impact on public opinion. Furthermore, we aim to put a great emphasis on the usefulness and trust aspects of automated solutions tackling the aforementioned themes.</span><br />
<h2 align="left" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Times New Roman, serif;" style="font-family: "times new roman" , serif;"><span data-mce-style="font-size: large;" style="font-size: medium;"><b><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "arial" , "helvetica" , sans-serif;">Workshop Theme and Topics</span></b></span></span></span></h2>
<span style="font-family: "arial" , "helvetica" , sans-serif;">The aim of this workshop is to bring together researchers and practitioners interested in social media mining and analysis to deal with the emerging issues of veracity assessment, fake news detection and manipulation of public opinion. We invite researchers and practitioners to submit papers reporting results on these issues. Qualitative studies performing user studies on the challenges encountered with the use of social media, such as the veracity of information and fake news detection, as well as papers reporting new data sets are also welcome. Finally, we also welcome studies reporting the usefulness and trust of social media tools tackling the aforementioned problems.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<h2 align="left" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<span data-mce-style="font-size: medium;" style="font-size: small;"><b><span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "arial" , "helvetica" , sans-serif;">Topics of interest include, but are not limited to:</span></span></b></span></h2>
<ul>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Detection and tracking of rumours.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Rumour veracity classification.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Fact-checking social media.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Detection and analysis of disinformation, hoaxes and fake news.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Stance detection in social media.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Qualitative user studies assessing the use of social media.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Bots detection in social media.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Measuring public opinion through social media.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Assessing the impact of social media in public opinion.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Political analyses of social media.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Real-time social media mining.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">NLP for social media analysis.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Network analysis and diffusion of dis/misinformation.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Usefulness and trust analysis of social media tools.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">AI generated fake content (image / text) </span></li>
</ul>
<br />
<div style="text-align: center;">
<span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Times New Roman, serif;" style="font-family: "times new roman" , serif;"><span data-mce-style="font-size: large;" style="font-size: medium;"><b><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "arial" , "helvetica" , sans-serif;">Workshop Program Format</span></b></span></span></span></div>
<br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">We will have 1-2 experts in the field delivering keynote speeches. We will then have a set of 8-10 presentations of peer-reviewed submissions, organised into 3 sessions by subject (the first two sessions about online information disorder and public opinion and the third session about the usefulness and trust aspects). After the session we also plan to have a group work (groups of size 4-5 attendances) where each group will sketch a social media tool for tackling e.g. rumour verification, fake news detection, etc. The emphasis of the sketch should be on aspects like usefulness and trust. This should take no longer than 120 minutes (sketching, presentation/discussion time). We will close the workshop with a summary and take home messages (max. 15 minutes). Attendance will be open to all interested participants.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">We welcome both full papers (5-8 pages) to be presented as oral talks and short papers (2-4 pages) to be presented as posters and demos.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span data-mce-style="font-size: medium;" style="font-size: small;"><span data-mce-style="font-family: Times New Roman, serif;" style="font-family: "times new roman" , serif;"><span data-mce-style="font-size: large;" style="font-size: medium;"><b><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "arial" , "helvetica" , sans-serif;">Workshop Schedule/Important Dates</span></b></span></span></span><br />
<ul>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Submission deadline: April 1st 2019</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Notification of Acceptance: April 15th 2019 </span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Camera-Ready Versions Due: April 26th 2019</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Workshop date: June 11, 2019 </span></li>
</ul>
<h3 data-mce-style="line-height: 1.15; margin-top: 10pt; margin-bottom: 0pt; text-align: center;" style="line-height: 1.15; margin-bottom: 0pt; margin-top: 10pt; text-align: center;">
<b><span data-mce-style="font-size: 16px; font-family: 'Trebuchet MS'; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline;" style="background-color: transparent; color: black; font-family: "arial" , "helvetica" , sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: bold; text-decoration: none; vertical-align: baseline;"> </span></b></h3>
<h3 data-mce-style="line-height: 1.15; margin-top: 10pt; margin-bottom: 0pt; text-align: center;" style="line-height: 1.15; margin-bottom: 0pt; margin-top: 10pt; text-align: center;">
<b><span data-mce-style="font-size: 16px; font-family: 'Trebuchet MS'; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline;" style="background-color: transparent; color: black; font-family: "arial" , "helvetica" , sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: bold; text-decoration: none; vertical-align: baseline;">Submission Procedure</span></b></h3>
<br />
<div dir="ltr" id="docs-internal-guid-0ce6c617-7fff-f307-fe53-dc25c535a2bf" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">We invite two kinds of submissions:</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">- Long papers/</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><span id="docs-internal-guid-41ba8e38-7fff-c12a-9297-52d52bb8575e" style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Brief Research Report</span> (max 8 pages + 2 references)</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">- Demos and poster (short papers) (max 4 pages + 2 references)</span></div>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Proceedings of the workshop will be published jointly with other ICWSM workshops in a special </span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">issue of Frontiers in Big Data.</span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"></span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"></span><br />
<br />
<div dir="ltr" id="docs-internal-guid-6bafd11b-7fff-31da-78a9-7ff2b2efa31d" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Papers must be submitted electronically in PDF format or any format that is supported by the </span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">submission site through </span><a href="https://www.frontiersin.org/research-topics/9706" style="text-decoration: none;"><span style="background-color: transparent; color: blue; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: underline; vertical-align: baseline; white-space: pre;">https://www.frontiersin.org/research-topics/9706</span></a><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"> (click on "Submit your manuscript"). </span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Note, submitting authors should choose one of the specific track organizers as their preferred Editor.</span></div>
<br /></div>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">You can find detailed information on the file submission requirements here:</span><a href="https://www.frontiersin.org/about/author-guidelines#FileRequirements" style="text-decoration: none;"><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"> </span></a><br />
<span style="background-color: transparent; color: blue; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: underline; vertical-align: baseline; white-space: pre;">https://www.frontiersin.org/about/author-guidelines#FileRequirements</span></div>
<br />
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">Submissions will be peer-reviewed by at least three members of the programme </span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"><br /></span><span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;">committee. The accepted papers will appear in the proceedings published at </span><br />
<span style="background-color: transparent; color: black; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; vertical-align: baseline; white-space: pre;"> </span><a href="https://www.frontiersin.org/research-topics/9706" style="text-decoration: none;"><span style="background-color: transparent; color: blue; font-family: "arial"; font-size: 10pt; font-style: normal; font-variant: normal; font-weight: 700; text-decoration: underline; vertical-align: baseline; white-space: pre;">https://www.frontiersin.org/research-topics/9706</span></a></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<h2 align="left" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Times New Roman, serif;" style="font-family: "times new roman" , serif;"><span data-mce-style="font-size: large;" style="font-size: medium;"><b><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "arial" , "helvetica" , sans-serif;">Workshop Organizers<br />
</span></b></span></span></span></h2>
<ul>
<li lang="en-AU"><span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-family: "arial" , "helvetica" , sans-serif; font-size: small;">Ahmet Aker, University of Duisburg-Essen, Germany; University of Sheffield, UK<br />
<span data-mce-style="color: #0000ff;" style="color: blue;"><span lang="zxx"><u><a data-mce-href="mailto:a.aker@is.inf.uni-due.de" href="mailto:a.aker@is.inf.uni-due.de">a.aker@is.inf.uni-due.de</a></u></span></span></span></span></span></li>
<li lang="en-AU"><span data-mce-style="color: #000000;" style="color: black; font-family: "arial" , "helvetica" , sans-serif;"> <span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-size: small;">Arkaitz Zubiaga, Queen Mary University of London, UK<br />
<span style="color: blue;"><u>arkaitz@zubiaga.org</u></span></span></span></span></li>
<li lang="en-AU"><span data-mce-style="color: #000000;" style="color: black; font-family: "arial" , "helvetica" , sans-serif;"> <span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-size: small;">Kalina Bontcheva, University of Sheffield, UK<br />
<span data-mce-style="color: #0000ff;" style="color: blue;"><span lang="zxx"><u><a data-mce-href="mailto:k.bontcheva@sheffield.ac.uk" href="mailto:k.bontcheva@sheffield.ac.uk">k.bontcheva@sheffield.ac.uk</a></u></span></span></span></span></span></li>
<li lang="en-AU"><span data-mce-style="color: #000000;" style="color: black; font-family: "arial" , "helvetica" , sans-serif;"> <span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-size: small;">Maria Liakata, University of Warwick and Alan Turing Institute, UK<br />
<span data-mce-style="color: #0000ff;" style="color: blue;"><span lang="zxx"><u><a data-mce-href="mailto:m.liakata@warwick.ac.uk" href="mailto:m.liakata@warwick.ac.uk">m.liakata@warwick.ac.uk</a></u></span></span></span></span></span></li>
<li lang="en-AU"><span style="font-family: "arial" , "helvetica" , sans-serif;"><span data-mce-style="color: #000000;" style="color: black;"> <span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-size: small;">Rob Procter, University of Warwick and Alan Turing Institute, UK<br />
<span data-mce-style="color: #0000ff;" style="color: blue;"><span lang="zxx"><u><a data-mce-href="mailto:rob.procter@warwick.ac.uk" href="mailto:rob.procter@warwick.ac.uk">rob.procter@warwick.ac.uk</a></u></span></span></span></span></span></span></li>
<li lang="en-AU"><span style="font-family: "arial" , "helvetica" , sans-serif;"><span data-mce-style="color: #000000;" style="color: black;"> <span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-size: small;">Symeon Papadopoulos, Centre for Research and Technology Hellas, Greece<br />
<span data-mce-style="color: #0000ff;" style="color: blue;"><span lang="zxx"><u><a data-mce-href="mailto:rob.procter@warwick.ac.uk" href="mailto:rob.procter@warwick.ac.uk">papadop@iti.gr</a></u></span></span></span></span></span></span></li>
</ul>
<h2 align="justify" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<b><span data-mce-style="font-size: 16px; font-family: 'Trebuchet MS'; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline;" style="background-color: transparent; color: black; font-family: "arial" , "helvetica" , sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: bold; text-decoration: none; vertical-align: baseline;">Programme Committee (Tentative)<br />
</span></b></h2>
<ul>
<li><span data-mce-style="color: #000000;" style="color: black;">Nikolas Aletras, University of Sheffield, UK<span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-family: "arial" , "helvetica" , sans-serif; font-size: small;"> </span></span></span></li>
<li><div>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Emilio Ferrara, University of Southern California, USA</span></div>
</li>
<li><span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-family: "arial" , "helvetica" , sans-serif; font-size: small;">Bahareh Heravi, University College Dublin, Ireland</span></span></span></li>
<li><span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-family: "arial" , "helvetica" , sans-serif; font-size: small;">Petya Osenova, Ontotext, Bulgaria</span></span></span></li>
<li><span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-family: "arial" , "helvetica" , sans-serif; font-size: small;">Damiano Spina, RMIT University, Australia</span></span></span></li>
<li><span data-mce-style="color: #000000;" style="color: black;"><span data-mce-style="font-family: Helvetica, serif;" style="font-family: "helvetica" , serif;"><span data-mce-style="font-size: medium;" style="font-family: "arial" , "helvetica" , sans-serif; font-size: small;">Peter Tolmie, Universität Siegen, Germany</span></span></span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Marcos Zampieri, University of Wolverhampton, UK</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Milad Mirbabaie, University of Duisburg-Essen, Germany</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Tobias Hecking, University of Duisburg-Essen, Germany </span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Kareem Darwish, QCRI, Qatar</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Hassan Sajjad, QCRI, Qatar</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Sumithra Velupillai, King's College London, UK </span><div>
</div>
</li>
</ul>
<h2 align="justify" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<b><span style="font-family: "arial" , "helvetica" , sans-serif;"> </span></b></h2>
<h2 align="justify" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<b><span style="font-family: "arial" , "helvetica" , sans-serif;">Invited Speaker(s)<br />
</span></b></h2>
<span style="font-family: "arial" , "helvetica" , sans-serif;">To be announced</span><br />
<h2 align="justify" data-mce-style="text-align: center;" lang="en-AU" style="text-align: center;">
<b><span style="font-family: "arial" , "helvetica" , sans-serif;">Sponsors</span></b></h2>
<div>
<span style="font-family: "arial" , "helvetica" , sans-serif;">This workshop is supported by the European Union under grant agreement No. 654024, SoBigData.</span></div>
<div>
</div>
<a data-mce-href="https://www.pheme.eu/wp-content/uploads/2018/05/logo-SoBigData-DEFINITIVO.png" href="https://www.pheme.eu/wp-content/uploads/2018/05/logo-SoBigData-DEFINITIVO.png"><span style="font-family: "arial" , "helvetica" , sans-serif;"><img alt="" class="alignnone wp-image-508" data-mce-src="https://www.pheme.eu/wp-content/uploads/2018/05/logo-SoBigData-DEFINITIVO.png" height="212" src="https://www.pheme.eu/wp-content/uploads/2018/05/logo-SoBigData-DEFINITIVO.png" width="598" /> </span></a><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">And the EU co-funded horizon 2020 project that deals with algorithm-supported verification of digital content </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<div style="text-align: center;">
<a href="https://weverify.eu/"><span style="font-family: "arial" , "helvetica" , sans-serif;"><img alt="WeVerify" height="209" src="https://weverify.eu/wp-content/uploads/2018/12/weverify-logo-1.png" width="400" /></span></a> </div>
Unknownnoreply@blogger.comtag:blogger.com,1999:blog-6188464480883238284.post-10157264150869937382019-01-21T14:42:00.000+00:002019-01-22T16:40:10.463+00:00 SoBigData funded travel grant for short-term visiting Scholar<div class="p1">
As a part of SoBigData's Transnational Access (TNA) activities, the Department of Computer Science at Sheffield University is keen to host scholars from non-UK universities who would like to visit Sheffield to undertake a short period of research as part of a scheme to promote international cooperation and the dissemination of knowledge. Grants are made available to cover 1-2 month research for scholars at non-UK universities/organisations. During the visit scholars will join in one of the following research projects:<br />
<div>
<br /></div>
</div>
<div class="p1">
<b> • Social media part of speech tagging in multiple languages</b></div>
<div class="p1">
— Part of Speech is one of the most widely used linguistic features to analyse social media content. The project aims to build models to tag social media content with the universal POS tag set.</div>
<div class="p2">
<br /></div>
<div class="p1">
<b> • Social media named entity recognition in multiple languages</b></div>
<div class="p1">
— The presentation of named entities in social media is generally different from the presentation of named entities in news articles. NER systems trained on news articles cannot perform well in social media analysis. The aim of this project is to build NER models for social media in different European languages</div>
<div class="p2">
<br /></div>
<div class="p1">
<b> • Sentiment Analysis for Twitter posts</b></div>
<div class="p1">
— Sentiment analysis is one of the basic components used to analyse societal debates. This project aims to build a sentiment analysis model based on short and noisy twitter posts.</div>
<div class="p2">
<br /></div>
<div class="p2">
<br /></div>
<div class="p1">
<b> What is covered (</b>up to 4500 euros<b>):</b></div>
<div class="p1">
Return flight/train tickets to Sheffield</div>
<div class="p1">
Accommodation during the visiting period</div>
<div class="p1">
Daily subsistence</div>
<div class="p1">
GATE Summer School</div>
<div class="p1">
Mentor from GATE members </div>
<div class="p2">
<br /></div>
<div class="p1">
<b> Deadlines:</b></div>
<div class="p1">
Application before: 30 March 2019</div>
<div class="p1">
Notification: within 2 months after submission</div>
<div class="p2">
<br /></div>
<div class="p2">
<br /></div>
<div class="p1">
<b> Eligibility Requirements:</b></div>
<div class="p1">
Candidates must:<br />
• have PhD degree or be enrolled in a doctoral programme offered by an educational institution recognised by that country’s authorities<br />
• not be enrolled as a student or worked in a higher education institution of the United Kingdom<br />
• resume studies/work in their home country after the end of the grant period<br />
<div>
<br /></div>
</div>
<div class="p1">
<b> How to apply:</b></div>
<div class="p3">
<span class="s1">Applicants should apply though SoBigData TransNationalAccess (<a href="http://www.sobigdata.eu/content/open-call-sobigdata-funded-transnational-access"><span class="s2">http://www.sobigdata.eu/content/open-call-sobigdata-funded-transnational-access</span></a>)</span></div>
<style type="text/css"> p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'; min-height: 14.0px} p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'; color: #dca10d} span.s1 {color: #000000} span.s2 {text-decoration: underline} </style>
<br />
<div class="p3">
<span class="s1">Submit completed application form (<a href="http://www.sobigdata.eu/sites/default/files/SoBigData%20TNA%202018-c%20Application%20form.doc"><span class="s2">http://www.sobigdata.eu/sites/default/files/SoBigData%20TNA%202018-c%20Application%20form.doc</span></a>) to <span class="s2"><a href="mailto:ta-admin@sobigdata.eu">ta-admin@sobigdata.eu</a></span></span></div>
<div class="p3">
<span class="s1"><br /></span></div>
<div class="p3">
<span class="s1">Any question related to the projects please contact: Xingyi Song (x.song@sheffield.ac.uk)</span></div>
Xingyi Songhttp://www.blogger.com/profile/04132189359238936416noreply@blogger.com0tag:blogger.com,1999:blog-6188464480883238284.post-35227070046934883312018-12-17T08:24:00.000+00:002019-03-31T18:05:19.178+01:00Open Call for SoBigData-funded Transnational Access!<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;">The <a href="http://sobigdata.eu/content/open-call-sobigdata-funded-transnational-access">SoBigData project</a> invites researchers and professionals to apply to participate in Short-Term Scientific Missions (STSMs) to carry forward their own big data projects. <b>The Natural Language Processing (NLP) group at the University of Sheffield are taking part in this initiative and invite all applications.</b></span><br />
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;"><br /></span>
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;">Funding is available for STSMs (2 weeks to 2 months) of up to 4500 euros, covering daily subsistence, accommodation and flights. These bursaries are awarded on a competitive basis.</span><br />
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;"><br /></span>
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;">Research areas are varied but include studies involving societal debate, online misinformation and rumour analysis. A key topic is <b>analysis of social media and newspaper articles</b> to understand the <b>state of public debate</b> in terms of <b>what </b>is being discussed, <b>how </b>it is being discussed, <b>who </b>is discussing it, and how this discussion is being <b>influenced</b>. The effects of <b>online disinformation campaigns</b> (especially <b>hyper-partisan </b>content) and the use of <b>bot accounts</b> to perpetrate this disinformation are also of particular interest.</span><br />
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;"><br /></span>
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;">Applications are welcomed for visits from 1 November 2018 and 31 July 2019!</span><br />
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;"><br /></span>
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;">For specific details, eligibility criteria, and to apply, click <a href="http://sobigdata.eu/content/open-call-sobigdata-funded-transnational-access">here</a>!</span><br />
<span style="background-color: white; color: #83939c; font-family: "raleway" , "arial" , "helvetica" , sans-serif; font-size: 14px;"><br /></span>Unknownnoreply@blogger.com0