The Sleepy President – Ib Gulbrandsen on “Trumps Tweets”

Recently the chief rat of the DigitalMediaLab was featured as an expert on the DR2 series “Trumps Tweets”. Having analysed and catalogued every tweet written by Trump in the first 100 days of his presidency, Ib describes the daily rhythm of the president and how it can be seen as influencing his Twitter habits.

You can watch the full episode here:

https://www.dr.dk/drtv/se/trumps-tweets_-den-soevnige-praesident_173970

Primærvalg og Super Tuesday

Den 3. marts var en vigtig dag for de demokratiske præsidentkandidater. Det var nemlig dagen, hvorpå over 33 procent af de delegerede skulle tildeles.

I alt skulle 1.357 delegerede ud af 3.979 mulige tildeles, og med kun fem præsidentkandidater til at kæmpe om dem var det spændende at følge med i. 

Inden Super Tuesday gik i gang valgte Pete Buttiegieg og Amy Klobuchar at droppe ud af kampen om at blive Det Demokratiske Partis præsidentkandidat. Det betød, at der til Super Tuesday kun var fem kandidater tilbage: Bernie Sanders, Joe Biden, Elizabeth Warren, Michael Bloomberg og Tulsi Gabbard. Efter Super Tuesday har yderligere to valgt at droppe ud af kampen: Elizabeth Warren og Michael Bloomberg. 

Digital Media Lab har derfor fulgt Super Tuesday ganske nøje, og har igen kigget på hvilke præsidentkandidater, som deltog under Super Tuesday, der bliver nævnt flest gange. Denne gang har vi gjort det over en periode af tre dage: Dagen inden Super Tuesday (02/03), på dagen (03/03) og dagen efter Super Tuesday (04/03). I den forbindelse skal det nævnes, at vi har taget udgangspunkt i EST tidszonen (Eastern Standard Time). Dette vil vi ligeledes gøre ved fremtidige dataindsamlinger, som vedrører Det Demokratiske Partis udvælgelse af deres endelige præsidentkandidat. Derudover er Donald J. Trump blevet nævnt 16.458 gange. 

I alt er der i perioden blevet indsamlet 501.024 tweets. 

Vi har valgt at putte de indhentede data ind i fire tidsintervaller per dag, således at udviklingen per dag er mest tydelig i diagrammet. Intervallerne er fra 00:00 til 06:00, 06:00 til 12:00, 12:00 til 18:00 og 18:00 til 00:00. Grafen ser således ud: 

Diagrammen viser antal @mentions for hashtagget #supertuesday. Klik på billedet for at forstørre det.

Grafen viser, ligeledes som resultaterne i primærvalget viste, at Joe Biden fik et stort comeback midt og efter Super Tuesday. Ligeledes viser diagrammet også, Bernie Sanders var den person, som flest personer nævnte (@mentions) – indtil Joe Biden blev annonceret som vinder af Super Tuesday. Det er ligeledes interessant at se, hvor lidt Michael Bloomberg er blevet nævnt taget i betragtning, at han har brugt omkring 3,7 milliarder kroner på sin kampagne. 

Hos DigitalMediaLab er ikke kun interesseret i, hvem det er der er blevet nævnt flest gange, men også hvem der kommunikere med hvem. Derfor har vi denne gang lavet en netværksgrafen med de 250 mest nævnte personer – denne gang baseret på hashtagget #supertuesday. Netværksgrafen viser igen, at personer som ikke er politisk aktive, som er den største gruppe af mennesker, som kommunikerer til kandidaterne via Twitter. Derudover er den siddende præsident, Donald J. Trump, også en af dem som bliver nævnt flest gange.

Netværksgraf som viser de 250 mest nævnte Twitter-konti i perioden 2. marts kl. 06:00:00 til 4. marts kl. 06:00:00. Tiden er i EST (Eastern Standard Time). Klik på billedet for at forstørre det.

Vi fortsætter med at følge med i den Amerikanske valgkamp på Twitter. Følg med her på bloggen for de nyeste analyser.

Nationalism without borders? Transnational networks among right-wing online news sites

by Eva Mayerhöffer

Just in time for the European Parliament Election 2019, a somewhat notorious figure made a public re-appearance. Steve Bannon, the co-founder of alt-right news site Breitbart News and former chief strategist for the Trump administration, embarked on a mission to foster a global right-wing populist movement (https://www.nytimes.com/2018/03/09/world/europe/horowitz-europe-populism.html). Europe was supposed to be the first step.

The Movement, as it was called, never really took off – that it should require an American to ‘save’ Europe was too much to swallow for European nationalists after all. Yet, while Steve Bannon may have failed on the political scene, chances are he has fostered the global alt-right movement on a whole other level. In the past couple of years, many countries around the globe have seen the emergence of online news media that position themselves as a counterforce to a perceived liberal mainstream in media and politics. Many of these right-wing alternative (or ‘alt-right’) online news media have not least been inspired by Bannon’s brainchild Breitbart News.

Is it maybe through these news sites that we can see a transnational alt-right movement emerging? And how transnationally oriented are these news sites in the first place? Can we find evidence for an emerging network of right-wing online news sites across countries? We tried to answer these questions by focusing on alternative news sites from six different countries – the US, the UK, Sweden, Denmark, Germany and Austria. With the US and Sweden, the selection includes two countries that are frequently named as ‘exporters’ of alt-right ideology, as well as three country pairs of (cultural) neighbors. That right-wing news sites from precisely these countries should entertain at least some relations with each other across borders is thus not unlikely. 

As most of these news sites are online native and rely heavily on social media as a dissemination platform, digital methods naturally played an important part in data collection and analysis. In assessing the transnational networking potential of these sites, we focused on two aspects in particular: 1) We looked at whether they re-tweeted posts of or mentioned other right-wing news sites on Twitter. To do so, we harvested mention and re-tweet activity of in total 65 sites through DMI-TCAT-user, hosted by RUC’s Digital Media Lab (https://digitalmedialab.ruc.dk/hosted-resources/). 2) We also studied whether they hyperlinked to other right-wing news sites in article content published on their websites. As alternative news sites are not regularly included in media archives such as Infomedia or Lexis Nexis, we collected article hyperlinks through the platform MediaCloud.org, which collects online news stories through the RSS feed of online media sources. To scrape all hyperlinks embedded in these articles, we used the R package ‘rvest’.

Why not Facebook? As many of these sites use Facebook almost exclusively as a platform to disseminate their website article content, collecting website hyperlinks through extracting data from Facebook’s API would indeed have been a viable alternative. Not least because it would provide the possibility to include audience engagement patterns in the analysis, as well. Unfortunately, apps to do so, e.g. DMI’s Netvizz Application, are now no longer allowed on Facebook.

But back to the question of transnational networks among right-wing news sites. To understand their linking behavior better, we have to quickly consider why right-wing news sites may be inclined to refer to other alternative news sites in the first place. Very broadly, we can distinguish between two strategies. On the one hand, right-wing news sites may refer to each other based on a movement logic. This means that they predominantly perceive themselves as part of a larger right-wing movement beyond the conservative mainstream that includes actors and organizations from the populist right-wing, ‘alt-right’, far right and extreme right spectrum. Hyperlinks, mentions and re-tweets serve here to cement political alliances, to build and reinforce a group identity, as well as to increase the visibility and exaggerate the importance of issues relevant to the movement. On the other hand, alt-right news sites may link to other sites based on a professional logic. By hyperlinking to additional material in articles, the sites can seek to heighten the article’s concision and depth through background information. So-called citational links to the original producer of news or other materials demonstrate facticity and strengthen thereby the credibility of an article and the entire website. The higher the societal status and overall credibility of the linked source, the higher the chances that the linking practice enhances one’s own reputation.

But why link to right-wing news sites from abroad? For one, it has been argued that transnational networking is particularly relevant when the national alternative digital news environment is relatively underdeveloped, as it is e.g. the case in Denmark. Secondly, including sources from abroad widens the spectrum of news stories of partisan news value. A typical news story featured on these sites is e.g. a criminal offense committed by immigrants. Yet, even though these sites are working hard to suggest otherwise, these offenses do not come in infinite numbers. And where are additional stories easier to find than on alt-right news sites from other countries? Finally, alt-right news sites may perceive themselves in competition with each other on the national level, but less so on the transnational level, and thus be more prone to refer to each other here.

Over the course of a 3-month period, we managed to extract more than 700,000 relevant hyperlinks (that is excluding links to e.g. advertisement or social media platforms) from articles published by the 65 right-wing alternative websites. Roughly 24,000 of those were connections between these 65 sites. This low share is not surprising- even if their linking pattern was strictly based on a movement logic, right-wing news sites would of course also link to other right-wing partisan actors (parties, movements, bloggers, etc.) and to right-wing online news sites from other than the six countries. If we additionally consider the professional logic, right-wing news sites will moreover include citational and background links to established actors and organizations, including legacy media

What is maybe more surprising is that less than 1,000 of these connections were transnational. At first sight, the transnational outlook of these sites thus appears minimal. However, whether or not right-wing news sites linked to their national or international peers was also highly country-dependent. In the US (pink), the country with the by far most elaborate ‘alt-right’ digital news infrastructure, 99,9% of all article links to other right-wing news sites were national. In Sweden (blue) and Germany (purple), the majority of links was likewise national; in Austria (green), the distribution was rather even. In Denmark (beige) and the UK (yellow), where the right-wing digital news infrastructure is relatively weak, literally all links were transnational.

page25image54523712

Hyperlinks in article websites (primary network), based on 65 news sites and 23,806 connections. Graph created in R. Layout: Fruchterman-Reingold. Node size represents out-degree. Edge color remains the same color as the country group if an edge runs between two nodes belonging to the same country. Edge color turns gray when an edge occurs between two nodes from different countries.

Many of the transnational links are between neighboring countries: German and Austrian sites link to each other, while Danish sites entertain a strong connection to Swedish sites. In general, however, it is US based right-wing news media, and here not least Breitbart News (brt.) that serve as a hub in the transnational ecosystem of right-wing alternative news displayed in the graph.

Yet, our right-wing news sites are also connected across boarders in another way. If we extend the view beyond direct links between right-wing online news sites, we can see that our right-wing news sites form part of a transnational network held together by that fact that many of them refer to the same third-party sites. Interestingly enough, the majority of them are established legacy media outlets like the New York Times, The Guardian, BBC, CNN, Swedish Aftonbladet, German BILD or Israeli Haaretz. In a case study based on the Danish right-wing sites, we found that these links to established media from abroad are only rarely used to delegitimize this source, but much more often serve to enhance the facticity and credibility of a given news story (“see, even the New York Times writes it”).

In contrast to website content, direct links between right-wing news sites based on mentions and re-tweets matter more on Twitter. Where the logic seemed more professional for website article hyperlinking, Twitter indeed seems to provide a better platform for movement-based networking. But even on Twitter, transnational mentions and re-tweets of legacy media sources carry quite some weight.

Did we find evidence for a transnational alt-right movement spearheaded by alternative media? – not so much. What did we find then? We did uncover interesting patterns in how linking patterns vary between Twitter (movement logic!) and website (professional logic!) communication, as well as between countries with established and weak digital right-wing news infrastructures. We found a rather central position of Breitbart and a few other US based alt-right media in what could eventually amount to a transnational network of right-wing alternative media. And not least a rather surprising reliance on legacy media as a journalistic source for a type of media that defines itself to work against the so-called media mainstream.

This blog entry has been written by lab member Eva Mayerhöffer and is based on her research conducted in collaboration with the research group ‘Digitalisation and the Transnational Public Sphere’, Weizenbaum Institute for the Networked Society, Berlin (https://www.weizenbaum-institut.de/en/research/rg15/).

Primærvalg i USA og #2020election

Den 3. november skal amerikanerne stemme om, hvem de ønsker som præsident i de følgende fire år. DigitalMediaLab følger løbende den Amerikanske “Twittersfære” med en række undersøgelser.

Digital Media Lab har øjnene rettet skarp mod USA. Men før amerikanerne når dertil, så skal primærvalget overstås. Primærvalget er perioden hvor kandidater fra de to partier kæmper om at blive partiets endelige kandidat for præsidentembedet og dermed komme på stemmesedlen den 3. november. Den siddende præsident Donald J. Trump fra Det Republikanske Parti burde være sikker, men situationen er anderledes i Det Demokratiske Parti. Her kæmper otte personer i øjeblikket om at blive præsidentkandidat for partiet, og som det ser ud lige nu, er det stadig et tæt løb.

I labbet indsamler vi data fra hashtagget #2020election, og vi har kigget på de to sidste uger (11. februar kl. 12:30:00 til 25. februar kl. 12:30:00) for at se, om der allerede nu er nogle tendenser, som er værd at holde øje med. I forbindelse med hashtagget har vi bl.a. kigget på, hvem der bliver nævnt mest (@mentions) sammen med hashtagget.

De ti mest nævnte sammen med hashtagget #2020election er følgende:

  1. Donald J. Trump (3503 gange – @realDonaldTrump)
  2. Polls Of Politics (2411 gange – @pollsofpolitics)
  3. Bernie Sanders (845 gange – @BernieSanders)
  4. A.F. Brano Cartoons (818 gange – @afbranco)
  5. Lindsey Graham (758 gange – @LindseyGrahamSC
  6. Herman Cain (536 gange – @THEHermanCain)
  7. Hillary Clinton (501 gange – @HillaryClinton)
  8. Juliet [ukendt efternavn] (414 gange – @Julietknows1)
  9. The Western Journal (376 gange – @Westjournalism
  10. Maria Bartiromo (396 gange – @MariaBartiromo)

På listen finder man politikere som Donald J. Trump (1), Bernie Sanders (3), Lindsey Graham (5) og Hillary Clinton (7). Derudover findes Polls Of Politics (2) også på listen. Det er en uafhængig Twitter-profil som laver afstemninger på sin profil. Derudover er der tv-personligheder som Herman Cain (6) og Maria Bartiromo (10) på listen.

Det er interssant at se, hvordan det kun er Bernie Sanders fra Det Demokratiske Parti, som er på top ti listen. Længere nede på listen finder man også Pete Buttigieg (281 mentions) og Elizabeth Warren (211). Hvis man kigger på Det Demokratiske Partis kandidater, så ser den rangerede liste således ud:

  1. Bernie Sanders (845 gange – @BernieSanders)
  2. Pete Buttigieg (281 gange – @PeteButtigieg)
  3. Elizabeth Warren (211 gange – @ewarren)
  4. Mike Bloomberg (196 gange – @MikeBloomberg)
  5. Joe Biden (148 gange – @JoeBiden)
  6. Tulsi Gabbard (57 gange – @TulsiGabbard)
  7. Amy Klobuchar (45 gange – @amyklobuchar)
  8. Tom Steyer (29 gange – @TomSteyer)

Ligeledes har vi kigget på sociale netværk, og set hvem det er, som nævner hinanden sammen med hashtagget #2020election. Man kan se netværksgrafen nederest i artiklen. Grafen er bestående af forskellige størrelse cirkler (noder), og en masse pile (forbindelser). Nodernes størrelse afspejler hvor mange gange, de er blevet nævnt. Det er derfor ikke mærkeligt, at Donald J. Trumps cirkel er størst, da han må forventes, som siddende præsident, at blive nævnt flest gange. Man kan ligeledes se forbindelsen mellem Twitter-brugeren @chris_1791 og @breitbartnews er størst, da @Chris_1791 er den eneste som har nævnt @breitbartnews.

Vi har i indsamlingen taget de 100 mest nævnte personer, og i den forbindelse har vi også fjernet noder, som kun nævner sig selv, og dermed ikke forbindes til andre noder.

Netværksgrafen viser også, at det ikke kun er politikerne som benytter sig af hashtagget, men mennesker, som vi har identificeret som værende ikke-politikere, der benytter sig af hashtagget. Det er også værd at nævne, at hashtagget #2020election ikke endnu har særlig meget aktivitet, men det forventes at blive brugt flittigt når den endelig præsidentkandidat for Det Demokratiske Parti er fundet, og valgkampen om præsidentembedet for alvor starter. 

Netværksgraf som viser de 100 mest nævnte Twitter-konti i perioden 11. februar kl. 12:30:00 til 25. februar kl. 12:30:00.

Digital Media Lab og de kommende uger

Over de næste uger vil Digital Media Lab blandt andet holde et skarp fokus på fænomenet ‘Super Tuesday’, som afholdes den 3. marts. Her vil over 33% af de delegerede stemmer blive delt ud, og vi har indtil videre indsamlet over 18.000 tweets. Her vil vi også holde øje med Twitter i forhold til debatterne, som foregår på amerikansk tv.

Har du et forslag til, hvad vi kan kigge efter, så er du som altid mere end velkommen til at kontakte os på mail: digitalmedialab@ruc.dk

Folketingsvalget 2019 på Twitter – Uge 3

Valgkampens tredje uge er forbi, og vi er gået ind i den sidste spændende uge inden valget den 5. juni. Inden da tager Digital Media Lab et overblik over den foregående uge. I dette tilfælde strækker ugen sig fra den 21. maj og til og med den 27. maj 2019.

I denne udgave kigger vi, ligesom sidste uge, på følgende ting:

  • Hvilken profil har lavet flest tweets?
  •  Hvilke hashtags har været mest populære?
  • Hvilken profil er blevet nævn flest gange?
  • Hvilken profil har over hele valgkampens tweetet flest gange?

I forlængelse med ovenstående kigger vi samtidig også lidt nærmere på Europa-Parlamentsvalgets indflydelse på folketingsvalgkampen.

Lad os for god ordens skyld kigge lidt nærmere på hvilke profiler der har lavet flest tweets i den forgangene uge.

Antal tweets
I alt er der blevet lavet 6.660 tweets fordelt ud over 560 kandidater. Dette inkluderer også partiernes egne Twitter profiler. Top fem over mest tweetede profiler ser således ud:

  1. Karen Melchior – Kommende Europa-Parlaments medlem for de Radikale (758 tweets)
  2. Uffe Elbæk – Formand og medlem af partiet Alternativet (193 tweets)
  3. Uwe Max Jensen – Folketingskandidat for Strams Kurs i Nordjylland (172 tweets)
  4. Andreas Albertsen – Folketingskandidat for SF i Vestjylland (148 tweets)
  5. Niels Callesøe – Folketingskandidat for SF i Østjylland (147 tweets).

Det er i dette sammenhæng værd at nævne, at retweets indgår i det samlede antal tweets. Karen Melchior er stadigvæk aktiv på Twitter, men de fleste af hendes tweets er retweets. Karen Melchior, som i øvrigt for nyligt blev valgt ind i Europa Parlamentet, sprænger altså grafen med hendes 758 tweets. For at sætte det antal i relief, så svarer det til at hun per time laver 4,5 tweets i syv fulde dage.

Nedenfor ses søjlediagrammet over dette:

Hashtags
Som skrevet tidligere, så har der i denne periode været 6.660 tweets, og emnerne de berører, er meget ens. For god ordens skyld så skal vi igen nævne, at de nedenstående grafer er foruden hashtaggene #dkpol, #fv19 og #dkmedier. I denne udgave har vi lavet to grafer. En der indeholder hashtagget #ep19dk og en der ikke gør. Det har vi gjort for at kunne vise emnerne, så det er mere realistisk hvad emner har været. Vi skal i den forbindelse også nævne, at valgkampshashtag som #folketingsvalg19 #fv2019 #valg2019 er blevet lagt sammen, så alle tal indgår. Det vil sige, at placeringerne af hashtags i grafen ikke nødvendigt stemmer overens med deres reelle placering.

Rækkefølgen er:

  1. #ep19dk (brugt 510 gange)
  2. #dkgreen (brugt 168 gange)
  3. #fremad (brugt 135 gange)
  4. #viereuropa (brugt 90 gange)
  5. #stemgrønnest (brugt 66 gange)
  6. #alternativet (brugt 51 gange)
  7. #grøntvalg (brugt 41 gange)
  8. #hopeisback (brugt 31 gange)
  9. #sundpol (brugt 29 gange)
  10. #Klimamarch (brugt 23 gange)

Grafen med hashtagget #ep19dk ser således ud:

Grafen uden hashtagget #ep19dk ser således ud:

Mentions
I de to sidste uger af valgkampen, har det været Socialdemokratiets Twitter profil (Spolitik) der er blevet nævnt flest gange. I valgkampens tredje uge må de se sig slået af Alternativets Twitter profil (alternativet_). Rækkefølgen er følgende:

  1. Alternativets twitterprofil (alternativet_) er nævnt 295 gange
  2. Socialdemokratiets twitterprofil (Spolitk) er nævnt 244 gange
  3. Venstres twitterprofil (venstredk) er nævnt 191 gange
  4. Det Radikale Venstres twitterprofil (radikale) er nævnt 176 gange
  5. Karen Melchiors twitterprofil (karmel80) er nævnt 166 gange

Længere nede på listen finder man partiformænd og kvinder som Uffe Elbæk, Lars Løkke Rasmussen, Pia Kjærsgaard og Pia Olsen Dyhr. Grafen ser således ud:

Samlet tweets
Tre ugers valgkamp har været undervejs, og den sidste uge er næsten kun lige begyndt. I den forbindelse har Digital Media Lab taget et dyk ned i hvem der har tweetet mest. På nuværende tidspunkt kan det næppe komme som nogen overraskelse, at det er kommende medlem af Europa Parlamentet for de Radikale Karen Melchior, der er på en stensikker førsteplads. Med 1361 tweets, har hun lavet dobbelt så mange tweets som andenpladsen. Andenpladsen er besat af Uwe Max Jensen som er folketingskandidat for Stram Kurs i Nordjylland.

Rækkefølgende er:

  1. Karen Melchior – Kommende Europa-Parlaments medlem for de Radikale (1361 tweets)
  2. Uwe Max Jensen – Folketingskandidat for Strams Kurs i Nordjylland (562 tweets)
  3. Andreas Albertsen – Folketingskandidat for SF i Vestjylland (551 tweets)
  4. Uffe Elbæk – Formand og medlem af partiet Alternativet (508 tweets)
  5. Niels Callesøe – Folketingskandidat for SF i Østjylland (486 tweets).

Grafen ser således ud:

Dette overblik er lavet af forskningsassistent Nicolaj Sveiger i samarbejde med adjunkt Sander Andreas Schwartz. 

Dataharvest 19 – Interesting software take aways

We are currently attending the European Investigative Journalism Conference (EIJC19) in Mechelen, and will write about interesting software usable for research purposes.

This list of software is something we took a note of, and that you might find interesting.

  • Datashare – a cool piece of software for reading documents and turn unstructured data into structures.
  • Neo4j – Graph software, that turns relational data into just that – relations, that can be visualized in many ways. Using a language quite similar to SQL. Syntax is pretty complex though.
  • Anaconda – Once again the Anaconda platform seems to be the weapon of choice for coders around the world.
  • OSINT Framework – Framework focused on gathering information from free tools or resources. Also on Github.
  • Python Package Index – Great overview of libraries. Searchable, obviously.

And as a small bonus, we did a bit of coding while at the conference. So we have updated our scripts for converting handles and id’s for Twitter users (and vice versa). Now they output both to command line and CSV-files 🙂

Datashare – Interesting tool developed by ICIJ

Have you ever encountered the following?

You have read a bunch of documents. Think a lot of documents. But after reading, you are thinking of a specific detail. A detail you have forgotten. The name of a company or product for instance. And really, you are not interested in reading all the documents again. So you start scimming the texts for the missing detail.

Good news – now you won’t have to scim it your self. Let Datashare do it for you.

Will Fitzgibbon of ICIJ has written a pretty good guide for Datashare, you could start by reading.

ICIJ is the International Consortium of Investigative Journalists. The organization behind the Panama Papers, Lux Leaks, Off shore leaks, Implant files and many other large scale international collaborative (data) journalism projects.

We are currently attending the European Investigative Journalism Conference (EIJC19) in Mechelen, and will write about interesting software usable for research purposes.

Link: Datashare on ICIJ
Link: Dataharvest/EIJC

Collecting data from Facebook pages

If you are interested in collecting and analyzing data from Facebook pages we hereby provide a short how-to guide. You can only collect data from public pages so that means no automated collection of groups or personal profiles. You might experience some issues with posts that are not collected. Always check that the data you collect appears to be complete by going to the facebook page and comparing numbers of likes/comments/shares and posts. I you experience any issue, then try to collect smaller amounts of data in a shorter amount of time. You can always merge multiple data sets later on.

This guide requires Excel and a Facebook account.

  1. Find Facebook ID
    • Go to the Facebook page that you want to collect data from via a web browser
    • Copy the URL address
    • Paste the address into the available space at https://lookup-id.com/
    • Copy the number (Facebook ID) that you receive through this process.
  2. Collect data from Facebook pages via Netvizz
    • Go to the app Netvizz via url: https://apps.facebook.com/107036545989762/
    • Install and accept permission for the app. This is a university app developed by the University of Amsterdam, and your data is not stored on their servers.
    • Press the link Page posts and insert the FB ID
    • Choose date and other collection details. Leave the posts statistics only selected unless you are specifically interested in the content of comments
    • Press posts by page only or posts by page and users
    • Scroll down to the bottom bellow both graphs. This is tricky on a Mac, since you might not see the option or scrolling. Bellow the two graphs you will find the download link saying Download link as zip file. Press that link to download the file.
  3. Open in Excel
    • Unpack the zip-file by double pressing it.
    • Choose the file that is NOT called statsperday.
    • Open the the file by right clicking and going to open with click and other.
    • Select all programs rather than recommended programs and then find your Excel program.
    • Click the Excel program and view the data via Excel.

You should now have an overview of the collected data via Excel sorted into columns and rows. If the data does not appear to be sorted into columns then select column A and press Data and then text to columns. Choose delimited and than select tab and press finish.

Watch two how to guides from the developer himself. The interface has changed a little since then, but overall it is very similar.

Link to YouTube video

 

Scraping 101 and basic programming concepts

Credit: Screendump from http://sumsum.se/posts/scraping101-part2/ by Mikko Helsig

Ever wondered how scraping works? If you are pretty much blank when it comes to programming, this guide is probably not for you. However, if you have the basic concepts in place, in a few steps the author, Mikko Helsig, shows you how to scrape a site in Python (and also how to install Python in a Windows environment).

Prerequisites are a basic understanding of programming. But then you get a concept of how Python works, how scraping works and the really cool libraries requests, requests_cache, BeautifulSoup and Gender (the latter is a library used to guessing and parsing gender of names).

Link: Scraping 101

If you are totally new to programming, we encourage you to start by learning a little Python. There are numerous places to do this. For instance:

We also deeply encourage you to start with programming in an environment such as Anaconda. A short description of the Anaconda Navigator can be found here.

Get data from Instagram with Instaloader

We have added Instaloader to our External resources.

Instaloader is a tool to download pictures (or videos) along with their captions and other metadata from Instagram. You can either download profiles or hashtags, and it’s possible to set up filters (for instance datefilters, see below) to narrow your search.

To use Instaloader,  you should do the following.

  1. Download and set up a new Anaconda Environment with a Python version higher than 3.5.
  2. Install Jyputer Notebook on the environment and open a new terminal
  3. Do a pip (not pip3, as that does not work with Anaconda) install of the instaloader and dependencies
    pip install instaloader
  4. Create a new folder in your root-environment (typically documents-folder) called for instance Instaloader
  5. In terminal do
    cd instaloader
  6. This is to avoid that everything is saved in your base folder 🙂
  7. Run various command line commands in your terminal. Please do note that the interface is rudimentary but filters can be applied with the use of boolean expressions for instance:
    instaloader "#HASHTAG" --post-filter="date_utc >= datetime(2017,1,1) and date_utc <= datetime(2018,1,1)" --login=USERNAME
  8. We would love to implement this as a hosted service. However, it is not likely we will do so just now. Therefore, please experiment with it yourself. You can also ask our advice, and we will do our best to help. If you plan to use this tool on a regular basis or for larger datasets, you should probably be ready to use several user accounts and/or proxies to avoid being banned.